Unknown

Dataset Information

0

Multi-sample pooling and illumina genome analyzer sequencing methods to determine gene sequence variation for database development.


ABSTRACT: Determination of sequence variation within a genetic locus to develop clinically relevant databases is critical for molecular assay design and clinical test interpretation, so multisample pooling for Illumina genome analyzer (GA) sequencing was investigated using the RET proto-oncogene as a model. Samples were Sanger-sequenced for RET exons 10, 11, and 13-16. Ten samples with 13 known unique variants ("singleton variants" within the pool) and seven common changes were amplified and then equimolar-pooled before sequencing on a single flow cell lane, generating 36 base reads. For comparison, a single "control" sample was run in a different lane. After alignment, a 24-base quality score-screening threshold and 3; read end trimming of three bases yielded low background error rates with a 27% decrease in aligned read coverage. Sequencing data were evaluated using an established variant detection method (percent variant reads), by the presented subtractive correction method, and with SNPSeeker software. In total, 41 variants (of which 23 were singleton variants) were detected in the 10 pool data, which included all Sanger-identified variants. The 23 singleton variants were detected near the expected 5% allele frequency (average 5.17%+/-0.90% variant reads), well above the highest background error (1.25%). Based on background error rates, read coverage, simulated 30, 40, and 50 sample pool data, expected singleton allele frequencies within pools, and variant detection methods; >or=30 samples (which demonstrated a minimum 1% variant reads for singletons) could be pooled to reliably detect singleton variants by GA sequencing.

SUBMITTER: Margraf RL 

PROVIDER: S-EPMC2922832 | biostudies-literature | 2010 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multi-sample pooling and illumina genome analyzer sequencing methods to determine gene sequence variation for database development.

Margraf Rebecca L RL   Durtschi Jacob D JD   Dames Shale S   Pattison David C DC   Stephens Jack E JE   Mao Rong R   Voelkerding Karl V KV  

Journal of biomolecular techniques : JBT 20100901 3


Determination of sequence variation within a genetic locus to develop clinically relevant databases is critical for molecular assay design and clinical test interpretation, so multisample pooling for Illumina genome analyzer (GA) sequencing was investigated using the RET proto-oncogene as a model. Samples were Sanger-sequenced for RET exons 10, 11, and 13-16. Ten samples with 13 known unique variants ("singleton variants" within the pool) and seven common changes were amplified and then equimola  ...[more]

Similar Datasets

| S-EPMC3121147 | biostudies-literature
| S-EPMC7657563 | biostudies-literature
| S-EPMC3334598 | biostudies-literature
| S-EPMC5956518 | biostudies-literature
| S-EPMC5755963 | biostudies-literature
| S-EPMC4787001 | biostudies-literature
| S-EPMC4008303 | biostudies-literature
| S-EPMC6506499 | biostudies-literature
| S-EPMC4086933 | biostudies-literature