SONiCS: PCR stutter noise correction in genome-scale microsatellites.
ABSTRACT: Motivation:Massively parallel capture of short tandem repeats (STRs, or microsatellites) provides a strategy for population genomic and demographic analyses at high resolution with or without a reference genome. However, the high Polymerase Chain Reaction (PCR) cycle numbers needed for target capture experiments create genotyping noise through polymerase slippage known as PCR stutter. Results:We developed SONiCS-Stutter mONte Carlo Simulation-a solution for stutter correction based on dense forward simulations of PCR and capture experimental conditions. To test SONiCS, we genotyped a 2499-marker STR panel in 22 humpback dolphins (Sousa sahulensis) using target capture, and generated capillary-based genotypes to validate five of these markers. In these 110 comparisons, SONiCS showed a 99.1% accuracy rate and a 98.2% genotyping success rate, miscalling a single allele in a marker with low sequence coverage and rejecting another as un-callable. Availability and implementation:Source code and documentation for SONiCS is freely available at https://github.com/kzkedzierska/sonics. Raw read data used in experimental validation of SONiCS have been deposited in the Sequence Read Archive under accession number SRP135756. Supplementary information:Supplementary data are available at Bioinformatics online.
Project description:Short tandem repeats (STRs) are polymorphic genomic loci valuable for various applications such as research, diagnostics and forensics. However, their polymorphic nature also introduces noise during in vitro amplification, making them difficult to analyze. Although it is possible to overcome stutter noise by using amplification-free library preparation, such protocols are presently incompatible with single cell analysis and with targeted-enrichment protocols. To address this challenge, we have designed a method for direct measurement of in vitro noise. Using a synthetic STR sequencing library, we have calibrated a Markov model for the prediction of stutter patterns at any amplification cycle. By employing this model, we have managed to genotype accurately cases of severe amplification bias, and biallelic STR signals, and validated our model for several high-fidelity PCR enzymes. Finally, we compared this model in the context of a naïve STR genotyping strategy against the state-of-the-art on a benchmark of single cells, demonstrating superior accuracy.
Project description:It has been posited that the longest uninterrupted stretch (LUS) of tandem repeats, as defined by the number of exactly matching repeating motif units, is a better predictor of rates of stutter than the parental allele length (PAL). While there are cases where this hypothesis is likely correct, such as the 9.3 allele in the TH01 locus, there can be situations where it may not apply as well. For example, the PAL may capture flanking indel variations while remaining insensitive to polymorphisms in the repeat, and these haplotypic changes may impact the stutter rate. To address this, rates of stutter were contrasted against the LUS as well as the PAL on different flanking haplotypic backgrounds. This study shows that rates of stutter can vary substantially depending on the flanking haplotype, and while there are cases where the LUS is a better predictor of stutter than the PAL, examples to the contrary are apparent in commonly assayed forensic markers. Further, flanking variation that is 7 bp from the repeat region can impact rates of stutter. These findings suggest that non-proximal effects, such as DNA secondary structure, may be impacting the rates of stutter in common forensic short tandem repeat markers.
Project description:Stuttering is a neurodevelopmental disorder characterized by frequent and involuntary disruptions during speech production. Adults who stutter are often subject to negative perceptions. The present study examined whether negative social and cognitive impressions are formed when listening to speech, even without any knowledge about the speaker. Two experiments were conducted in which naïve participants were asked to listen to and provide ratings on samples of read speech produced by adults who stutter and typically-speaking adults without knowledge about the individuals who produced the speech. In both experiments, listeners rated speaker cognitive ability, likeability, anxiety, as well as a number of speech characteristics that included fluency, naturalness, intelligibility, the likelihood the speaker had a speech-and-language disorder (Experiment 1 only), rate and volume (both Experiments 1 and 2). The speech of adults who stutter was perceived to be less fluent, natural, intelligible, and to be slower and louder than the speech of typical adults. Adults who stutter were also perceived to have lower cognitive ability, to be less likeable and to be more anxious than the typical adult speakers. Relations between speech characteristics and social and cognitive impressions were found, independent of whether or not the speaker stuttered (i.e., they were found for both adults who stutter and typically-speaking adults) and did not depend on being cued that some of the speakers may have had a speech-language impairment.
Project description:Short tandem repeats (STRs) are highly mutable genetic elements that often reside in regulatory and coding DNA. The cumulative evidence of genetic studies on individual STRs suggests that STR variation profoundly affects phenotype and contributes to trait heritability. Despite recent advances in sequencing technology, STR variation has remained largely inaccessible across many individuals compared to single nucleotide variation or copy number variation. STR genotyping with short-read sequence data is confounded by (1) the difficulty of uniquely mapping short, low-complexity reads; and (2) the high rate of STR amplification stutter. Here, we present MIPSTR, a robust, scalable, and affordable method that addresses these challenges. MIPSTR uses targeted capture of STR loci by single-molecule Molecular Inversion Probes (smMIPs) and a unique mapping strategy. Targeted capture and our mapping strategy resolve the first challenge; the use of single molecule information resolves the second challenge. Unlike previous methods, MIPSTR is capable of distinguishing technical error due to amplification stutter from somatic STR mutations. In proof-of-principle experiments, we use MIPSTR to determine germline STR genotypes for 102 STR loci with high accuracy across diverse populations of the plant A. thaliana. We show that putatively functional STRs may be identified by deviation from predicted STR variation and by association with quantitative phenotypes. Using DNA mixing experiments and a mutant deficient in DNA repair, we demonstrate that MIPSTR can detect low-frequency somatic STR variants. MIPSTR is applicable to any organism with a high-quality reference genome and is scalable to genotyping many thousands of STR loci in thousands of individuals.
Project description:Stuttering is a neurodevelopmental disorder that affects the timing and rhythmic flow of speech production. When speech is synchronized with an external rhythmic pacing signal (e.g., a metronome), even severe stuttering can be markedly alleviated, suggesting that people who stutter may have difficulty generating an internal rhythm to pace their speech. To investigate this possibility, children who stutter and typically-developing children (n=17 per group, aged 6-11 years) were compared in terms of their auditory rhythm discrimination abilities of simple and complex rhythms. Children who stutter showed worse rhythm discrimination than typically-developing children. These findings provide the first evidence of impaired rhythm perception in children who stutter, supporting the conclusion that developmental stuttering may be associated with a deficit in rhythm processing.
Project description:Affecting 1% of the general population, stuttering impairs the normally effortless process of speech production, which requires precise coordination of sequential movement occurring among the articulatory, respiratory, and resonance systems, all within millisecond time scales. Those afflicted experience frequent disfluencies during ongoing speech, often leading to negative psychosocial consequences. The aetiology of stuttering remains unclear; compared to other neurodevelopmental disorders, few studies to date have examined the neural bases of childhood stuttering. Here we report, for the first time, results from functional (resting state functional magnetic resonance imaging) and structural connectivity analyses (probabilistic tractography) of multimodal neuroimaging data examining neural networks in children who stutter. We examined how synchronized brain activity occurring among brain areas associated with speech production, and white matter tracts that interconnect them, differ in young children who stutter (aged 3-9 years) compared with age-matched peers. Results showed that children who stutter have attenuated connectivity in neural networks that support timing of self-paced movement control. The results suggest that auditory-motor and basal ganglia-thalamocortical networks develop differently in stuttering children, which may in turn affect speech planning and execution processes needed to achieve fluent speech motor control. These results provide important initial evidence of neurological differences in the early phases of symptom onset in children who stutter.
Project description:Stuttering affects nearly 1% of the population worldwide and often has life-altering negative consequences, including poorer mental health and emotional well-being, and reduced educational and employment achievements. Over two decades of neuroimaging research reveals clear anatomical and physiological differences in the speech neural networks of adults who stutter. However, there have been few neurophysiological investigations of speech production in children who stutter. Using functional near-infrared spectroscopy (fNIRS), we examined hemodynamic responses over neural regions integral to fluent speech production including inferior frontal gyrus, premotor cortex, and superior temporal gyrus during a picture description task. Thirty-two children (16 stuttering and 16 controls) aged 7-11 years participated in the study. We found distinctly different speech-related hemodynamic responses in the group of children who stutter compared to the control group. Whereas controls showed significant activation over left dorsal inferior frontal gyrus and left premotor cortex, children who stutter exhibited deactivation over these left hemisphere regions. This investigation of neural activation during natural, connected speech production in children who stutter demonstrates that in childhood stuttering, atypical functional organization for speech production is present and suggests promise for the use of fNIRS during natural speech production in future research with typical and atypical child populations.
Project description:Speech timing deficits have been proposed as a causal factor in the disorder of stuttering. The question of whether individuals who stutter have deficits in nonspeech timing is one that has been revisited often, with conflicting results. Here, we uncover subtle differences in a manual metronome synchronization task that included tempo changes with adults who stutter and fluent speakers. We used sensitive circular statistics to examine both asynchrony and consistency in motor production. While both groups displayed a classic negative mean asynchrony (tapping before the beat), individuals who stutter anticipated the beat even more than their fluent peers, and their consistency was particularly affected at slow tempi. Surprisingly, individuals who stutter did not have problems with interval correction at tempo changes. We also examined the influence of music experience on synchronization behavior in both groups. While music perception and training were related to synchronization behavior in fluent participants, these correlations were not present for the stuttering group; however, one measure of stuttering severity (self-rated severity) was negatively correlated with music training. Overall, we found subtle differences in paced auditory-motor synchronization in individuals who stutter, consistent with a timing problem extending to nonspeech.
Project description:PURPOSE:The purpose of the present study was to extend previous research by analyzing the ability of adults who stutter to use phonological working memory in conjunction with lexical access to perform a word jumble task. METHOD:Forty English words consisting of 3-, 4-, 5-, and 6-letters (n = 10 per letter length category) were randomly jumbled using a web-based application. During the experimental task, 26 participants were asked to silently manipulate the scrambled letters to form a real word. Each vocal response was coded for accuracy and speech reaction time (SRT). RESULTS:Adults who stutter attempted to solve fewer word jumble stimuli than adults who do not stutter at the 4-letter, 5-letter, and 6-letter lengths. Additionally, adults who stutter were significantly less accurate solving word jumble tasks at the 4-letter, 5-letter, and 6-letter lengths compared to adults who do not stutter. At the longest word length (6-letter), SRT was significantly slower for the adults who stutter than the fluent controls. CONCLUSION:Results of the current study lend further support to the notion that differences in various aspects of phonological processing, including vision-to-sound conversions, sub-vocal stimulus manipulation, and/or lexical access are compromised in adults who stutter.
Project description:Microsatellites are polymorphic short tandem repeats of 1-6 nucleotides ubiquitously present in the genome that are extensively used in living organisms as genetic markers and in oncology to detect microsatellite instability (MSI). While the standard analysis method of microsatellites is based on PCR followed by capillary electrophoresis, it generates undesirable frameshift products known as 'stutter peaks' caused by the polymerase slippage that can greatly complicate the analysis and interpretation of the data. Here we present an easy multiplexable approach replacing PCR that is based on low temperature isothermal amplification using recombinase polymerase amplification (LT-RPA) that drastically reduces and sometimes completely abolishes the formation of stutter artifacts, thus greatly simplifying the calling of the alleles. Using HT17, a mononucleotide DNA repeat that was previously proposed as an optimal marker to detect MSI in tumor DNA, we showed that LT-RPA improves the limit of detection of MSI compared to PCR up to four times, notably for small deletions, and simplifies the identification of the mutant alleles. It was successfully applied to clinical colorectal cancer samples and enabled detection of MSI. This easy-to-handle, rapid and cost-effective approach may deeply improve the analysis of microsatellites in several biological and clinical applications.