<HashMap><database>biostudies-literature</database><scores/><additional><submitter>Tharakaraman K</submitter><funding>Intramural NIH HHS</funding><pagination>408</pagination><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-EPMC1599759</full_dataset_link><repository>biostudies-literature</repository><omics_type>Unknown</omics_type><volume>7</volume><pubmed_abstract>&lt;h4>Background&lt;/h4>Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set.&lt;h4>Results&lt;/h4>We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence.&lt;h4>Conclusion&lt;/h4>Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances.</pubmed_abstract><journal>BMC bioinformatics</journal><pubmed_title>Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements.</pubmed_title><pmcid>PMC1599759</pmcid><funding_grant_id>Z99 LM999999</funding_grant_id><pubmed_authors>Landsman D</pubmed_authors><pubmed_authors>Marino-Ramirez L</pubmed_authors><pubmed_authors>Tharakaraman K</pubmed_authors><pubmed_authors>Spouge JL</pubmed_authors><pubmed_authors>Sheetlin SL</pubmed_authors></additional><is_claimable>false</is_claimable><name>Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements.</name><description>&lt;h4>Background&lt;/h4>Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set.&lt;h4>Results&lt;/h4>We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence.&lt;h4>Conclusion&lt;/h4>Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances.</description><dates><release>2006-01-01T00:00:00Z</release><publication>2006 Sep</publication><modification>2024-11-20T02:32:42.23Z</modification><creation>2019-03-27T01:46:03Z</creation></dates><accession>S-EPMC1599759</accession><cross_references><pubmed>16961919</pubmed><doi>10.1186/1471-2105-7-408</doi></cross_references></HashMap>