The factor of 10 in forensic DNA match probabilities.
ABSTRACT: An update was performed of the classic experiments that led to the view that profile probability assignments are usually within a factor of 10 of each other. The data used in this study consist of 15 Identifiler loci collected from a wide range of forensic populations. Following Budowle et al. , the terms cognate and non-cognate are used. The cognate database is the database from which the profiles are simulated. The profile probability assignment was usually larger in the cognate database. In 44%-65% of the cases, the profile probability for 15 loci in the non-cognate database was within a factor of 10 of the profile probability in the cognate database. This proportion was between 60% and 80% when the FBI and NIST data were used as the non-cognate databases. A second experiment compared the match probability assignment using a generalised database and recommendation 4.2 from NRC II (the 4.2 assignment) with a proxy for the matching proportion developed using subpopulation allele frequencies and the product rule. The findings support that the 4.2 assignment has a large conservative bias. These results are in agreement with previous research results.
Project description:Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10(-17). This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications.
Project description:The Abbreviated Profile of Hearing Aid Benefit (APHAB) questionnaire reports subjective hearing impairments in four typical conditions. We investigated the association between the frequency-specific probability of hearing loss and scores from the unaided APHAB (APHAB<sub>u</sub>) to determine whether the APHAB<sub>u</sub> could be useful in primary diagnoses of hearing loss, in addition to pure tone and speech audiometry. This retrospective study included database records from 6558 patients (average age 69.0 years). We employed a multivariate generalised linear mixed model to analyse the probabilities of hearing losses (severity range 20-75 dB, evaluated in 5-dB steps), measured at different frequencies (0.5, 1.0, 2.0, 4.0, and 8.0 kHz), for nearly all combinations of APHAB<sub>u</sub> subscale scores (subscale scores from 20 to 80%, evaluated in steps of 5%). We calculated the probability of hearing loss for 28,561 different combinations of APHAB<sub>u</sub> subscale scores (results available online). In general, the probability of hearing loss was positively associated with the combined APHAB<sub>u</sub> score (i.e. increasing probability with increasing scores). However, this association was negative at one frequency (8 kHz). The highest probabilities were for a hearing loss of 45 dB at test frequency 2.0 kHz, but with a wide spreading. We showed that the APHAB<sub>u</sub> subscale scores were associated with the probability of hearing loss measured with audiometry. This information could enrich the expert's evaluation of the subject's hearing loss, and it might help resolve suspicious cases of aggravation. The 0.5 and 8.0 kHz frequencies influenced hearing loss less than the frequencies in-between, and 2.0 kHz was most influential on intermediate degree hearing loss (around 45 dB), which corresponded to the frequency-dependence of speech intelligibility measured with speech audiometry.
Project description:We describe a new computational method for estimating the probability that a point mutation at each position in a genome will influence fitness. These 'fitness consequence' (fitCons) scores serve as evolution-based measures of potential genomic function. Our approach is to cluster genomic positions into groups exhibiting distinct 'fingerprints' on the basis of high-throughput functional genomic data, then to estimate a probability of fitness consequences for each group from associated patterns of genetic polymorphism and divergence. We have generated fitCons scores for three human cell types on the basis of public data from ENCODE. In comparison with conventional conservation scores, fitCons scores show considerably improved prediction power for cis regulatory elements. In addition, fitCons scores indicate that 4.2-7.5% of nucleotides in the human genome have influenced fitness since the human-chimpanzee divergence, and they suggest that recent evolutionary turnover has had limited impact on the functional content of the genome.
Project description:BACKGROUND:Recently, the Combined DNA Index System (CODIS) Core Loci Working Group established by the US Federal Bureau of Investigation (FBI) reviewed and recommended changes to the CODIS core loci. The Working Group identified 20 short tandem repeat (STR) loci (composed of the original CODIS core set loci (minus TPOX), four European recommended loci, PentaE, and DYS391) plus the Amelogenin marker as the new core set. Before selecting and finalizing the core loci, some evaluations are needed to provide guidance for the best options of core selection. METHOD:The performance of current and newly proposed CODIS core loci sets were evaluated with simplified analyses for adventitious hit rates in reasonably large datasets under single-source profile comparisons, mixture comparisons and kinship searches, and for international data sharing. Informativeness (for example, match probability, average kinship index (AKI)) and mutation rates of each locus were some of the criteria to consider for loci selection. However, the primary factor was performance with challenged forensic samples. RESULTS:The current battery of loci provided in already validated commercial kits meet the needs for single-source profile comparisons and international data sharing, even with relatively large databases. However, the 13 CODIS core loci are not sufficiently powerful for kinship analyses and searching potential contributors of mixtures in larger databases; 19 or more autosomal STR loci perform better. Y-chromosome STR (Y-STR) loci are very useful to trace paternal lineage, deconvolve female and male mixtures, and resolve inconsistencies with Amelogenin typing. The DYS391 locus is of little theoretical or practical use. Combining five or six Y-chromosome STR loci with existing autosomal STR loci can produce better performance than the same number of autosomal loci for kinship analysis and still yield a sufficiently low match probability for single-source profile comparisons. CONCLUSION:A more comprehensive study should be performed to provide the necessary information to decision makers and stakeholders about the construction of a new set of core loci for CODIS. Finally, selection of loci should be driven by the concept that the needs of casework should be supported by the processes of CODIS (or for that matter any forensic DNA database).
Project description:An RNA switch triggers biological functions by toggling between two conformations. RNA switches include bacterial riboswitches, where ligand binding can stabilize a bound structure. For RNAs with only one stable structure, structural prediction usually just requires a straightforward free energy minimization, but for an RNA switch, the prediction of a less stable alternative structure is often computationally costly and even problematic. The current sampling-clustering method predicts stable and alternative structures by partitioning structures sampled from the energy landscape into two clusters, but it is very time-consuming. Instead, we predict the alternative structure of an RNA switch from conditional probability calculations within the energy landscape. First, our method excludes base pairs related to the most stable structure in the energy landscape. Then, it detects stable stems ("seeds") in the remaining landscape. Finally, it folds an alternative structure prediction around a seed. While having comparable riboswitch classification performance, the conditional-probability computations had fewer adjustable parameters, offered greater predictive flexibility, and were more than one thousand times faster than the sampling step alone in sampling-clustering predictions, the competing standard. Overall, the described approach helps traverse thermodynamically improbable energy landscapes to find biologically significant substructures and structures rapidly and effectively.
Project description:PURPOSE:In this study we investigated a set of 100 sentence contexts and their cloze probabilities to develop a database of linguistic stimuli for Brazilian Portuguese children and adolescents. The study also examined age-related changes on cloze probabilities, and specified the predictor effects of age and cloze probabilities on idiosyncratic responses and errors (semantic, syntactic, and other errors). Finally, the study also aimed to shed light on cultural effects on word generation by comparing Brazilian and Portuguese sentence databases. METHOD:361 typically developing monolingual Brazilian speakers, with ages ranging from 7 to 18 years, participated in the study. The cloze task was composed by 100 sentence contexts, grounded on the European Portuguese database. Responses were classified as valid (correct) or invalid (semantic, syntactic, and other-type errors). Statistical analyses were based on mixed-effects logistic models. RESULTS:Sixty-three sentences met criteria for high cloze probabilities, 30 for medium cloze, and 7 for low cloze. Age was a significant predictor of idiosyncratic responses, semantic and syntactic errors: older participants were less likely to produce idiosyncratic responses, as well as semantic and syntactic errors. Cloze probability values were concordant in the Brazilian and Portuguese databases for 31 out of 49 (83.7%) high-cloze sentences and for 7 low-cloze sentences. CONCLUSION:In this study we have provided a database with cloze probability values for a set of 100 sentence-final word contexts for Brazilian Portuguese children and adolescents. Results showed that both age and sentence contextual level predicted sentence final word completion. Older participants were more likely to choose more consistently the same final word, with the contextual level of a given sentence also contributing to the final word selection. Age should be controlled for in future studies probing semantic processing with this set of sentences.
Project description:A key problem in computational proteomics is distinguishing between correct and false peptide identifications. We argue that evaluating the error rates of peptide identifications is not unlike computing generating functions in combinatorics. We show that the generating functions and their derivatives ( spectral energy and spectral probability) represent new features of tandem mass spectra that, similarly to Delta-scores, significantly improve peptide identifications. Furthermore, the spectral probability provides a rigorous solution to the problem of computing statistical significance of spectral identifications. The spectral energy/probability approach improves the sensitivity-specificity tradeoff of existing MS/MS search tools, addresses the notoriously difficult problem of "one-hit-wonders" in mass spectrometry, and often eliminates the need for decoy database searches. We therefore argue that the generating function approach has the potential to increase the number of peptide identifications in MS/MS searches.
Project description:The accelerated pace of genomic sequencing has increased the demand for structural models of gene products. Improved quantitative methods are needed to study the many systems (e.g., macromolecular assemblies) for which data are scarce. Here, we describe a new molecular dynamics method for protein structure determination and molecular modeling. An energy function, or database potential, is derived from distributions of interatomic distances obtained from a database of known structures. X-ray crystal structures are refined by molecular dynamics with the new energy function replacing the Van der Waals potential. Compared to standard methods, this method improved the atomic positions, interatomic distances, and side-chain dihedral angles of structures randomized to mimic the early stages of refinement. The greatest enhancement in side-chain placement was observed for groups that are characteristically buried. More accurate calculated model phases will follow from improved interatomic distances. Details usually seen only in high-resolution refinements were improved, as is shown by an R-factor analysis. The improvements were greatest when refinements were carried out using X-ray data truncated at 3.5 A. The database potential should therefore be a valuable tool for determining X-ray structures, especially when only low-resolution data are available.
Project description:We employed our previously developed 27-plex ancestry-informative single nucleotide polymorphism (SNP) panel to infer the ancestral components of bone remains of a possible foreign pilot found in south-western China. For ancestry assignment of this unknown individual, we first obtained the 27-SNP genotype of the individual. Then, based on a reference database of 3081 individuals from 33 populations, we calculated the match probability and likelihood ratio using the self-developed software program Forensic Intelligence. Inferred ancestral components of this individual were calculated by structure at K = 3. A complete profile was obtained for the individual using our multiplexed SNP assay. The European population was within one order of magnitude of the highest likelihood. The major ancestral component of this individual was 97.6% European.
Project description:Aberrant transcriptional repression through chromatin remodeling and histone deacetylation has been postulated as the driving force for tumorigenesis. FBI-1 (formerly called Pokemon) is a member of the POK family of transcriptional repressors. Recently, FBI-1 was characterized as a critical oncogenic factor that specifically represses transcription of the tumor suppressor gene ARF, potentially leading indirectly to p53 inactivation. Our investigations on transcriptional repression of the p53 pathway revealed that FBI-1 represses transcription of ARF, Hdm2 (human analogue of mouse double minute oncogene), and p21CIP1 (hereafter indicated as p21) but not of p53. FBI-1 showed a more potent repressive effect on p21 than on p53. Our data suggested that FBI-1 is a master controller of the ARF-Hdm2-p53-p21 pathway, ultimately impinging on cell cycle arrest factor p21, by inhibiting upstream regulators at the transcriptional and protein levels. FBI-1 acted as a competitive transcriptional repressor of p53 and Sp1 and was shown to bind the proximal Sp1-3 GC-box and the distal p53-responsive elements of p21. Repression involved direct binding competition of FBI-1 with Sp1 and p53. FBI-1 also interacted with corepressors, such as mSin3A, NCoR, and SMRT, thereby deacetylating Ac-H3 and Ac-H4 histones at the promoter. FBI-1 caused cellular transformation, promoted cell cycle proliferation, and significantly increased the number of cells in S phase. FBI-1 is aberrantly overexpressed in many human solid tumors, particularly in adenocarcinomas and squamous carcinomas. The role of FBI-1 as a master controller of the p53 pathway therefore makes it an attractive therapeutic target.