Project description:Dependent on concise, pre-defined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large scale proteomics datasets, and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) which leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.
Project description:Precision de novo peptide sequencing using mirror proteases of Ac-LysargiNase and trypsin for large-scale proteomicsPrecision de novo peptide sequencing using mirror proteases of Ac-LysargiNase and trypsin for large-scale proteomics
Project description:Sequencing based approaches have led to new insights about DNA methylation. While many different techniques for genome-scale mapping of DNA methylation have been employed, throughput has been a key limitation for most. To further facilitate the mapping of DNA methylation, we describe a protocol for gel-free multiplexed reduced representation bisulfite sequencing (mRRBS) that reduces the workload dramatically and enables processing of 96 or more samples per week. mRRBS achieves similar CpG coverage as the original RRBS protocol, while the higher throughput and lower cost make it better suited for large-scale DNA methylation mapping studies including cohorts of cancer samples. Libraries of 96 human samples
Project description:Cytosine methylation of DNA CpG dinucleotides in gene promoters is an epigenetic modification that regulates gene transcription. While many methods exist to interrogate methylation states, no current methods offer large-scale, targeted, single CpG resolution. We report an approach combining bisulfite treatment followed by RainDance microdroplet PCR with next-generation sequencing to assay the methylation state of 50 genes in the regions 1 kb upstream and downstream of their transcription start sites. Wildtype and hypermethylated Jurkat DNA (New Englad Biolabs) was treated with bisulfite to convert all unmethylated cytosines to uracil. Following bisulfite treatment, targeted amplification was carried out using a custom primer library and microdroplet PCR. PCR product was sheared to 200 bp and ligated to sequencing adapters following standard protocols. Sequencing was conducted with single-end 100 bp reads on an Illumina GAIIx for wild type Jurkat DNA or Jurkat CpG DNA with a single sample per lane.