Project description:Microbiome sequencing model is a Named Entity Recognition (NER) model that identifies and annotates microbiome nucleic acid sequencing method or platform in texts. This is the final model version used to annotate metagenomics publications in Europe PMC and enrich metagenomics studies in MGnify with sequencing metadata from literature. For more information, please refer to the following blogs: http://blog.europepmc.org/2020/11/europe-pmc-publications-metagenomics-annotations.html https://www.ebi.ac.uk/about/news/service-news/enriched-metadata-fields-mgnify-based-text-mining-associated-publications
Project description:NGPS is a method for de-novo, full-length protein sequencing in high throughput. The method is based on cleavage of the protein at semi-random sites by microwave-assisted acid hydrolysis (MAAH), enrichment of LC-MS/MS amenable peptides from the hydrolysate by solid-phase-extraction, LC-MS/MS analysis, de-novo long peptide tag sequencing of resulting peptides and assembly of peptide tags into consensus contigs.
Project description:P53 mutation is closely associated with the occurrence and progression of colon cancer. In this project, we did crotonylomics sequencing by using human colon cancer homologous cell line pair-HCT116+/+(with wild type p53) and HCT116-/- (with null p53). Crotonylomics sequencing results showed that p53 deficiency regulated crotonylation of non-histone proteins.
Project description:Critical protein therapeutics, such as antibodies and nanobodies, are often not encoded in reference genomes, limiting their accurate characterization via standard proteomics. Current methods rely on indirect inference, fragmented outputs, and labor-intensive workflows, which hinder functional insights and routine application. Here, we present a generalizable, end-to-end workflow for direct protein sequencing, combining streamlined sample preparation, AI-driven de novo peptide sequencing, and tailored assembly to reconstruct contiguous protein sequences. A novel composite scoring framework prioritizes longer assemblies and coverage, enhancing accuracy and reducing ambiguity. Validation across diverse protein modalities demonstrates its utility and ability to robustly reconstruct functionally critical regions essential for optimizing therapeutic efficacy, stability, and immunogenicity. This workflow advances precision proteomics with promising applications in therapeutic discovery, immune profiling, and protein engineering.
Project description:We selected humann intervertebral disc samples to perform proteomics analysis. There were 1 case of grade I , 1 case of grade II, 3 cases of grade Ⅲ and 3 cases of grade Ⅳ according to Pfirrmann classfication. RNA seqencing analysis and single-cell RNA sequencing were integrated with proteomics data to identify the hub genes for intervertebral disc degeneration using bioinformatic method.
Project description:Dependent on concise, pre-defined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large scale proteomics datasets, and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) which leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.