Project description:Stop-loss mutations cause over twenty different diseases. The effects of stop-loss mutations can have multiple consequences that are, however, hard to predict. Stop-loss in ITM2B/BRI2 results in C-terminal extension of the encoded protein and, upon furin cleavage, in the production of two 34 amino acid long peptides, ADan and ABri, that accumulate as amyloids in the brains of patients affected by familial Danish and British Dementia. To systematically explore the consequences of Bri2 C-terminal extension, here, we measure amyloid formation for 676 ADan substitutions and identify the region that forms the putative amyloid core of ADan fibrils, located between positions 20 and 26, where stop-loss occurs. Moreover, we measure amyloid formation for ~18,000 random C-terminal extensions of Bri2 and find that ~32% of these sequences can nucleate amyloids. We find that the amino acid composition of these nucleating sequences varies with peptide length and that short extensions of 2 specific amino acids (Aliphatics, Aromatics and Cysteines) are sufficient to generate novel amyloid cores. Overall, our results show that the C-terminus of Bri2 contains an incomplete amyloid motif that can turn amyloidogenic upon extension. C-terminal extension with de novo formation of amyloid motifs may thus be a widespread pathogenic mechanism resulting from stop-loss, highlighting the importance of determining the impact of these mutations for other sequences across the genome.
Project description:Spliced peptides are short protein fragments spliced together in the proteasome by peptide bond formation. True estimation of the contribution of proteasome-spliced peptides (PSPs) to the global Human Leukocyte Antigen (HLA) ligandome is critical. A recent study suggested that PSPs contribute up to 30% of the HLA ligandome. We performed a thorough reanalysis of the reported results using multiple computational tools and various validation steps and concluded that only a fraction of the proposed PSPs passes the quality filters. To better estimate the actual number of PSPs, we present an alternative workflow. We performed de-novo sequencing of the HLA-peptide spectra and discarded all de-novo sequences found in the UniProt database. We checked whether the remaining de-novo sequences could match spliced peptides from human proteins. The spliced sequences were appended to the UniProt fasta file, which was searched by two search tools at a FDR of 1%. We find that 2-6% of the HLA ligandome could be explained as spliced protein fragments. The majority of these potential PSPs have good peptide-spectrum match properties and are predicted to bind the respective HLA molecules. However, it remains to be shown how many of these potential PSPs actually originate from proteasomal splicing events.
Project description:The identification of promoter and upstream regulatory sequences is an key step towards understanding gene regulation; to this aim, primer extension techniques, which can be performed transcriptome-wide, are the first step, as they allow the location of experimental transcription start sites and other features (such as
Project description:Our extended analysis of the HLA-presented antigen landscape in cervical cancer cells using an integrative proteogenomics approach identifies, next to tumour-associated antigens and tumour-specific neoantigens, presentation of viral canonical, and alternative reading frame (ARF)-derived HLA-presented sequences, including peptides derived from HPV-E1.
Project description:HLA-I molecules bind short peptides and present them to CD8+ T cells for TCR recognition. The length of HLA-I ligands typically ranges from 8 to 12 amino acids, but high variability is observed between different alleles. Here we used recent HLA peptidomics data to analyze in an unbiased way peptide length distributions over 85 different HLA-I alleles. Our results revealed clear clustering of HLA-I alleles with distinct peptide length distributions, which enabled us to unravel some of the molecular basis of peptide length distributions and predict peptide length distributions based on HLA-I sequences only. We further took advantage of our collection of curated HLA peptidomics studies to investigate multiple specificity in HLA-I molecules and validated these observations with binding assays. Explicitly modeling peptide length distributions and multiple specificity significantly improved predictions of naturally presented HLA-I ligands, as demonstrated in an independent benchmarking based on ten newly generated HLA peptidomics datasets from meningioma samples.
Project description:We combined electrophoresis and nanopore sequencing to analyze MarathonRT velocity over the heterogeneous sequences and structures of HOTAIR RNA template and reveal that the local sequences and structures of the template have negligible effect to MarathonRT, and MarathonRT can copy the long RNA in a single turnover, which leads to unusually synchronized primer extension at a constant speed of 25 nt/sec. We further demonstrate that ultra-stable RNA structure insertions do not obstruct MarathonRT, suggesting that MarathonRT can immediately disrupt any RNA structures within the template.
Project description:Background. Assessment of non-HLA variants alongside standard HLA testing was previously shown to improve the identification of potential coeliac disease (CD) patients. We intended to identify new genetic variants associated with CD in the Polish population that would improve CD risk prediction when used alongside HLA haplotype analysis. Results. Association analysis using four HLA-tagging SNPs showed that, as was found in other populations, positive predicting genotypes (HLA-DQ2.5/DQ2.5, HLA-DQ2.5/DQ2.2, and HLA-DQ2.5/DQ8) were found at higher frequencies in CD patients than in healthy control individuals in the Polish population. Both CD-associated SNPs discovered by GWAS were found in the CD susceptibility region, confirming the previously-determined association of the major histocompatibility (MHC) region with CD pathogenesis. The two most significant SNPs from the GWAS were rs9272346 (HLA-dependent; localized within 1 Kb of DQA1) and rs3130484 (HLA-independent; mapped to MSH5). Specificity of CD prediction using the four HLA-tagging SNPs achieved 92.9%, but sensitivity was only 45.5%. However, when a testing combination of the HLA-tagging SNPs and the MSH5 SNP was used, specificity decreased to 80%, and sensitivity increased to 74%.