Project description:This project presents a comprehensive genotyping dataset for 224 healthy Kazakh individuals. Genomic DNA was collected from local medical stations across Kazakhstan and genotyped using the Illumina Infinium SNP Genotyping Array GSA MG v2. The study contributes to understanding the genetic variation of the Kazakh population and highlights single nucleotide variants with biomedical significance.
2025-05-26 | PRJEB89820 | EVA
Project description:WGS and analysis of hundreds Kazakh individuals
Project description:Esophageal squamous cell carcinoma (ESCC) is characterized as a metabolic disorder characterized by lipid metabolic reprogramming. To investigate the regional characteristics of ESCC patients in Xinjiang Province, China, and lipid metabolism, in this study, we described the characteristics of the serum lipid composition in Kazakh ESCC patients by performing an integrated analysis of the transcriptome and lipidomic data. Serum samples from 30 Kazakh ESCC patients and 30 healthy individuals were subjected to targeted lipid metabolomics analysis via UPLC‒MS/MS, while 3 tumor samples and matched adjacent normal tissues from 30 ESCC patients were subjected to transcriptome analysis. Compared with those in the healthy group, we observed obvious changes in the serum lipid subclass content, chain length and unsaturation in the ESCC patients. Integrated lipidomic and transcriptomic analyses revealed that unsaturated fatty acid biosynthesis, fatty acid metabolism, lipid degradation, cholesterol metabolism and the AMPK signaling pathway were enriched in tumor tissues. In addition, RT–qPCR results demonstrated that genes closely related to these pathways were differentially expressed between the ESCC group and the healthy control group. Considering the key role of AMPK in lipid metabolism, we conducted a targeted lipid metabolomics analysis on AMPK-knockdown esophageal cancer cells by UPLC‒MS/MS. These findings suggested that AMPK might be correlated with lipid metabolism in Kazakh ESCC patients, identifying potential therapeutic targets of AMPK and other lipid metabolism-related markers against the progression of ESCC.
Project description:Investigate long non-coding RNA (lncRNA) expression characteristics in the peripheral blood lymphocytes of Xinjiang Kazakh people with essential hypertension.
Project description:Estimating the relationships between individuals is one of the fundamental challenges in many fields. In particular, relationship estimation could provide valuable information for missing persons cases. The recently developed investigative genetic genealogy approach uses high-density single nucleotide polymorphisms (SNPs) to determine close and more distant relationships, in which hundreds of thousands to tens of millions of SNPs are generated either by microarray genotyping or whole-genome sequencing. The current studies usually assume the SNP profiles were generated with minimum errors. However, in the missing person cases, the DNA samples can be highly degraded, and the SNP profiles generated from these samples usually contain lots of errors. In this study, a robust machine learning approach was developed for estimating the relationships with high error SNP profiles. In this approach, a hierarchical classification strategy was employed first to classify the relationships by degree and then the relationship types within each degree separately. As for each classification, feature selection was implemented to gain better performance. Both simulated and real data sets with various genotyping error rates were utilized in evaluating this approach, and the accuracies of this approach were higher than individual measures; namely, this approach was more accurate and robust than the individual measures for SNP profiles with genotyping errors. In addition, the highest accuracy could be obtained by providing the same genotyping error rates in train and test sets, and thus estimating genotyping errors of the SNP profiles is critical to obtaining high accuracy of relationship estimation.
Project description:Genome-wide sequencing of the kazakh horse of the Mugalzhar breed from the Aktobe region (Kazakhstan), with 15X coverage of DNA samples from 20 individuals.