Metabolomics,Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

Preprocessed reduced representation bisulfite sequencing (RRBS) data from 173 human umbilical cord blood samples, collected for a study on associations between perinatal DNA methylation marks and progression to type 1 diabetes by age 15

ABSTRACT: The samples were collected from the participants of the Finnish Diabetes Prediction and Prevention (DIPP) Study, born between 1995 and 2006. DIPP is a prospective follow-up cohort of children with a moderate or high risk of type 1 diabetes, based on the HLA-DR-DQ genotype. Islet cell autoantibodies (ICA, GADA, IAA, IA2A and ZnT8A) were measured 1 - 4 times per year until age 15 or year 2018. The aim was to study associations between perinatal DNA methylation marks and later progression to type 1 diabetes. Case individuals who became persistently positive for at least two biochemical autoantibodies (GADA, IAA, IA2A or ZnT8A) and/or were diagnosed with type 1 diabetes during the follow-up were compared to the control individuals who remained autoantibody-negative throughout the follow-up. These data were also used in the development of data analysis methodology in bisulfite sequencing studies. To protect the privacy of the study participants, the sequence read data are not publicly available. However, the processed data can be downloaded here. These include two count matrices: \\"methylated_reads\\" and \\"total_reads\\". The matrix \\"methylated_reads\\" contains methylated read counts at each high-coverage CpG site (altogether approximately 2.5 million rows) at each of the 173 samples (173 columns), and the matrix \\"total_reads\\" contains the corresponding total read counts (coverage). Please notice that the methylated read counts are read counts, not percentages. Methylation proportions can be calculated as methylated_reads/total_reads. The row names are the genomic locations of these CpG sites in hg19 (GRCh37) coordinates (1,2). For privacy reasons, all potential SNPs were excluded from these publicly available count matrices. Specifically, we removed all common (minor allele frequency > 1 %) human SNPs, as listed in dbSNP (3). We also removed all SNPs that were detected in one or more samples even with \\"low\\" evidence by BS-SNPer, which is a software for detecting SNPs from bisulfite sequencing data (4). Altogether 204443 out of 2752981 rows were removed from the original coverage-filtered count matrices that were analyzed in the present study. Description of the sample attributes: Individual: The individual-specific identifiers, such as “Subject1”. Since each sample is from a different individual, these correspond to the sample identifiers (Subject1 == Sample 1 etc.) Experimental Group: The variable of interest (called \\"class\\" in the associated publications) with three possible values: 1) case, 2) control and 3) NA (neither case nor control). 1) Case: became persistently positive for at least two biochemical autoantibodies (GADA, IAA, IA2A or ZnT8A) and/or diagnosed with type 1 diabetes during the follow-up. 2) Control: remained autoantibody-negative throughout the follow-up. 3) NA: The remaining 51 individuals with a missing value (“NA”) did not qualify as cases or controls, since they were either persistently positive for only 1 biochemical autoantibody or transiently positive for one or more autoantibodies. We excluded these 51 individuals from the case-control-comparison but included them in the comparison between the sexes. Library preparation batch: The sequencing libraries were prepared in 7 batches. The names of the batches do not have any special meaning. That is, \\"1A\\" is not necessarily more similar to \\"1B\\" than it is to \\"3B\\". We treated this as a categorical technical variable with 7 categories. PC1 and PC2: Projections of the sample-specific methylation proportion vectors on the first two orthonormal principal components. The principal component analysis (PCA) was performed on the original coverage-filtered methylation proportion matrix (methylated/total reads), where missing values at each CpG site were imputed by the median over samples with non-missing values. The original methylation proportion matrix included 2752981 rows, whereas these publicly available matrices include 2548538 rows (all potential SNPs excluded). Hence, PCA on the publicly available data would result in slightly different values for PC1 and PC2. We included these as covariates in the differential methylation analysis to represent technical variation (in addition to the library preparation batches). References 1. Church DM, Schneider VA, Graves T, Auger K, Cunningham F, Bouk N, et al. Modernizing reference genome assemblies. PLoS Biol. 2011 Jul;9(7):e1001091. 2. Genome Reference Consortium. NCBI downloads: https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz, accessed Feb 10th, 2019 3. NCBI. dbSNP: https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/common_all_20180423.vcf.gz, accessed April 29th, 2021 4. Gao S, Zou D, Mao L, Liu H, Song P, Chen Y, et al. BS-SNPer: SNP calling in bisulfite-seq data. Bioinformatics. 2015 Dec 15;31(24):4006–8.

INSTRUMENT(S): Illumina HiSeq 2500

ORGANISM(S): Homo sapiens

SUBMITTER: Essi Laajala

PROVIDER: E-MTAB-10530 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

ACCESS DATA

Similar Datasets

Project description:Type 1 diabetes mellitus (T1DM) results from immune mediated destruction of pancreatic beta cells. However, clinical and immunologic phenotypes of T1DM are variable. Several auto-antibodies including GADA, IA-2A, and ZnT8A, were identified in T1DM, but the prevalence of these auto-antibodies varied for a broad spectrum of T1DM. Here, we systemically profiled auto-antibodies from serum samples of 16 T1DM, 16 type 2 diabetes (T2DM) patients, and 27 healthy controls with normal glucose tolerance (NGT) using protein microarrays containing 9,480 proteins. Among 9,480 different proteins on the array, we identified novel auto-antibody candidates (EEF1A1-AAb and UBE2L3-AAb) by M-test coupled with PLS-DA. These auto-antibodies were highly present in T1DM than controls and detected in 40% of T1DM without GADA. Furthermore, these auto-antibodies might help to differentiate subtype of T1DM when combined with GADA. These novel auto-antibodies provide new diagnostic information of T1DM, as well as new insights into the pathogenesis of T1DM. Auto-antibodies from serum samples were profiled using a high-density, fluorescence-based protein microarray containing duplicate spots of 9,480 human proteins derived from the Ultimate ORF collection The cohort of patients and controls consisted of 16 T1DM, 16 T2DM patients, and 27 healthy controls with NGT. This cohort was used to screen candidate auto-antibodies using protein microarrays (ProtoArray platform version 5.0, Invitrogen Corp., Carlsbad, CA). Serum samples were drawn from T1DM patients who have 1) fasting C-peptide level <0.3 nmol/L or serum C-peptide <0.6 nmol/L after glucagon loading, 2) initiation of insulin treatment within six months after diagnosis, and 3) duration of diabetes M-bM-^IM-$12 months. Mean age of T1DM in the first cohort was 42 M-BM-1 16 years. The control serum samples were obtained from T2DM patients who were treated only with oral anti-diabetic drug at least 5 years and from NGTs who had no history of diabetes, no first-degree relatives with diabetes, a fasting plasma glucose concentration of <6.1 mmol/l, and a HbA1c value of <5.8%.

Dataset Information

Preprocessed reduced representation bisulfite sequencing (RRBS) data from 173 human umbilical cord blood samples, collected for a study on associations between perinatal DNA methylation marks and progression to type 1 diabetes by age 15

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets