Unknown

Dataset Information

0

Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci.


ABSTRACT: Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we select CDS incomplete isoforms annotated in GENCODE and identify 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-sequencing among these incomplete isoforms, we reveal 29 full-length isoforms with unannotated CDSs associated with GWAS traits. Furthermore, we show that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases.

SUBMITTER: Yamaguchi K 

PROVIDER: S-EPMC9402578 | biostudies-literature | 2022 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci.

Yamaguchi Kensuke K   Ishigaki Kazuyoshi K   Suzuki Akari A   Tsuchida Yumi Y   Tsuchiya Haruka H   Sumitomo Shuji S   Nagafuchi Yasuo Y   Miya Fuyuki F   Tsunoda Tatsuhiko T   Shoda Hirofumi H   Fujio Keishi K   Yamamoto Kazuhiko K   Kochi Yuta Y  

Nature communications 20220824 1


Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 i  ...[more]

Similar Datasets

| S-EPMC6499313 | biostudies-literature
| S-EPMC1378042 | biostudies-other
| S-EPMC4079314 | biostudies-literature
| S-EPMC2668052 | biostudies-literature
| S-EPMC4820289 | biostudies-literature
| S-EPMC5409091 | biostudies-literature
| S-EPMC6140573 | biostudies-literature
| S-EPMC8785530 | biostudies-literature
| S-EPMC8081720 | biostudies-literature
| S-EPMC7083320 | biostudies-literature