Unknown

Dataset Information

0

Beyond Homozygosity Mapping: Family-Control analysis based on Hamming distance for prioritizing variants in exome sequencing.


ABSTRACT: A major challenge in current exome sequencing in autosomal recessive (AR) families is the lack of an effective method to prioritize single-nucleotide variants (SNVs). AR families are generally too small for linkage analysis, and length of homozygous regions is unreliable for identification of causative variants. Various common filtering steps usually result in a list of candidate variants that cannot be narrowed down further or ranked. To prioritize shortlisted SNVs we consider each homozygous candidate variant together with a set of SNVs flanking it. We compare the resulting array of genotypes between an affected family member and a number of control individuals and argue that, in a family, differences between family member and controls should be larger for a pathogenic variant and SNVs flanking it than for a random variant. We assess differences between arrays in two individuals by the Hamming distance and develop a suitable test statistic, which is expected to be large for a causative variant and flanking SNVs. We prioritize candidate variants based on this statistic and applied our approach to six patients with known pathogenic variants and found these to be in the top 2 to 10 percentiles of ranks.

SUBMITTER: Imai A 

PROVIDER: S-EPMC5155624 | biostudies-literature | 2015 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Beyond Homozygosity Mapping: Family-Control analysis based on Hamming distance for prioritizing variants in exome sequencing.

Imai Atsuko A   Nakaya Akihiro A   Fahiminiya Somayyeh S   Tétreault Martine M   Majewski Jacek J   Sakata Yasushi Y   Takashima Seiji S   Lathrop Mark M   Ott Jurg J  

Scientific reports 20150706


A major challenge in current exome sequencing in autosomal recessive (AR) families is the lack of an effective method to prioritize single-nucleotide variants (SNVs). AR families are generally too small for linkage analysis, and length of homozygous regions is unreliable for identification of causative variants. Various common filtering steps usually result in a list of candidate variants that cannot be narrowed down further or ranked. To prioritize shortlisted SNVs we consider each homozygous c  ...[more]

Similar Datasets

| S-EPMC3248690 | biostudies-literature
| S-EPMC4231049 | biostudies-literature
| S-EPMC3291334 | biostudies-literature
| S-EPMC7655865 | biostudies-literature
2014-03-20 | GSE56043 | GEO
| S-EPMC3260153 | biostudies-literature
2010-11-01 | GSE21958 | GEO
| S-EPMC3234372 | biostudies-literature
2014-03-20 | E-GEOD-56043 | biostudies-arrayexpress