Unknown

Dataset Information

0

RaPID-Query for Fast Identity by Descent Search and Genealogical Analysis.


ABSTRACT:

Motivation

Due to the rapid growth of the genetic database size, genealogical search, a process of inferring familial relatedness by identifying DNA matches, has become a viable approach to help individuals finding missing family members or law enforcement agencies locating suspects. A fast and accurate method is needed to search an out-of-database individual against millions of individuals. Most existing approaches only offer all-vs-all within panel match. Some prototype algorithms offer one-vs-all query from out-of-panel individual, but they do not tolerate errors.

Results

A new method, random projection-based identical-by-descent (IBD) detection (RaPID) query, is introduced to make fast genealogical search possible. RaPID-Query identifies IBD segments between a query haplotype and a panel of haplotypes. By integrating matches over multiple PBWT indexes, RaPID-Query manages to locate IBD segments quickly with a given cutoff length while allowing mismatched sites. A single query against all UK biobank autosomal chromosomes was completed within 2.76 seconds on average, with the minimum length 7 cM and 700 markers. RaPID-Query achieved a 0.016 false negative rate and a 0.012 false positive rate simultaneously on a chromosome 20 sequencing panel having 86,265 sites. This is comparable to the state-of-the-art IBD detection method TPBWT(out-of-sample) and Hap-IBD. The high-quality IBD segments yielded by RaPID-Query were able to distinguish up to fourth degree of the familial relatedness for a given individual pair, and the area under the receiver operating characteristic curve values are at least 97.28%.

Availability

The RaPID-Query program is available at https://github.com/ucfcbb/RaPID-Query.

Supplementary information

Supplementary data is available at Bioinformatics online.

SUBMITTER: Wei Y 

PROVIDER: S-EPMC10244210 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

RaPID-Query for fast identity by descent search and genealogical analysis.

Wei Yuan Y   Naseri Ardalan A   Zhi Degui D   Zhang Shaojie S  

Bioinformatics (Oxford, England) 20230601 6


<h4>Motivation</h4>Due to the rapid growth of the genetic database size, genealogical search, a process of inferring familial relatedness by identifying DNA matches, has become a viable approach to help individuals finding missing family members or law enforcement agencies locating suspects. A fast and accurate method is needed to search an out-of-database individual against millions of individuals. Most existing approaches only offer all-versus-all within panel match. Some prototype algorithms  ...[more]

Similar Datasets

| S-EPMC6612857 | biostudies-literature
| S-EPMC11741331 | biostudies-literature
| S-EPMC11796457 | biostudies-literature
| S-EPMC11275886 | biostudies-literature
| S-EPMC3035716 | biostudies-literature
| S-EPMC7750976 | biostudies-literature
| S-EPMC4315301 | biostudies-literature
| S-EPMC10718418 | biostudies-literature
| S-EPMC8192555 | biostudies-literature
| S-EPMC3494893 | biostudies-literature