Ontology highlight
ABSTRACT: Motivation
Linkage disequilibrium (LD) matrices derived from large populations are widely used in population genetics in fine-mapping, LD score regression, and linear mixed models for Genome-wide Association Studies (GWAS). However, these matrices can reach large sizes when they are derived from millions of individuals; hence, moving, sharing and extracting granular information from this large amount of data can be cumbersome.Results
We sought to address the need for compressing and easily querying large LD matrices by developing LDmat. LDmat is a standalone tool to compress large LD matrices in an HDF5 file format and query these compressed matrices. It can extract submatrices corresponding to a sub-region of the genome, a list of select loci, and loci within a minor allele frequency range. LDmat can also rebuild the original file formats from the compressed files.Availability and implementation
LDmat is implemented in python, and can be installed on Unix systems with the command 'pip install ldmat'. It can also be accessed through https://github.com/G2Lab/ldmat and https://pypi.org/project/ldmat/.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Weiner RJ
PROVIDER: S-EPMC9969815 | biostudies-literature | 2023 Feb
REPOSITORIES: biostudies-literature

Weiner Rockwell J RJ Lakhani Chirag C Knowles David A DA Gürsoy Gamze G
Bioinformatics (Oxford, England) 20230201 2
<h4>Motivation</h4>Linkage disequilibrium (LD) matrices derived from large populations are widely used in population genetics in fine-mapping, LD score regression, and linear mixed models for Genome-wide Association Studies (GWAS). However, these matrices can reach large sizes when they are derived from millions of individuals; hence, moving, sharing and extracting granular information from this large amount of data can be cumbersome.<h4>Results</h4>We sought to address the need for compressing ...[more]