Unknown

Dataset Information

0

Liftoff: accurate mapping of gene annotations.


ABSTRACT: Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of these genomes, annotation of gene features and other functional elements is essential; however for most species, only the reference genome is well-annotated. One strategy to annotate new or improved genome assemblies is to map or 'lift over' the genes from a previously-annotated reference genome. Here we describe Liftoff, a new genome annotation lift-over tool capable of mapping genes between two assemblies of the same or closely-related species. Liftoff aligns genes from a reference genome to a target genome and finds the mapping that maximizes sequence identity while preserving the structure of each exon, transcript, and gene. We show that Liftoff can accurately map 99.9% of genes between two versions of the human reference genome with an average sequence identity >99.9%. We also show that Liftoff can map genes across species by successfully lifting over 98.3% of human protein-coding genes to a chimpanzee genome assembly with 98.2% sequence identity. Liftoff can be installed via bioconda and PyPI. Additionally, the source code for Liftoff is available at https://github.com/agshumate/Liftoff. Supplementary data are available at Bioinformatics online.

SUBMITTER: Shumate A 

PROVIDER: S-EPMC8289374 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Liftoff: accurate mapping of gene annotations.

Shumate Alaina A   Salzberg Steven L SL  

Bioinformatics (Oxford, England) 20210701 12


<h4>Motivation</h4>Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of these genomes, annotation of gene features and other functional elements is essential; however, for most species, only the reference genome is well-annotated.<h4>Results</h4>One strategy to annotate new or improved genome assemblies is to map or 'lift over' the genes from a previou  ...[more]

Similar Datasets

| S-EPMC3337258 | biostudies-literature
| S-EPMC4460903 | biostudies-literature
| S-EPMC3243146 | biostudies-literature
| S-EPMC11769679 | biostudies-literature
| S-EPMC4222628 | biostudies-literature
| S-EPMC11525321 | biostudies-literature
| S-EPMC8624953 | biostudies-literature
| S-EPMC2686450 | biostudies-literature
| S-EPMC2652876 | biostudies-literature
| S-EPMC4339237 | biostudies-literature