Unknown

Dataset Information

0

MIMt-23S- A new 23S reference database for taxonomic assignment of metagenomic samples


ABSTRACT: MIMt-23S is a database composed by sequences belonging to prokaryotes from Target Loci type material, Refseq genomes and Genbank genomes. To create MIMt-23S database we collected all the 23S curated sequences from Target Loci and we append the 23S region (prokaryote large subunit rRNA) from all the available genomes in RefSeq predicted with the tool RNAmmer-1.2. The result is a complete database where most of the sequences are manually curated from RefSeq curators and are properly identified at species level, or even subspecies/strain. The full version of MIMt-23S contains in addition 23S regions from the genome of new species deposited in Genbank, always keeping the full 23S region and identifying exactly the species name to get the full taxonomic classification. Thus, all sequences included in both versions of MIMt-23S are full length large subunit rRNA and are well identified at all taxonomic levels. MIMt-23S has been trained to be used in QIIME and the classifier is also provided. The database format is >SeqIDK__kingdom;P__phylum;C__class;O__order;F__family;G__genus;S__Genus_species CGCGACTACGACTACGCTCAGACGCATCGTACGCAGACTACGTCAGTCAGACGTCGCTGCTCGTCGTACGTACGCT There is also available a file with just the taxonomy associated to each sequence in the format: SeqIDFull_taxonomy and another one with species sharing the 100% of the sequence, so the programs could not differentiate between both species when a taxonomic classification is performed. All files are available for both, only curated version and full version (including also predicted 23S regions from Genbank genomes)

ORGANISM(S): Prokaryotes

SUBMITTER:  

PROVIDER: S-BSST2014 | biostudies-other |

REPOSITORIES: biostudies-other

Similar Datasets

| S-BSST2008 | biostudies-other
| S-BSST2015 | biostudies-other
| S-BSST2009 | biostudies-other
| S-EPMC3753567 | biostudies-literature
| S-EPMC5349245 | biostudies-literature
2015-05-01 | GSE58431 | GEO
| S-EPMC12689518 | biostudies-literature
| S-EPMC4702849 | biostudies-literature
| S-EPMC11417245 | biostudies-literature