Ontology highlight
ABSTRACT:
SUBMITTER: Zhai J
PROVIDER: S-EPMC11185591 | biostudies-literature | 2024 Jun
REPOSITORIES: biostudies-literature
Zhai Jingjing J Gokaslan Aaron A Schiff Yair Y Berthel Ana A Liu Zong-Yan ZY Lai Wei-Yun WY Miller Zachary R ZR Scheben Armin A Stitzer Michelle C MC Romay M Cinta MC Buckler Edward S ES Kuleshov Volodymyr V
bioRxiv : the preprint server for biology 20240822
Interpreting function and fitness effects in diverse plant genomes requires transferable models. Language models (LMs) pre-trained on large-scale biological sequences can learn evolutionary conservation and offer cross-species prediction better than supervised models through fine-tuning limited labeled data. We introduce PlantCaduceus, a plant DNA LM based on the Caduceus and Mamba architectures, pre-trained on a curated dataset of 16 Angiosperm genomes. Fine-tuning PlantCaduceus on limited labe ...[more]