Ontology highlight
ABSTRACT:
SUBMITTER: Madani A
PROVIDER: S-EPMC10400306 | biostudies-literature | 2023 Aug
REPOSITORIES: biostudies-literature
Madani Ali A Krause Ben B Greene Eric R ER Subramanian Subu S Mohr Benjamin P BP Holton James M JM Olmos Jose Luis JL Xiong Caiming C Sun Zachary Z ZZ Socher Richard R Fraser James S JS Naik Nikhil N
Nature biotechnology 20230126 8
Deep-learning language models have shown promise in various biotechnological applications, including protein design and engineering. Here we describe ProGen, a language model that can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics. The model was trained on 280 million protein sequences from >19,000 families and is augmented with control tags specifying pro ...[more]