Large language models generate functional protein sequences across diverse families
Published 2023 View Full Article
- Home
- Publications
- Publication Search
- Publication Details
Title
Large language models generate functional protein sequences across diverse families
Authors
Keywords
-
Journal
NATURE BIOTECHNOLOGY
Volume -, Issue -, Pages -
Publisher
Springer Science and Business Media LLC
Online
2023-01-27
DOI
10.1038/s41587-022-01618-2
References
Ask authors/readers for more resources

Related references
Note: Only part of the references are listed.- A backbone-centred energy function of neural networks for protein design
- (2022) Bin Huang et al. NATURE
- Protein sequence design with a learned potential
- (2022) Namrata Anand et al. Nature Communications
- ColabFold: making protein folding accessible to all
- (2022) Milot Mirdita et al. NATURE METHODS
- ProtGPT2 is a deep unsupervised language model for protein design
- (2022) Noelia Ferruz et al. Nature Communications
- Deep diversification of an AAV capsid protein by machine learning
- (2021) Drew H. Bryant et al. NATURE BIOTECHNOLOGY
- Protein sequence design by conformational landscape optimization
- (2021) Christoffer Norn et al. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
- Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations
- (2021) Payel Das et al. Nature Biomedical Engineering
- Fast and sensitive taxonomic assignment to metagenomic contigs
- (2021) M Mirdita et al. BIOINFORMATICS
- Low-N protein engineering with data-efficient deep learning
- (2021) Surojit Biswas et al. NATURE METHODS
- Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
- (2021) Alexander Rives et al. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
- Protein design and variant prediction using autoregressive generative models
- (2021) Jung-Eun Shin et al. Nature Communications
- Highly accurate protein structure prediction with AlphaFold
- (2021) John Jumper et al. NATURE
- De novo protein design by deep network hallucination
- (2021) Ivan Anishchenko et al. NATURE
- Current approaches for automated model building into cryo-EM maps using Buccaneer with CCP-EM
- (2020) Soon Wen Hoh et al. Acta Crystallographica Section D-Structural Biology
- Signal Peptides Generated by Attention-Based Neural Networks
- (2020) Zachary Wu et al. ACS Synthetic Biology
- Unified rational protein engineering with sequence-based deep representation learning
- (2019) Ethan C. Alley et al. NATURE METHODS
- On the catalytic mechanism of bacteriophage endolysins: Opportunities for engineering
- (2019) Michael J. Love et al. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS
- Overview of refinement procedures within REFMAC5: utilizing data from different sources
- (2018) Oleg Kovalevskiy et al. Acta Crystallographica Section D-Structural Biology
- Catalytic diversity and cell wall binding repeats in the phage encoded endolysins
- (2018) Sebastian S. Broendum et al. MOLECULAR MICROBIOLOGY
- The EVcouplings Python framework for coevolutionary sequence analysis
- (2018) Thomas A Hopf et al. BIOINFORMATICS
- The coming of age of de novo protein design
- (2016) Po-Ssu Huang et al. NATURE
- De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity
- (2016) S. E. Boyken et al. SCIENCE
- Deep learning
- (2015) Yann LeCun et al. NATURE
- De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy
- (2015) Po-Ssu Huang et al. Nature Chemical Biology
- BetaCavityWeb: a webserver for molecular voids and channels
- (2015) Jae-Kwan Kim et al. NUCLEIC ACIDS RESEARCH
- Control over overall shape and size in de novo designed proteins
- (2015) Yu-Ru Lin et al. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
- Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models
- (2015) Richard R. Stein et al. PLoS Computational Biology
- Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information
- (2014) Sergey Ovchinnikov et al. eLife
- Pfam: the protein families database
- (2013) Robert D. Finn et al. NUCLEIC ACIDS RESEARCH
- Towards automated crystallographic structure refinement withphenix.refine
- (2012) Pavel V. Afonine et al. ACTA CRYSTALLOGRAPHICA SECTION D-BIOLOGICAL CRYSTALLOGRAPHY
- Principles for designing ideal protein structures
- (2012) Nobuyasu Koga et al. NATURE
- The NCBI Taxonomy database
- (2011) S. Federhen NUCLEIC ACIDS RESEARCH
- Direct-coupling analysis of residue coevolution captures native contacts across many protein families
- (2011) F. Morcos et al. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
- XDS
- (2010) Wolfgang Kabsch ACTA CRYSTALLOGRAPHICA SECTION D-BIOLOGICAL CRYSTALLOGRAPHY
- Features and development ofCoot
- (2010) P. Emsley et al. ACTA CRYSTALLOGRAPHICA SECTION D-BIOLOGICAL CRYSTALLOGRAPHY
- Lessons from the lysozyme of phage T4
- (2010) Walter A. Baase et al. PROTEIN SCIENCE
- Learning generative models for protein fold families
- (2010) Sivaraman Balakrishnan et al. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS
- Evaluation at atomic resolution of the role of strain in destabilizing the temperature-sensitive T4 lysozyme mutant Arg 96 → His
- (2009) Blaine H. M. Mooers et al. PROTEIN SCIENCE
- Graphical Models of Residue Coupling in Protein Families
- (2008) J. Thomas et al. IEEE-ACM Transactions on Computational Biology and Bioinformatics
- Identification of direct residue contacts in protein-protein interaction by message passing
- (2008) M. Weigt et al. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA