Protein language models trained on multiple sequence alignments learn phylogenetic relationships
Related publications (39)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
The evolutionary relationships between organisms or phylogenies are fundamental to biology. They are invaluable as guiding tools to mine, organize and exploit the enormous amounts of biological data in the post-genomic era. The advent of high-throughput se ...
Correlation-based alignment is an alternative alignment method for electron beam lithography. Using complex marker patterns, such as Penrose patterns, which contain more positional information, greater alignment accuracy can be achieved. Correlation-based ...
For a long time, natural language processing (NLP) has relied on generative models with task specific and manually engineered features. Recently, there has been a resurgence of interest for neural networks in the machine learning community, obtaining state ...
Networks are central in the modeling and analysis of many large-scale human and technical systems, and they have applications in diverse fields such as computer science, biology, social sciences, and economics. Recently, network mining has been an active a ...
Background: Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have be ...
Background: The HIV-1 genome is subject to pressures that target the virus resulting in escape and adaptation. On the other hand, there is a requirement for sequence conservation because of functional and structural constraints. Mapping the sites of select ...
Sequence motifs are of greater biological importance in nucleotide and protein sequences. The conserved occurrence of identical motifs represents the functional significance and helps to classify the biological sequences. In this paper, a new algorithm is ...
The replication of Moloney murine leukaemia virus relies on the formation of a stable homodimeric 'kissing complex' of a GACG tetraloop interacting through only two C center dot G base pairs flanked of 5'-adjacent Unpaired adenosines A9. Previous NMR inves ...
Background: Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting d ...
Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multiprotein complexes and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution betwee ...
Proceedings of the National Academy of Sciences2016