Imagine a world where the most pressing global issues, from climate change to disease treatment and plastic waste disposal, share a common solution. This might sound like science fiction, but advancements in Artificial Intelligence (AI), particularly in the realm of protein folding, suggest this could become a reality. The ability to accurately predict protein structures using AI is not just a scientific breakthrough; it’s a key that could unlock solutions to numerous challenges facing humanity.
The Protein Folding Problem: A Century-Old Mystery
For decades, determining the structure of proteins was one of the most significant challenges in biology. Understanding protein structures is crucial because a protein’s shape dictates its function. Think of proteins as tiny machines, each designed for a specific purpose within our bodies and the natural world. Their functionality depends heavily on their precise three-dimensional (3D) structure.
What are Proteins? What is Protein Folding?
Proteins begin as simple chains of amino acids. Each amino acid contains a central carbon atom bonded to an amine group on one side and a carboxyl group on the other. The fourth bond can be one of 20 different side chains, which determines the specific amino acid.

Protein Basic Structure

Side Chain
These amino acids link together through peptide bonds, forming a long string. The interactions between countless molecules, electrostatic forces, hydrogen bonds, and solvent interactions cause this string to coil and fold in on itself, ultimately defining the protein’s 3D structure.

Peptide bonds, forming a long string



Why does Protein Shape Matter?
The shape of a protein is critical to its function. For example, haemoglobin, the protein in red blood cells, has a specific binding site that allows it to efficiently carry oxygen throughout the body. Similarly, proteins in muscles change shape to facilitate movement and contraction. Understanding these shapes allows scientists to manipulate and optimize protein functions for various applications.
“These are machines, they need to be in their correct orientation in order to work together to move, for example, the proteins in your muscles. They change their shape a little bit in order to pull and contract.”
-John Dumper, 2024 Nobel Prize in Chemistry, Google Deepmind
The Traditional Approach: X-Ray Crystallography for Protein Folding Problem
Historically, scientists determined protein structures using X-ray crystallography. This method involves creating a crystal from the protein and exposing it to X-rays. The resulting diffraction pattern is then used to deduce the protein’s structure. However, this process is time-consuming, expensive, and often challenging. It can cost tens of thousands of dollars per protein, and it may take years to determine the structure of just one protein.
“The first way protein structure was determined was by creating a crystal out of that protein. This was then exposed to x-rays to get a diffraction pattern, and then scientists would work backwards to try to figure out what shape of molecules would create such a pattern.”

The AI Breakthrough: AlphaFold and the Protein Revolution
The advent of AI has revolutionized the field of protein structure prediction. Algorithms like AlphaFold, developed by DeepMind, have drastically reduced the time and resources required to determine protein structures. This breakthrough has opened new avenues for solving global problems.
Despite the computational hurdles, researchers weren’t giving up.
The CASP Competition for Protein Folding Problem
To accelerate progress in protein structure prediction, the Critical Assessment of Structure Prediction (CASP) competition was established in 1994. The challenge was to create a computer model that could predict a protein’s structure from its amino acid sequence. Models were scored based on their accuracy, with a score above 90 considered a solved structure.
“The challenge was simple, to design a computer model that could take an amino acid sequence and output its structure. The modelers would not know the correct structure beforehand, but the output from each model would be compared to the experimentally determined structure.”
AlphaFold 1: The First Attempt at solving Protein Folding Problem
AlphaFold 1 used a standard deep neural network trained on a vast database of known protein structures.


The algorithm took a protein’s amino acid sequence and evolutionary data as inputs and predicted a 2D representation of the protein structure. While it showed promise, it did not meet the CASP threshold for accurate structure prediction.
“As input, AlphaFold took the protein’s amino acid sequence and an important set of clues given by evolution.”
AlphaFold 2: A Transformative Leap in solving Protein Folding Problem
AlphaFold 2 marked a significant advancement. The team incorporated the “transformer” architecture, similar to that used in large language models like ChatGPT.
This new model called the EVO Former, used “attention” to add context to the amino acid sequences, breaking them down into chunks and making connections between them.
AlphaFold 2 was really a system about designing our deep learning. The individual blocks to be good at learning about proteins, have the types of geometric physical, evolutionary concepts that were needed and put it into the middle of the network instead of a process around it.
Key Steps to AlphaFold 2’s Success:
- Maximum Compute Power: Leveraging Google’s extensive computing resources.
- Large and Diverse Data Sets: Training the AI on a comprehensive collection of protein structures.
- Better AI Algorithms: Implementing advanced machine learning techniques.
How AlphaFold 2 solves Protein Folding Problem


EVO transformer
The EVO Former in AlphaFold 2 contains two towers: a biology tower and a geometry tower. The biology tower processes evolutionary information, while the geometry tower handles pair representations of the protein structure. These towers exchange information, refining their understanding of the protein through multiple iterations.
“The EVO Former contained two towers, evolutionary information in the biology tower and pair representations in the geometry tower.”
Triangular attention is also applied, where the AI uses the triangle inequality to constrain the distances between triplets of amino acids.


This helps the model produce a self-consistent picture of the structure.
“For each triplet of amino acids, AlphaFold applies the triangle inequality. The sum of two sides must be greater than the third. This constrains how far apart these three amino acids can be.”
The structure module then uses the geometrical features learned by the network to predict the final 3D protein structure. Unlike previous methods, AlphaFold 2 does not explicitly encode the fact that the amino acids form a chain. Instead, it positions each amino acid separately, allowing the chain to emerge naturally.

Passing the Threhold
“It’s more like we give it a bag of amino acids and it’s allowed to position each of them separately.”
The Impact of AlphaFold: A New Era in Scientific Discovery
AlphaFold’s success has had a profound impact on scientific research. It has significantly accelerated the pace of discovery and innovation in various fields.
Unveiling Millions of Protein Structures
Before AlphaFold, scientists had painstakingly determined the structures of about 150,000 proteins over six decades. In a single stroke, AlphaFold unveiled over 200 million protein structures, nearly all proteins known to exist in nature. This achievement has advanced research by decades.
“Over six decades, all of the scientists working around the world on proteins painstakingly found about 150,000 protein structures. Then in one fell swoop, AlphaFold came in and unveiled over 200 million of them.”
Applications Across Disciplines
AlphaFold is being used to develop vaccines for diseases like malaria, break down antibiotic resistance enzymes, and understand how protein mutations lead to diseases such as schizophrenia and cancer. It has also provided new insights into the life mechanisms of endangered species. The AlphaFold 2 paper has been cited over 30,000 times, underscoring its significant impact on scientific research.
Nobel Prize Recognition
The significance of this breakthrough was recognized with the award of the Nobel Prize in chemistry. This prestigious award highlights the transformative impact of AI in solving complex scientific problems.
Designing New Proteins: The Next Frontier
Building on the success of AlphaFold, scientists are now using AI to design completely new proteins from scratch. This capability opens up even more possibilities for addressing global challenges.
RF Diffusion: Creating Proteins from Noise
David Baker’s lab has developed a technique called “RF Diffusion” that uses generative AI, similar to programs like Dall-E, to design proteins. The AI is trained by adding random noise to known protein structures and then learning to remove this noise. Once trained, the AI can generate proteins for specific functions from a random noise input.
“His technique called “RF Diffusion” is trained by adding random noise to a known protein structure, and then the AI has to remove this noise. Once trained in this way, the AI can be asked to produce proteins for various functions.”
Potential Applications
The ability to design proteins has numerous potential applications, including creating human-compatible antibodies to neutralize lethal snake venom, developing new vaccines, and designing enzymes that can capture greenhouse gases or break down plastic. These advancements promise to revolutionize medicine, environmental science, and materials science.
AI Beyond Proteins: Transforming Other Fields
The success of AI in protein folding is just the beginning. AI is transforming other fields of science and technology, offering solutions to complex problems and driving innovation.
Materials Science
DeepMind’s GNoME program has discovered 2.2 million new crystals, including over 400,000 stable materials that could be used in superconductors and batteries.


This represents a significant leap forward in materials science, with the potential to power future technologies.
The Future of AI in Science
AI is pushing the boundaries of human knowledge at an unprecedented rate. By solving fundamental problems, AI is unlocking new avenues of discovery and driving progress across various disciplines. The speed and scale at which AI can analyze data and generate solutions are revolutionizing the way science is conducted.
“If we think of the whole tree of knowledge, you know there are certain problems where you know if their root, no problems. If you unlock them, if you discover a solution to them, it would unlock a whole new branch or avenue of discovery.”
Conclusion: Embracing the AI Revolution
The AI revolution in protein folding and materials science is transforming our ability to address global challenges. AI is not just a tool; it’s a partner that enhances human capabilities and accelerates the pace of discovery. By embracing AI and investing in its development, we can unlock solutions to some of the most pressing problems facing humanity. Imagine a world without new diseases, with abundant clean energy, and where plastic pollution is a distant memory. Is this future within reach thanks to AI?
Disclaimer: Throughout this article, certain terms are used with specific meanings in the context of AI and biochemistry:
- Protein Folding: The process by which a protein structure assumes its functional shape.
- AI: Artificial Intelligence, referring to machine learning algorithms and neural networks.
- AlphaFold: A specific AI program developed by DeepMind for protein structure prediction.
- EVO Former: The transformer architecture used in AlphaFold 2.
- Protein structure prediction: AI algorithms predicting the structure of a protein from its amino acid sequence.
References
- AlphaFold: Using AI for protein structure prediction – DeepMind
- Protein structure prediction: A brief history – Nature Methods
- CASP: Critical Assessment of Structure Prediction – CASP
- RFDiffusion: Protein design with deep learning – Baker Lab
- GNoME: AI for materials discovery – DeepMind Blog
- 3Blue1Brown: Transformers and Attention – YouTube
Good one