Proteins are the doers of biology. In the human body, they perform most of the work in cells and are required for the structure, function and regulation of the body’s tissues and organs. Each protein is made of a precise sequence of amino acids that allows it to fold up into a three-dimensional shape determining its function. In the past, the all-important structure of a protein could only be captured by difficult and time-consuming laboratory analysis. Now comes a dramatic new turn—a window on life’s basic building blocks.
DeepMind, a UK-based firm that is part of Alphabet, Google’s parent company, has developed an artificial intelligence and machine learning system that can predict the three-dimensional structure of proteins, decoding the amino acids that make up each protein. Last year, the system had 350,000 entries. Then on July 28, DeepMind co-founder and chief executive Demis Hassabis announced the expansion of the company’s database of folded proteins to more than 200 million — nearly all cataloged proteins known to science, including those in humans, plants, bacteria, animals and other organizations — and that the company is making them publicly available and free, accessible with no more effort than a Google search. The database is called AlphaFold, and it is the equivalent of a James Webb Space Telescope for biology, providing astounding new visuals of a world beyond.
Proteins don’t fold neatly like dishtowels. Many look like a skin of yarn after a cat has played with it. They often have precisely engineered moving parts that are linked to chemical events and, most importantly, bind to other molecules. For example, antibodies are proteins produced by the immune system that bind to foreign molecules, like those on the surface of an invading virus, such as the spikes on the coronavirus. Thus, scientists have sought for decades to learn the exact folding of proteins and their functions. Researchers have long used a technique known as X-ray crystallography to better grasp proteins’ structure, and the field’s central repository contains some 185,000 experimentally solved structures.
Then came artificial intelligence. AlphaFold algorithms learned how to predict the protein folding based on the underlying amino acid sequence, leading to an explosion of new information. Another project called RoseTTAFold at the University of Washington’s Institute for Protein Design is on a similar quest. The protein folding predictions will need verification, in some cases by real-world experiments. But for drug and vaccine developers wanting to know how a protein looks or behaves, the prediction itself—a visual representation—can provide a remarkable leg up. Both the journals Science and Nature Methods cited the breakthrough as the most important of 2021.
Using these new methods, researchers have been able to explore the nuclear pore complex, which acts as a kind of gatekeeper for everything that goes in and out of the cell nucleus. It contains more than 1,000 protein subunits, woven together, so it is a difficult jigsaw puzzle for scientists. Using AlphaFold, researchers were able to create a model nearly twice as complete as the old one, covering two-thirds of the complex.
AlphaFold does not reveal all of biology’s mysteries, nor is it the only advance needed for drug development or disease fighting. But the views are truly astonishing.