For a few years now John McGeehan, a biologist and director of the Center for Enzyme Innovation in Portsmouth, England, has been searching for a molecule that can break down the 150 million tons of soda bottles and other plastic waste scattered around the world.
Working with researchers from both sides of the Atlantic, he has discovered some good alternatives. But her work is one of the most sought-after locksmiths: pinpointing chemical compounds that will twist and twist on their own into microscopic shapes that can fit snugly into plastic bottle molecules and break them apart, such as A key opens a door.
Determining the exact chemical content of a given enzyme is a fairly simple challenge these days. But identifying its three-dimensional shape may involve years of biochemical experimentation. So last fall, after reading that an artificial intelligence lab in London called DeepMind had created a system that automatically predicts the sizes of enzymes and other proteins, Dr. McGeehan asked the lab if it would help his project. Can do.
At the end of a work week, he sent DeepMind a list of seven enzymes. The following Monday, the lab returned the sizes for all seven. “It took us a year ahead of where we were, if not two,” Dr. McGeehan said.
Now, any biochemist can speed up his work in the same way. On Thursday, DeepMind released the predicted shapes of more than 350,000 proteins — the microscopic mechanisms that drive the behavior of bacteria, viruses, the human body and all other living things. This new database includes three-dimensional structures for all proteins expressed by the human genome, as well as for proteins that appear in 20 other organisms, including the mouse, fruit fly and E. coli bacteria.
This vast and detailed biological map – which provides about 250,000 shapes that were previously unknown – could accelerate the ability to understand diseases, develop new drugs, and reuse existing drugs. It could also lead to new types of biological devices, such as an enzyme that efficiently breaks down plastic bottles and converts them into materials that are easily reused and recycled.
“It can take you forward in time—influencing the way you think about problems and helping them solve them faster,” said Gira Bhabha, assistant professor in the Department of Cell Biology at New York University. “Whether you study neuroscience or immunology—whatever your field of biology—it can be useful.”
This new knowledge is key of its kind: If scientists can determine the shape of a protein, they can determine how other molecules will bind to it. It could reveal, say, how bacteria resist antibiotics – and how to counter that resistance. Bacteria resist antibiotics by expressing certain proteins; If scientists were able to identify the shape of these proteins, they could develop new antibiotics or new drugs that suppressed them.
In the past, pinpointing the size of proteins required months, years or decades of trial-and-error experiments involving X-rays, microscopes and other equipment on the lab bench. But DeepMind can shorten the timeline significantly with its AI technology, known as AlphaFold.
When Dr. McGeehan sent his list of seven enzymes to DeepMind, he told the lab that he had already identified the motifs for two of them, but did not specify which of the two enzymes were. It was a way to test how well the system worked; AlphaFold passed the test by correctly predicting both shapes.
It was even more remarkable, Dr. McGeehan said, that the predictions came within days. He later learned that Alphafold had actually completed the task in just a few hours.
AlphaFold predicts protein structures called neural networks, a mathematical system that can learn functions by analyzing vast amounts of data – in this case, thousands of known proteins and their physical sizes – and extrapolating into the unknown.
It’s the same technology that recognizes commands you receive on your smartphone, recognizes faces in photos you post on Facebook, and translates one language into another on Google Translate and other services . But many experts believe that AlphaFold is one of the most powerful applications of the technology.
“It shows that AI can do useful things amidst the complexity of the real world,” said Jack Clark, one of the authors of the AI Index, an effort to track the progress of artificial intelligence technology around the world.
As Dr. McGeehan discovered, it can be remarkably accurate. AlphaFold can predict the shape of a protein with an accuracy that rivals physical experiments about 63 percent of the time, according to independent benchmark tests that compare its predictions to known protein structures. Most experts had assumed that such a powerful technology was still years away.
“I thought it would take another 10 years,” said Professor Randy Reid from the University of Cambridge. “It was a complete transformation.”
But system accuracy varies, so some predictions in DeepMind’s database will be less useful than others. Each prediction in the database comes with a “confidence score” indicating how accurate it is likely to be. DeepMind researchers estimate that the system provides “good” predictions about 95 percent of the time.
As a result, the system cannot completely replace physical experiments. It is used in conjunction with work on the lab bench, helping scientists determine which experiments they should run and filling in the gaps if experiments fail. Using AlphaFold, researchers at the University of Colorado Boulder recently helped identify a protein structure they had been struggling to identify for more than a decade.
The developers of DeepMind have opted to freely share their database of protein structures, rather than have access to sales, with the hope of advances in the biological sciences. “We are interested in maximum impact,” said Demis Hassabis, chief executive and co-founder of DeepMind, which is owned by the same parent company as Google but operates more like a research lab than a commercial business. .
Some scientists have compared DeepMind’s new database to the Human Genome Project. Completed in 2003, the Human Genome Project provided a map of all human genes. Now, DeepMind has provided a map of the nearly 20,000 proteins expressed by the human genome — another step toward understanding how our bodies work and how we might respond when things go wrong.
It is also expected that the technology will continue to develop. A lab at the University of Washington has created a similar system called RoseTTafold, and like DeepMind, it has openly shared the computer code running its system. Anyone can use technology, and anyone can work to improve it.
Before DeepMind began sharing its technology and data openly, AlphaFold was feeding a wide range of projects. Researchers at the University of Colorado are using this technology to understand how bacteria such as E. coli and Salmonella develop resistance to antibiotics, and develop ways to combat this resistance. Researchers at the University of California, San Francisco, have used this tool to improve their understanding of the coronavirus.
The coronavirus wreaks havoc on the body through 26 different proteins. Researchers with the help of AlphaFold improved their understanding of a key protein And are hoping the technology can help increase their understanding of the other 25.
If it is too late to make an impact on the current pandemic, it can help prepare for the next pandemic. “A better understanding of these proteins will help us target not only this virus but other viruses,” said Clement Verba, one of the San Francisco researchers.
The possibilities are numerous. DeepMind collaborated with Dr. After sizing up McGeehan, who could potentially rid the world of plastic waste, he sent a list of 93 more to the lab. “They are working on these now,” he said.