New AI Model Makes Drug Discovery Faster, Smarter, and More Transparent

A new algorithm could help researchers better predict how molecules bind to proteins —an essential step in designing more effective drugs to treat a wide range of diseases. 

Bruce Donald, PhD, James B. Duke Distinguished Professor of Computer Science and professor of biochemistry, and Yuxi (Jaden) Long, a former undergraduate in the Donald Lab and now a graduate student at Memorial Sloan Kettering Cancer Center, developed the Predicting Affinity Through Homology (PATH) model.  

Yuxi (Jaden) Long presenting a Poster on the PATH
Yuxi (Jaden) Long presenting a poster on PATH while he was an undergraduate at Duke University.

PATH dramatically reduces the number of parameters required by traditional deep learning models, making the results simpler and easier to interpret. The new tool has already been integrated into OSPREY (Open Source Protein Redesign for You), a free software suite developed by the Donald Lab. 

“Previous models, which used deep learning, use billions of parameters and tens of thousands of features,” Donald said. “They report good correlations, but we don’t know why.” 

PATH changes that by incorporating interpretable machine learning with algebraic topology, a branch of mathematics devoted to studying shapes. This approach allows researchers to trace predictions back to individual molecular interactions. “We can now understand how the algorithm made these predictions,” Long said. “We can see exactly how much each atom contributed.” 

Unlike previous models that tend to be overly optimistic — predicting molecules will bind even when most actually don’t— PATH excels at distinguishing binders from non-binders. “If you look at a million small molecules and a protein target, only two or three will actually bind,” Donald explained. “Most previous models predict binding because they have only seen positive examples, so they are ‘trained to please.’ PATH incorporates a second module that specifically discriminates between binders and non-binders.” 

The result is not just more accurate predictions but also a massive boost in efficiency: PATH can run up to 1,000 times faster than previous methods by reducing unnecessary computational steps. 

PATH works with any proteins, small molecules, peptides, and antibodies.  

The Donald Lab plans to use PATH to develop new cancer drugs and HIV antibodies, particularly those targeting kinases, protein interactions, and transcription factors. The goal is to design drugs that bind precisely to their intended targets — and nothing else — for maximum effectiveness. 

While PATH won’t solve all the challenges of drug discovery and protein binding prediction, it represents a significant step toward building faster, more transparent, and more interpretable AI tools for drug design. 


Publication: Predicting Affinity Through Homology (PATH): Interpretable binding affinity prediction with persistent homology. Long Y, Donald BR. 

PLoS Comput Biol. 2025 Jun 27;21(6):e1013216. doi: 10.1371/journal.pcbi.1013216. 

Funding: NIH Outstanding Investigator Grant to BRD: R35 GM-144042  

 

Share