New Algorithm Increases Success Rate of Drug-Compound Screening

Dmitri Kireev’s new screening tool makes getting a hit from catalogs of prospective lead compounds four to six times more likely.

Scientists developing new drugs increasingly turn to massive online catalogs of prospective compounds to test. The trouble is that very few those chemical contenders even have the potential to be useful because of inaccuracies in the computerized screening process that identified them.

Researchers at the UNC Eshelman School of Pharmacy have created SPLIF, a new, freely shared algorithm that greatly improves the chances of finding a chemical compound that actively connects with a disease target.

Swing and a Miss

The discovery of new medicines usually starts by screening a large collection of chemical compounds against a specific protein target that plays a role in disease. The initial check is being done increasingly often by ultrafast computer programs that pick chemical compounds one by one from databases of tens of millions and virtually “dock” them into a pocket on the protein. Those that fit the pocket are hits and can be purchased and put through real-world tests in a lab.

Most of the time, though, those hits turn out to be misses, says Dmitri Kireev, PhD, director of computational biophysics and molecular design in the School’s Center for Integrative Chemical Biology and Drug Discovery.

“These screening algorithms too often overlook compounds that are actually active against the target while declaring many inactive compounds to be hits,” he says. “As a result, only 3 to 5 percent of purchased virtual hits turn out to be truly active when tested. That is a much better success rate than in a random high-throughput screening, but it’s not good enough.”

Kireev’s team developed SPLIF, an algorithm that reprocesses all those computer-generated docking results and does a much better job of distinguishing genuine actives from false ones.

Picking out the Real Deal

SPLIF stands for structural protein-ligand interaction fingerprint. At the heart of the algorithm is a mathematical model fed by data extracted from the crystal structures of protein-substrate complexes. In multiple benchmark screenings, SPLIF significantly outperformed alternative hit-detection strategies, Kireev says.

For example, the Kireev’s team used SPLIF to identify hits against MER kinase, a target enzyme connected to a type of leukemia. Of sixty-two purchased and tested SPLIF hits, fifteen turned out to be potent MER inhibitors.

Kireev’s team performed ten screenings against public benchmark databases with known quantities of active and inactive compounds. SPLIF was the best in six out of ten screens and came in second in the other four. The next best tool ranked came in first in only two screens. Every other method tried failed dramatically at least twice. On average, SPLIF has demonstrated itself to be four to six times better than popular computational tools for screening compounds, Kireev says.

“No screening method is expected to be the best in every single situation,” Kireev says. “But imagine running a lab that can only afford the $1,500 to $2,500 needed to buy twenty to thirty virtual hits for testing. If a usual success rate for the screening tool you use is 3 percent, you aren’t likely to get any actives at all. But with the success rate of 15 to 20 percent we’ve seen with SPLIF, you’d be sure to secure a few hits.”

The source code of the SPLIF algorithm is freely available, and an article describing it and the outcome of benchmark studies was published in the Journal of Chemical Information and Modeling of the American Chemical Society.

The Study Authors

Dmitri Kireev, PhD, is a research professor at the UNC Eshelman School of Pharmacy and director of computational biophysics and molecular design in the School’s Center for Integrative Chemical Biology and Drug Discovery.
Chenxiao Da, PhD, is a postdoctoral fellow in the School’s Division of Chemical Biology and Medicinal Chemistry.

This work was supported by the University Cancer Research Fund (University of North Carolina at Chapel Hill) and by Federal Funds from the National Cancer institute, National Institute of Health, under contract number HHSN261200800001E.