Monday 17 August 2009

How does rescoring improve results in docking?

Despite more than a decade of research into improved scoring functions, a scoring function that can accurately predict binding affinities remains an elusive goal. Even the simpler problem of identifying ligands from a data set of inactive molecules is a challenge for modern scoring functions, although for a given protein a particular scoring function may work very well. While there is certainly a need for the development of improved scoring functions with better performance over a wider range of protein families, it is also important to make the maximal use of currently available scoring functions. One of the ways to do this is to combine existing scoring functions in a so-called rescoring experiment.
Testing Assumptions and Hypotheses for Rescoring Success in Protein−Ligand Docking Noel M. O'Boyle, John W. Liebeschuetz and Jason C. Cole, Journal of Chemical Information and Modeling, 2009, ASAP.

A rescoring experiment simply involves taking the docking poses found by Scoring Function A, and assessing them (after local optimization if you want to avoid artifacts) with Scoring Function B. Compared to the length of time a docking requires, rescoring is almost instant. Although rescoring has the potential to improve results in a virtual screen, it won't always. This means that it is important to understand the underlying reasons for success in rescoring. This would then allow the choice of appropriate Scoring Functions A and B.

JCIM has just published some work of mine in which I investigate two hypotheses for rescoring success:
  • That rescoring success occurs due to some consensus effect between the two scoring functions that eliminates false positives
  • That rescoring success occurs due to complementary between the scoring functions; that is, the first scoring function is better at pose prediction, while the second is better at scoring actives relative to inactives
As far as I am aware, this is the first study to investigate why rescoring can improve results in a virtual screen.

1 comment:

Unknown said...

"rescoring success occurs due to complementary between the scoring functions; that is, the first scoring function is better at pose prediction, while the second is better at scoring actives relative to inactives".

This is exactly the approach used by Glide. It uses two different scoring functions: one for selecting the best pose for a given ligand, and another to rank different ligands. See J. Med. Chem., 2004, 47 (7), pp 1739–1749.