10 Common Mistakes in Fragment Screening

There is an excellent review paper from Dan Erlanson and Ben Davis that came out last year detailing some of the more common mistakes and artifacts that can arise in fragment-based screening campaigns (so-called “unknown knowns”).  I encourage readers to go read the original paper.  I have summarized some of the key points below:

1) Not checking compound identity to make sure what you think you purchased is what you actually have.

2) Low-level impurities in compound stocks can cause problems at the high concentrations used in fragment screens.

3) DMSO, commonly used to store fragments in plates, can act as a mild oxidant and is also hygroscopic.

4) Pan-assay interference compounds (PAINS) are common in many libraries and are found to give false positives to many targets.

5) Reactive functional groups in fragment hits can cause covalent binding or aggregation of the target.

6) Many fragments can show binding or inhibition while acting as aggregators rather than reversible binders.  Including a small % of detergent can help eliminate these kinds of fragments from giving positive signals.

7) STD-NMR is very sensitive to weak binders, but because it relies on a relatively fast disassociation rate for the ligand, tight binders (<1 uM) can be missed by this method.

8) X-ray crystallographic structures are often taken as the “truth” when they are in fact a model of an electron density.  Fragments can often be modeled into the density in incorrect orientations or in place of solvent atoms.

9) SPR methods are very sensitive to fragment binding, but can be confounded by non-specific binding of fragment to the target or chip, as well as compound-dependent aggregation.

10) Fragment hits should be validated by more than one method before embarking on optimization.  They should also be screened for being aggregators by DLS or other methods.

Using R to automate ROC analysis

ROC analysis is used in many types of research.  I use it to examine the ability of molecular docking to enrich a list of poses for experimental hits.  This is a pretty standard way to compare the effectiveness of docking methodologies and make adjustments in computational parameters.

An example ROC plot on a randomly generated dataset
An example ROC plot on randomized data

Normally this kind of plot would take at least an hour to make by hand in Excel, so I wrote a function in R that generates a publication-quality ROC plot on the fly.  This is handy if you want to play around with the hit threshold of the data (i.e., the binding affinity) or experiment with different scoring functions.

According to wikipedia:

a receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the total actual positives (TPR = true positive rate) vs. the fraction of false positives out of the total actual negatives (FPR = false positive rate), at various threshold settings.

There are already several ROC plot calculators on the web.  But I wanted to write my own using the R statistical language owing to its ability to produce very high-quality, clean graphics.  You can find the code here:

https://github.com/mchimenti/data-science-coursera/blob/master/roc_plot_gen.R

The function takes a simple 2 column input in csv format.   One column is “score,” the other is “hit” (1 or 0).   In the context of docking analysis, “score” is the docking score and hit is whether or not the molecule was an experimental binder.   The area-under-curve is calculated using the “trapz” function from the “pracma” (practical mathematics) package.

 

Gilead’s innovative approach to Hep C drug, Sovaldi

Hepatitis C virus (HCV) is a single-stranded RNA virus that infects an estimated 180 million people worldwide.

In 2013, Gilead received FDA approval for a new HCV drug, Sovaldi (sofosbuvir), that inhibits viral replication by targeting the virus’s NS5B polymerase.  Sovaldi has shown a very high cure rate (nearly 100% HCV suppression and sustained virological response) in clinical trials of previously untreated patients and has fewer side effects than pegylated-interferon and ribavirin therapies.

Sovaldi is a methyluridine-monophosphate prodrug: it is metabolized in the body back into methyluridine-triphosphate, which acts as a potent substrate mimic and inhibitor of the NS5B polymerase.

What is interesting about Sovaldi is the approach the scientists took to getting the inhibitor into the cell, relying on phosphoramidate prodrug  technology that had been effectively used to develop anti-HIV drugs, but had never been applied before to this class of anti-HCV drugs.

During development, the researchers decided that they needed to deliver the charged methyluridine-monophosphate (rather than the neutral methyluridine)  into the cell on the basis of two key observations:  1) the methyluridine triphosphate is the active compound against HCV NS5B polymerase, while the methyluridine alone is inactive (owing  to very low conversion to monophosphate in vivo) and 2) the methyuridine monophosphated derivative can be anabolized in the cell back to the potent triphosphate form by an endogenous uridine-cytidine monophosphate kinase.

jm-2010-00863x_0003
The active uridine triphosphate (6) can be created when 4 is metabolized to methyluridine-5′-monophosphate. Compound 5 is not phosphorylated and is inactive in cells.

 

The phosphoramidate prodrug technology had never been applied to HCV inhibition until Solvadi.

The idea behind  phosphoramidate prodrug technology is to create a membrane-soluble neutral prodrug derivative that can be metabolized in the liver by carboxylesterase-mediated cleavage and subsequent steps back to the monophosphate form.

The researchers applied the approach and after a significant amount of  SAR investigation and PK/PD studies around the chemical composition of the phosphoramide substituents, they concluded that the structure of compound shown above was the optimal structure to deliver the methyluridine-monophosphate to the liver.

The result is a new generation of highly effective HCV therapeutics with few side effects that can make a significant difference in the lives of patients living with HCV.

 

Tackling challenging targets with Chemotype Evolution

Carmot Therapeutics, a small company located in San Francisco’s Mission Bay, has developed a very innovative drug discovery technology, called Chemotype Evolution (CE), that relies on fragment-based discovery but is different from traditional FBDD and HTS approaches in important ways.

The first important innovation is that CE relies on a “bait” molecule as a starting point for screening.  The bait can be a known ligand, cofactor, or inhibitor.  The bait is then derivatized with a linker moiety that allows it to become chemically bonded with every fragment in a proprietary library.  This process generates a screening library that contains thousands of bait-fragment hybrids.

The most powerful aspect of CE is the ability to iterate over chemical space, allowing access to an exponential number of possible fragment-bait hybrids.

These hybrids are then screened against the target for binding using either biophysical or biochemical screening techniques in a high-throughput plate format.
The most powerful aspect of CE is the ability to iterate over chemical space, allowing access to an exponential number of possible fragment-bait hybrids.  The method can be iterated with new “baits” derived from the best fragment hits of the previous round.  Thus, instead of having 7,000 fragments in your library, after 3 iterations you access 7,000^3 possible combinations (343 billion possible compounds), selecting only the most target-relevant chemotypes at each stage.

figure-image
Schematic of the Chemotype Evolution process through 3 iterations. Note that at any point after each iteration, the hit molecules can be taken into hit-to-lead optimization.

The CE approach is similar in concept to the “tethering” approach pioneered at Sunesis, but differs in the fact that no protein engineering of cysteine residues needs to be performed.  The bait molecule performs the role of the engineered cys, providing a “handle” that binds to the target and selects for complementary fragment binders.

Carmot Therapeutics just embarked upon their first major industry collaboration with the January 2014 announcement of a partnership with Amgen

Carmot Therapeutics just embarked upon their first major industry collaboration with the January 2014 announcement of a partnership with Amgen to use CE technology against two challenging targets.  Identifying leads and developing hits will be carried out jointly between the companies, while clinical trials will proceed at Amgen.  I think Carmot is definitely a company to watch given its innovative and potentially paradigm-shifting discovery technology and increasing interest from big pharma.