Big-Data Analysis Points Toward New Drug Discovery Method
New Tool May Identify Treatments for Cancer, Other Diseases
A research team led by scientists at UC San Francisco has developed a computational method to systematically probe massive amounts of open-access data to discover new ways to use drugs, including some that have already been approved for other uses.
The method enables scientists to bypass the usual experiments in biological specimens and to instead do computational analyses, using open-access data to match FDA-approved drugs and other existing compounds to the molecular fingerprints of diseases like cancer. The specificity of the links between these drugs and the diseases they are predicted to be able to treat holds the potential to target drugs in ways that minimize side effects, overcome resistance and reveal more clearly how both the drugs and the diseases are working.
“This points toward a day when doctors may treat their patients with drugs that have been individually tailored to the idiosyncracies of their own disease,” said first author Bin Chen, PhD, assistant professor with the Institute for Computational Health Sciences (ICHS) and the Department of Pediatrics at UCSF.
In a paper published online on July 12 in Nature Communications, the UCSF team used the method to identify four drugs with cancer-fighting potential, demonstrating that one of them – an FDA-approved drug called pyrvinium pamoate, which is used to treat pinworms – could shrink hepatocellular carcinoma, a type of liver cancer, in mice. This cancer, which is associated with underlying liver disease and cirrhosis, is the second-largest cause of cancer deaths around the world – with a very high incidence in China – yet it has no effective treatment.
Large-Scale Analyses Without Need For Biological Experiments
The researchers first looked in The Cancer Genome Atlas (TCGA), a comprehensive map of genomic changes in nearly three dozen types of cancer that contains more than two petabytes of data, and compared the gene expression signatures in 14 different cancers to the gene expression signatures for normal tissues that were adjacent to these tumors. This enabled them to see which genes were up- or down-regulated in the cancerous tissue, compared to the normal tissue.
Once they knew that, they were able to search in another open-access database, called the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 dataset, to see how thousands of compounds and chemicals affected cancer cells. The researchers ranked 12,442 small molecules profiled in 71 cell lines based on their ability to reverse abnormal changes in gene expression that lead to the production of harmful proteins. These changes are common in cancers, although different tumors exhibit different patterns of abnormalities. Each of these profiles included measurements of gene expression from 978 “landmark genes” at different drug concentrations and different treatment durations.
The researchers used a third database, ChEMBL, for data on how well biologically active chemicals killed specific types of cancer cells in the lab – specifically for data on a drug efficacy measure known as the IC50. Finally, Chen used the Cancer Cell Line Encyclopedia to analyze and compare molecular profiles from more than 1,000 cancer cell lines.
Their analyses revealed that four drugs were likely to be effective, including pyrvinium pamoate, which they tested against liver cancer cells that had grown into tumors in laboratory mice.
“Since in many cancers, we already have lots of known drug efficacy data, we were able to perform large-scale analyses without running any biological experiments,” Chen said.
A Better Predictor of Drug Efficacy
He and colleagues developed a ranking system, which he calls the Reverse Gene Expression Score (RGES), a predictive measure of how a given drug would reverse the gene-expression profile in a particular disease – tamping down genes that are over-expressed, and ramping up those that are weakly expressed, thus restoring gene expression to levels that more closely match normal tissue.
Chen used open-access databases to determine that RGES was correlated with drug efficacy in liver cancer, breast cancer and colon cancer. He focused on liver cancer cell lines, but since they have not been investigated as much as breast and colon cancer cell lines, there was far less data available to study them. So, he used RGES scores for drugs and other biologically active molecules that had been tested on non-liver cancer cell types. The RGES scores were powerful enough that he could still predict which molecules might kill liver cancer cells.
Chen’s collaborators from the Asian Liver Center at Stanford University examined four candidate molecules with known mechanisms of drug action. They found that all four killed five distinct liver cancer cell lines grown in the lab. Pyrvinium pamoate was the most promising drug, shrinking liver tumors grown beneath the skin in mice.
Cancer researchers usually target individual genetic mutations, but Chen said drugs that are targeted in this way often are less effective than anticipated and generate drug resistance. He said a broader measure such as RGES might lead to better drugs and also help researchers identify new drug targets.
Because RGES is based on the molecular characteristics of real tumors, Chen said it also may be a better predictor of a drug’s true clinical promise than high-throughput screening of large panels of drugs and other small molecules, which are based on drug activity in lab-grown cell lines.
As costs come down and the number of gene expression profiles in diseases continues to grow, I expect that we and others will be able to use RGES to screen for drug candidates very efficiently and cost-effectively.
UCSF Institute for Computational Health Sciences
“As costs come down and the number of gene expression profiles in diseases continues to grow, I expect that we and others will be able to use RGES to screen for drug candidates very efficiently and cost-effectively,” Chen said. “Our hope is that ultimately our computational approach can be broadly applied, not only to cancer, but also to other diseases where molecular data exist, and that it will speed up drug discovery in diseases with high unmet needs. But I’m most excited about the possibilities for applying this approach to individual patients to prescribe the best drug for each.”
The senior UCSF co-author on the study was Atul Butte, MD, PhD, director of the ICHS. The senior co-author from Stanford was Mei-Sze Chua, PhD, senior research scientist at the Asian Liver Center (ALC) and Department of Surgery at Stanford University School of Medicine. The co-first author from Stanford was Li Ma, PhD, a postdoctoral fellow at the Stanford ALC. Additional co-authors from UCSF include, from the ICHS and Department of Pediatrics, Marina Sirota, PhD, an assistant professor, and Hyojung Paik, PhD, a postdoctoral fellow; additional authors from the Stanford ALC and Department of Surgery were Wei Wei, PhD, a research associate, and Samuel So, MD, the executive director of the ALC, and the Lui Hac Minh Professor and Professor of Surgery at Stanford University School of Medicine.
The study was funded by the National Institutes of Health. Butte is a founder and scientific advisor to NuMedii, Inc., a drug-discovery company.
UC San Francisco (UCSF) is a leading university dedicated to promoting health worldwide through advanced biomedical research, graduate-level education in the life sciences and health professions, and excellence in patient care. It includes top-ranked graduate schools of dentistry, medicine, nursing and pharmacy; a graduate division with nationally renowned programs in basic, biomedical, translational and population sciences; and a preeminent biomedical research enterprise. It also includes UCSF Health, which comprises three top-ranked hospitals, UCSF Medical Center and UCSF Benioff Children’s Hospitals in San Francisco and Oakland, and other partner and affiliated hospitals and healthcare providers throughout the Bay Area.