NAS Report Calls for Building Biomedical Knowledge Network to Drive Precision Medicine

A National Academy of Sciences committee co-chaired by UCSF Chancellor Susan Desmond-Hellmann, MD, MPH, recommends the creation of a Google maps-like data network that could transform the future of medical discovery, diagnosis and treatment.

The so-called “Knowledge Network” would integrate the wealth of data emerging on the molecular basis of disease with information on environmental factors and patients’ electronic medical records, with the goal of developing more diagnostics and treatments tailored to individual patients — known as “precision medicine.”

This diagram illustrates a comprehensive biomedical knowledge network that suppo

(See Larger) This diagram illustrates a comprehensive biomedical knowledge network that supports a New Taxonomy of disease. 

The development of this broadly accessible data network would allow scientists to share emerging research findings faster, thereby accelerating the development of tailored treatments. It also would allow clinicians to make more informed decisions about treatments. It would reduce health care costs and ultimately improve care.

The report, titled “Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease,” [PDF] is the result of a one-year study conducted at the special request of Francis Collins, MD, PhD, director of the National Institutes of Health (NIH).

Keith Yamamoto, PhD, vice chancellor for research at UCSF who served on the National Academy of Sciences committee, considers the proposal “the most important National Academy of Sciences Framework Analysis since that advisory body recommended that the United States go forward with the Human Genome Project.”

The task of the committee, which also included Bernard Lo, MD, UCSF professor of medicine and director of the Program in Medical Ethics, was to propose a strategy for incorporating molecular data into the current classifications of diseases — going beyond descriptions that are today generally defined broadly by organs and symptoms — thus creating a “new taxonomy” of disease.   

But the committee went further, proposing an infrastructure in which the patient would be the lynchpin of a health care system where findings from the laboratory would inform patient care, and individuals’ responses to treatment would inform basic research.

The Knowledge Network would be centered on a dynamic, interactive data repository, or “Information Commons,” that, like Google maps, would link layers ofdata to reveal information. The layers – environmental exposures, signs and symptoms, genetics, epigenetics, microbial exposures and othertypes of patient data – would be linked to data on individual patients. Data would be continuously refreshed with new basic and clinical research results and patient outcomes.

Chancellor Speaks About Report


The approach would enable basic scientists to mine and manipulate patient data in order to explore common molecular mechanisms across illnesses, and to test their hypotheses about the causes of diseases. Clinicians could tap into the network to learn about the latest findings, informing their diagnoses and enriching their treatment approaches.

The challenge of creating such a complex data network would be like “building Europe’s great cathedrals,” said co-chair Charles Sawyers, MD, a Howard Hughes Medical Institute investigator and the inaugural director of the Human Oncology and Pathogenesis Program at Memorial Sloan-Kettering Cancer Center. “One generation will start building them, but they will ultimately be completed by another, with plans changing over time.”

Far-Reaching Impact for Treating Disease
The implications of the initiative could be significant. Many diseases, such as Type II diabetes, are thought to have diverse molecular and environmental influences, yet are classified — and treated — as one disease. Conversely, many illnesses share a common molecular cause, yet do not share the same disease classification.

Exacerbating the problem, the International Classification of Diseases (ICD), used to track and diagnose disease and determine the level of reimbursement for care, is based primarily on physical signs and symptoms, seldom incorporating new data, patient characteristics or socio-environmental influences on disease. Moreover, information is slow to reach the clinic to benefit patients. Data from ICD-9, developed in 1977, remains widely used in the clinic. ICD-10, established in 1992, is just being implemented. ICD-11 is not expected to be available until 2015.

“A disconnect exists between the wealth of scientific advances in research and the incorporation of this information into the clinic,” said Desmond-Hellmann, an oncologist by training. “Biomedical research information can take years to trickle to doctors and patients, while wasteful health care expenditures are carried out for treatments that are only effective in specific subgroups. Meanwhile, researchers don’t have access to comprehensive and timely information from the clinic. Opportunities are being missed to understand, diagnose and treat diseases more precisely, and to better inform health care decisions.”

The potential of improving the practice of medicine based on molecular research findings is clear in the few areas, mostly involving cancer, where it’s being applied. Today, patients with breast cancer receive a precise diagnosis based on the specific characteristics of their tumor. This allows clinicians to tailor treatments, and for patients’ relatives to learn their predisposition to the disease.

Likewise, scientists know of nine molecular drivers of non-small-cell lung cancers, and, long-term, drugs could be developed to target the specific mechanisms. Through clinical trials, already, 10 percent of patients with non-small cell lung cancer, have had  dramatic response to a particular drug.

In contrast, adult patients with Type II diabetes are lumped into one category, based primarily on their age and their blood sugar levels. Physicians can’t predict their response to the drug they likely will receive: metformin, developed in 1957. Likewise, they can’t predict the likelihood that they will develop kidney failure, blindness or other diabetes-related complications, and so can’t tailor treatment regimens. Similarly, it’s impossible to assess the risk of developing diabetes among patients’ siblings and children.

Advancing Basic Research Collaborations
While the potential impact of the Knowledge Network on patients is clear, the impact on basic research would be profound as well, said Yamamoto, UCSF professor of cellular and molecular pharmacology. 

In the database ‘cloud’ envisioned by the committee, he said, “One could imagine pulling down information about who’s studying a given type of molecular mechanism, or cross-correlating that mechanism to a disease, and asking, ‘What other diseases are linked?’ ‘What environmental factors influence this mechanism’, ‘Who are the people doing this type of work?’

“If you swept out in bigger and bigger areas of the network, pretty quickly studies, and the people doing them, would show up that you didn’t know about. You’d ask, ‘Who are these people? How are they approaching some of the same questions that interest me?’ In this imagined network, there would be a self-assembled team of collaborators. You’d be motivated to sit down with them and learn from each other. You could ask and answer questions you couldn’t possibly answer by yourself.”

The strategy for fostering new research collaborations is similar to that of the UCSF Clinical and Translational Science Institute’s Profiles database, which works by providing a quick search to discover people, research expertise, and networks of co-authors and similar people. 

Proposed Next Steps


The NAS committee’s proposed next steps for creating a Knowledge Network of disease and deriving a new taxonomy from it:

  • Conduct pilot studies that begin to populate the Knowledge Network’s data repository, or “Information Commons.”
  • Integrate data from pilot studies with results of basic biomedical research, to create a dynamic, interactive Knowledge Network.
  • Initiate process with federal agencies to assess privacy issues.
  • Ensure widespread data sharing on the links between molecular data and disease symptoms, so that a wide diversity of researchers can mine them. Standards should provide incentives that respect privacy concerns and motivate data sharing.
  • Develop an efficient validation process to incorporate information from the Knowledge Network into a new taxonomy of disease.
  • Develop incentives that encourage public-private partnerships involving government, drug developers, regulators, advocacy groups and payers.

While the committee’s charge was to consider the value and feasibility of a new framework, the scientists also proposed exploratory first steps, which the NIH sponsors received postively, Desmond-Hellmann said. "We anticipate they will be able to use the report as a framework for their actions."

The recommendations include a series of pilot studies to populate the Knowledge Network’s data repository, allowing scientists to assess the ability of integrating molecular data with medical histories and health outcomes in the ordinary course of care. One proposed pilot is a Million American Genomes Initiative (MAGI), which would involve sequencing the genomes of 1 million Americans, looking for connections with medical histories that could reveal genetic causes of disease.

Another is the UCSF/Kaiser study mapping the genetics of 100,000 patients to study disease, health and aging, which was featured prominently in the report as an example. "We expect that many at UCSF will be involved in the action plan regarding this study," Desmond-Hellmann said.

Next steps would include integrating data from such studies with results of the basic research network.

Of course, significant thought would have to be given to the many factors supporting the network, the committee members noted in their report.

“The complexity and potential impact of such a network requires that the scientific community, together with the public, patient representatives, and disease advocacy groups, carefully consider privacy and consent issues, oversight of such research projects, validation of the clinical significance of discoveries, and pre-competitive collaboration among industry and academic researchers, and data sharing,” said committee member Lo, of UCSF.  

“Disease advocacy groups already are driving innovative research to find better treatment,” he said. Now, he said, “Scientists need to work closely with the public to explain the benefits of research on precision medicine and to collaboratively develop policies on consent, privacy and data sharing that will build public trust for such cutting-edge research.”

As for funding the endeavor, the committee suggested creative financing: redirected federal resources, collaborations with biotechnology and pharmaceutical companies, private sources and a shift in the acquisition of molecular data to the clinic setting.

The decision to further explore the implementation of the initiative rests with the NIH’s Collins, though other groups, institutions and agencies, such as UCSF, insurance companies and the U.S. Department of Health and Human Services, could begin to implement the recommendations.

The proposal, Desmond-Hellmann acknowledges, is not small, but the time to begin is now. “Our long-term goal,” she said, “is to embed our scientific enterprise into the normal course of clinical care.”

Related Links

Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease [PDF]
National Academy of Sciences Report

UCSF and Kaiser Complete Massive Genoming Project
July 21, 2011

Kaiser, UCSF Awarded $25 Million from NIH to Build Resource for Genetic Research
Oct. 13, 2009