Welcome to the Revolution.
First came steam power, then electric power, and then the information age. Now, according to the World Economic Forum, we’re entering the Fourth Industrial Revolution, as the sciences converge around digitized information and data in ways that disrupt nearly every field in every country.
More stories about Big Data at work:
Human health is one of the fields that will be—and, indeed, already is being—most transformed by this revolution, says Robert Califf, BS'73, MD'78, HS'78, HS'80-'83, vice chancellor for health science data and director of Duke Forge, the School of Medicine’s ambitious new hub for all things data science. That’s why leading academic medical institutions like Duke must be in the vanguard of this profound change: to help society adjust and determine how best to organize, analyze, and use the growing abundance of data.
Technological advances within the past two decades have given us access to unprecedented amounts of information. Duke, with its premier medical school, health system, professional education, and biomedical research enterprises, as well as top programs in engineering, computer science, statistics, social sciences, and policy, is perfectly positioned to play a critical role in determining how best to apply all this information to improve health—locally, nationally, and globally.
“Big data” is a phrase we’re starting to hear a lot, around medicine and in other fields. “It’s one of those terms that means different things to different people,” Califf says.
“The fundamental way to think about it is, when data exceeds what you can handle on your standard laptop with your standard analysis packages, it’s ‘Big.’”
-Robert Califf, vice chancellor for health science data and director of Duke Forge
There may be no bigger or more complex collection of data than the human body. We have around 3.2 billion base pairs in our genome; multiple proteins and metabolites that interact at the level of tissues, blood, and organs; and myriad behavioral factors, social interactions, and environmental impacts. Until very recently, most of that stuff was invisible to us. No more.
“Until now, computing systems were unable to handle all of these multiple dimensions,” says Califf, professor of medicine and Donald F. Fortin, MD, Professor of Cardiology. “Now, with advanced technologies like whole genome sequencing, researchers can generate massive amounts of information relevant to health and health care at every level, from individuals to entire populations. The challenge is figuring out how to organize, analyze, and harness this huge quantity of data to produce better health care policies, practices, and health outcomes. With the digitalization of information—the Fourth Industrial Revolution—we are finally at a point where we can do this.”
From Information to Impact
Big data, made up as it is of bytes and bits and data sets, can be a hard concept to get your arms around. Its great promise lies at the point where all those seemingly insubstantial particles of information intersect with real people—individuals, communities, and whole populations—facing real health challenges.
“Big data is very important for national health policy because it holds the potential to provide much more value in health care”
-Mark McClellan, MD, PhD, director of the Duke-Margolis Center for Health Policy and Robert J. Margolis, MD, Professor of Business, Medicine, and Policy
“That means more convenient care; it means lower-cost care because it can be targeted better; and it means better outcomes for people by knowing exactly what works best.
“The scope of data that is now available and the potential for generating relevant evidence on lots of questions that could never be answered before is unprecedented: How can a health care organization prevent heart attacks by addressing some of the social and community-based factors that influence how they manage their risk factors? What exact combination of treatments is needed for a patient with a certain type of cancer, or with a predisposition to dementia? How can we make decisions well and use our health care dollars most effectively?”
Those questions, and the potential they represent, fall squarely within the mission of Duke University School of Medicine and related units such as the Duke-Margolis Center, says Dean Mary E. Klotman, BS’76, MD’80, HS’80-85. She has made data science a top priority for the School of Medicine, last summer naming Michael Pencina, PhD, the new vice dean of data science and information technology. A senior member of her leadership team, Pencina is responsible for developing and implementing quantitative science strategies as they pertain to the education and research missions of the school.
“The advent of big data, and our university’s commitment to this area of opportunity, gives us an unprecedented ability to lead the way in improving health and health care while reducing costs and shedding inefficiencies,” says Klotman. “Our opportunity, and our obligation, is to bring the full spectrum of data science tools and expertise to bear on answering the hard problems and delivering care that improves outcomes, delivers value, and honors our commitment to access and health equity for vulnerable populations and communities.”
That’s where Duke Forge comes in. The School of Medicine launched “the Forge” in 2017 to act as a communication and knowledge hub to support and advance health data science being done throughout Duke, facilitating interactions and collaborations among scholars, clinicians, and experts across the university and beyond to develop actionable insights and improve health outcomes.
Califf, the former commissioner of the U.S. Food and Drug Administration, returned to Duke to direct the Forge. (Prior to his post at FDA, he was the founding director of the Duke Clinical Research Institute.) Califf’s unique professional arrangement is another boost to Duke’s efforts in this area: His time is split between Duke and Verily, which is part of Alphabet, Google’s parent company. The powerful ability of Alphabet’s enterprise to apply vast resources to handle large quantities of data and move it around the world provides a model for universities to consider when envisioning the future.
Vice Provost for Research Larry Carin, PhD, has been a major ally in the development of strategy for the Forge. Carin has played a key role in the recruitment of key faculty and the educational program called “+DS” across the university. +DS simply means:
“Regardless of your field, you should add data science.”
- Larry Carin, PhD, vice provost for research
Duke currently ranks in the top 15 institutions in national rankings for machine learning and artificial intelligence, making it an ideal place for the work of the Forge. Duke Forge’s goal is to “free the data” by engaging partners to curate, analyze, and disseminate reliable and actionable information that leads to improved health for individuals and populations. The Forge spans multiple schools and departments at Duke, uniting faculty, staff, and students in the challenge to create innovative ways to fuse fields like biostatistics and machine learning, implement the insights gained into patient care, and leverage digital information to improve health and prevent disease.
“Data by itself is less valuable than data joined with other types of data,” says Erich Huang, MD, PhD, co-director of Duke Forge and assistant professor of biostatistics and bioinformatics. “Duke Forge can help bring together electronic health record data, socioeconomic data, genomic data, even financial data. Big data is noisy and messy, but when you join different types of data together, it becomes easier to parse through and distinguish signal from noise, which can help us determine what we should be doing from a health care standpoint to make an impact on patient outcomes.”
Priorities and Partnerships
One of Duke Forge’s priorities is to deploy technology to address health disparities, particularly in specific urban neighborhoods and the vast expanse of rural America where health status is declining at an alarming rate, Califf says. “This is one of the main reasons I came back to Duke,” he says. “We have an important job to do now to level the playing field.”
The Forge also aims to enhance what is known as data liquidity—the ability to freely and efficiently move and share data—and ensure that information disseminated is accurate and trustworthy. Also high on the to-do list: preparing current and future generations of clinicians and scientists for a data-rich future.
A key element of the transformation of Duke’s health system will be the formation of “Learning Health Units,” which will place analysts and clinical information experts within operating units of the health system. This signature program, led by Adrian Hernandez, MD, vice dean for clinical research, will use modern analytics to guide clinical care to the meet the “quadruple aim” of better clinical outcomes at a lower cost with a better experience for patients, their families, and clinicians in the system.
The Forge squad works closely with other university initiatives like the Rhodes Information Initiative at Duke, an interdisciplinary program designed to increase big data computational research and expand opportunities for student engagement in this rapidly growing field, and with faculty experts at other schools.
“The Forge brings together a core of thought leaders in data science and associated technology so we can assist Duke Health faculty in their research efforts,” Huang explains. “Most of our clinician-scientists aren’t experts in data science—they’re experts in surgery or cardiology or their scope of practice. Our job at Duke Forge is to provide that expertise, advice, and assistance to them. When we marry the people who have the technical skills in the domain of big data with the people with the clinical domain expertise, that’s very powerful.”
Creating a road map for medicine
In medicine, a key step to the data-driven future is to create an accurate “road map,” Califf explains. He likens it to his Alphabet colleagues, who are applying fourth-revolution concepts to driverless cars. “You can’t think about driverless cars unless you have mapped every single road in the country,” says Califf. “That mapping gives us the ability to integrate information from all those roads in real time, and even have a human voice share real-time decision support with you in an interactive manner.”
For human biology and health care, we don’t yet have that map, but Califf believes Duke will be instrumental in creating it.
“Not long ago, the ability to map the human condition was unthinkable,” he says. “But now, with the cloud and other massive changes that have occurred in computing, it’s actually possible to map out the biology and the behavioral factors and the social interactions and more. We’re in the very early phases of making the maps, but of course, health can’t wait, so we also have investigators here who are already applying these data science in exciting ways to improve health care.” (See sidebars)
Revolutions are seldom tidy, and in the years ahead charting the big data revolution will present challenges we no doubt haven’t even thought of yet. But the potential for enormous change is invigorating, and few institutions in the nation can match the combination of strengths and resources that Duke can bring to the task, says McClellan.
“As a leading academic medical center and a global coordinator for clinical data, Duke is in a great position to help to pull together data on socioeconomic and non-medical factors that influence health, thanks to the strong programs that this university has and is building on related to population health and well-being,” McClellan notes. “And Duke has some of the world’s leading methodologic experts on how to turn data into evidence, in artificial intelligence and machine learning and system design. There are not very many places that can pull all of those critical pieces together.”