Predictive Analytics

The authors report no relationships that could be construed as a conflict of interest. The views expressed in this article are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute, National Institutes of Health, or the Centers for Disease Control and Prevention. From the *Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA; and the yOffice of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, GA, USA. Correspondence: M. M. Engelgau (michael. engelgau@nih.gov).

Institutes of Health (NIH). In most high-income countries, extensive efforts have synthesized this knowledge through systematic reviews and developed evidence-based guidelines for intervention delivery within health care systems and community-based settings [1e7]. In low-and middleincome countries, the World Economic Forum and World Health Organization have studied the economic toll of noncommunicable diseases and the cost of scaling up a set of proven-effective interventions, so-called best buys (e.g., providing drug therapy and counseling for eligible persons at high risk to prevent heart attacks and strokes) [8]. Thus, effective interventions are now available and recommended for implementation across the globe. However, adaptable and sustainable implementation strategies for HLBS interventions are lacking and return on investment for this vast knowledge base is diminished. This has led many to call for a more developed implementation research agenda-including leadership from institutions such as the World Health Organization [9], World Bank [10], academia [11], US Agency for International Development [12], and NIH [13e19].
Implementation research studies intervention delivery in real-world contexts of health care delivery systems and community public health systems, with the goal of delivering interventions optimally and sustainably across the entire socioecological context [20]. Several major challenges for implementation research include understanding key barriers and facilitators within the socioecological context, various health and community policies, delivery strategies within health systems (e.g., physical infrastructure, staffing and their skill levels, availability of diagnostics and therapeutics, geospatial issues), and community contexts (e.g., disease burden, community resources, social deprivation, and economic opportunities). Elements at each of these levels can impact intervention implementation and thus enhance or impair intervention delivery and outcomes.
Predictive analytics [21] offers novel approaches to use methodologies that could pinpoint key barriers and facilitators across the community context and lend insights as to what might be highly promising implementation strategies for research agendas. Implementing the NHLBI Strategic Vision [22]-where implementation research is prominent-will potentially benefit from predictive analytics informing the research agenda. In this perspective, we briefly review the opportunities and challenges of predictive analytics for refining the implementation research agenda targeted at reducing the population burden of HLBS disorders.

WHAT IS PREDICTIVE ANALYTICS?
Predictive analytics in health is a set of analytic procedures that take existing information and forecast future probabilities of disease patterns using population-and individual-level data along with biomedical and other types of data [21][22][23][24]. These annotated data can then drive analytic models that can make health decisions [24]-at both the individual and population levels. This is much different from the traditional hypothesis-driven biomedical research. Predictive analytics uses statistical algorithms, comprehensive data (e.g., geospatial, burden of disease, demography, variation in community and health care capacity and in local resources settings), and strives to understand complex interrelationships between determinants of health and the variability of health care and public health service delivery-and the likelihood of future health outcomes. Predictive analytics could help forecast potential solutions for populations, specifically vulnerable population subgroups, by simulating implementation strategies and then exploring the facilitators and challenges in specific contexts-while simultaneously considering available resources and capacity.

HOW PREDICTIVE ANALYTICS IS USED IN HEALTH
Currently, predictive analytics is used mostly as a business tool and now increasingly in health. Businesses may use current and past data to better understand their customers, products, and competitors while also identifying potential opportunities and risks [21]. This approach is not hypothesis driven per se. Rather, the data drive the analytical direction seeking novel relationships. For health, understanding how the community context can facilitate delivery of health interventions is key. Predictive analytics can simulate a wide variety of clinical-, systems-, and population-level delivery interventions and also predict outcome measures and forecast results from selected intervention delivery scenarios. These efforts are part of larger movements attempting to understand how best to align areas such as precision health, precision public health, and population health [25e27]. Predictive analytics can help determine efficient implementation research strategies that then can be tested in delivery systems.
Examples of predictive analytics vary broadly-from uses in individuals, populations, and health systems and can incorporate large amounts of data from a single or several sources with the goal of predicting disease risk or health events [24,28e37] (Table 1). The spectrum of data sources can range from genomics and epidemiology to social media and provide a range of results that include identifying disease burdens and risk factors, high health risk states, and strategies to target resources at high-risk or high-burden groups. A common theme is the use of large amounts of data, and in some cases, multiple types of data-beyond traditional health-related data-and to use them in innovative ways.

SOME LIMITATIONS OF PREDICTIVE ANALYTICS
We need to better understand how prediction models actually impact health care decisions, patient outcomes, cost, and quality of care. While we can be thoughtful in identifying health information that is enlightening, whether it triggers a health decision action, can be challenging [38]. Other technical modeling challenges include data quality used for the analyses and its heterogeneity, model calibration (not able to account for the effects of unmeasured covariates that can lead to misleading results), and finally, gaining trust of the providers and consumers new to using this type of information and changing behaviors.
The challenges with the use of social media data need careful considerations. Limitations include sample bias due to highly variable participation rates across age and population groups and the methods to extract useful content among large volumes of data that contain little relevant content [35,36]. Beyond sample bias challenges, interventions delivered by social media can also have shortterm backlash when users realize they do not control the information they receive. Medium-term repercussions can include potentially driving unhealthy behaviors underground (e.g., participants will defer from tweeting unhealthy behaviors and/or develop or obfuscate it to avoid social media targeting). Finally, longer-term risks may include personal behavior pattern changes with what they share in the public social media domain.

HOW CAN PREDICTIVE ANALYTICS HELP ADVANCE IMPLEMENTATION RESEARCH
Implementation research is focused on the portion of the translational research spectrum that takes proven-effective interventions and studies their implementation in the actual context where they will be delivered [39] (Fig. 1). With T4 translation research, all the real-world factors that can facilitate implementation-or those that can be barriers to it-are critically important. For example, these can include the following domains: individual and patient factors; family dynamics; community, social, and physical environment; geographic location (e.g., urban/rural); social determinants of disease; educational and economic opportunities; social and physical mobility; and social and public health policy. These domains all can contribute to health risk, disease status, or health outcomes at the individual and at the population levels-and impact implementation efforts. They also are the drivers that will To construct a genomic risk score for CVD and to estimate its potential as a screening tool for primary prevention [30,33].
Large-scale genomic-wide data combined with targeted genetic association data.
Complements use of conventional risk factor predictions and predicts future CVD trajectory for clinical level prevention and treatment. To determine pharmacist interventions that maximize benefits and minimize risks from medications while improving outcomes [34].
Medical and medication records, health system claims data, physician notes.
Tailors medications use to individual benefits within the health system.

Population
To develop a predictive algorithm to estimate 5-yr risk of incident CVD in community settings [28].
Longitudinal population surveys. Population risk levels for policy decisions.
To use expressed Twitter language to characterize community-level psychological correlates of age-adjusted mortality from CVD [35,36].
Twitter language content, county-level mortality rates, census data.
Local CVD health intervention prioritization.
Health system To determine population-level health experiences, population health, and health care costs [29,37].
Health system, social, environmental, and geographic data; community health status and vulnerabilities.
Targets resources to have largest impact.
jg WATCH enhance, and challenges that slow implementation of health interventions at both the individual and community levels.
As we found in the health examples using predictive analytics (Table 1), the ability to assimilate large and diverse data can help understand how these domains combine and interact across the socioecological spectrum from the individual to the community, and to the complex system-level environment. These should enlighten implementation strategies and focus in on identifying those at risk or with disease burden and uncover the key levers needed for successful intervention delivery. In characterizing complex systems in which people live and work, this will also provide better approaches to identify and address health inequities-that is, preventable health disadvantage due to poor socioeconomic and social determinantsdriven by factors that go beyond those in the health sector [40]. Predictive analytics approaches can incorporate data that describe the burden of risk or disease in the context of the community environment while considering locale, assets, resources, and opportunity.

DISCUSSION
Predictive analytics provides a methodological approach that could be valuable in developing more refined implementation research agendas. It is currently becoming of great interest to NIH and NHLBI and was the topic of a recent NIH methodology seminar and is the topic for a NHLBI workshop planned for April 2019. It may help identify high-burden communities where health inequities are common, reveal important levers that can yield maximum return on benefit, and determine factors and drivers of future adverse health outcome, both clinical and public health, at the individual and population levels. It may also help tackle the complex systems where individuals and populations receive their clinical and public health care and also live, work, and play. All this information will help inform the implementation research agenda for NHLBI. However, the scope of predictive analytics is broader than described here. Precision medicine and precision public health are leveraging this tool along with other tools that employ big data approaches such as machine learning, artificial intelligence, neural networks, and data mining [21,25,26,41]. Use of these approaches can be tailored for the individual or population. The use of predictive analytics may enhance the emerging field of precision public health whose goal is to predict and understand population risks and tailor interventions to populations that are most likely to benefit from a specific intervention [25]. More broadly, it has been noted to have attributes that improve our ability to prevent disease, promote health, and reduce health disparities in populations by applying emerging methods and technologies for measuring disease, pathogens, exposures, behaviors, and susceptibility in populations; and developing policies gWATCH j and targeted implementation programs to improve health [27]. Some learning from delivering individual-based precision medicine may actually be very beneficial and complementary to successfully delivering precision public health. For example, precision public health delivery may benefit from combining individual-and population-level prevention efforts because individual genetically targeted approaches can have population level impact (e.g., familial hypercholesterolemia). In addition, precision medicine methods can incorporate big data methods and be useful to help define new approaches for precision public health [27]. However, others have cautioned that precision public health should not focus solely on genetics-this may detract from the impact of considering broader social determinates [26]. Other challenges include ethical, legal, and privacy issues associated with accessing and analyzing big data from individuals and groups without consent. These same issues would challenge delivery of interventions through predictive analytics approaches. In sum, the benefits and the risks associated with a predictive analytic approach will both need to be weighted.

CONCLUSIONS
To date, research investments have yielded many interventions that we know are highly effective. Yet, they are not being delivered to have individual-or population-level impact. Implementation research can provide a means to determine optimal and sustainable strategies to deliver these interventions. Calls have been made for more efforts in this research area. Novel tools such as predictive analytics are currently tapping into big data and developing prediction models for health outcomes at the individual, population, and system levels. These models may incorporate big data from many domains and simulate the complex environments where interventions are delivered. These simulations may prove to be a valuable asset because they may be able to refine and make more efficient the implementation research agenda. This becomes even more important under current scenarios of limited resources. In conclusion, implementing the NHLBI Strategic Vision and its focus on implementation research will benefit from predictive analytics informing our implementation research agenda as we chart the future together.