Developing Non-Laboratory Cardiovascular Risk Assessment Charts and Validating Laboratory and Non-Laboratory-Based Models

Background: Developing simplified risk assessment model based on non-laboratory risk factors that could determine cardiovascular risk as accurately as laboratory-based one can be valuable, particularly in developing countries where there are limited resources. Objective: To develop a simplified non-laboratory cardiovascular disease risk assessment chart based on previously reported laboratory-based chart and evaluate internal and external validation, and recalibration of both risk models to assess the performance of risk scoring tools in other population. Methods: A 10-year non-laboratory-based risk prediction chart was developed for fatal and non-fatal CVD using Cox Proportional Hazard regression. Data from the Isfahan Cohort Study (ICS), a population-based study among 6504 adults aged ≥ 35 years, followed-up for at least ten years was used for the non-laboratory-based model derivation. Participants were followed up until the occurrence of CVD events. Tehran Lipid and Glucose Study (TLGS) data was used to evaluate the external validity of both non-laboratory and laboratory risk assessment models in other populations rather than one used in the model derivation. Results: The discrimination and calibration analysis of the non-laboratory model showed the following values of Harrell’s C: 0.73 (95% CI 0.71–0.74), and Nam-D’Agostino χ2:11.01 (p = 0.27), respectively. The non-laboratory model was in agreement and classified high risk and low risk patients as accurately as the laboratory one. Both non-laboratory and laboratory risk prediction models showed good discrimination in the external validation, with Harrell’s C of 0.77 (95% CI 0.75–0.78) and 0.78 (95% CI 0.76–0.79), respectively. Conclusions: Our simplified risk assessment model based on non-laboratory risk factors could determine cardiovascular risk as accurately as laboratory-based one. This approach can provide simple risk assessment tool where laboratory testing is unavailable, inconvenient, and costly.


Introduction
Cardiovascular disease (CVD) is the most common preventable non-communicable diseases (NCD) worldwide, with an estimated 17.8 million deaths in 2017. It is predicted that CVD would be the cause of more than 23 million (about 30.5%) deaths by 2030 worldwide [1][2][3].
A reduction of CVD mortality rates has been reported in high income regions. However, 50% of CVD mortality and 80% of the CVD global burden occur in low and middle-income countries (LMICs), including the Eastern Mediterranean Region (EMR) [4].
In the last two decades, the most common causes of death have been transited from infectious to NCDs specially CVD in Iran [5]. Global Burden of Disease(GBD) previous data in 2010 and 2015 reported that CVD was the first leading cause of mortality and DALYs that led to 46% of all deaths and 20-23% of the burden of diseases in Iran [6].
Many global prevention and control guidelines recommend applying CVD risk assessment charts to identify people at high risk of developing CVD within a specified period of time, usually 10 years. Risk-based management helps to target specific intensive preventive and treatment interventions [7][8][9][10]. Several risk assessment models have been reported previously [11][12][13][14][15][16][17][18]. Although the widely used risk algorithms could be beneficial in developed countries, however, they are based on laboratory measures that are not affordable and available in some LMICs [19,20].
Recently, significant efforts have been directed to develop non-laboratory based CV risk algorithms that can predict the disease as accurately as laboratory-based ones but are more feasible to use in clinical practice [10,20].
Laboratory and non-laboratory based CVD risk charts were developed by WHO for different regions worldwide and validated in other cohorts [10]. However, one potential limitation is the use of HRs derived in a mainly western population in a LMIC. Additionally, the new WHO charts are recalibrated for region and might be improved by focusing on national data. Providing affordable approaches for the prediction of CVD risk based on national data is especially crucial in LMICs, where many primary care facilities are not available, and most individuals remain unaware of their underlying cardiovascular risk [20,21].
In an attempt to simplify the Persian Atherosclerotic cardiovascular disease Risk Stratification (PARS) model that we reported previously and is a laboratory-based one [22], we aim to develop a simplified Persian Atherosclerotic cardiovascular disease Risk Stratification (SPARS) model based on non-laboratory risk factors and to assess if it can predict CVD risk as accurately as the PARS laboratory-based one. Furthermore, we aim to validate PARS and SPARS models, both developed based on Isfahan Cohort study(ICS) and to validate them on the Tehran Lipid and Glucose Study (TLGS) population [23].

Study population
The ICS is a population-based longitudinal study of 6504 Iranians adult, who were recruited in the year 2001 using multi-stage random cluster sampling and followed-up for at least ten years [24].
Written, informed consents were obtained from all subjects and Ethical approval was obtained from the Isfahan Cardiovascular Research Center Ethics Committee, a WHO collaborating center in the Eastern Mediterranean Region (EMR), and Isfahan University of Medical Sciences and conformed to the Declaration of Helsinki.
Having Iranian nationality, aged ≥ 35 years, mentally competent, and not pregnant, were considered as inclusion criteria. Participants with known coronary heart disease, heart failure, stroke and ischemic heart attack (n = 181) and subjects without a single follow-up (n = 891) were excluded from the study. Among the original recruited sample, only 5432 were free of CVD at baseline and had at least one follow-up.
Phone call follow-up was done every two years to look for any report of CVD events including sudden death, unstable angina, fatal and non-fatal myocardial infarction, and fatal and non-fatal stroke. All medical records of participants who reported CVD events were collected and verbal autopsies were done for dead participants. A panel of cardiologists and neurologists made final decisions by reviewing on the patient's medical records and validated and confirmed the events diagnosis [25].
The loss to follow-up rate was 891 (14.1%) in the first phone call follow-ups and decrease to 104 (1.6%) in the fifth stage. The baseline characteristics and prevalence of CVD risk factors were not significantly different between subjects lost to follow-up and those studied [25,26]. Participants without any event and loss to follow-up events were considered as censored. The detailed description of the design, methodology, followup, success rate of follow-up, risk factor measurements, and endpoints of ICS were previously reported [22,25,27].

External Cohort
The Tehran Lipid and Glucose Study (TLGS) was used for external validation because of the similarities of the research protocols including adequate follow-up duration, using similar CVD events definition, similar age range and systematic measurement of CVD risk factors.
The TLGS is a longitudinal study started in 1999 to identify risk factors of non-communicable diseases among population in district no.13 of Tehran, the capital of Iran. The study methods have been described elsewhere [23,28].

Developing of Non-laboratory-based model
In our previously reported risk model based on laboratory risk factors, we considered age, sex, systolic blood pressure (SBP), total cholesterol (TC), diabetes (based on fasting blood test), smoking status, family history of CVD, and waist to hip ratio (WHR) [22]. In current study, we developed a new non-laboratory-based model and named it as simplified PARS (SPARS) by considering risk factors like: age, smoking status, SBP, self-reporting history of diabetes, and WHR. We used Cox proportional hazards regression to estimate the hazard ratios (HR). Deviation from the proportional hazards assumption was assessed graphically and by fitting an extended Cox model, including time-varying covariates. There was no significant deviation from the proportional hazards assumption. Then we constructed a 10-year simplified risk assessment chart of CVD incidence using the SPARS model. SBP was categorized into four classes based on Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (JNC 7) [29]: <120, 120-139, 140-159, and ≥160 mm Hg. Waist-to-hip ratio (WHR) was categorized as <0.85, 0.85-0.90, 0.90-0.95 and >=0.95 in females and <1, 1-1.05, 1.05-1.10 and >=1.10 in males [30]. Smoking status was categorized as smoker and non-smokers.
The simplified risk chart displays CVD risk thresholds as <5% (low risk), 5% to <10% (intermediate risk), 10% to <15% (high risk) and >15% (very high risk) as a colored chart. The thresholds were chosen conceptually based on ROC curve and indices introduced by Song et al. [31]. Then they are confirmed by some cardiologists about usefulness of them in classifying individuals in low risk to high risk in target population.
The performance and predictive accuracy of models was assessed by discrimination and calibration. To avoid optimism that might result from assessing risk discrimination in the data from which the model was derived, resampling methods including 50000 random-sample bootstrapping and 10-fold cross-validation were used to assess discrimination using Harrell's C [32]. The calibration of the risk models were assessed in the derivation dataset by Nam-D'Agostino chi-square test [33].

Evaluation of External validation
External validation assesses whether the model can be used in a population other than the one in which the model was derived [34].
The performance of the non-laboratory-based SPARS prediction function among the TLGS cohort as an external cohort was assessed using two evaluations. First, SPARS function was refitted using the same variables in the equation by applying multiple Cox (proportional hazards) regression model in the TLGS database. Second, recalibration of SPARS function were done using the method applied in the new revised WHO model [10]. The process involved the use of mean risk factor levels (from TLGS) and country-specific estimates of annual incidence of CVD within five-year age groups, from GBD 2017 (Methods.A in the Appendix).
Model performance was evaluated by three steps: comparison of regression coefficients (Methods.A in the Appendix), discrimination using Harrell's C, and calibration by Nam-D'Agostino chi-square test.
Further, all external validation analysis were also repeated for previous reported laboratory-based model (PARS) to assess external validation of laboratory-based model. SAS software, version 9.4 (SAS Institute Inc) was used for statistical modeling and analysis. Chart generation and model validations were performed using Matlab version 8.6 (The MathWorks Inc., Natick, MA, USA).

Non-laboratory-based model development
During 49452.8 person-years of follow-up (range 0.1-12, median 10.9 years), there were a total of 705 events related to cardiovascular disease (564 IHDs, 141 strokes). Mean age of male and female was 51.2 ± 11.9 and 50.3 ±11.3 years, respectively. There was a higher prevalence of smoking in men than women (41.6 vs. 3.3), while a higher proportion of women had high WHR (94.6 vs. 39.2) and self-reported diabetes (8.6 vs. 5.8). SBP distribution was similar between sexes.
HRs for fatal and non-fatal CVD events for each risk predictor included in the non-laboratory-based model are provided in Table 1 and compared with the laboratory-based model.
The 10-fold cross-validation yielded a mean Harrell's C: 0.73 (95% confidence interval [CI], 0.71-0.74). In bootstrap validation, the mean Harrell's C was 0.73 (min-max; 0.70-0.77). The Nam-D'Agostino calibration χ 2 was 11.01 (p = 0.27). Thus, the proposed non-laboratory-based model could be internally validated and has good performance in terms of discrimination and calibration. Fitting new non-laboratory WHO model on our population resulted in C-index of 0.73 (95% [CI], 0.71-0.76) and χ 2 of 15.76 (p = 0.07) for female and C-index of 0.71 (95% [CI], 0.68-0.74) and χ 2 of 5.91 (p = 0.75) for male (Table A.1 in the Appendix). Thus, our proposed model performed better than new WHO model. After that, the risk chart based on the proposed non-laboratory-based model was created (Figure 1). We observed no difference in the Harrell C and a small one in the calibration χ 2 when we compared nonlaboratory-based with the laboratory-based, reported previously (Harrell's C: 0.73, Nam-D'Agostino χ 2 :10.82; p = 0.29). In addition, we found strong agreement between risk predictions based on laboratory and nonlaboratory models. Of individuals at greater than 10% risk using the laboratory-based model, all were also identified as being at greater than 5% risk with the non-laboratory-based model. When using a 10% and 15% risk threshold with the non-laboratory-based model, about 93% and 71% of CVD events respectively were identified.
In spite of reporting C statistics as suitable criteria for the overall predictive discrimination of the risk models, since the most clinical policy regarding treatment are often considered a specific absolute level of risk, it is useful to illustrate if the risk models classified patients correctly at different levels of 10-year CVD risk, commonly used in guidelines [20]. One measure is the percentage of individuals correctly classified as the sum of the number of true negatives (those below the risk threshold without any CVD events) and the true positives (those who are above the risk threshold and ultimately did have an event) divided by the total number of individuals in the sample (Figure 2, Table 2). The laboratory-based and non-laboratory-based models classified individuals at the same rates across the commonly used risk levels in clinical guidelines. Both laboratory-based and non-laboratory-based models correctly classified over 80% of patients when the threshold for risk was set at 10% and 15%. This percentage was reduced as the risk threshold dropped, and correctly classified 77% and 68% of patients at 5% risk based on laboratory-based and non-laboratory-based models, respectively.

External validation
A total of 824 CVD events occurred during 55982.8 person-years of follow-up in TLGS cohort. The 10-year CVD event rates were 1.4% in the ICS and 1.5% in the TLGS. Numbers of participants, person-years of follow-up, CVD events, and the risk factor levels at the baseline examination of the two cohorts are shown in Table 3.
The HRs for major CVD risk factors based on non-laboratory-based risk function were obtained from Cox regression model for TLGS cohort and compared with original functions that were developed on ICS cohort ( Table 4). Major risk factors showed a similar relation to CVD in both cohorts for non-laboratory model. For most risk factor categories, the magnitude of the HRs did not differ significantly. The results for the nonlaboratory-based model showed few exceptions that reached statistical significance, the significant higher HR for men (P = 0.005) and WHR (P = 0.04, 0.03 for the 3 rd and 4 th category, respectively) in TLGS.
For the laboratory-based model, male gender was associated with a higher HR (P < 0.001), and SBP of 120 to 139 mm/Hg was associated with lower HR (P = 0.03) in TLGS. Other risk factors showed a similar relation to CVD in both cohorts ( Table 5).
In the discrimination analysis, non-laboratory-based model separated individuals with CVD events from ones without CVD in the TLGS cohort nearly as well as in the ICS cohort. The Harrell's C was 0.77 (95% CI, 0.75-0.78) for non-laboratory-based model in TLGS versus 0.73 (95% CI, 0.71-0.74) in ICS. However, with respect to calibration, non-laboratory-based model slightly overestimated the event rates observed in the TLGS cohort. The Nam-D'Agostino χ 2 was 29.89 (p = 0.001) in TLGS versus 11.01 (p = 0.27) in ICS (Figure 3). In the recalibrated non-laboratory-based function, the C statistic values was 0.73 (95% CI, 0.71-0.75). Thus, the model retained acceptable discrimination performance in the external validation cohort.

Discussion
We developed a 'Simplified' non-laboratory based CVD risk chart and named it SPARS, then evaluated and validated both PARS laboratory-based, published previously [22], and SPARS risk prediction models with TLGS as an external cohort. The proposed SPARS model does not require laboratory measurements such as serum lipids or glucose. This simplified non-laboratory-based model which is clinically convenient, user-friendly, and affordable,       can be applicable in situations with little access to laboratory tests, particularly in primary health care systems (PHC) or in low-resource settings like health houses that cover the rural population in Iran [35]. Our study shows that the non-laboratory-based risk model can predict CVD outcomes with similar accuracy to a laboratory-based one. Our values of predictive discrimination of 0.73 for the non-laboratory-based model are not different from the corresponding values in the laboratory-based model. We also found strong agreement with the laboratory-based model with respect to the classification of patients to moderate or high risk group, similar to findings from previous reports [10,18,20]. Furthermore, our study showed that 90% of patients with diabetes, who were classified as being at greater than 10% risk of developing CVD in 10 years based on the PARS model, were also classified as being at greater than 10% risk with the SPARS model. Other studies reported poor performance among people with diabetes (e.g., 45% of men and 25% of women in WHO study) [10,18]. It might be due to the inclusion of self-reported history of diabetes in our non-laboratory-based model that highlights the importance of including a diagnosis of diabetes in the risk score. Therefore, because of the ability of non-laboratory screening tools to correctly classify patients at the thresholds recommended by guidelines for initiating treatment, these tools are suitable to predict CVD risk in LMICs to reduce the cost regarding laboratory tests.  [22]. † Hazard ratio in the ICS is significantly different from that in the TLGS (P-value < 0.05). We developed SPARS in an analogous manner to the previous reported PARS model to provide userfriendly chart. Nevertheless, the color code has been revised to facilitate application compared with those of previous PARS model. Our chart differs from the new WHO chart, in that it includes waist to hip ratio instead of the body-mass index (BMI). In both PARS and SPARS models, WHR was a significant predictor of cardiovascular risk in our population.
Our user-friendly chart takes into account features of practicality, cost, and feasibility. Health workers can determine the risk value and treatment can be initiated in one clinic visit with minimum equipment, and less cost and time needed as there is no waiting time for laboratory results. Major global guidelines promote the use of multivariable risk models to guide treatment decisions. Decisions about whether to initiate pharmacological treatment in addition to lifestyle behaviors interventions as well as treatment intensity are guided by the level of the risk of these models. Considering risk assessment models in guidelines recommendations can be seen in many guidelines like hypertension or hyperlipidemia. Individuals at higher risk for CVD events require more intensive management. Conversely, low-risk individuals can be spared from the associated harms and high costs of overtreatment. The World Health Organization (WHO) has also recommended two packages of essential NCD interventions (WHO-PEN and HEARTS) with protocols that include simple and affordable tools such as CVD risk assessment charts for early detection and treatment [7].
Moreover, in many LMICs, clinical guidelines are developed based on laboratory-based risk assessment models which are costly and guideline developers do not consider the availability of facilities or health-care workers to implement these laboratory-based screening or treatment guidelines [20].
In the present study, PARS and SPARS prediction models were evaluated externally using data obtained from the TLGS study. HRs for major CVD risk factors were remarkably similar to those derived from the ICS, except, the HRs for males and for WHR were somewhat higher in TLGS. Both the original and the recalibrated PARS and SPARS functions discriminated well between individuals with CVD and without CVD in ICS and the TLGS. In the calibration analysis, slight overestimation was observed when the PARS and SPARS functions were applied directly to the TLGS. The calibration is generally poor, mainly owing to the differences in the relative risks associated with risk factors and the mean levels of the risk factors between the two cohorts [36]. To use a risk assessment tool optimally and to be acceptable for treatment guidelines, clinicians need to be confident that the absolute risk prediction functions can be generalized to other settings beyond where they were originally developed [36]. We have demonstrated that both PARS and SPARS prediction functions work reasonably well among the TLGS population.
To overcome over-estimation, we recalibrated both PARS and SPARS risk models based on national and more contemporary statistics from GBD applying new approach used in WHO models [10]. This approach involved few modeling steps [17,37,38]. Descriptive epidemiological data, including country-specific cardiovascular disease incidence to reflect changes in disease incidences and risk factor profiles can be readily incorporated to revise models [10].
Recently, the WHO developed laboratory and non-laboratory risk charts, however they were only presented for regions and not for individual countries, although CVD risk differs between countries within some regions [39]. The risk chart developed in our study could be particularly useful for application in other countries in the EMR or other LMICs because many of these countries do not have locally developed risk scores based on their own cohort studies.
We also developed electronic and mobile application based on the SPARS risk assessment chart for easier use by general practitioners, physicians with related specialties like cardiologists and other health workers. All can rapidly implement a simple non-laboratory approach for initial screening, preventive, and treatment interventions without need for blood testing. Web-based program are freely accessible at http://www.prognosis.ir/Pars/index2.php. Other electronic and mobile application are under preparation to be used by the general public and provide them preventive recommendations.
Our study has several points of strength. First feature is the development of practical model merely based on non-laboratory measurement. Such simplified approaches could be used as part of stepwise approaches to help target laboratory testing in people most likely to benefit from the extra information and used even when values for some risk factors are unavailable for individuals in low-resource settings. A second is the recalibration approach we have used. It involves the use of GBD statistics. So, it can be easily developed and require fewer modelling steps. Third feature is that the risk model reported here can predict combined outcome of fatal and non-fatal events, thereby improving on risk calculators that predict fatal events alone. Predicting only fatal events in risk models significantly underestimates total CVD risk, particularly in populations with a low fatality rate [18].
However, our study has some limitations. First, we did not include heart failure, vascular dementia, or peripheral vascular disease as outcomes, thus the overall cardiovascular risk may be underestimated, however, only few risk assessment models include these outcomes. Second, as in all cohort studies, some lost to follow-up participants was inevitable. The main reason for loss to follow-up in the ICS was related to phone number changes in some city blocks, which was done by the national government to develop and expand communication infrastructure. There were also some changes in address and only a few people were not willing to take part in multiple follow-ups. Therefore, loss to follow-up was likely to be largely random [22,25,27]. Finally, although the current study has broad coverage in comparison with other similar studies in Iran, recruitment was still limited to the central area of Iran.
In conclusion, we derived a non-laboratory-based cardiovascular risk prediction model using age, sex, smoking status, a self-reporting history of diabetes, blood pressure levels, and waist to hip ratio, all of which can be obtained in one outpatient visit. Such a user-friendly tool can help primary health care providers to determine the risk of cardiovascular disease (CVD) using an affordable screening tool. Remarkably, the simplified model was in close agreement with the laboratory-based model. We also validated and recalibrated both laboratory-based and non-laboratory-based models using mean risk factor levels based on the external cohort and country-specific estimates of annual incidence of CVD from GBD statistics, thereby enabling more accurate identification of individuals at high risk of cardiovascular disease in different settings.

Additional File
The additional file for this article can be found as follows: • Appendix. Methods A. DOI: https://doi.org/10.5334/gh.890.s1