Assessment of Claims of Improved Prediction Beyond the Framingham Risk Score
- Ioanna Tzoulaki, PhD;
- George Liberopoulos, MD;
- John P. A. Ioannidis, MD
- Author Affiliations: Department of Epidemiology and Public Health, Imperial College of Medicine, London, England (Dr Tzoulaki); Clinical and Molecular Epidemiology Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece (Drs Liberopoulos and Ioannidis); Biomedical Research Institute, Foundation for Research and Technology–Hellas, Ioannina (Dr Ioannidis); and Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, and Department of Medicine, Tufts University School of Medicine, Boston, Massachusetts (Dr Ioannidis).
-
Corresponding Author: John P. A. Ioannidis, MD, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina 45110, Greece (jioannid@cc.uoi.gr).
Abstract
Context With heightened interest in predictive medicine, many studies try to document information that can improve prediction of major clinical outcomes.
Objective To evaluate the reported design and analysis of studies that examined whether additional predictors improve predictive performance when added to the Framingham risk score (FRS), one of the most widely validated and cited clinical prediction scores.
Study Selection Two independent investigators searched 1908 articles citing the article that described the FRS in 1998 until September 2009 through the ISI Web of Knowledge database. Articles were eligible if they included any analyses comparing the predictive performance of the FRS vs the FRS plus some additional predictor for a prospectively assessed outcome.
Data Analyses We recorded information on FRS calculation, modeling of additional predictors, outcomes assessed, population evaluated, subgroup analysis documentation, and flaws in the methods that may have affected the reported improvements in predictive ability. We also evaluated the correlation of reported design and analysis features with the predictive model discrimination and improvements with the additional predictors.
Results We evaluated 79 eligible articles. Forty-nine studies (62%) did not calculate the FRS as it has been proposed, 15 (19%) modeled the additional predictor in more than 1 way and presented only the best-fit or area-under-the-curve (AUC) results for only 1 model, 41 (52%) did not examine the original outcome that the FRS was developed for, 33 (42%) studied a population different from what the FRS was intended for, and 25 (32%) claimed improved prediction in 1 subgroup but only 7 (9%) formally tested subgroup differences. Evaluation of independence in multivariable regressions, discrimination in AUC, calibration, and reclassification were reported in 77, 36, 7, and 7 studies, respectively, but these methods were adequately documented in only 60, 13, 4, and 2 studies, respectively. Overall, 63 studies (80%) claimed some improved prediction. Increase in AUC was larger when the predictive performance of the FRS was lower (ρ = −0.57, P < .001). Increase in AUC was significantly larger when evaluation of independence in multivariable regression or discrimination in AUC analysis was not adequately documented and when the additional predictor had been modeled in more than 1 way and only 1 model was reported for AUC.
Conclusion The majority of examined studies claimed that they found factors that could offer additional predictive value beyond what the FRS could achieve; however, most had flaws in their design, analyses, and reporting that cast some doubt on the reliability of the claims for improved prediction.
- KEYWORDS:
- CARDIOVASCULAR DISEASES
- CORONARY DISEASE
- DATA INTERPRETATION, STATISTICAL
- FRAMINGHAM RISK SCORE
- PREDICTIVE VALUE OF TESTS
- PROGNOSIS
- RISK ASSESSMENT
- RISK FACTORS
Emphasis on improvements in prediction has become a hallmark of the quest for personalized and individualized medicine,1 and the predictive literature is drawing increasing attention in medicine. However, the utility of this literature can be hampered by methodological limitations affecting design, analysis, and reporting.2,3,4,5,6 One challenge is to demonstrate that new candidate predictors can offer independent, incremental information beyond what is already known based on traditional risk factors. A sophisticated new predictor may have good predictive ability on its own but may not improve predictive ability further when simple, easy-to-measure traditional factors are already taken into account.
We assessed empirically a systematic sample of studies that evaluated various candidate prognostic factors in their ability to improve prediction of coronary heart disease (CHD) or other outcomes beyond what the Framingham risk score (FRS) can achieve.7 The FRS is one of the most thoroughly validated and widely used predictive scores in medical literature. Its original publication in 1998 is one of the most-cited articles across all biomedicine.7 The FRS was developed with robust methods and is intended to offer prospective risk assessment for CHD risk in men and women who do not have overt CHD. Risk is calculated based on age, blood pressure, total or low-density lipoprotein cholesterol level, high-density lipoprotein cholesterol level, smoking status, and presence of diabetes mellitus. We aimed to assess, among studies that tried to find additional risk factors that could improve prediction beyond the FRS, whether the reported design and analysis were adequate or claims for improvement were susceptible to bias.
METHODS
Selection of Studies
We assembled studies that examined whether 1 or more factors other than those included in the FRS could improve the predictive performance of the FRS. Identifying all studies that have done such analyses is impractical, if not impossible. Instead, we assembled a systematic sample with an objective and readily replicable search strategy. Thus, we searched all articles that cited the 1998 Circulation article describing the FRS,7 herein called the “reference FRS article.” We used the ISI Web of Knowledge database, and the last search was updated in September 2009, yielding 1908 citations. We perused the title and abstract of each citing article. Two evaluators (I.T., G.L.) independently searched all articles with a third evaluator (J.P.A.I.) arbitrating their discrepancies. Potentially eligible articles were then scrutinized in full text.
Articles were eligible if they included 1 or more analyses comparing the predictive performance of the FRS against the predictive performance of the FRS plus some additional predictor for a prospectively assessed outcome. Eligible articles cited the reference FRS article either in the methods in describing what they used as baseline for the comparison or in some other section alluding to traditional risk factors or the FRS in a way that suggested that FRS variables were used as the baseline for the comparison (as long as the methods did not suggest that some other score or model was used). We included articles regardless of the types of additional predictors considered and how they were measured and handled in the analysis and regardless of whether the examined outcome was CHD as defined by the reference FRS article, CHD with different definitions, or some other outcome.
Moreover, we considered articles regardless of what analyses were reported to address incremental predictive ability (evaluation of independence in multivariable regressions with and without the additional predictors, evaluation of discrimination by calculation and comparison of area under the curve [AUC] with and without the additional predictors in receiver operating characteristic [ROC] curves, evaluation of model calibration [accuracy of absolute risk prediction in various risk categories], evaluation of reclassification of participants into risk categories, or any other method addressing incremental predictive ability). We included studies regardless of whether they gave quantitative results or only qualitative statements comparing predictive models. Studies were included regardless of whether the eligible comparative analyses were the sole analyses or among several assessed in the same article.
We did not consider articles without data on actual patients (eg, decision analyses with simulated data); reviews; studies without prospective follow-up (eg, case-control and cross-sectional studies); studies that examined only the correlation/regression of additional predictors against the FRS or components thereof; studies that sought only to recalibrate the FRS in a different population, without considering additional predictors; and studies that compared the FRS with other risk scores without evaluating the additive value of these scores over and above the FRS.
When an article considered separately 2 or more predictors for their incremental predictive ability or more than 1 outcome, each was considered a different analysis. When several additional predictors were considered together, we did not separate them.
Definitions and Data Extraction
Two evaluators (I.T., G.L.) independently extracted data from eligible studies. Any inconsistencies were identified and cross-checked by both investigators with a third evaluator (J.P.A.I.) arbitrating any remaining discrepancies. For each eligible article, we recorded first author, journal, publication year, and sample size; information on the FRS, additional predictors, outcomes assessed, population evaluated, and methods of evaluation of predictive ability that may have affected study results; and inferences on whether the additional predictors improved prediction beyond the FRS.
The estimated baseline performance of the FRS alone and the estimated improvement by a candidate additional predictor may be affected in a number of ways: if the FRS is calculated or used suboptimally (not as originally developed and proposed), if the candidate additional predictor is modeled in a way that its performance is inappropriately exaggerated, if the examined outcome or population is not what the FRS was intended for, or if spurious claims are made for improved prediction only in specific subgroups of patients. We tried to capture these diverse possibilities.
We recorded the following aspects of suboptimal FRS calculation and use: whether any of the standard risk factors was deleted or modified or other factors added; whether it was not calculated based on the regression coefficients or points proposed for each factor in the reference FRS article (or the well-accepted minor variant version of the National Cholesterol Education Program Adult Treatment Panel III guidelines),8 but based on some other approach (eg, new regression coefficients developed from the data in the specific study); whether patients with specific FRS values were excluded, narrowing or altering the dynamic range; and whether the FRS was not used as a continuous score but in grouped categories.
For each additional candidate predictor, we noted whether it had been modeled in more than 1 way (eg, continuous and categorical, or different splits used in categorical analyses) and results for incremental predictive ability were presented for only 1 model. If so, we recorded whether the presented results were clearly the best fit (those with maximum estimated improvement in predictive ability) or this was unclear.
For outcomes, we recorded whether they were CHD or some completely different outcome. For CHD, we further recorded whether it was defined as in the reference FRS article (angina pectoris, myocardial infarction, coronary insufficiency, or CHD death) and whether outcome was assessed within 10 years of follow-up. For the population studied, we recorded whether it included participants of nonwhite ethnicity (>10%) or participants with overt history of CHD at start of follow-up. For subgroup analyses, we recorded whether claims were made for improved predictive performance only in a specific subgroup of patients without properly documenting that the improvement is different beyond chance (P < .05) in these subgroups vs other participants.
Assessing Incremental Predictive Ability
Several methods for the evaluation of prediction models have been proposed. For each article and additional predictor of interest, we noted which methods of evaluation of predictive ability were reported. For 4 methods that have been widely discussed and proposed in the CVD literature and beyond, we noted whether they had been adequately documented.
For multivariable regressions examining independence of the additional predictor, we considered documentation to be adequate when a test was used (eg, Bayes or Akaike information criteria) that penalized for including an additional variable in the model (fully adequate documentation) or when the investigators reported whether the additional predictor was independently statistically significant at P < .05 without addressing model parsimony (partially adequate documentation).
For the evaluation of discrimination in ROC curves,9 AUC (or C-index) analyses were considered adequately documented if there was description or reference to the method, provision of both AUC estimates (with and without the additional predictor), and provision of either confidence intervals (CIs) or a P value for the comparison of AUCs. Values for the AUC, 95% CIs, and P values were recorded when available. Whenever a study examined subgroups, we focused on the whole population unless only data per subgroup were provided; in those cases, we extracted data for each subgroup separately.
Calibration (accuracy) of the predictive model was considered adequately documented when model calibration was given both with and without the additional predictor with a goodness-of-fit test.
Adequate documentation of reclassification required the following: use of standard categories of risk (<10%, 10%-20%, >20%)8 or justified use of different categories; reporting of how many participants changed risk categories for each type of change; and reporting of whether participants who had changed risk category had moved in the correct direction depending on whether they had an event (improved reclassification if moving to a higher predicted risk category for those who had an event and if moving to a lower predicted risk category for those who did not have an event).10
Finally, we estimated what proportion of studies claimed in interpreting their findings that they had found an improvement in prediction.
Correlates of AUC Estimates and Improvement in AUC
We evaluated whether improvement in AUC (the difference in AUCs with vs without the additional predictor, ΔAUC) was larger when the reported AUC of FRS alone was lower. We calculated the Spearman correlation coefficient between ΔAUC and the AUC of FRS alone; in sensitivity analyses, we evaluated the correlation between ΔAUC and the mean of the AUC of FRS alone and the AUC of the FRS plus the additional predictors.
We also evaluated whether reported design and analysis features correlated with results on predictive performance. Specifically, we evaluated whether the AUC with and without the additional predictors as well as the ΔAUC were different depending on whether the analysis involved adequate vs suboptimal calculation or use of the FRS; appropriate vs potentially exaggerated modeling of the additional predictor; CHD vs other outcome; appropriate population (white ethnicity, no baseline CHD) vs other target population; the whole population vs a subgroup; adequate vs inadequate documentation of independence in multivariable regression; and adequate vs inadequate documentation of discrimination in AUC analyses, as defined previously. Too few studies had performed calibration and reclassification evaluations to allow meaningful similar analyses. We used the Mann-Whitney U test to compare the point estimates in AUC and in ΔAUC between subgroups defined by these study features. A 2-sided P value <.05 was used to denote statistical significance. Statistical tests were performed with SPSS version 14 (SPSS Inc, Chicago, Illinois).
RESULTS
After exclusions (eFigure), 79 articles were eligible.11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89 The κ coefficient for eligibility between the 2 independent investigators on initial screening was 0.86. Main study characteristics are shown in eTables 1 and 2.
Calculation and Use of the FRS
Twenty-four studies11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34 used different risk factors from those included in the reference FRS article. Additions, deletions, and modifications of risk factors appear in Table 1. Twenty-three studies11,13,17,18,19,20,21,22,23,25,26,27,28,29,30,31,32,33,34,35,36,37,38 did not calculate the FRS but simply used variables that they claimed corresponded to FRS factors. Three articles39,40,41 narrowed the dynamic range by excluding participants from further analysis based on FRS values. In 1939,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57 of 56 studies that calculated a continuous score, the score was categorized and the continuous information not used. Overall, 49 studies11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,58,59,60,61 (62%) calculated or used the FRS suboptimally for at least 1 aspect described here.
Table 1. Additions, Deletions, and Modifications of the Risk Factors Used for the Framingham Risk Score in 79 Eligible Articles Compared With the Framingham Risk Score Definition in Wilson et al7
Additional Predictors
In total, 86 different predictors were assessed in 192 analyses; between 1 and 51 different predictors were assessed per article (eTable 1). Fifteen studies (19%) examined different models of analysis; 4 studies11,40,61,62 reported the model with the best fit, and 11 studies24,25,31,53,54,57,63,64,65,66,67 reported only the results of the AUC analysis for one model without clarifying whether it was the best-fitting one.
Outcomes
Forty-one studies11,12,15,16,17,19,27,28,29,30,31,32,33,34,35,36,37,39,43,44,46,48,52,55,56,57,61,62,77,78,79,80,81,82,83,84,85,86,87,88,89 (52%) did not examine CHD as an outcome of interest in any analysis. Various other outcomes were assessed (eTable 3).
Follow-up ranged from 2 to 32 years (median, 7.1 years); only 23 studies (29%) had a mean or median follow-up time of 10 years or longer and 35 studies14,16,17,18,19,21,23,24,26,27,28,30,35,38,41,43,45,46,49,53,54,58,61,65,69,71,75,76,77,78,79,80,81,82,83 (44%) a mean or median follow-up time of 8 years or longer.
Study Population
A population of exclusively white ethnicity was examined in 39 studies (49%).12,13,16,17,21,23,24,26,30,31,32,33,37,39,40,47,48,49,50,51,52,53,57,61,63,64,65,67,69,71,72,74,76,78,80,83,84,85,86 Among the remaining studies, 24 studies14,20,25,27,28,29,34,35,38,42,45,54,56,58,60,66,68,70,73,75,77,81,82,89 included more than 10% nonwhite participants (2 had exclusively nonwhite participants), and for 16 studies, population ethnicity was unclear. In 18 studies (23%),28,29,30,31,32,33,34,36,37,41,55,56,57,63,79,83,84,85 participants with overt CHD at baseline were included. Overall, 33 studies (42%) clearly included more than 10% nonwhite participants or patients with CHD at baseline.
Subgroup Analyses
Thirty-nine studies (51%) stratified the analysis examining the additive predictive value of a novel risk factor in different subgroups. Subgroup analysis was defined according to FRS categories (15 studies39,42,43,44,52,53,55,56,57,59,66,71,74,86,88), categories of other factors (16 studies12,14,15,25,26,27,28,46,49,58,60,63,68,75,76,87), or a combination (7 studies16,40,45,48,54,62,69). Improved prediction in one subgroup vs others was claimed in 2512,14,15,16,25,26,28,40,42,44,45,48,54,55,58,59,60,66,69,71,73,74,75,76,87 of those 39 studies (64%). Only 7 studies12,25,28,45,49,73,75 (18%) presented a formal test for differences between subgroups.
Assessing Incremental Predictive Ability
Of the 79 eligible studies, 77 reported some multivariable regression analysis for at least 1 additional predictor and at least 1 outcome. The other 2 studies52,56 performed only univariate regressions of the additional predictor stratified by FRS category. Also, 36 studies11,12,14,15,16,20,21,24,25,26,31,33,54,57,58,59,60,61,63,64,65,66,67,68,70,71,73,74,76,77,78,81,82,85,88,89 used ROC analysis, 7 studies11,14,23,24,61,67,85 tested the calibration of multivariable regressions, and 7 studies24,54,57,61,64,67,76 described reclassification analysis. Documentation of these methods appears in Table 2.
Table 2. Documentation of Methods of Assessing Incremental Predictive Ability
Overall, 63 articles11,12,13,14,15,16,17,18,19,21,22,24,27,28,29,30,32,33,34,35,36,38,39,40,41,42,43,44,45,47,48,49,50,51,52,53,54,55,57,58,59,61,62,63,66,68,69,70,72,73,74,75,76,78,79,80,81,83,84,85,86,87,88 (80%) claimed improved prediction for at least 1 predictor and 1 outcome.
AUC Analyses
Twenty-nine studies reported AUC estimates for both FRS alone and the FRS with additional predictors with data on 88 such pairs of data. The AUC of FRS alone and the FRS with additional predictors ranged from 0.50 to 0.83 (median, 0.74) and 0.57 to 0.84 (median, 0.75), respectively. There was a strong inverse correlation between the ΔAUC and the baseline AUC of FRS alone (Spearman correlation coefficient, −0.57; P < .001) and between ΔAUC and the mean AUC of the FRS alone and the FRS with additional predictors (correlation coefficient, −0.47; P < .001). Absolute improvements in AUC of at least 0.05 were seen only when FRS achieved an AUC of 0.72 or less (Figure).
Association between area under the receiver operating curve (AUC) of Framingham risk score (FRS) and difference in AUC between AUC of FRS alone and AUC of FRS plus inclusion of additional predictor in 88 analyses. Different markers indicate whether the improvement in AUC was found to be nominally statistically significant (P < .05) (17 analyses) or not statistically significant (30 analyses) or that information on level of significance was missing in the article (41 analyses).
Table 3 shows the median AUC values and the estimated improvement in prediction (ΔAUC) when the data were classified according to prespecified features of study design and analysis. The AUC estimates were significantly higher in studies with suboptimal FRS calculation or use vs those where the FRS was calculated and used as proposed; in studies that used CHD rather than other outcomes; and in those where the evaluation of independent information in multivariable regression was adequately documented vs those that did not adequately document this aspect. Also, the ΔAUC was significantly larger when multivariable regression or ROC analysis was not adequately documented and when the additional predictor had been modeled in more than 1 way but only 1 model was reported for AUC analyses.
Table 3. Median AUC Values and ΔAUC According to Different Aspects of Study Design and Analysisa
COMMENT
In this empirical evaluation, the majority of examined studies claimed that they found factors that could offer additional predictive value beyond what the FRS could achieve. However, most studies had flaws in their design, analyses, and reporting that cast doubt on the reliability of the claims of improved prediction. Furthermore, some methodological limitations were associated with larger estimates of reported improvements in predictive ability; studies without such limitations showed on average no improvement in AUC with the additional candidate predictors.
Most studies examining the additive predictive value of a risk factor with the FRS did so by presenting statistically significant associations with the outcome after the FRS or its components were included in multivariable regression models. However, a P value alone usually offers weak support for credibility and provides no information on model calibration and discrimination.91,92 Calibration, the important ability of the model to give predictions of absolute risk that are commensurate with levels seen in reality, was rarely reported. Discrimination, how well the model can separate those who do and do not experience an outcome, was reported in about a third of the studies via AUC analysis. However, even then, reporting of AUC analyses was often inadequate. Few studies assessed the ability of the additional risk factor to reclassify individuals into risk categories and only 2 provided adequate documentation. Reclassification analyses are essential for understanding whether a new model can alter management decisions.10,93 The sparse use of these methods, partly explained by their recent introduction, and suboptimal reporting diminish the clinical utility of this literature.
Considerable absolute improvements in AUC were reported only in studies where the FRS performed poorly, below what is typically expected of this widely used and carefully validated score. This may be due to chance (regression to the mean), genuine differences in study characteristics, or a conglomerate of diverse biases that decrease the predictive performance of the FRS. The baseline predictive performance of the FRS was also lower when the AUC was calculated for an outcome different from what the FRS was initially developed for. In such cases, the FRS is transformed into a “straw-man” predictive model and even mediocre new candidate risk factors can achieve an improvement. This result is analogous to observations from randomized clinical trials where drugs may show exaggerated treatment effects when the comparator drug is ineffective.94
Furthermore, inadequate use of regression models and AUC analyses and modeling of additional predictors were associated with inflated estimates of improvement in the AUC. Studies without these methodological flaws did not show any substantial improvements in AUC. Publication bias and other types of selective reporting bias might also affect improved performance if articles and models that show no improvement in AUC remain unpublished. Misuse of subgroup analysis was also observed, as has been previously documented in diverse other fields.90
We used a convenient systematic sampling strategy to select eligible studies. Many other articles might examine improved prediction of a risk factor beyond the FRS without citing the reference FRS article. However, it is unlikely that those studies would be better reported. Furthermore, reported methods are not always commensurate with what was actually done in a study,95,96 which is the case even more in predictive literature, where no widely accepted reporting standards are available. However, when the reported information suggests considerable problems or biases, it is unlikely the actual study was immune to them.14
Another limitation is that we considered only studies that had at least 1 analysis performed where incremental predictive ability was assessed. These articles are the most sophisticated in the large literature about studies of new candidate predictors. Numerous other articles only considered candidate risk factors for association with some outcome alone or simply juxtaposed them with the FRS in plain correlation evaluations without establishing incremental prediction. Empirical evidence from other fields, eg, cancer, suggests that almost all of these studies report significant associations for the examined predictors.4 Methodological problems in these studies are even more extensive.5
Our analyses focused on cardiovascular research. However, their implications probably extend to prediction models for other diseases. Rigorous design, analysis, and reporting are essential elements of research studies. Efforts to enhance the transparency of reporting of prognostic marker and predictive model studies6 require wider adoption. Currently, incremental predictive effects are difficult to discern from the spurious effects that biased analysis or reporting can create.
Author Contributions: Dr Ioannidis had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Tzoulaki, Liberopoulos, Ioannidis.
Acquisition of data: Tzoulaki, Liberopoulos, Ioannidis.
Analysis and interpretation of data: Tzoulaki, Liberopoulos, Ioannidis.
Drafting of the manuscript: Tzoulaki, Ioannidis.
Critical revision of the manuscript for important intellectual content: Liberopoulos, Ioannidis.
Statistical analysis: Tzoulaki, Liberopoulos, Ioannidis.
Administrative, technical, or material support: Tzoulaki.
Study supervision: Ioannidis.
Financial Disclosures: None reported.
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.
- 96.









