Glucosamine and Chondroitin for Treatment of Osteoarthritis
A Systematic Quality Assessment and Meta-analysis
- Timothy E. McAlindon, DM;
- Michael P. LaValley, PhD;
- Juan P. Gulin, MD;
- David T. Felson, MD
Abstract
Context Glucosamine and chondroitin preparations are widely touted in the lay press as remedies for osteoarthritis (OA), but uncertainty about their efficacy exists among the medical community.
Objective To evaluate benefit of glucosamine and chondroitin preparations for OA symptoms using meta-analysis combined with systematic quality assessment of clinical trials of these preparations in knee and/or hip OA.
Data Sources We searched for human clinical trials in MEDLINE (1966 to June 1999) and the Cochrane Controlled Trials Register using the terms osteoarthritis, osteoarthrosis, degenerative arthritis, glucosamine, chondroitin, and glycosaminoglycans. We also manually searched review articles, manuscripts, and supplements from rheumatology and OA journals and sought unpublished data by contacting content experts, study authors, and manufacturers of glucosamine or chondroitin.
Study Selection Studies were included if they were published or unpublished double-blind, randomized, placebo-controlled trials of 4 or more weeks' duration that tested glucosamine or chondroitin for knee or hip OA and reported extractable data on the effect of treatment on symptoms. Fifteen of 37 studies were included in the analysis.
Data Extraction Reviewers performed data extraction and scored each trial using a quality assessment instrument. We computed an effect size from the intergroup difference in mean outcome values at trial end, divided by the SD of the outcome value in the placebo group (0.2, small effect; 0.5, moderate; 0.8, large), and applied a correction factor to reduce bias. We tested for trial heterogeneity and publication bias and stratified for trial quality and size. We pooled effect sizes using a random effects model.
Data Synthesis Quality scores ranged from 12.3% to 55.4% of the maximum, with a mean (SD) of 35.5% (12%). Only 1 study described adequate allocation concealment and 2 reported an intent-to-treat analysis. Most were supported or performed by a manufacturer. Funnel plots showed significant asymmetry (P≤.01) compatible with publication bias. Tests for heterogeneity were nonsignificant after removing 1 outlier trial. The aggregated effect sizes were 0.44 (95% confidence interval [CI], 0.24-0.64) for glucosamine and 0.78 (95% CI, 0.60-0.95) for chondroitin, but they were diminished when only high-quality or large trials were considered. The effect sizes were relatively consistent for pain and functional outcomes.
Conclusions Trials of glucosamine and chondroitin preparations for OA symptoms demonstrate moderate to large effects, but quality issues and likely publication bias suggest that these effects are exaggerated. Nevertheless, some degree of efficacy appears probable for these preparations.
Osteoarthritis (OA) is a major public health problem for which there are few effective medical remedies.1 Nonsteroidal anti-inflammatory agents are the most commonly prescribed agents for this disorder but are a frequent cause of serious adverse effects.2-3 Glucosamine and chondroitin are compounds extracted from animal products that have been used in various forms for OA in Europe for more than a decade and have recently acquired substantial popularity because of several lay publications.4 Because of their safety, these remedies would have great utility in the treatment of OA even if they were only modestly effective. They are absorbed from the gastrointestinal tract5-6 and appear to be capable of increasing proteoglycan synthesis in articular cartilage.7-8 Furthermore, these agents have been tested in a number of clinical trials that are widely interpreted as demonstrating efficacy for osteoarthritis.9-21 The medical community in the United Kingdom and the United States, on the other hand, appears to have paid little attention to the potential benefits of these compounds.22 This skepticism appears to be based largely on concerns about the quality of these trials, although this matter has not been evaluated formally.23
Such concerns may be well founded. Considerable progress has been made recently in elucidating the specific aspects of methods used in these trials that affect the validity of their conclusions.24-25 These studies have shown that trials with methodological flaws, especially inadequate allocation concealment and absence of intent-to-treat approaches,25 are associated with exaggerated estimates of benefit.26-27
Therefore, we appraised the evidence provided by clinical trials of glucosamine and chondroitin preparations in OA by combining a systematic quality assessment with a meta-analysis.
METHODS
Identification of Clinical Trials
We searched for clinical trials of glucosamine or chondroitin compounds using electronic searches of MEDLINE (from 1966 to June 1999) and the Cochrane Controlled Trials Register.28 Osteoarthritis, osteoarthrosis, degenerative arthritis, glucosamine, chondroitin, and glycosaminoglycans were entered as Medical Subject Heading terms and as textwords. The terms were then connected through Boolean operators and the result was limited to studies reporting only on human subjects and clinical trials. We had no limitations on whether clinical trial was controlled or randomized, on language, or age group. Manuscript or abstract publications were also sought by screening citation lists in review articles and published manuscripts. Abstracts presented at national and regional meetings of the American College of Rheumatology, the British Society for Rheumatology, and the Osteoarthritis Research Society were manually searched in supplement issues of Arthritis and Rheumatism, the British Journal of Rheumatology, and Osteoarthritis and Cartilage published between 1978 and 1998. Where abstract data were incomplete, we contacted the primary author to request further information. Finally, we attempted to identify unpublished data by contacting content experts, study authors, and manufacturers of glucosamine or chondroitin.
Inclusion Criteria
Because of evidence that these compounds may take several weeks to exert any therapeutic effect, we included only controlled trials that were at least 4 weeks in duration and trials that tested oral or parenteral glucosamine sulfate, glucosamine hydrochloride, or chondroitin sulfate against placebo among individuals with knee or hip OA. Only trials clearly stated to be double-blind and that had randomized treatment assignments were included in our meta-analysis. We also required that each trial include at least 1 of the outcome measures currently recommended for OA clinical trials (Table 1).29
Table 1. Hierarchy of Outcome Measures Used in the Meta-analysis*
Trial Efficacy Measures
We adopted 2 approaches in defining the outcome of each trial. In the primary analysis, we tested the outcome that the authors stated to be the main measure used in their trial. For the secondary analyses, however, we compiled a list of 5 outcome measures recently recommended for OA clinical trials29 and extracted the reported outcome that was highest on this list. We took this approach to examine the possibility of bias resulting from post hoc selection of "primary outcomes" among these trials.
Data Extraction
The data extraction was performed by 2 reviewers (M.P.L. and J.P.G.) using a standardized form. Where necessary, means and measures of dispersion were approximated from figures in the manuscripts. For 4 trials9, 30-32 that presented mean values without measures of variability, we imputed SDs by multiplying the means for the trial arm by the median coefficient of variability (a measure of variability that is not sensitive to trial duration) from other trial arms included in the meta-analysis that used the same outcome.
Quality Assessment Instrument
We evaluated each reported clinical trial by applying a quality assessment instrument that has been developed and tested in studies by Chalmers et al25 and Rochon et al.33 This instrument assigns a score for reported compliance with each of 14 aspects of clinical trial conduct (control appearance, allocation concealment, patient blinding, observer blinding to treatment, observer blinding to results, prior estimate of numbers, compliance testing, inclusion of pretreatment variables in analysis, presentation of statistical end points, statistical evaluation of type II error, presentation of confidence limits around between-group differences in statistical end points, quality of statistical analyses, withdrawals, side-effects discussion). The potential scores derived from this system range from 0 to 68 for negative and from 0 to 65 for positive studies and are expressed as a percentage of the maximum possible score for each trial.
The instrument has been shown to be consistent between reviewers, and has been used to evaluate large numbers of trials.33 These have demonstrated mean (SD) quality scores of 38.5% (13.1%) for journal articles and 33.6% (12.8%) for those published in supplements.33 In addition to quality scoring, we recorded separately whether an intent-to-treat approach had been undertaken in the study analysis.
Finally, we attempted to determine the presence of industrial sponsorship for each trial. When an article or an abstract included no disclosure for sponsorship, we contacted authors directly. We asked about source of funding, author affiliation, and level of sponsor involvement in the study.
Quality Scoring
Following a training session, 2 rheumatologists (T.E.M. and D.T.F. ) independently reviewed the articles and performed the quality scoring. To optimize consistency, disagreements in quality scores were adjudicated by a process in which the 2 reviewers discussed all discordant items. Reviewers were subsequently allowed to adjust their score assignments, although strict concordance was not considered mandatory. The mean of the postadjudication scores of the 2 reviewers was used in the analyses. One of the articles, published in German,31 was scored for quality with the help of a biostatistician fluent in German.
Statistical Approach Efficacy Analyses. We performed separate meta-analyses for trials of glucosamine compounds and trials of chondroitin. Intent-to-treat results were used whenever possible.
We calculated an effect size for each trial from the difference in mean outcome value between the treatment and placebo arms at the end of the trial, divided by the SD of the outcome value in the control group at trial end. Our use of the SD of the control group at the end of the trial was based on our concern that treatment might artificially change variation in the treated group.34 If trial end values were not presented in the report, change from baseline was used as a proxy measurement. To reduce bias, we multiplied all effect sizes by a correction factor that depended on the sample size as defined by Hedges and Olkin.35 Effect sizes provide unitless measures of treatment efficacy that are centered at zero if the treatment effect is similar to placebo. A scale for effect sizes has been suggested by Cohen,36 with 0.8 reflecting a large effect, 0.5 a moderate effect, and 0.2 a small effect. We pooled effect sizes in our analyses using a random effects model, since this tends to produce more conservative estimates than the fixed effects model.37
Sensitivity Analyses. To assess the effect of choice of primary outcome on the results, we derived a pooled estimate based on the secondary outcome measure selected from our predefined hierarchy. When 1-month results were presented for a trial, we tested the treatment effect at this time point in a separate meta-analysis. Also, to investigate the influence of trial quality on the study outcome, we dichotomized each group about the median quality score and the median trial size and repeated the meta-analyses among these subsets. We repeated the analyses after excluding the 4 trials for which data imputations had been made. Finally, to investigate possible biases associated with combining heterogeneous outcome measures, we performed subset analyses in which the models were confined to only trials with pain outcomes or with algofunctional outcomes (ie, Lequesne index, a questionnaire-based disability score).38
Evaluation for Bias. We tested for the possibility of bias among our sample of clinical trials using 2 approaches. In the first, we generated funnel plots, which graph the effect size for a trial on the horizontal axis and the number of subjects in that trial on the vertical axis. Asymmetry in the funnel plot suggests bias. In a second analysis, we regressed effect size on the inverse of the study variance, using a method described by Egger et al,39 which considers bias to be present if the intercept for the regression is different from null at P<.138.
RESULTS
Trials
We identified 17 placebo-controlled clinical trials that fulfilled our inclusion criteria.9-10,15-16,18, 20, 30-32,41-47,49 We excluded 2 studies that did not report sufficient numerical results to permit data extraction.44, 46 Therefore, our meta-analysis is based on 15 trials. The characteristics of the included studies are presented in Table 2. Four glucosamine18, 30, 42-43 and 4 chondroitin trials10, 15, 31, 45 also reported outcome observations at the 4-week time point. Three trials reported mean values for outcome variables but did not list the variability associated with the means.9, 30-31 For 4 trials9, 30-32 variability was imputed from other studies in the meta-analysis.
Table 2. Characteristics of Eligible Clinical Trials of Glucosamine and Chondroitin Preparations*
Meta-analysis
We found moderate treatment effect sizes for glucosamine (0.44; 95% confidence interval [CI], 0.24-0.64) and large effects for chondroitin (0.96; 95% CI, 0.63-1.3; Figure 1). The test for heterogeneity was significant (P<.001) among the chondroitin trial sample, however. One chondroitin trial47 had a substantially larger effect (effect size, 4.6) than any other trial. When this trial was removed from the chondroitin analysis, the test for heterogeneity became nonsignificant (P = .5) and the effect diminished to 0.78 (95% CI, 0.60-0.95). These results were not substantially altered when we repeated the analyses using the outcome measures imposed from our predefined hierarchy (glucosamine, 0.49 [95% CI, 0.24-.074]; chondroitin, 0.88 [95% CI, 0.67-1.1]). Smaller effect sizes were observed for the 1-month outcome among the 9 trials that reported observations at this time point (glucosamine, 0.26 [95% CI, 0.10-0.42]; chondroitin, 0.40 [95% CI, 0.17-0.62]). Effects sizes were similar without correction for bias (glucosamine, 0.46; chondroitin, 1.0), and after excluding the 4 studies for which imputations had been made (glucosamine, 0.35; chondroitin, 0.87). Similar results were also observed on confining the models to trials with pain outcomes (3 glucosamine trials: effect size, 0.51 [95% CI, 0.05-0.96]; 8 chondroitin trials: effect size, 0.86 [95% CI, 0.64-1.09]) and trials reporting Lequesne index (3 glucosamine trials: effect size, 0.41 [95% CI 0.14-0.69]; 2 chondroitin trials: effect size, 0.63 [95% CI, 0.32-0.94]).
Figure 1. Forest Plot of Effect Sizes for Trials and Pooled Effects
95% confidence intervals are shown using lines extending from the symbols. Effect size is based on the scale proposed by Cohen36 in which 0.8 reflects a large effect, 0.5 a moderate effect, and 0.2 a small effect.
Quality Scores
The level of agreement between the 2 reviewers was good with intraclass correlation coefficients of 0.75 prior to and 0.92 after adjudication (P<.01). Quality scores ranged from 12.3% to 55.4% with a mean (SD) of 35.5% (12%). Only 1 provided sufficient information to determine that allocation concealment had been adequate.43 Furthermore, only 2 articles reported an intent-to-treat analysis,32, 43 and only 1 of these gave sufficient statistical information for this to be incorporated in our meta-analysis. Indeed, 7 studies did not present dropout rates, and the remainder reported a mean (SD) rate of 1.2% (4.2%) per month.
Sponsorship
None of the studies reported independent funding from any governmental or non-for-profit organization. Six articles presented sufficient information to ascertain manufacturer support. Contact with authors from the remaining studies confirmed some level of manufacturer sponsorship for all except 2 studies. Six studies received direct financial support from a manufacturer. Seven articles included an investigator from the company as an author. In at least 4 studies, the manufacturer conducted key aspects of the trial such as randomization, data collection, or statistical analysis. These results are summarized in Table 2.
Evaluation for Publication Bias
The funnel plots for the trials included in our analyses are depicted in Figure 2. Both plots showed significant asymmetry, reflecting a relative absence of trials with both small numbers and small or null treatment effects. Analyses in which we tested quantitatively for publication bias by regressing effect size with inverse of study variance showed strong evidence for bias (glucosamine, intercept estimate, 1.3, P = .01; chondroitin, 3.8, P = .002).
Figure 2. Funnel Plot of Glucosamine and Chondroitin Trials
Effect size is based on the scale proposed by Cohen36 in which 0.8 reflects a large effect, 0.5 a moderate effect, and 0.2 a small effect.
Influence of Trial Quality Scores
Pooled effect sizes were substantially higher among lower-quality compared with higher-quality trials. For glucosamine, the pooled effect for trials with a quality score below the median was 0.7 (95% CI, 0.4-1.0) vs 0.3 (95% CI, 0.1-0.5) for trials with a quality score above the median. For chondroitin, the pooled effect for trial with a quality score below the median was 1.7 (95% CI, 0.7-2.7) vs 0.8 (95% CI, 0.6-1.0).
Influence of Trial Size
For glucosamine, the pooled effect for small trials was 0.5 (95% CI, 0.1-0.9) compared with 0.4 (95% CI, 0.1-0.7) for large trials. In contrast, for chondroitin, the pooled effect for small trials was much greater (1.7 [95% CI, 0.5-2.8]) than large trials (0.8 [95% CI, 0.6-1.0]) for large trials.
COMMENT
Trials of glucosamine and chondroitin preparations for OA collectively demonstrate moderate to large treatment effects on symptoms, but our assessments of methodological aspects of these studies suggest that the actual efficacy of these products is likely to be substantially more modest. Furthermore, the efficacy was smaller when measured after only 4 weeks of treatment, suggesting that induction of full therapeutic benefit may take longer than 1 month. Nevertheless, even modest efficacy could have clinical utility, given the safety of these preparations.
We evaluated the quality of each clinical trial by applying a validated assessment instrument, which has been developed and described in detail in Chalmers et al25 and Rochon et al.33 This instrument scores aspects of how a trial is reported to have been conducted, including allocation concealment. Allocation concealment is separately assessed from blinding as it relates to preventing selection bias and protecting assignment sequence before and until treatment allocation, while blinding is concerned with preventing ascertainment bias and protecting assignment sequence after allocation.50 Using this instrument, a full score of 10 points is assigned if a report outlines its procedural methods that would ensure allocation concealment (eg, in the study by Noack et al,43 indistinguishable treatments randomly precoded by a central pharmacy). Partial credit of 5 points is given when a method is used that generates a small chance of the next treatment assignment being predicted (eg, in the study by Pujalte et al,16 indistinguishable treatments "blindly assigned" from a "previously randomized list"). No credit is given when quasi-randomization procedures are used (eg, chart numbers), or when the method cannot be discerned from the report, as in the majority of these trials.
In theory, a poorly described study could receive a low score even if well conceived. It should be noted, however, that it is poor quality reports that have been associated with inflated estimates of benefit.27, 51 These and other investigations in this field52 strongly suggest that inadequate reports generally reflect inadequate methods. Nevertheless, it is possible that some well-performed trials have been given low scores because of inadequate descriptions of their methods.
There is also some potential for variability of quality scoring between observers in the application of this instrument to individual studies. While the interobserver reproducibility was found to be good, we chose to further increase reliability by taking the mean of their final scores following a session to adjudicate differences.
As in trials of other pharmacologic agents for arthritis disorders,53 these studies exhibited numerous methodological problems and biases. Particular methodological flaws that have been associated with inflation of treatment effects and that were frequent in these trials included inadequate allocation concealment26-27 and absence of intent-to-treat approaches.25 Further empirical evidence for inflated estimates of benefit is suggested by our observation of smaller effect sizes among the higher-quality and the larger studies.
The second major concern is that we found statistical evidence of bias reflecting an absence of trials with both small numbers of participants and small (or null) treatment effects. Such bias may arise from various sources including selective publication of positive trials, post hoc selection of study outcome measures, and premature trial termination once a positive outcome is achieved.39 The imposition of an outcome measure chosen independently of the investigators made little difference to the overall results, suggesting that publication bias may be a more likely explanation for this asymmetry. This possibility is strengthened by the finding that most, if not all, of the trials received some level of sponsorship from a manufacturer of the study compound. On the other hand, we contacted authors of published articles, and content experts, in attempts to determine the existence of any unpublished trials of these compounds, and found none.
We included trials reported in supplements and as abstracts in this analysis. Although absence of peer review has been associated with lower quality scores,33 we chose to include these trials for 2 reasons: First, we had envisaged that negative studies would more likely be represented among this group; second, we intended to include trials in our review that were cited in lay publications and that likely contributed to the current vogue for these preparations.
A possible limitation to our analysis and to any meta-analysis is that the trials may be so varied (heterogeneous) that producing a pooled effect is meaningless. Although we did not pool glucosamine and chondroitin trials, 2 possible sources of heterogeneity still exist in our analysis. The first is that we combined studies that were heterogeneous in the routes of administration, and preparations were tested within the 2 compound types. It might be argued that difference in administration and preparation of trial compounds could result in biological differences in the mode of action of these nutritional compounds. However, from an empirical perspective, this consideration has little impact on our analyses because all our sample trials showed positive effects, and statistical assessments consistently found little or no evidence of heterogeneity.
A second potential source of heterogeneity is that we pooled trials that measured the outcome in different ways (eg, pain or function) and used different instruments (eg, visual analog scales, Lequesne index38 Western Ontario and McMaster Universities Osteoarthritis Index). To address this, we used an effect size that was derived from the standardized mean difference, which should enhance comparability between different outcome types. To explore further the possibility of heterogeneity due to different outcome measures, we performed analyses confined to trials with pain-based outcomes and to those that used the Lequesne index. The effect sizes remained relatively consistent in these analyses, suggesting that heterogeneity due to different outcome measures did not adversely affect our analyses.
We also made imputations for the measures of variability from 4 trials that did not report these data.30-32,54 We did this because we wished to include as many trials as possible, yet we were aware that some articles would not include enough detail to allow calculation of effect size. Because trials could be of different durations, we used the coefficient of variability, a measure that is not sensitive to trial duration, to calculate the SD and effect size. This approach has been used in previous meta-analyses.54 Because of potential concerns, however, we repeated the main analyses after excluding studies for which imputations had been made. This made little difference in the results of the meta-analyses.
In summary, we have found that trials of glucosamine and chondroitin preparations for OA symptoms demonstrate moderate to large effects but exhibit methodological problems that have been associated with exaggerated estimates of benefit.39 Overall, it seems probable that these compounds do have some efficacy in treating OA symptoms and that they are safe. Because of this, they may have considerable utility in OA treatment. We recommend further high-quality, independent studies to determine the actual efficacy and utility of these preparations.
Acknowledgments
Financial Disclosure: None of the authors are affiliated with, or funded by, any manufacturer of chondroitin or glucosamine products.
Funding/Support: This study was funded by grant AR20613 from the National Institutes of Health.
Acknowledgment: We thank Camlin Tierney, PhD, Biostatistics Department, Harvard School of Public Health, for her help in quality scoring the German language publication.
Corresponding Author and Reprints: Timothy E. McAlindon, DM, The Arthritis Center, Boston University School of Medicine, 715 Albany St, A203, Boston, MA 02118 (e-mail: tmcalind{at}bu.edu).










