Not finding a difference doesn’t prove equivalence
The recent LINC trial was a randomised controlled trial comparing a mechanical chest compression device (LUCAS) with manual CPR(1). “No significant difference” was found for any of the main outcome measures considered.
So do you think the LINC trial demonstrated that mechanical CPR using the LUCAS device is equivalent, or at least not inferior, to manual CPR?
This was an interesting and important trial for those of us who manage prehospital cardiac arrest patients. In some social media discussions, it appears to have been interpreted by some as evidence that they are equivalent resuscitative techniques or that LUCAS is not inferior to manual CPR.
However, unless you see a p-value less than 0.05 in the table above, (issues of multiple hypotheses testing aside) no evidence of anything was demonstrated; not of difference and certainly not of equivalence. When faced with 2-sided p values >5%, investigators often conclude that there is “no difference” between the treatments, leading to an assumption among readers that the treatments are equivalent. A better conclusion is that there is “no evidence” of a difference between treatments (see opinion piece by Sackett, 2004(2)). In order to determine if treatments are equivalent, equivalence must be tested directly.
How can we test for equivalence?
First, we must define equivalence. It is crucial that this definition is provided a priori i.e. defined before the data are examined. As the focus of the LINC study was on superiority the investigators did not offer an a priori definition of equivalence. However, the CIRC study(3), conducted some time earlier and similar in design, did. (This study examined an alternative mechanical CPR device, the Zoll AutoPulse).
When establishing equivalence between treatments, instead of the more customary null hypothesis of no difference between treatments, the hypothesis that the true difference is equal to a specified ‘delta’ (δ) is tested (4).
To analyse the LINC results to look for equivalence, we can derive our delta values from the CIRC study, which as we’ve said did offer an a priori definition of equivalence. For the purpose of illustration, we will use the risk-difference stopping boundaries calculated for the CIRC study, rather than the odds ratio based equivalence margins, on the grounds of greater simplicity and clinical appropriateness. Therefore, we set our equivalence margins at -δ=-1.4% and δ=1.6%, meaning, where LUCAS fared no worse than manual CPR by 1.4% and no better by 1.6%, we will consider the two techniques equally efficacious. Thus, we will declare equivalence between LUCAS and manual CPR if the 2-sided 95% CI for the treatment difference lies entirely within -1.4% and 1.6%, and noninferiority if the one-sided 97.5% CI for the treatment difference (equivalent to the lower limit of the two-sided 95% CI) lies above -1.4%. (5).
These concepts and how they differ from a traditional comparison are more readily appreciated graphically (Fig. 1).
Figure 1. Two one-sided test procedure and the equivalence margin in equivalence/noninferiority testing between LUCAS and manual CPR
1a Traditional comparative study, such as the LINC trial, shows results with confidence intervals that show no evidence of a difference as they encompass 0.
1b. Using equivalence margins (-δ and δ) derived from a similar study (CIRC), we show that the LINC trial does not demonstrate that LUCAS and manual CPR are equally efficacious, since the 95% CI do not lie completely within the equivalence margins.
The presentation of the LINC trial’s results shows no evidence of a difference in outcomes between mechanical and manual CPR, which is not the same as showing they are equivalent or that mechanical CPR is non-inferior. However if we re-examine their data using equivalence margins (-δ, δ) derived from a similar study (CIRC), there is some evidence that the LUCAS device is not inferior to manual CPR (but not necessarily equivalent) with respect to longer term good neurological outcome.
1. Rubertsson S, Lindgren E, Smekal D, er al. Mechanical Chest Compressions and Simultaneous Defibrillation vs Conventional Cardiopulmonary Resuscitation in Out-of-Hospital Cardiac Arrest
JAMA. 2014 Jan 1;311(1):53-61
2. Sackett D. Superiority trials, non-inferiority trials, and prisoners of the 2-sided null hypothesis
Evid Based Med 2004;9:38-39 [Open Access]
3. Lerner EB, Persse D, Souders CM, et al. Design of the Circulation Improving Resuscitation Care (CIRC) Trial: a new state of the art design for out-of-hospital cardiac arrest research
Resuscitation. 2011 Mar;82(3):294-9
4. Dunnett CW, Gent M. Significance testing to establish equivalence between treatments, with special reference to data in the form of 2X2 tables. Biometrics. 1977 Dec;33(4):593-602
5. Piaggio G, Elbourne DR, Pocock SJ, et al. Reporting of noninferiority and equivalence randomized trials: extension of the CONSORT 2010 statement. JAMA. 2012;308(24):2594-604. [Open Access]