Explore this issue:August 2018
How robust are the statistically significant findings in randomized trials in the head and neck cancer literature where surgery was a primary intervention?
The statistical significance of the majority of randomized control trials (RCTs) in the head and neck oncologic literature hinges on only a few events. The calculated fragility index (FI) score was lower than the number of patients lost to follow-up in a majority of cases. The FI helps address the deficits of the threshold P value and may serve as a useful adjunct, in addition to other metrics such as effect size and 95% confidence intervals.
Background: RCTs are the basis for evidence-based medicine and guide clinical decision making. The conclusions drawn from these trials are often based on statistically significant tests that suggest the findings are robust and not spurious in nature. Traditionally, the threshold P value of < .05 has been used to dictate whether an intervention reached statistical significance; however, the P value is frequently criticized for being overly simplistic. Many readers place the same degree of confidence in similar P values, irrespective of additional factors such as sample size or number of outcome events.
The FI has been developed to communicate the limitations of the P value, where the FI score is defined as the minimum number of patients whose status would have to change from a nonevent to an event for statistical significance to be lost. This is done by iteratively adding events to the trial arm with the fewest number of events until the recalculated P value is ≥ .05.
The developers of the FI have demonstrated that 24% of RCTs published in high-impact journals hinge on three or fewer events, and that more than 50% of trials had an FI score that was lower than the number of patients lost to follow-up. The FI tool has yet to be explored in the head and neck surgical patient population.