- Type 1 error/effect of multiple analyses;
- Type 2 error/effect of inadequate power (sample size); and
- Over-interpretation of data.
According to Dr. Mehanna, Type 1 errors can be adjusted for by making the p-value more stringent. “Multiple analyses have a high risk of Type I error, especially if you’re using p = 0.05 as the level of significance,” Dr. Mehanna said. “That can be an overestimation of the effect of the intervention. We do a lot of analyses in the hope that we can find something that is less than P = 0.05, and we consider that as the significant, and then we build our paper around it.”
Explore This Issue
November 2019Dr. Mehanna cites the nivolumab study as a clinical trial that did well in avoiding a Type 1 error: The protocol was set and published before the study started, detailing exactly what analyses would take place. “It built the sample size to be 0.05, or 0.001 or whatever depending on the size and the number of analyses that were going to take place,” he said, “and stated exactly every question they were going to look at.”
We are recruiting and supporting recruitment into trials, but only where the trials do not impinge on our decision making. —Hisham Mahanna, MD
Type 2 errors lead to incorrectly missing a true effect because the sample size of the clinical trial isn’t large enough to detect it. Dr. Mehanna pointed to two randomized studies, one in 1980 and the other in 2009, that showed no benefits from neck dissection in early oral cancer (Cancer. 1980;46:386–390; Head Neck. 2009;31:765–772). Each study looked at only approximately 75 patients, and surveillance became one of the standards of care for early oral cancer. In 2015, Dr. D’Cruz’s study looking at elective neck dissection versus surveillance involved nearly 600 patients. “The real shock,” Dr. Mehanna said, “was that there was a 12.5% absolute difference in overall survival between having an elective neck dissection and surveillance. That, to me, was an “oh dear” moment, because for many years our patients were being treated with surveillance.” The previous studies had been much too small to identify an effect reliably.
Over-interpretation of data is the third crime, according to Dr. Mehanna, who pointed to the 2009 study that confirmed HPV-positive patients did better than HPV-negative patients. It also suggested that there were three different risk categories. The question was, were they over treating by giving the low risk patients who were doing really well too much chemoradiotherapy?
Some clinicians extrapolated data from the Bonner study of 2006 (which showed that adding cetuximab to radiotherapy resulted in better outcomes than by radiotherapy alone) to conclude that using cetuximab and radiotherapy for HPV patients could benefit them, and they changed clinical practice on that basis (N Engl J Med. 2006;354:567–578).
“However, our De-ESCALaTE trials showed that out of every 13 patients who were treated with cetuximab, one died unnecessarily because they hadn’t been treated with cisplatin,” Dr. Mehanna said. “That has had a big effect on our practice. In fact, those studies have had a drastic reduction on the use of cetuximab for low-risk HPV patients worldwide, an almost overnight reduction in cetuximab for those patients.”
Furthermore, Dr. Mehanna said that trials also benefit healthcare systems. “The PET/CT arm resulted in a cost saving of £1,415 per person treated,” Dr. Mehanna said. “In the U.K., about 2,000 patients would have gotten a neck dissection before for this indication. That translates into almost £3 million a year of savings, which covers HPV vaccine for almost 10,000 children a year.” And there is a fair amount of data that shows the clinical outcomes of all patients treated within a healthcare system that does a lot of research are better than those outcomes at healthcare organizations with low to no research activity.