A Randomized Pilot Trial of Virtual Reality Surgical Planning for Head and Neck Oncologic Resection

INTRODUCTION

Positive surgical margins (PSM) after resection of head and neck squamous cell carcinoma (HNSCC) are one of the most important prognostic indicators and predictors of recurrence. Despite this, oral cavity cancer has one of the highest rates of positive margins among solid tumors in men and women. High surgical complexity and surgeon task load burden (TLB) are among the factors contributing to PSM rates. Surgeons must manage a complex three-dimensional (3D) surgical field while avoiding critical structures with convoluted trajectories. Advancing technologies such as computer-aided design, virtual reality (VR), and 3D modeling may represent opportunities to enhance surgical planning, improve surgeon–pathologist communication, and avoid treatment failure. Recent VR advancements include virtual toolboxes whereby users can mark, measure, annotate, erase, and otherwise modify radiograph models, enabling “virtual resection” before surgical cases.

The majority of research utilizing VR and 3D scanning has shown benefits in training settings and for clinical use in intra-operative navigation. Most studies on the role of VR and surgical planning assess bony reconstruction. Few studies have focused on the role of VR in pre-operative oncologic surgical planning and its impact on surgeon TLB. Although VR is increasingly used in the healthcare setting, its measurable benefits remain largely unknown. It is crucial to assess the utility of this technology to avoid increasing cost without meaningful benefit.

In this pilot study, we investigate the feasibility of a novel Virtual Reality Case Enhancement Protocol (VRCEP). This protocol included VR model generation, VR planning, and 3D post-resection model markup. We sought to assess key barriers to VR implementation, such as workflow and surgeon TLB, while exploring metrics of utility and value through margin events. Currently, there are few valid, reproducible endpoints to quantify the impact of this evolving technology on oncologic outcomes. While positive final margins are useful, their overall frequency is low, necessitating high numbers of patients to achieve power; however, positive frozen margins have been correlated with worse outcomes. Additionally, specimen-driven margins in oral cavity cancer are considered superior to defect-driven margins (DDM) in defining margin status and correlating with recurrence risk. We sought to use “margin events”: positive final margins, positive frozen margins, and the need for DDMs to lower the threshold for detection of a measurable event and increase power. In selecting these endpoints, we chose metrics that could correlate with survival while yielding measurable value in the short term. In this pilot study, we aimed to evaluate the feasibility and impact of the VRCEP in planning for head and neck oncologic resection.

METHODS

Trial Design

This study was approved by the Thomas Jefferson University Hospital Institutional Review Board. We conducted a prospective, non-blinded, randomized controlled pilot trial with patients from a single academic tertiary hospital between January and September 2022. The primary endpoint was feasibility, defined as successful completion of 80% of the VRCEPs, including generating VR models, completion of VR planning, and generating 3D specimen models for margin markup. The sample size was determined based on previous literature. Sample sizes between 24 and 50 have been recommended for pilot studies assessing feasibility. Previous literature on feasibility studies has a median of 36 patients per arm, and feasibility studies with continuous endpoints have a sample size of 30 per arm. Given the effort and time required for preparation and completion of the study in the clinical setting, we elected to limit our collection to 20 randomized patients per arm.

Head and neck surgeons from the institution carried out study evaluations. Adult patients diagnosed with mucosal head and neck carcinoma undergoing definitive oncologic resection met the inclusion criteria. Exclusion criteria included patients under 18 years of age or those who had a known impairment impacting their ability to provide informed consent. After informed consent was obtained, cases were randomized to one of two treatment arms: standard of care (SOC) plus the VRCEP or SOC alone. Patients underwent simple, fixed 1:1 randomization with assignment through the REDCap (Nashville, Tennessee) randomization module tool. For the SOC cohort, surgeons reviewed pre-operative cross-sectional images from CT, MRI, and/or PET as in standard clinical practice. For the VRCEP cohort, surgeons performed a VR review of three models in conjunction with standard cross-sectional imaging. Surgeons used the full virtual suite of instruments to alter the models to conduct virtual resections.

Procedures and Equipment

Figure 1: Visual description of the Virtual Reality Case Enhancement Protocol (VRCEP). Pre-operative planning included standard image review, virtual reality case review, and virtual resection (top). After surgical resection, post-operative scanning and marking were completed (bottom). https://doi.org/10.1002/lary.31874

For cases randomized to VRCEP, VR was used for pre-operative surgical planning. Radiographic images (CT or MRI) were uploaded as a DICOM file to Medical Holodeck V1.0 (Zurich, Switzerland), an immersive 360° 3D software that runs on the Oculus Rift VR headset (Meta, Menlo Park, Calif.). This program was selected for its low cost and automated soft tissue segmentation capability, which does not require technicians for optimal use. Software functions included 3D manipulation of segmented rendering of patient imaging, omnidirectional cross-sectional viewing, measuring, cutting, free-hand annotating, masking, and screen capture/recording. Using omnidirectional view, measure, mark, and cut, the surgeon virtually isolated and resected the tumor (Fig. 1).

Following surgical resection, specimens were taken from the operating room to the gross pathology room for 3D scanning (post-resection modeling) and pathological analysis. Specimens were rinsed with water to remove blood contents and dried to reduce visual glare. Specimens were placed on a structured light 3D scanner (EinScan SP, Shining 3D, Hangzhou, China) and serially imaged. Each side of the specimen was scanned individually, resulting in two separate 3D data surfaces. Three-point cross-registration was used to geometrically align both surfaces into a singular 3D model. The resulting meshwork was rendered into a watertight, photorealistic virtual 3D model that was loaded into 3D Slicer for surgeon and pathologist review and marking.

Outcome Measures

Feasibility

Successful VR generation of the imaging-based model pre-operatively, completion of the VR procedure by the surgeon, and generation and markup of the post-resection 3D model were all completed and used to determine if the VRCEP was feasible. The feasibility cutoff was set at 80% completion.

Surgeon assessment

The NASA-TLX survey was administered to evaluate the impact of VR on surgical procedural burden. This is a validated tool designed to measure the subjective workload of a task, with each category scored from 0 to 100 in five-point increments. The measured categories include mental demand, physical demand, temporal demand, effort, performance, and frustration level. Higher scores in each category indicate a higher demand for the task or a lack of success in the task. Scores from all categories were averaged for the total unweighted NASA-TLX score for each case. Scores were compared between SOC and VRCEP.

In addition to the validated NASA-TLX, a post-operative surgeon assessment survey was developed to capture the surgeon’s perspective of the VRCEP. Post-operatively, surgeons in the VRCEP cohort were surveyed on the VR’s helpfulness and impact on pre-operative planning, and any limitations in its use.

Margin events

The margin event score (MES) and margin event rate (MER) were developed as secondary endpoints using the Clinical Trials Transformation Initiative Framework. While this framework is typically designed for qualitative, patient-driven outcomes for digital health technology, we believe it is applicable in the design of the quantitative trial endpoints presented here.

A “margin event” (ME) was defined as defect-driven margins, positive intra-operative frozen section margins, and/or positive final margins. Thus, each case could have a maximum of three possible events. The MES was calculated per case, along with a mean MES per cohort. We also calculated an MER per cohort as the sum of all MEs over the total possible MEs. Defect-driven margins were defined as margins taken at the time of surgery from the defect/tumor bed, rather than the surgical specimen. The standard for our institution is to take margins from the surgical specimen for pathologic analysis when possible. Positive final margins were defined as at least one positive margin at the time of surgery. We did not penalize for intra-operative frozen sections taken for reasons other than margin assessment, such as confirmation of disease or for nerve margins, given the challenges of identifying this on imaging.

Ad hoc margin event analysis

Initially, we did not stratify for cases in which taking specimen-driven margins is not feasible. In certain types of cases, specimen-driven margins are not feasible, and therefore, we considered these as “expected defect-driven margins.” Expected cases include endoscopic laser laryngeal, maxillectomy, or skull base resections. We performed an ad hoc analysis to assess the impact, if any, of non-feasible or expected defect-driven margins on the significance of the findings. For this analysis, we counted a positive ME only for cases considered unexpected and defined this as an “unexpected defect-driven margin” (uDDM).

Post-resection modeling

Post-resection 3D tumor models were constructed as part of the trial assessments. The construction of the models was used as part of the feasibility assessment, but we also conducted surgeon and pathologist markup of margins in an attempt to assess the feasibility of modeling utility. In future studies, we hope to utilize this part of the protocol to assess agreement between providers and effective communication between the services.

Statistical Analysis

Statistical analysis was conducted on SPSS Version 28.0.1.1. Descriptive statistics were calculated for patient demographics, cancer stage, and surgeon. Two-tailed t-tests were used for comparison of overall NASA-TLX scores and subcategory scores between the VRCEP and SOC cohorts and between cancer subsites. Fisher’s exact test was used to compare DDMs, positive frozen margins, positive final margins, and intra-operative change in plan between cohorts. Chi-square was used to compare MER between cohorts. Two-tailed t-tests were used to compare the average MES. Significance was defined as p-value <0.05.

RESULTS

Thirty-nine patients were enrolled between January 2022 and September 2022, with 19 patients assigned to the VRCEP and 20 patients to SOC. Three patients (one randomized to VRCEP and two randomized to SOC) did not undergo the intended intervention due to a change in treatment plan for nonsurgical management: one with chemoradiotherapy, one with radiation, and one opted for palliative care. One VRCEP patient did not undergo the allocated intervention and was therefore excluded from surgeon TLB and margin event analyses. One SOC patient was excluded from the analysis due to meeting the exclusion criteria (benign pathology).

Thirty-four participants with squamous cell carcinoma were included in the per-protocol analysis, with 17 VRCEP and 17 SOC. No differences in demographic, subsite, and T-stage characteristics between groups were identified (p > 0.05) except for a higher proportion of males in the SOC cohort (66.7% versus 33.3%). There were no intra-operative complications or VR-related adverse events. Three VRCEP patients and one SOC patient had reoperation due to post-operative complications: exploration due to thrombosis/hemorrhage (VRCEP), washout due to flap infection (VRCEP), antibiotic and free flap exploration due to delirium (VRCEP), and reoperation for margins due to specimen fragmentation (SOC).

Impact of VR Case Enhancement Protocol

Feasibility

Figure 2: VR Case Enhancement Protocol case. Screen captures of the patient’s scan and virtual resection (top). Images of the surgical specimen on the scanner and in EinScan (middle). 3D renderings of the surgical specimen in 3D slicer with and without markings (bottom).

The pre-operative case was uploaded and completed for 17/18 patients randomized to VRCEP. Three-dimensional specimens were loaded and marked by surgeons and pathologists for all 17 cases with completed pre-operative planning. Results are displayed in Figure 2.

Surgeon task-load burden

Surgeon task load burden was quantified using NASA-TLX. In the VRCEP and SOC cohorts, respectively, there was no significant difference in mean score (45.8 versus 53.0) or in any subcategory (p > 0.05). For both cohorts, the lowest scores were in the frustration subcategory and the highest were in mental demand. Following stratification by surgeon, there remained no significant differences in overall or subcategory scores.

Surgeon assessment survey—benefit of VRCEP

Surgeons assessed the impact of VR. Surgeons agreed that VRCEP was helpful and easily integrated in 16/16 cases. VR contributed to a change of plan before surgery in 31% of cases. Of those with pre-operative approach changes, only one included an intra-operative plan change. Overall, 3/17 cases in the VRCEP group had an intra-operative change in plan compared to 6/17 SOC cases (p = 0.44). In five cases, surgeons qualitatively commented on the somewhat/marginal helpfulness of VR. In all five comments, surgeons noted the benefit was limited by the technology’s level of detail, adequate segmentation, and differentiation of soft tissue. Surgeons noted the benefits of being able to view pre-operative imaging obliquely in addition to standard coronal, axial, and sagittal slicing.

Margin event rates and score

The impact of VR on margins was analyzed by comparing positive frozen margin, positive final margin, DDM, and composite metrics to SOC. The average MES per case was significantly lower in the VRCEP versus SOC cohorts (0.29 versus 0.94, p = 0.014). The VRCEP had a lower, though not statistically significant, positive frozen margin rate (18.8% versus 38.5%, p = 0.41) and positive final margin rate (5.9% versus 20.0%, p = 0.32) compared to SOC. The DDM rate was significantly lower in the VRCEP cohort compared to SOC (10.0% versus [58.8%], p = 0.032). There was a significant difference in MER in the VRCEP cohort (11.6% versus 40.0%, p = 0.0041). With the exclusion of the “expected” DDMs where specimen-driven margins were not feasible, the uDDM rate was still significantly lower in the VRCEP versus SOC cohort (10% versus 57.1%, p = 0.03). There, MER remained significantly reduced in the VRCEP cohort (11.6% versus 35.6%, p = 0.0047).

CONCLUSION

In this prospective trial, VRCEP was a feasible addition to a range of head and neck surgeries. It was associated with an improved margin event rate and score with no measured impact on surgeon TLB. Further investigation is warranted to evaluate the impact of similar technologies for oncologic surgical planning, the MER and MES as endpoints for surgical trials, and metrics of TLB.

INTRODUCTION

You Might Also Like

Explore This Issue