We use cookies to improve your experience. By continuing to browse this site, you accept our cookie policy.×
Skip main navigation
Aging Health
Bioelectronics in Medicine
Biomarkers in Medicine
Breast Cancer Management
CNS Oncology
Colorectal Cancer
Concussion
Epigenomics
Future Cardiology
Future Medicine AI
Future Microbiology
Future Neurology
Future Oncology
Future Rare Diseases
Future Virology
Hepatic Oncology
HIV Therapy
Immunotherapy
International Journal of Endocrine Oncology
International Journal of Hematologic Oncology
Journal of 3D Printing in Medicine
Lung Cancer Management
Melanoma Management
Nanomedicine
Neurodegenerative Disease Management
Pain Management
Pediatric Health
Personalized Medicine
Pharmacogenomics
Regenerative Medicine
Research ArticleOpen Accesscc iconby iconnc iconnd icon

Augmentation of a multidisciplinary team meeting with a clinical decision support system to triage breast cancer patients in the United Kingdom

    Martha Martin

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Hartmut Kristeleit

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Danny Ruta

    *Author for correspondence:

    E-mail Address: danny.ruta@gstt.nhs.uk

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Christina Karampera

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Rezzan Hekmat

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Winnie Felix

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Bertha InHout

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Ashutosh Kothari

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Majid Kazmi

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    ,
    Lesedi Ledwaba-Chapman

    School of Life Course & Population Sciences, King's College London, WC 2R2LS, UK

    ,
    Amanda Clery

    School of Life Course & Population Sciences, King's College London, WC 2R2LS, UK

    ,
    Yanzhong Wang

    School of Life Course & Population Sciences, King's College London, WC 2R2LS, UK

    ,
    Bola Coker

    School of Life Course & Population Sciences, King's College London, WC 2R2LS, UK

    ,
    Anita M Preininger

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Roy Vergis

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Thomas Eggebraaten

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Chris Gloe

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Irene Dankwa-Mullan Irene

    IBM Watson Health, Cambridge, MA 02142, USA

    ,
    Gretchen Purcell Jackson

    Vanderbilt University Medical Center, Nashville, TN 37232, USA

    Intuitive Surgical, Sunnyvale, CA 94086, USA

    &
    Anne Rigg

    Guys & St. Thomas' NHS Foundation Trust, Guy's Cancer Center, London, SE 19RT, UK

    Published Online:https://doi.org/10.2217/fmai-2023-0001

    Abstract

    Aim: Multidisciplinary team (MDT) meetings struggle with increasing caseloads. Recent National Health Service (NHS) guidance proposes that patients are triaged for ‘no discussion at MDT’. We examine whether an artificial intelligence (AI)-based clinical decision-support system (CDSS) can support human triage. Methods: Local best practice breast cancer MDT treatment decisions were compared with treatment decisions made by: two, two-person MDT triage teams with and without the CDSS; the CDSS acting ‘alone’; and the historical MDT. A decision tree on whether to triage patients to the CDSS or the MDT was created using supervised learning algorithms. Results: When localized, the CDSS achieved high concordance with local best practice (treatment plan decisions: 92% CDSS vs 96% team 1 vs 92% team 2, not significant [NS]; treatment type decisions: 89% CDSS vs 93% team 1 vs 82% team 2, NS). Using a decision tree 40.2% of cases can be correctly triaged to the CDSS for a treatment plan, and 34.6% for treatment type recommendations. Conclusion: AI-enabled CDSSs can potentially reduce the clinical workload for a breast cancer MDT by up to 40%. Before routine deployment they need to be appropriately localized and validated in prospective studies to evaluate clinical effectiveness and economic impact.

    Tweetable abstract

    AI-enabled clinical decision-support systems can potentially reduce the clinical workload for a breast cancer multidisciplinary teams by up to 40%. Before routine deployment they need to be appropriately localized and validated in prospective real world studies.

    Background

    The advent of a new generation of machine learning (ML) methods and algorithms applied to healthcare [1,2] has led to the development of AI-enabled computerized clinical decision-support systems (CDSS), that present therapeutic options to support oncologists who are treating cancer patients [3]. IBM developed a CDSS, Watson for Oncology (WFO) that used natural language processing and machine learning to generate ranked, evidence-based therapeutic options [4]. This CDSS provided a range of therapeutic options for consideration in cancer treatment plan decisions for all key treatment modalities (surgery, chemotherapy, radiotherapy, hormone therapy and immunotherapy) and specific treatment types (e.g., types of surgical procedures or chemotherapies) for several tumor types. It was developed in collaboration with experts at Memorial Sloan Kettering Cancer Center (MSKCC) and a detailed description of the WFO system is published elsewhere [5].

    One potential application of an AI-enabled CDSS is the streamlining of cancer multidisciplinary team (MDT) meetings. Cancer care by an MDT has been mandated in the United Kingdom's (UK) National Health Service (NHS) since 2000 [6]. As the complexity and volume of cancer cases have increased, and as care has become more personalized, MDTs have struggled to operate effectively [7]. A study by Cancer Research UK found that around half of patients were discussed for 2 min or less, with insufficient time to discuss more complex patients [8]. In 2020, NHS England published guidance on streamlining MDTs [7]. It proposed a process of “introducing Standards of Care as a routine part of MDT to stratify patient cases into those which require full multidisciplinary discussion in the MDT, and those cases which can be listed but not discussed in the MDT, as patient need is met by a Standard of Care (SoC)”. A SoC is defined as “a point in the pathway of patient management where there is a recognized international, national, regional or local guideline on the intervention(s) that should be made available to a patient”. The guidance states that for a patient to be assigned for ‘no discussion at the MDT’, the SoC must have been reviewed by an appropriate person or triage group. The guidance does not specify who/what should make up this triage group, instead it is decided locally.

    This study examines and quantifies the extent to which an AI-based CDSS (the CDSS used in this study was WFO) can be used by the MDT to enhance the structured decisions around clinically complex patients, and ultimately reduce the prevailing workload and time pressures by streamlining some patients to ‘no discussion at the MDT’. The study objective was to identify a set of less complex, non-metastatic breast cancer cases in which there was less decision conflict, and suitable for triage in a smaller quorum of clinicians. It assessed the concordance of therapeutic options generated for early-stage non-metastatic breast cancer made by two triage groups, each comprising of an oncologist and surgeon, with the therapeutic options provided by the CDSS, in a retrospective sample of breast cancer patients treated at Guy's Cancer Centre.

    Methods

    Study design & patient selection

    The retrospective concordance study compared local best practice breast cancer MDT treatment decisions with treatment decisions made by: two MDT triage teams, each comprising one consultant breast medical oncologist and one consultant breast surgeon, with and without decision support from the CDSS; the CDSS acting ‘alone’; and the historical MDT decision. MDT triage teams were blinded to each other's and historical MDT decision outcomes. Local best practice MDT breast cancer treatment decisions, or ‘consensus panel decisions’ were derived at the end of the study from the consensus decisions of the two consultant medical oncologists and two consultant breast surgeons with knowledge of the historical MDT outcome and the CDSS therapeutic options, and these decisions were considered the gold standard. The teams reviewed 213 MDT retrospective cases. We included patients discussed at the MDT between 2017 and 2018 apart from three who opted out of MDT triage. Female patients with Stage 1–3 invasive breast cancer were selected, excluding patients who could not be analyzed by the CDSS due to unavailable treatment options. These were patients with recurrent breast cancer, bilateral breast cancer, male patients and adenoid/colloid/secretory/tubular histology. Although the CDSS supported metastatic patients on the first line of therapy, these were also excluded as advanced disease was not deemed appropriate for triage in a real-world setting.

    The study was discussed with the GSTT Information Governance team and was classified as a service evaluation and did not require ethical or Institutional Review Board (IRB) approval. In accordance with information governance recommendations, all eligible patients were invited by letter to opt-out, and three patients chose to do so.

    WFO

    The breast cancer module contained a repository of up to 270 clinical attributes that influenced the WFO algorithm. Examples include demographic/family/medical histories, comorbidities, functional status, endocrine status, genetic profiling, tumor characteristics such as tumor, node, metastasis (TNM) staging/grade/receptor status, prior treatment modalities including their timing and responses to treatments, and major organ functional status. For a given case, WFO used a dynamic subset of attributes to derive a treatment recommendation.

    WFO provided a list of US-based treatments, which may differ from UK practice. We were not able to retrain WFO to learn and adapt to UK practice within the constraints of this study. We addressed these differences in practice by ‘post-localization’ analysis which we describe in the Results section of this paper.

    Data collection

    Patient attributes were extracted from electronic health records and transcribed into a version of the CDSS to support research, the Concordance Case Tool. Treatment options were collected and managed as study data using Research Electronic Data Capture (REDCap) electronic data capture tools hosted at Guy's and St Thomas Hospital. REDCap is a secure, web-based software platform designed to support data capture for research studies [9,10]. Study data were categorized by ‘treatment plan’ (surgery, chemotherapy, endocrine therapy, radiotherapy) and ‘treatment type’ (e.g., for surgery, wide local excision versus mastectomy).

    Concordance analysis

    Concordance was compared between local current best practice MDT treatment decisions and: the historical MDT decisions; the MDT triage team decisions before and after being shown the CDSS therapeutic options; and the CDSS therapeutic options. The consensus panel reviewed all cases where CDSS therapeutic options were discordant with local best practice. Several clinical scenarios were identified where CDSS US-based treatments differed from UK practice. We were not able to retrain the CDSS to learn and adapt to UK practice within the constraints of this study; however, such localization is feasible, and to assess the potential impact of CDSS adaptation to UK practices these scenarios were re-classified as concordant in a further ‘post-localization’ analysis.

    The two independent MDT triage teams were presented with the same patient attributes as the CDSS, and their agreed treatment plan and type were recorded by an independent researcher in REDCap. The teams were then shown the CDSS output and allowed to revise their prior decisions. This generated ‘pre’ and ‘post’ CDSS output for each team.

    Data analysis & statistics

    The characteristics of each breast cancer case were summarized using counts and frequencies. Concordance is reported with 95% confidence intervals (CIs) approximated using the Wilson interval. Fisher's exact test is used to assess differences in concordance between the MDTs and the CDSS and to assess differences in concordance within case characteristics. Estimates for performance metrics were obtained by randomly sampling 50% of the data 1000 times and taking the median performance of the decision tree on these subsamples. 95% CIs for the performance metrics were calculated by taking the 2.5th and 97.5th percentiles of the 1000 values. All analyses were conducted using the R software version 3.6.3.

    A decision tree that decides whether to triage breast cancer patients to the CDSS or the MDT was created using fast-and-frugal trees (FFTs) that were constructed with the R package FFTrees [11]. FFTs [12,13] are supervised learning algorithms for binary classification tasks. We chose age, prior early-stage treatments, clinical stage, histology, estrogen receptor (ER) status, progesterone receptor (PR) status and human epidermal growth factor receptor 2 (HER2) status as variables for the decision tree because they had particularly high or low decision concordance and were clinically relevant. The outcome variable being classified was the CDSS concordance with consensus panel MDT treatment recommendations.

    The values of a confusion matrix of case concordance and the prediction by the CDSS were used to calculate the performance metrics that in turn were used to assess the predictive ability of the FFTs. The performance of decision trees was assessed using sensitivity, specificity, balanced accuracy, weighted accuracy and percentage of patients whose cases were sent to be analyzed by the CDSS.

    In this study, a false positive translates to a breast cancer patient case being sent to the CDSS and receiving treatment therapeutic options discordant with the consensus panel MDT. Conversely, a false negative translates to a breast cancer patient case being sent to an MDT despite the CDSS providing therapeutic options concordant with the consensus panel MDT. A false positive will result in poorer patient outcomes than a false negative; therefore, we aimed to create a decision tree that minimized the number of false positives, in other words, that maximized specificity.

    Weighted accuracy considers both sensitivity and specificity and, depending on a weight, w, can be weighted in favor of one or the other. Values of w closer to 1 weight in favor of predictive tools with high sensitivity and values of w closer to 0 weight in favor of tools with high specificity. We chose weighted accuracy with a weight of 0.3 to be the performance metric used in building and choosing FFTs. The most accurate MDT triage team achieved 96.2 and 93.9% concordance for treatment plan and type, respectively. These values were therefore selected as the minimum benchmarks for specificity in the study. After the best performing FFT was chosen, an estimate of each performance metric was calculated with 95% CIs.

    Results

    Sample characteristics

    The characteristics of each case of breast cancer included in the study can be found in Table 1.

    Table 1. The characteristics of the sample.
    CharacteristicsNumber of patients, n (%)
    All cases213 (100.0)
    Age (years) 
      <50108 (50.7)
      ≥50105 (49.3)
    Sex 
      Female213 (100.0)
      Male0 (0.0)
    Prior early-stage treatments 
      None105 (49.3)
      Chemotherapy8 (3.8)
      Surgery59 (27.7)
      Surgery and chemotherapy30 (14.1)
      Surgery and chemotherapy, targeted11 (5.2)
    Clinical Stage 
      Stage 143 (20.2)
      Stage 2133 (62.4)
      Stage 337 (17.4)
    Tumor grade 
      Low15 (7.0)
      Intermediate96 (45.1)
      High102 (47.9)
    Tumor focality 
      Unifocal170 (79.8)
      Multifocal40 (18.8)
      Multicentric3 (1.4)
    Tumor location 
      Lateral125 (58.7)
      Medial58 (27.2)
      Medial and lateral (overlapping)30 (14.1)
    Histology 
      Ductal182 (85.4)
      Lobular18 (8.5)
      Rare subtypes13 (6.1)
    ER status 
      Negative75 (35.2)
      Positive138 (64.8)
    PR status 
      Negative112 (52.6)
      Positive101 (47.4)
    HER2 status 
      Negative177 (83.1)
      Positive36 (16.9)

    Post-localization

    Four main clinical practice changes to the CDSS's therapeutic options were identified in order to adapt the CDSS to conform to local best practice (referred to here as ‘localization’). In 13 cases this was a change in treatment plan, while in 36 cases this was a change in treatment type. These were: use of neoadjuvant carboplatin for triple negative breast cancer patients (TNBC; 0/13 for treatment plan, 22/36 for treatment type), avoidance of adjuvant Capecitabine after neoadjuvant chemotherapy in ER positive (ER+) patients (7/13 for treatment plan, 7/36 for treatment type), avoidance of adjuvant capecitabine after neoadjuvant chemotherapy in patients that had received neoadjuvant carboplatin (6/13 for treatment plan, 6/36 for treatment type), and use of adjuvant pertuzumab in node negative (N0) patients (0/13 for treatment plan, 1/36 for treatment type).

    Concordance

    Figure 1 displays the concordance of each group.

    Figure 1. Percentage concordance of treatment recommendations (n = 213).

    CDSS: Clinical decision support system; MDT: Multidisciplinary team.

    The CDSS achieved 85% concordance with local best practices for treatment plan recommendations and 70% concordance for treatment type recommendations. When localized to local clinical best practice, the CDSS achieved much higher concordance, and no statistical difference with either MDT triage team was observed (treatment plan decisions: 92% CDSS vs 96% team 1 vs 92% team 2, not significant (NS); treatment type decisions: 89% CDSS vs. 93% team 1 vs 82% team 2, NS).

    Tables 2 and 3 display for treatment plan and treatment type, respectively, the CDSS's concordance before and after localization to local clinical best practice.

    Table 2. Concordance of treatment plan before and after clinical decision support system localization to local clinical best practice.
    CharacteristicsAll cases, n (%)Pre-localizationPost-localization
      CC, nConcordance, (95% CI)PCC, nConcordance, (95% CI)p-value
    All patients21318285.4 (80.1–89.6) 19792.5 (88.1–95.3) 
    Age (years)       
      <50108 (50.7)9386.1 (78.3–91.4)0.93210395.4 (89.6–98.0)0.174
      ≥50105 (49.3)8984.8 (76.7–90.4) 9489.5 (82.2–94.0) 
    PEST       
      None105 (49.3)9994.3 (88.1–97.4)<0.0019994.3 (88.1–97.4)0.419
      Chemo8 (3.8)8100.0 (67.6–100.0) 8100.0 (67.6–100.0) 
      Surgery59 (27.7)5084.7 (73.5–91.8) 5288.1 (77.5–94.1) 
      Surgery and chemo30 (14.1)1446.7 (30.2–63.9) 2790.0 (74.4–96.5) 
      Surgery and chemo, targeted11 (5.2)11100.0 (74.1–100.0) 11100.0 (74.1–100.0) 
    Clinical Stage       
      Stage 143 (20.2)4297.7 (87.9–99.6)0.00743100.0 (91.8–100.0)0.063
      Stage 2133 (62.4)11385.0 (77.9–90.0) 12291.7 (85.8–95.3) 
      Stage 337 (17.4)2773.0 (57.0–84.6) 3286.5 (72.0–94.1) 
    Tumor grade       
      Low15 (7.0)1173.3 (48.0–89.1)0.1941280.0 (54.8–93.0)0.057
      Intermediate96 (45.1)8083.3 (74.6–89.5) 8790.6 (83.1–95.0) 
      High102 (47.9)9189.2 (81.7–93.9) 9896.1 (90.3–98.5) 
    Tumor focality       
      Unifocal170 (79.8)14585.3 (79.2–89.8)0.77115792.4 (87.4–95.5)0.883
      Multifocal40 (18.8)3485.0 (70.9–92.9) 3792.5 (80.1–97.4) 
      Multicentric3 (1.4)3100.0 (43.9–100.0) 3100.0 (43.9–100.0) 
    Tumor location       
      Lateral125 (58.7)10281.6 (73.9–87.4)0.14911289.6 (83.0–93.8)0.163
      Medial58 (27.2)5289.7 (79.2–95.2) 5696.6 (88.3–99.0) 
      Medial and lateral30 (14.1)2893.3 (78.7–98.2) 2996.7 (83.3–99.4) 
    Histology       
      Ductal182 (85.4)16087.9 (82.4–91.9)0.03417294.5 (90.2–97.0)<0.001
      Lobular18 (8.5)1266.7 (43.7–83.7) 1266.7 (43.7–83.7) 
      Rare subtypes13 (6.1)1076.9 (49.7–91.8) 13100.0 (77.2–100.0) 
    ER status       
      Negative75 (35.2)6688.0 (78.7–93.6)0.5657397.3 (90.8–99.3)0.088
      Positive138 (64.8)11684.1 (77.0–89.2) 12489.9 (83.7–93.9) 
    PR status       
      Negative112 (52.6)9786.6 (79.1–91.7)0.75510795.5 (90.0–98.1)0.129
      Positive101 (47.4)8584.2 (75.8–90.0) 9089.1 (81.5–93.8) 
    HER2 status       
      Negative177 (83.1)14783.1 (76.8–87.9)0.5316291.5 (86.5–94.8)0.404
      Positive36 (16.9)3597.2 (85.8–99.5) 3597.2 (85.8–99.5) 

    CC: Concordant cases; CI: Confidence interval; HER2: Human epidermal growth factor receptor 2; ER: Estrogen receptor; PEST: Prior early-stage treatments; PR: Progesterone receptor.

    Table 3. Concordance of treatment type before and after clinical decision support system localization to local clinical best practice.
    CharacteristicsAll cases, n (%)Pre-localizationPost-localization
      CC, nConcordance, (95% CI)p-valueCC, nConcordance, (95% CI)p-value
    All patients21315070.4 (64.0–76.1) 19089.2 (84.3–92.7) 
    Age (years)       
      <50108 (50.7)7569.4 (60.2–77.3)0.86710092.6 (86.1–96.2)0.163
      ≥50105 (49.3)7571.4 (62.2–79.2) 9485.7 (77.8–91.1) 
    PEST       
      None105 (49.3)7268.6 (59.2–76.7)0.0059590.5 (83.4–94.7)0.660
      Chemo8 (3.8)8100.0 (67.6–100.0) 8100.0 (67.6–100.0) 
      Surgery59 (27.7)4779.7 (67.7–88.0) 5084.7 (73.5–91.8) 
      Surgery and chemo30 (14.1)1446.7 (30.2–63.9) 2790.0 (74.4–96.5) 
      Surgery and chemo, targeted11 (5.2)981.8 (52.3–94.9) 1090.9 (62.3–98.4) 
    Clinical stage       
      Stage 143 (20.2)4297.7 (87.9–99.6)<0.00143100.0 (91.8–100.0)0.032
      Stage 2133 (62.4)8563.9 (55.5–71.6) 11687.2 (80.5–91.9) 
      Stage 337 (17.4)2362.2 (46.1–75.9) 3183.8 (68.9–92.3) 
    Tumor grade       
      Low15 (7.0)1066.7 (41.7–84.8)0.2651173.3 (48.0–89.1)0.054
      Intermediate96 (45.1)7376.0 (66.6–83.5) 8487.5 (79.4–92.7) 
      High102 (47.9)6765.7 (56.1–74.2) 9593.1 (86.5–96.6) 
    Tumor focality       
      Unifocal170 (79.8)12774.7 (67.7–80.6)0.00515490.6 (85.3–94.1)0.277
      Multifocal40 (18.8)2050.0 (35.2–64.8) 3382.5 (68.1–91.3) 
      Multicentric3 (1.4)3100.0 (43.9–100.0) 3100.0 (43.9–100.0) 
    Tumor location       
      Lateral125 (58.7)8467.2 (58.6–74.8)0.35810785.6 (78.4–90.7)0.084
      Medial58 (27.2)4272.4 (59.8–82.2) 5696.6 (88.3–99.0) 
      Medial and lateral30 (14.1)2480.0 (62.7–90.5) 2790.0 (74.4–96.5) 
    Histology       
      Ductal182 (85.4)13272.5 (65.6–78.5)0.24816691.2 (86.2–94.5)0.006
      Lobular18 (8.5)1055.6 (33.7–75.4) 1266.7 (43.7–83.7) 
      Rare subtypes13 (6.1)861.5 (35.5–82.3) 1292.3 (66.7–98.6) 
    ER status       
      Negative75 (35.2)4053.3 (42.2–64.2)<0.0017093.3 (85.3–97.1)0.230
      Positive138 (64.8)11079.7 (72.2–85.6) 12087.0 (80.3–91.6) 
    PR status       
      Negative112 (52.6)6961.6 (52.4–70.1)0.00510392.0 (85.4–95.7)0.251
      Positive101 (47.4)8180.2 (71.4–86.8) 8786.1 (78.1–91.6) 
    HER2 status       
      Negative177 (83.1)11967.2 (60.0–73.7)0.03915788.7 (83.2–92.6)0.820
      Positive36 (16.9)3186.1 (71.3–93.9) 3391.7 (78.2–97.1) 

    CC: Concordant case; CI: Confidence interval; ER: Estrogen receptor; HER2: Human epidermal growth factor receptor 2; PEST: Prior early-stage treatments; PR: Progesterone receptor.

    The CDSS showed high treatment plan concordance with local best practices in treatment naive patients, patients with prior chemotherapy, and in Stage 1 disease. After adapting to UK treatment practice (localization) high treatment plan concordance was also seen in ductal histology. For treatment type high concordance was demonstrated in patients with prior chemotherapy, Stage 1 disease and multicentric tumors. Statistically significant positive differences in concordance were also found in cases who were ER+, PR+, and HER2+ but concordance levels were low (79.7, 80.2 and 86.1%). Post-localization, the CDSS showed higher treatment type concordance with local best practice in cases with Stage 1 cancer (100%).

    FFT for treatment plan

    FFT analysis was used to identify clinical parameters in which the CDSS therapeutic options showed high concordance with MDT outcomes. This would allow identification of a subset of patients that could be triaged to the CDSS and would not require an MDT discussion.

    Figure 2 displays the decision tree that decides whether a case should be triaged to the CDSS or an MDT for their treatment plan. A case should be triaged to the CDSS for their treatment plan if:

    • The case has a stage 1 cancer;

    • The case has stage 2 or 3 cancer, is under 50 years, and HER2+;

    • The case has stage 2 or 3 cancer, is under 50 years, HER2-, and ER-.

    Figure 2. Decision tree for treatment plan (top) and treatment type (bottom).

    CDSS: Clinical decision support system; ER: Estrogen receptor; HER2: Human epidermal growth factor receptor 2; MDT: Multidisciplinary team; PR: Progesterone receptor.

    The performance of this decision tree is presented in Table 4. The sensitivity of the decision tree was 43.4% (95% CI: 36.4–50.5%) and the specificity was 100.0% (95% CI: 100.0–100.0%). On average, 40.2% (95% CI: 33.6–46.7%) of cases were triaged to the CDSS. Because the specificity of this tree did not fall below 100%, we can say that on average, we would expect 40% of cases to be correctly triaged to the CDSS for a treatment plan.

    Table 4. Performance of the fast-and-frugal trees for treatment plan and type.
    OutcomePerformance metricAverage value, % (95% CI)
    Treatment planSensitivity43.4 (36.4–50.5)
    Specificity100.0 (100.0–100.0)
    Balanced accuracy71.7 (68.2–75.3)
    Weighted accuracy83.0 (80.9–85.2)
    % triaged to Watson for Oncology (WFO)40.2 (33.6–46.7)
    Treatment typeSensitivity38.9 (32.6–46.3)
    Specificity100.0 (100.0–100.0)
    Balanced accuracy69.5 (66.3–73.2)
    Weighted accuracy81.7 (79.8–83.9)
    % triaged to WFO34.6 (29.0–41.1)

    FFT for treatment type

    Figure 2 displays the decision tree that decides whether a case should be triaged to the CDSS or an MDT for their treatment type. A case should be triaged to the CDSS for their treatment type if:

    1.

    The case has a stage 1 cancer;

    2.

    The case has stage 2 or 3 cancer, is under 50 years, ER-, PR- and HER2-.

    The performance of this decision tree is presented in Table 4. The sensitivity of the decision tree was 38.9% (95% CI: 32.6–46.3%) and the specificity was 100.0% (95% CI: 100.0–100.0%). On average, 34.6% (95% CI: 29.0–41.1%) of cases were triaged to the CDSS. Because the specificity of this tree did not fall below 100%, we can say that on average, we would expect 35% of cases to be correctly triaged to the CDSS for treatment type.

    Discussion

    This study is one of the first to evaluate an AI-based oncology CDSS tool in its utility in streamlining workloads for MDTs. We found that up to 40% of eligible breast cancer cases can be triaged by a CDSS straight to the treating clinician, avoiding MDT discussion and potentially reducing MDT workloads or increasing time for discussion of more complex treatment decision cases. The reduction in the size of the MDT list by 40% may not necessarily equate to a 40% reduction in MDT time, if the time required to discuss a case is proportional to its complexity, and less complex patients are triaged to the CDSS. This may not be a reasonable assumption however, and a prospective study would be required to establish the actual time saving.

    The validity of this CDSS's therapeutic options has been assessed in studies in India, Korea and China by measuring the level of concordance between the CDSS and local multidisciplinary tumor Boards [14–19]. Results have been variable, ranging from 20.2% concordance in colon cancer patients aged 70 and over at Gil Medical Centre, South Korea [14], to 98% in HER2 positive metastatic breast cancer patients at Manipal Hospital, India [17]. Several explanations have been offered for this observed variance in concordance including tumor type, disease stage, patient age, availability of drugs and other local practice differences, different release versions of the CDSS [14–19] and an MSKCC bias towards US published studies [20].

    Although published evidence is limited, it is not unreasonable to suggest that CDSSs may have a place in routine cancer care. This is supported by a study from Thailand which found 70% of CDSS therapeutic options acceptable or identical to local practice where the gold standard was made by an expert panel of clinicians rather than concordance with the MDT only [5], and a systematic review of all previous concordance studies showed the CDSS had greater concordance with MDTs than individual clinician decisions [21].

    A recent meta-analysis found the highest consistency in the CDSS ‘recommended’ and ‘for consideration’ MDT decisions was in breast cancer cases [22]. Although direct comparison is limited due to our separating treatment plan versus treatment type, their subgroup analysis was consistent with our own decision tree. They found there was greater consistency in lower stage cases, and when they looked at “recommended” treatment options only they found greater consistency in hormone receptor negative patients (including hormone receptor negative, HER2+) than hormone receptor positive patients, which also echoes our own decision tree for treatment type and sending cases straight to the CDSS. Papers also highlighted the need for localization as well as the role of CDSSs in supporting rather than replacing clinician decision making [22].

    Our study examined a specific application of a CDSS within a UK cancer pathway designed to address a pressing NHS need of cancer care providers: how best to introduce standards of care to streamline MDTs so that some patients, once reviewed by an appropriate person or triage group to have met recognized national or local guidelines or ‘Standard of Care’ protocols, can be assigned for ‘no discussion at the MDT’ (NHS England, 2019). Currently, there is no consensus or evidence base on what constitutes the most appropriate person or triage group. We compared the treatment plan and treatment type decisions of a consensus panel of two breast medical oncologists and two breast surgeons, which we defined as local best practice, with the decisions made by those same clinicians, when they formed two independent two-person oncologist/surgeon triage teams. We assessed consensus panel – triage team concordance both with and without clinical decision support from a CDSS. Both teams achieved very high concordance for treatment plan decisions (96% and 93%). Concordance for treatment type decisions was slightly lower than the treatment plan for team 1 (93%), but was markedly lower for team 2 (82%). This suggests that streamlining MDT recommendations for treatment type using two-person triage teams may introduce unacceptable variations in clinical decision making for those patients who are not discussed at a full MDT. Access to the CDSS changed the teams' decisions in only one case for team 1 and 3 cases for team 2, suggesting that the CDSS offered minimal clinical benefit as a support tool to the triage teams.

    When the CDSS was assessed as a stand-alone triage tool for use with breast cancer patients meeting the eligibility criteria, it initially failed to achieve the levels of concordance with local best practices seen in the human triage teams. However after the tool was localized to take account of four key differences between MSKCC and local MDT clinical best practice, our post-localization CDSS analysis suggests that the CDSS approached the performance levels seen in our two human triage teams for treatment plan decisions and may have actually performed better than one of the teams for treatment type decisions. However, our FFT analysis provides the most promising evidence for the potential value of an AI CDSS within a cancer pathway. Our findings suggest that with a simple triage decision tree that could be administered by a trained non-clinical MDT coordinator/supervising clinician, or automated within the electronic patient record that uses clinical stage, patient age, HER2, ER and PR status, and with localization of a CDSS to account for local best practice, it might be possible to streamline a breast cancer MDT pathway such that 40% of eligible patients could be referred to a CDSS for treatment plan therepeutic options and assigned for ‘no discussion at the MDT’. This would require the appropriate clinical information including staging to be available at the time of triage. Our FFT analysis predicts that this CDSS would be 100% concordant with local MDT best practice for this sub-group of cases, and would therefore require minimal clinical supervision. If treatment type therapeutic options are required, 34% of eligible patients could be safely referred to the CDSS, again with 100% concordance with local MDT best practice. We estimate that if the triage decision tree, combined with the eligibility criteria for the version of the CDSS used in this study were applied to the current Guy's Cancer Centre breast cancer pathway, the clinical burden on the MDTs would be reduced by approximately 30%. This could be translated into shorter MDTs and more time for clinical work, or longer discussions on more complex cases. This could be investigated in a prospective study implementing AI-CDSS triage, by the MDT meeting being kept the same length initially and the clinical team evaluating if there is more value to patient care by the meetings being shorter, or the meetings being the same length but having more time to discuss fewer cases.

    There were several methodological limitations in our study design. We made some assumptions when considering the potential value of a CDSS for routine UK breast cancer care, which may not hold true. Concerning the study design, we used the consensus decisions/a ‘consensus panel’ of our two participating medical oncologists and two surgeons as a proxy ‘ground truth’/gold standard for local best practice, and these same clinicians then formed the two triage teams. The two oncologists and one of the surgeons are amongst the most senior members of the local breast MDTs. All four clinicians are core MDT members; therefore, it is not unreasonable to use their decisions as a proxy for local best practices. A more rigorous design could have included other clinical members of a full MDT in the consensus panel. Still, we have no reason to believe their participation would materially alter the consensus decisions. We could also have identified triage teams comprising clinicians not part of the consensus panel, and from other UK centers to establish the ‘ground truth’/gold standard. However of note, as standards of care to streamline MDTs are introduced in the UK, the triage groups likely asked to review patients will similarly comprise clinicians drawn from the full MDTs.

    Our post-localization CDSS analysis assumes that the machine learning algorithms used to generate CDSS therapeutic options can be adapted to consider local clinical practice differences from MSKCC practice, such as those identified in our study. While this is technically quite feasible, the resource implications in re-training a CDSS for adaptation to every national or regional setting in which such a tool is implemented may make localization commercially non-viable unless cheaper computational methods other than machine learning (such as expert systems) can be applied.

    Cancer stage is a vital patient characteristic driving the triage decision tree that resulted in a sizeable sub-group of patients being sent to the CDSS (avoiding the MDT) such that 100% concordance with local MDT best practice was achieved. This presupposes that accurate clinical staging is possible early in the clinical pathway prior to the MDT discussion. This may not be the case for all local breast cancer pathways. It may require that the pathway be re-designed, for example, making imaging or histology results available at an earlier point in the pathway.

    The impact we suggest a CDSS could make on the clinical MDT burden is based on the mix of breast cancer patients seen at the Guy's Cancer Centre. The Centre serves as both a secondary referral center for the local population of South London, but also as a tertiary referral center with a large early phase clinical trials programme. However, since patients with metastatic disease were excluded from this study, we believe our findings are generalizable to other secondary care cancer centers. It may also be the case that compliance with nationally agreed standards of care is more likely to increase in such settings, where there may be less access to specialized medical oncology expertise. Our study was not designed to test this hypothesis. Still, our finding that the post-localization CDSS treatment type therapeutic options had a higher concordance with our consensus panel decisions than one of the two-person triage teams provides some indirect support.

    Conclusion

    Before AI-enabled, computerized CDSSs can be recommended as effective guideline implementation interventions to promote evidence-based medicine, they must themselves be shown to be evidence-based. Previous prospective concordance studies for the CDSS we used in this study provide some preliminary evidence for its clinical effectiveness in specific settings, and this retrospective study builds on that by demonstrating how, when used as a triage tool and adapted to local best practice guidelines, AI-enabled CDSSs can potentially reduce the clinical workload for a breast cancer MDT by up to 40%. Before such tools can be deployed routinely within a UK cancer pathway, and similarly for deployment globally, they would need to be appropriately localized and clinically validated in prospective studies designed to evaluate both clinical effectiveness and economic impact. Localization and tumor group specific prospective evaluation would be required to identify subgroups with the greatest concordance. These tools are meant to augment decisions of the MDT and unlikely to replace the MDT. However, it can help reduce the cognitive burden and decompress decision processes. For rapid NHS adoption a coordinated multicenter evaluation approach, for example through the NHS Transformation Directorate, would seem appropriate.

    Summary points
    • This study is one of the first to evaluate an artificial intelligence (AI)-based oncology clinical decision support system (CDSS) tool in its utility in streamlining workloads for multidisciplinary teams (MDTs).

    • Up to 40% of eligible breast cancer cases can be triaged by a CDSS straight to the treating clinician and assigned for ‘no discussion at the MDT’.

    • This could be translated into shorter MDTs and more time for clinical work, or longer discussions on more complex cases.

    • Routine deployment of an AI CDSS may require that the pathway be re-designed, for example, making imaging or histology results available at an earlier point in the pathway.

    • Re-training a machine learning CDSS for adaptation to every national or regional setting in which such a tool is implemented may make localization commercially non-viable unless cheaper computational methods other than machine learning (such as expert systems) can be applied.

    • Before such tools can be deployed routinely within a UK cancer pathway, and similarly for deployment globally, they would need to be appropriately localized and clinically validated in prospective studies designed to evaluate both clinical effectiveness and economic impact.

    Author contributions

    M Martin, D Ruta, H Kristeleit, R Hekmat, W Felix, I Dankwa-Mullan were lead on the conceptualization of the study, data collection, data analysis, data interpretation and writing and editing of the manuscript. C Karampera, R Vergis, T Eggebraaten made important intellectual contributions towards conceptualization, data collection, data analysis, data interpretation, writing and editing. B InHout, A Kothari, A Rigg, M Kazmi made important contributions towards data collection, writing and editing. L Ledwaba-Chapman, A Clery, Y Wang, B Coker, C Gloe made important contributions towards data collection, data analysis and data interpretation.AM Preininger, I Dankwa-Mullan, GP Jackson made important contributions to the writing and editing of the manuscript and agreed with the final version of the manuscript.

    Financial & competing interests disclosure

    R Hekmat, W Felix, R Vergis, T Eggebraaten, C Gloe, AM Preininger, I Dankwa-Mullan, GP Jackson were previously employees of IBM. IBM provided some funding to assist in the completion of data entry for the project. Otherwise, there was no specific funding for the project. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

    No writing assistance was utilized in the production of this manuscript.

    Ethical conduct of research

    The authors state that they have obtained appropriate institutional review board approval or have followed the principles outlined in the Declaration of Helsinki for all human or animal experimental investigations. In addition, for investigations involving human subjects, informed consent has been obtained from the participants involved.

    Open access

    This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/

    References

    • 1. Char DS, Shah NH, Magnus D. Implementing Machine Learning in Health Care – Addressing Ethical Challenges. N. Engl. J. Med. 378(11), 981–983 (2018).
    • 2. He J, Baxter SL, Xu J et al. The practical implementation of artificial intelligence technologies in medicine. Nat. Med. 316, 30–36 (2019).
    • 3. Somashekhar SP, Sepudulveda M-J, Nordern A et al. Early experiences with IBM Watson for Oncology (CDS) cognitive computing system for lung and colorectal cancer. J. Clin. Oncol. 35(Suppl. 15), 8527–8527 (2017).
    • 4. Steadman I. ibm-watson-medical-doctor. www.wired.co.uk/article/ibm-watson-medical-doctor wired. DOA 9th Dec 2021.
    • 5. Suwanvecho S, Suwanrusm H, Jirakulaporn T et al. Comparison of an oncology clinical decision-support system's recommendations with actual treatment decisions. JAMIA 28(4), 832–838 (2021).
    • 6. Department of Health, UK. The NHS Cancer Plan: A plan for investment, a plan for reform. Department of Health, London, UK (2000).
    • 7. NHS Cancer Programme, London. Streamlining Multi-Disciplinary Team Meetings: Guidance for Cancer Alliances. NHS England, NHS Improvement London, UK (2019).
    • 8. Cancer Research UK. Meeting Patients' Needs, improving the effectiveness of multidisciplinary team meetings. Cancer Research UK, London, Uk (2017).
    • 9. Harris PA, Taylor R, Thielke R et al. Research electronic data capture (REDCap) – A metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42(2), 377–381 (2009).
    • 10. Harris PA, Taylor R, Minor BL et al. The REDCap consortium: Building an international community of software partners. J. Biomed. Inform. 95, doi: 10.1016/j.jbi.2019.103208 (2019).
    • 11. Phillips ND, Neth H, Woike JK et al. FFTrees: A toolbox to create, visualize, and evaluate fast-and-frugal decision trees. Judgment and Decision Making 12(4), 344–368 (2017).
    • 12. Martignon L, Hoffrage U. Fast, frugal, and fit: Simple heuristics for paired comparison. Theory and Decision 52(1), 29–71 (2002).
    • 13. Martignon L, Katsikopoulos KV, Woike JK. Categorization with limited resources: A family of simple heuristics. J. Math. Psychol. 52(6), 352–361 (2008).
    • 14. Lee WS, Ahn SM, Chung JW et al. Assessing concordance with Watson for Oncology, a cognitive computing decision support system for colon cancer treatment in Korea. JCO Clin. Cancer Inform. 2, 1–8 (2018).
    • 15. Kim M, Kim BH, Kim JM et al. Concordance in postsurgical radioactive iodine therapy recommendations between Watson for Oncology and clinical practice in patients with differentiated thyroid carcinoma. Cancer 125(16), 2803–2809 (2019).
    • 16. Kim D, Kim YY, Lee JH et al. A comparative study of Watson for Oncology and tumor boards in breast cancer treatment. Korean J. Clin. Oncol. 15(1), 3–6 (2019).
    • 17. Somashekhar SP, Sepúlveda MJ, Puglielli S et al. Watson for Oncology and breast cancer treatment recommendations: agreement with an expert multidisciplinary tumor board. Ann. Oncol. 29(2), 418–423 (2018).
    • 18. Liu C, Liu X, Wu F et al. Using artificial intelligence (Watson for Oncology) for treatment recommendations amongst Chinese patients with lung cancer: feasibility study. J. Med. Internet Res. 20(9), e11087 (2018).
    • 19. Xu F, Sepúlveda MJ, Jiang Z et al. Artificial intelligence treatment decision support for complex breast cancer among oncologists with varying expertise. JCO Clin. Cancer Inform. 3, 1–15 (2019).
    • 20. Tupasela A, Di Nucci E. Concordance as evidence in the Watson for Oncology decision-support system. AI & SOCIETY 35, 811–818 (2020).
    • 21. Arriaga Y, Hekmat R, Draulis K et al. A systematic review of studies of concordance with expert opinion for a globally implemented oncology clinical decision-support system. Proceedings of the 2020 AMIA Summit 762–763 (2020).
    • 22. Jie Z, Zhiying Z, Li L. A meta-analysis of Watson for Oncology in clinical application. Sci. Report. 11(1), 1–13 (2021).