Promise and pitfalls: evaluating deep learning-driven genotyping in GISTs
This author has read with great interest the recent publication by Kong et al. in The Journal of Pathology (1). The study explored the application of artificial intelligence (AI), specifically deep learning (DL), to predict the mutational subtypes of gastrointestinal stromal tumors (GISTs) in the context of tyrosine kinase inhibitor (TKI) selection. Kong et al. categorized GISTs into four clinically relevant genotypic groups—(I) imatinib-sensitive; (II) imatinib dose-adjustment (corresponding to KIT exon 9 mutations); (III) avapritinib-sensitive (corresponding to the PDGFRA D842V mutation); and (IV) wild-type—and investigated whether these categories could be reliably predicted using DL-based digital pathology applied to hematoxylin and eosin (HE)-stained slides.
They trained their model on 443 whole-slide images from 85 GIST cases and 73 from 69 adjacent normal mucosal tissues, then evaluated model performance using 231 whole-slide images, including 205 from 35 GIST cases and 26 from GIST-adjacent normal tissues. The model achieved a case-level sensitivity of 0.864, specificity of 0.962, accuracy of 0.757, and an area under the curve (AUC) of 0.910, suggesting reasonably high diagnostic performance. However, this author believes the study does not yet provide sufficient evidence to support DL-based digital pathology as a standalone method for GIST genotyping.
Methodological concerns
One key methodological concern is the inclusion of 26 peritumoral samples—non-neoplastic tissues—as part of the negative class in the diagnostic validation. While such samples may serve as appropriate controls in general GIST detection tasks, their use in genotypic classification could introduce significant bias. Given the marked histological differences between tumor and non-tumor tissues, the model is unlikely to misclassify these peritumoral samples, potentially inflating specificity and overall accuracy. This raises concern that the model’s predictive capacity may be overestimated.
Kong et al. further assessed clinical utility by evaluating treatment responses to TKI therapy in 35 patients from the test set. They reported that DL-based digital pathology yielded higher predictive accuracy for treatment response than genotyping: 0.7143 vs. 0.4286 for non-responders. While intriguing, this author finds this conclusion fundamentally problematic. The DL model was trained using genotype labels, not treatment outcomes, as ground truth. As such, asserting that the model outperforms genotyping—its own reference standard—lacks logical foundation and may merely reflect a coincidental alignment rather than genuine predictive superiority. This discrepancy demands particularly cautious interpretation.
Broader implications for clinical practice
Beyond these methodological issues, the study also raises broader questions about how DL models may integrate into existing diagnostic workflows. Experienced pathologists and oncologists at high-volume centers can often predict mutation status using histopathological and clinical features, while less experienced clinicians may seek expert consultation or referral. Furthermore, although rare, there may be scenarios in resource-limited or remote settings where expert pathological consultation is unavailable or telepathology is impractical. In such situations, DL tools could serve as valuable adjuncts to improve diagnostic decision-making.
In practical clinical scenarios, DL may also provide additional value. In borderline-resectable, locally advanced GIST, where tissue is scarce and rapid assessment of TKI sensitivity is needed, DL applied to biopsy specimens could be highly informative. Many PDGFRA-mutant GISTs are resistant to imatinib (2,3), and avoiding ineffective neoadjuvant therapy could directly impact resectability and outcomes. Fortunately, Kong et al. reported excellent predictive accuracy for PDGFRA-mutant GISTs (AUC 0.96), likely reflecting their characteristic gastric location and epithelioid morphology (4).
Another important point is that DL may offer unique value not by replicating what skilled pathologists already recognize, but by identifying novel histopathological features that have been overlooked. As noted by Kong et al., the association between PDGFRA mutations and features such as cytoplasmic vacuolization and lymphocytic infiltration is intriguing and warrants further investigation. AI’s capacity to highlight such subtle histologic signals could generate new biological insights, potentially leading to tangible advances in diagnostic pathology.
Perspective
This study reinforces the potential of DL-based pathology in predicting mutational subtype and therapeutic response in GISTs. It complements earlier work by Fu et al. (5), who applied DL to a dataset of 1,233 GIST cases and demonstrated that HE-based models could (I) outperform the Miettinen risk classification system; (II) further stratify intermediate-risk cases; and (III) predict KIT, PDGFRA, and wild-type genotypes with AUCs of 0.81, 0.91, and 0.71, respectively. Kong et al.’s findings thus add to growing evidence supporting DL-driven genotyping, even in rare diseases such as GISTs, where large-scale genomic datasets are difficult to obtain. These advances are noteworthy, but their interpretation must also take into account the unique biological characteristics of GIST.
Indeed, it is precisely the biological characteristics of GIST that underlie this author’s cautious stance on DL-based genotypic prediction—not skepticism about AI itself. Most GISTs are driven by gain-of-function mutations in receptor tyrosine kinases, most commonly KIT or PDGFRA (6,7). Because imatinib exerts its therapeutic effect by competitively inhibiting the kinase activity of these mutant proteins (8), the mutational genotype, which directly determines the structural characteristics of these kinases, is therefore the most rational tool for predicting drug response. For this reason, both Fu et al. and Kong et al. structured their DL models around imatinib sensitivity. Genotyping will therefore continue to serve as the gold standard.
Conclusions
DL-based digital pathology for GIST genotyping is advancing from conceptual proof toward clinical validation. While genotyping will remain the gold standard, DL models may complement expert pathology in meaningful ways: by identifying subtle morphologic-genotypic correlations, by supporting decision-making in resource-limited settings, and by facilitating rapid therapeutic decisions in the neoadjuvant setting. The promise of AI will be realized when these tools are evaluated in practice-oriented contexts, such as prospective trials using biopsy specimens or preoperative therapy cohorts.
Acknowledgments
None.
Footnote
Provenance and Peer Review: This article was commissioned by the editorial office, Gastrointestinal Stromal Tumor. The article has undergone external peer review.
Peer Review File: Available at https://gist.amegroups.com/article/view/10.21037/gist-2025-4/prf
Funding: None.
Conflicts of Interest: The author has completed the ICMJE uniform disclosure form (available at https://gist.amegroups.com/article/view/10.21037/gist-2025-4/coif). The author has no conflicts of interest to declare.
Ethical Statement: The author is accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Kong X, Shi J, Sun D, et al. A deep-learning model for predicting tyrosine kinase inhibitor response from histology in gastrointestinal stromal tumor. J Pathol 2025;265:462-71. [Crossref] [PubMed]
- Heinrich MC, Maki RG, Corless CL, et al. Primary and secondary kinase genotypes correlate with the biological and clinical activity of sunitinib in imatinib-resistant gastrointestinal stromal tumor. J Clin Oncol 2008;26:5352-9. [Crossref] [PubMed]
- Cassier PA, Fumagalli E, Rutkowski P, et al. Outcome of patients with platelet-derived growth factor receptor alpha-mutated gastrointestinal stromal tumors in the tyrosine kinase inhibitor era. Clin Cancer Res 2012;18:4458-64. [Crossref] [PubMed]
- Sakurai S, Hasegawa T, Sakuma Y, et al. Myxoid epithelioid gastrointestinal stromal tumor (GIST) with mast cell infiltrations: a subtype of GIST with mutations of platelet-derived growth factor receptor alpha gene. Hum Pathol 2004;35:1223-30. [Crossref] [PubMed]
- Fu Y, Karanian M, Perret R, et al. Deep learning predicts patients outcome and mutations from digitized histology slides in gastrointestinal stromal tumor. NPJ Precis Oncol 2023;7:71. [Crossref] [PubMed]
- Hirota S, Isozaki K, Moriyama Y, et al. Gain-of-function mutations of c-kit in human gastrointestinal stromal tumors. Science 1998;279:577-80. [Crossref] [PubMed]
- Heinrich MC, Corless CL, Duensing A, et al. PDGFRA activating mutations in gastrointestinal stromal tumors. Science 2003;299:708-10. [Crossref] [PubMed]
- Mol CD, Dougan DR, Schneider TR, et al. Structural basis for the autoinhibition and STI-571 inhibition of c-Kit tyrosine kinase. J Biol Chem 2004;279:31655-63. [Crossref] [PubMed]
Cite this article as: Kanda T. Promise and pitfalls: evaluating deep learning-driven genotyping in GISTs. Gastrointest Stromal Tumor 2025;8:6.

