1. Introduction
Cancer continues to be a major cause of global mortality rates, with conventional treatments such as chemotherapy and radiotherapy exhibiting inconsistent efficacy, high costs, and considerable side effects. Over the past decade, a promising alternative has emerged: cancer immunotherapy, which leverages the body’s immune system to identify and eradicate cancer cells
[1]. Significant progress has been made in several novel immunotherapies, including immune checkpoint inhibitors (ICIs), chimeric antigen receptor T-cell therapies, and personalized therapeutic cancer vaccines. However, the heterogeneity and complexity of cancer types still pose enormous challenges. The efficient use of large numbers of medical datasets is essential for overcoming these obstacles. Without artificial intelligence (AI), linking translational and clinical data to derive meaningful insights would remain an insurmountable task. Recently, single-cell RNA sequencing cell typing (scGPT), a pretrained generative model utilizing a database of more than 33 million cells, is developed to accurately extract important biological insights concerning cellular biology
[2]. A personalized learning workflow is also proposed as a simple and effective pipeline to discover neoantigens for cancer immunotherapy
[3].
AI for medical research is what antibiotics were to medicine in 1910, and the era of digital penicillin has now arrived. AI is becoming a driving force behind cancer research, providing capabilities that were once unimaginable. It is playing a critical role in identifying modifiable risk factors, discovering new drug targets, and creating innovative clinical platforms. As AI continues to evolve, it will become more than a tool but an imperative—driving breakthroughs that are otherwise unattainable and transforming every stage of cancer care, from detection and diagnosis to treatment, recurrence, and survival.
2. Key challenges in AI-powered cancer immunotherapy
AI-powered cancer immunotherapy stands at the forefront of precision medicine, combining advanced computational techniques with biological insights to personalize treatments and improve patient outcomes. By analyzing large, diverse datasets, AI models can uncover and generate previously unrecognized patterns, predict therapeutic responses, and streamline the development of drugs and vaccines
[4]. Despite recent advancements, several critical challenges continue to hinder the full potential of AI in this domain. These obstacles are rooted in the inherent complexity of biological systems, the limitations of current computational tools, and the difficulty of translating research findings into practical clinical applications. Some of the key challenges in AI-powered cancer immunotherapy are categorized below.
2.1. Personalized cancer medicine development considering individual biological factors
A fundamental challenge in personalized cancer medicine development lies in the effective integration and interpretation of diverse, often incomplete datasets, such as genomic, proteomic, clinical, and imaging data, specifically for tailoring treatments to individual patients. To achieve this purpose, these varied data types must be synthesized in a way that provides actionable, patient-specific insights. However, current approaches have notable limitations. For example, one multimodal model that is proposed to integrate genomic, imaging, and clinical trial data struggles with incomplete or noisy datasets
[5]. He et al.
[6] introduced a context–aware autoencoder to predict clinical drug responses; however, it relies solely on cell-line data, which cannot fully capture patient-specific complexities.
These limitations significantly hinder progress in developing personalized cancer medicine. For example, tasks such as optimizing messenger RNA (mRNA) vaccine sequences are complicated by the need to navigate a vast combinatorial space while simultaneously addressing the intricate biological constraints unique to each patient
[7]. Similarly, discovering predictive biomarkers requires advanced systems that can analyze noisy and incomplete data, which is essential for identifying meaningful patterns unique to a patient’s tumor
[8].
2.2. Modeling complex biological systems with precision and scalability
Biological systems—especially within the context of cancer immunotherapy—exhibit complex dynamics that span the molecular, cellular, and tissue scales
[9]. To accurately represent these processes, computational models must be both scalable and precise. Although significant progress has been made through advancements such as AlphaFold
[10] in protein structure prediction, these tools primarily focus on static representations of individual protein structures, which limits their ability to model dynamic cellular- and tissue-level interactions. Recent efforts such as AF2Complex
[11] offer a promising extension by addressing protein–protein interactions. However, these methods still fall short when applied to tumor microenvironments (TMEs), whose spatial organization and functional dynamics must be comprehensively captured
[12].
2.3. Building fair and trustworthy AI systems
Integrating AI into cancer immunotherapy raises important ethical and privacy concerns that must be addressed to ensure its responsible adoption. Many existing AI systems rely on datasets that lack sufficient diversity, resulting in inconsistent performance across different populations
[13]. Another challenge lies in the opacity of many AI models, which often function as “black boxes” due to their complex architectures and enormous numbers of parameters. Current methods for improving explainability, such as those outlined in Refs.
[14],
[15], primarily focus on image datasets, limiting their usefulness in critical areas such as clinical trials and treatment planning
[16]. Privacy concerns also pose a significant challenge. Reliance on sensitive patient data introduces the risks of breaches, unauthorized access, and difficulties in meeting regulatory standards
[17]. Developing advanced privacy-preserving techniques and adhering to stricter regulatory guidelines are essential for protecting patient information while enabling meaningful AI applications.
3. Opportunities for AI in cancer immunotherapy
3.1. Generative AI in mRNA-based personalized cancer vaccines
While mRNA coronavirus disease 2019 (COVID-19) vaccines have been highly successful, extending this technology to treat cancers and other diseases requires significant advancements, particularly in the design of mRNAs that exhibit elevated expression levels and extended durability. Key sequence components, such as coding sequences (CDSs) and untranslated regions (UTRs), play a critical role in the therapeutic effectiveness of mRNA. Optimizing these sequences is crucial to unlocking the full potential of mRNA therapeutics, yet it remains extremely challenging due to the vast number of possible mRNA sequences
[18]. Current strategies for designing 5' UTRs explore only a small portion of the available design space, and most research has focused on comparisons with natural UTRs, lacking a broad investigation of alternative designs. To address these limitations, generative AI offers a promising approach
[19]. With their exceptional capacity for translation-based reasoning, generative AI models can contribute to the development of superior mRNA molecules by uncovering potential structural and sequence patterns
[7]. Utilizing these models could provide an effective strategy for mRNA design, significantly advancing the development of mRNA-based vaccines and therapeutics (
Fig. 1).
3.2. Accelerating cancer drug discovery through quantum machine learning
In recent years, drug discovery has been revolutionized by AI methods
[20]. Biotechnology companies focused on drug development are now establishing new business models based on AI-driven molecular modeling techniques. In this evolving landscape, AI tools such as ChatGPT and other large language models (LLMs) are making their mark in drug discovery
[21]. However, current methods primarily rely on approximations to model drug–target interactions, which limits their precision and applicability
[22].
Quantum machine learning (QML), founded on the principles of quantum mechanics, offers a transformative solution, as it is capable of simulating molecular interactions such as binding affinities with exceptional accuracy
[23]. By closely aligning with the physical reality of these interactions, QML has the potential to significantly advance our understanding of drug–target dynamics and improve the precision of computational drug discovery. For example, Li et al.
[24] proposed a hybrid quantum generator for discovering drug candidates by generating molecular structures that adhere to established chemical and physical properties. In addition, hybrid quantum-classical workflows offer practical solutions for integrating QML into real-world drug-discovery processes
[25]. These advances make QML particularly valuable for applications such as prioritizing promising drug candidates, substantially accelerating the path to therapeutic discovery.
3.3. Chain-of-thought reasoning in biomarker discovery
Although biomarkers are critically important for evaluating the efficacy of cancer immunotherapies
[26], the growing number of complex biomarkers increases both the costs and time required for decision-making in oncology practice. LLMs have emerged as useful tools for addressing these challenges, as they can reason across distributed and multimodal evidence sources
[27],
[28]. However, a prominent concern regarding LLMs is hallucination, in which models generate fabricated or erroneous information that lacks grounding in actual data
[29]. Moreover, the autoregressive nature of most LLMs can result in plausible yet incorrect predictions, particularly in cases involving rare or ambiguous clinical scenarios
[30].
Retrieval augmented generation (RAG)
[31] and chain-of-thought (CoT) reasoning
[32] offer promising approaches for addressing some of the inherent limitations of LLMs. RAG-based systems have been tested in healthcare settings, demonstrating their ability to retrieve and integrate clinical trial data for decision-making support
[33]. Similarly, studies have shown that LLMs combined with CoT reasoning can be fine-tuned to provide clinically relevant insights, such as predicting patient outcomes, underscoring their potential value in cancer immunotherapy
[34]. Despite their promise, these applications require rigorous validation to ensure that the retrieved knowledge is comprehensive, up-to-date, and free from bias. To mitigate these issues, the development of curated, high-quality biomedical databases with rigorous updating protocols is necessary. Additionally, collaborative efforts among healthcare institutions, researchers, and AI developers are essential to establish centralized, unbiased repositories optimized for clinical use.
3.4. Integrating multimodal large language models into tumor modeling analysis
TMEs are intricate niches characterized by cellular, molecular, and genetic diversity, particularly in the context of cancer immunotherapy. AI technologies have been employed in TME discovery to delineate tissue compartments without the use of antibodies. This includes identifying components such as normal ductal epithelium, precursor cells of pancreatic cancer, smooth muscle cells, acini, adipocytes, collagen, islets of Langerhans, lymph nodes, and nerves
[35]. For instance, employing machine learning in combination with hematoxylin and eosin (H&E) staining
[36] makes it possible to visualize three-dimensional (3D) tissue architecture across large tissue volumes. However, the modeling of biological interactions involving multimodal data continues to be a significant obstacle. Multimodal large language models (MLLMs) can parse and analyze vast amounts of scientific data spanning different data types to extract relevant information about cellular interactions and pathways. This capability is crucial for identifying and modeling complex relationships among various biological entities, such as cells, proteins, and signaling pathways
[37].
3.5. Optimizing clinical trials with trustworthy AI
Clinical trials are a key step in developing new therapies, yet nearly 90% of products under clinical trial fail to reach clinical implementation. AI tools such as LLMs can significantly enhance this process at multiple stages
[38]. The use of AI in medical imaging also facilitates the early detection of disease progression, enabling more informed and timely decisions compared with traditional methods
[39]. However, the complexity and opacity of many AI models, which involve millions of parameters and nonlinear transformations, pose challenges to trust and adoption
[40]. To address these issues, explainable AI frameworks must be implemented for tracing decision-making processes and identifying the key features driving predictions. Furthermore, fairness-aware algorithms are essential to mitigate biases and ensure equitable outcomes for diverse populations. Another critical consideration is safeguarding patient privacy. Robust data-anonymization methods, secure sharing protocols, and strict adherence to ethical standards are necessary to protect sensitive information. By embedding fairness, explainability, and privacy protections into AI systems, clinical trials can become more efficient, inclusive, and trustworthy, ultimately paving the way for broader adoption and improved success rates.
4. Conclusions
Over the past decade, remarkable progress has been made in AI-powered cancer immunotherapy, yet its practical realization continues to present substantial challenges. One prominent issue lies in the efficient use of real-world data, which includes electronic health records, medical imaging, and laboratory tests. Unlike structured clinical trial data, which is gathered under well-defined protocols, real-world data is often fragmented, unstructured, and heterogeneous. Utilizing this data with AI holds great potential to advance personalized immunotherapy but necessitates the creation of integrated systems equipped with advanced algorithms to manage the data complexity. Key priorities in this domain also include identifying novel therapeutic targets and developing efficient AI-driven drug-discovery platforms that can effectively match targets to candidate therapies. These initiatives may be supported by scalable and interpretable AI systems capable of processing multimodal datasets and delivering transparent, actionable insights to healthcare providers.
Future efforts should emphasize the establishment of unified frameworks for integrating multimodal data, alongside rigorous standards for model validation, reproducibility, and adherence to ethical principles. Ensuring patient privacy through advanced anonymization methods and secure data-sharing practices is equally essential for maintaining public trust. Addressing these challenges will enable AI to make transformative contributions to cancer immunotherapy, thereby driving innovation, improving treatment outcomes, and enhancing the overall quality of cancer care for patients.
Acknowledgments
This research was supported by Australian Centre for AI in Medical Innovation (ACAMI) funded by the Victoria State Government, National University of Singapore (NUHSRO/2020/133/Startup/08, NUHSRO/2023/008/NUSMed/TCE/LOA, NUHSRO/2021/034/TRP/09/Nanomedicine, NUHSRO/2021/044/Kickstart/09/LOA, and 23-0173-A0001), National Medical Research Council (MOH-001388-00, CG21APR1005, MOH-001500-00, and MOH-001609-00), Singapore Ministry of Education (MOE-000387-00 and MOET32023-0005), and National Research Foundation (NRF-000352-00).