The human body, an open and intricately complex system, displays distinct responses to varied medical interventions. Traditional Chinese medicine (TCM), rich in historical and cultural significance, offers a unique perspective on the intricate relationship between the human body and nature. Integrating distinct philosophical foundations, diagnostic methods, and therapeutic practices [
1], [
2], TCM has become a leader in addressing diseases like influenza, utilizing its extensive clinical experience and comprehensive strategies [
3]. The role of TCM was particularly highlighted during the coronavirus disease 2019 (COVID-19) pandemic in 2019 [
4], [
5], [
6], where it played a pivotal role, underscoring its importance on the global health stage. At its core, TCM advocates for a holistic treatment approach, which not only acknowledges the complexity of illnesses but also mirrors the deep philosophical insights and substantial practical knowledge accumulated within this medical tradition
In stark contrast to contemporary biomedicine, TCM’s perspective on health and disease significantly deviates in terms of underlying philosophies, terminologies, and methodologies for diagnosis and treatment [
7]. TCM research is increasingly embracing scientific methodologies, incorporating computational modeling and data analysis [
8], [
9]. This shift calls for a systematic investigation into TCM’s material basis to streamline its complex theoretical constructs into a coherent knowledge system. The term “material basis” refers to the chemical constituents that form a medicinal substance or compound, delivering therapeutic effects through their synergistic, multi-target, and multi-pathway mechanisms. Understanding the material basis is crucial for both guiding clinical practices and facilitating new drug discoveries, as exemplified by artemisinin [
10].
Merging traditional TCM insights with modern scientific approaches, while preserving holistic herbal medicine principles and meeting strict evidence-based criteria, presents considerable challenges. Research into the material basis of TCM spans a multifaceted domain, encompassing various dimensions and levels. Artificial intelligence (AI), through mimicking human cognitive functions, provides potent solutions for complex challenges. However, the intersection of AI with TCM’s holistic material basis research, particularly its application in deep analysis, remains underexplored. To this end, this study introduces an innovative research framework based on systems theory—the System Function Decoding Model (SFDM), involves four main steps: define, quantify, infer, and validate. The primary objective of this research is to optimize the study of TCM’s holistic material basis through AI, focusing on quantification and inference processes.
1. Integrating AI into holistic material basis research of TCM through SFDM
1.1. Unveiling the complexities of TCM’s holistic material basis
The material foundation of TCM comprises a complex network of bioactive compounds such as alkaloids, glycosides, polysaccharides, and essential oils, each contributing uniquely to TCM’s therapeutic efficacy. A profound understanding of TCM’s material basis involves more than identifying and quantifying these compounds; it requires a comprehensive examination of their physicochemical properties, intermolecular interactions, and collective impact on the body’s pharmacokinetic (PK) and pharmacodynamic (PD) activities. Such an in-depth approach is essential for the development of precise medicinal formulations and therapeutic protocols that consider individual differences, such as genetic makeup, age, and gender.
Modern research into TCM’s holistic material basis adheres to a principle against isolating components, emphasizing synergistic interactions that reflect TCM’s holistic philosophy. This view advocates that the efficacy of TCM formulations derives from the synergistic interactions among its constituents, mirroring TCM’s holistic philosophy. This research examines the full spectrum of TCM’s chemical constituents and their interactions within the body, affecting drug absorption, distribution, metabolism, excretion (ADME), and their combined impact on various biological targets. Exploring TCM’s holistic material basis bridges interdisciplinary fields such as pharmaceutics, pharmacology, systems biology, and precision medicine. It aims to elucidate the intricate relationships between TCM components at molecular and systemic levels, accounting for variabilities from individual to population scales, thereby uncovering the unique mechanisms and therapeutic potentials of TCM.
1.1.1. Comparative analysis of research strategies
Fig. 1 illustrates a comparative analysis of the top-down and bottom-up approaches in TCM material basis research. The Top-Down Scientific Discovery framework begins with a comprehensive layer, “formula” and incrementally narrows down to the “herbal,” “component,” and finally, the “ingredient” levels. This phenotype-driven discovery pathway starts with the formula’s observable effects, systematically isolating active ingredients through chemical separation and activity tracing.
In contrast, the Bottom-Up Scientific Innovation framework starts from the molecular “target” and “dose-response relationship,” ascending to “drug combinations” and “syndrome.” This method focuses on molecular interactions, aiming to decode TCM’s essence from the foundational level. It concentrates on ADME, toxicity, and the in vitro-in vivo correlation (IVIVC), thereby elucidating the underlying topological dynamics of syndromes. While the top-down approach seeks to simplify TCM’s complexity by deconstructing its formulas to elemental constituents, the bottom-up strategy endeavors to maintain TCM’s integrity, constructing a detailed understanding from the molecular level.
1.1.2. Top-down approach
Inspired by a reductionist philosophy, the top-down approach in TCM research methodically deconstructs formulations from broad to detailed perspectives to elucidate their mechanisms and efficacy. It starts with an overarching analysis of the effects of formulations, advancing to the separation of chemical and biological components, setting the stage for the identification and pharmacological exploration of individual compounds. Techniques like metabolomics [
11], [
12], TCM syndrome metabolomics [
13], gene expression profiling, and whole-genome chip assays [
14] are crucial for deepening the understanding of TCM’s mechanisms. Furthermore, immunoaffinity chromatography [
15] uniquely facilitates the selective removal of multiple components in TCM formulations. Despite its proficiency in isolating and identifying pharmacologically active substances, the top-down approach requires enhancements in its in-depth pharmacological analysis and overall mechanism exploration of formulations for improved efficacy.
1.1.3. Bottom-up approach
In contrast, the bottom-up approach uses bioinformatics and computational analysis to create a “component-target-disease” framework that investigates the synergistic effects of TCM components. It uses an interdisciplinary methodology, incorporating chemoinformatics and bioinformatics [
16], to identify potential active molecules, and employs molecular docking [
17] to predict TCM’s active components and understand their mechanisms. Supportive databases like CPMCP [
18], SymMap [
19], TCMSP [
20], ETCM [
21], TCMID [
22], SuperTCM [
23], TCMBank [
24], HERB [
25], and CMAUP [
26] underpin this research. Network pharmacology [
27] is extensively applied in this context to deepen the comprehension of TCM’s holistic efficacy and to chart new paths for understanding component interactions. However, this approach encounters challenges such as the potential for data oversimplification and homogenization. Additionally, the discovery of new drugs through target detection methods [
28] has been limited, especially in treating complex diseases where single-target strategies may not yield the anticipated therapeutic outcomes.
1.2. AI: A brief overview
AI, a rapidly evolving computer science field, is driven by advancements in big data, computational hardware, and algorithmic innovations. It aims to equip machines or software with the ability to perform tasks requiring human intelligence, ranging from basic automation to complex decision-making processes. The AI research domain is extensive, including subfields like machine learning (ML), natural language processing (NLP), computer vision, and robotics, which enable computer systems to replicate human cognitive functions such as learning, reasoning, and self-correction.
Fig. 2 illustrates AI’s multidisciplinary role in scientific research, integrating experimental, theoretical, computational, and data-driven paradigms to address complex challenges. AI applications in scientific research involve data collection, information extraction, and intelligent decision-making, relying on a synergy of computer software technology, mathematical statistics, and substantive expertise. In contemporary scientific research, AI has become crucial for analyzing and addressing complex issues across disciplines, from biology to meteorology. AI methodologies are broadly categorized into end-to-end data-driven models and mechanistic models, reflecting AI’s diverse applications and its pivotal role in enhancing our understanding and management of complex systems and phenomena.
1.2.1. End-to-end data-driven models
These models excel at transforming raw data directly into final outputs, eliminating the need for manual data processing or feature extraction, and are ideal for managing large and complex datasets. They autonomously learn to identify critical patterns and structures. For instance, the AlphaFold deep learning model has revolutionized our understanding of protein structures by accurately predicting their three-dimensional (3D) shapes, using a comprehensive database of known protein structures [
29]. Furthermore, in NLP, models like ChatGPT have demonstrated remarkable proficiency in understanding and generating human language, contributing to advancements in areas such as cancer diagnosis and treatment planning [
30].
1.2.2. Mechanistic models
Contrasting with data-driven approaches, mechanistic modeling relies on established scientific principles to develop predictive models, often grounded in quantitative theories from physics, chemistry, or biology. These models seek to elucidate and forecast the behaviors of complex systems. For instance, weather forecasting models utilize atmospheric physics to predict climatic changes with significant accuracy [
31]. In healthcare, mechanistic models are invaluable for predicting disease progression and enhancing diagnostic processes. For example, AI-enhanced image analysis has demonstrated potential in improving breast cancer detection rates and reducing false positives [
32]. Moreover, these models can incorporate individual patient data, including historical health records, biomarker profiles, and treatment responses, to offer personalized disease progression forecasts. Bayesian models are particularly notable for their application in clinical prognosis and diagnostic accuracy improvements [
33].
1.3. SFDM and AI in TCM research: A systems theory approach
Integrating AI into TCM holistic material basis research introduces significant challenges due to TCM’s reliance on a holistic, experientially driven framework that defies straightforward quantification. TCM formulations, characterized by their complexity from multiple constituents to diverse therapeutic pathways, demand analytical methods that can encompass this multifaceted nature. Effective AI application in TCM requires models that are deeply rooted in TCM principles and capable of analyzing beyond mere constituent quantities to understand their collective biological mechanisms.
To navigate these complexities, we use a systems theory framework (
Fig. 3), viewing TCM as the interaction between two complex systems: TCM formulations and the body’s response systems. This perspective frames the clinical use of TCM as a dynamic interplay of material, energy, and information flows. The “material flow” encompasses TCM’s comprehensive material basis, including both the medicinal constituents and the physiological reactions they trigger. Following the “drug-properties theory of TCM” and the “eight principles of TCM syndrome,” TCM formulations are made from diverse natural ingredients and administered through various methods, including oral, inhalational, and transdermal routes. These interact with the body’s system through processes of PK and PD, aiming for therapeutic efficacy.
1.3.1. Decoding complex systems in TCM with SFDM
Grounded in systems theory, SFDM decodes complex systems by systematically examining how interactions among components yield emergent properties not evident in individual parts. This approach addresses the limitations of traditional linear research methods in capturing the dynamic complexity of systems, making it especially relevant for the holistic material basis study of TCM.
Define: This foundational research step identifies key components, issues, and system boundaries, integrating TCM theory, biomedicine, and systems biology to outline the system’s structure and critical factors such as chemical properties, pharmacological targets, and human metabolism. This step establishes a multidimensional framework for TCM’s holistic material basis.
Quantify: This step converts qualitative relationships into quantitative models using analytical technologies and AI to measure components and interactions. This phase involves building mathematical models to represent the system’s structure and dynamics, utilizing AI for data processing and model construction.
Infer: This phase leverages AI’s predictive capabilities to simulate potential system behaviors and outputs under various scenarios. Techniques like molecular dynamics simulations and bioinformatics are used to predict drug-target interactions and pathways, with AI algorithms enhancing prediction accuracy.
Validate: This involves using experimental or clinical data to test the reliability of inferences and adjust models based on feedback to align predictions with observed results. This iterative process ensures model validity and effectiveness.
SFDM transcends disciplinary boundaries, offering a comprehensive framework for TCM research. It integrates AI with traditional TCM theories and methods, promoting the synthesis of multidisciplinary knowledge and facilitating a deeper understanding of TCM’s holistic material basis.
1.3.2. Objectives and scope of the study
The primary goal of this study is to construct a systematic framework that harnesses AI to bolster the research into the material basis of TCM. This entails addressing the inherent challenges associated with integrating AI into the complex dimensions of TCM through the implementation of the SFDM, which is articulated through phases of definition, quantification, inference, and validation. This investigation delineates the role and function of TCM theoretical principles within complex systems under a systems theory framework, merging TCM research with contemporary medical knowledge. It elaborates on the practical advancements of AI in the domains of TCM material basis quantification, and inference, transforming AI from merely a tool for data processing to a powerful ally in comprehending and applying traditional TCM knowledge. This research spans AI, systems science, and TCM, demonstrating a multidisciplinary effort and setting a pioneering benchmark for incorporating AI into TCM research.
In refining the objectives, the study accentuates a multidimensional comprehension of the material basis system in the definition phase, advocating for a robust theoretical exposition of TCM from dual perspectives: a top-down, system theory-guided approach, and a bottom-up, elements-structure-function-based approach. This dichotomy enables a thorough exploration of TCM’s material basis, covering everything from active substances and biopharmaceutical regulators to drug combinations, self-assembly mechanisms, and remote control. Additionally, the research explores AI’s capabilities in the quantitative assessment of TCM’s material basis and its predictive proficiency in inference analysis. This encompasses evaluating AI’s effectiveness in modeling the complex interactions among TCM elements and forecasting system behaviors from quantitative data, accompanied by two specific technical proposals for AI application material basis research.
2. Systems theory-guided top-down research framework in TCM
This section outlines a comprehensive, systems theory-guided framework for TCM research, streamlining and simplifying TCM’s intricate aspects through a structured, top-down approach. By integrating AI into this methodology, the study seeks to enhance the understanding and application of TCM’s material basis, formulation design, mechanisms of action, clinical positioning, and interaction with external conditions.
As depicted in
Fig. 4, this structured top-down approach maps the system from macro to micro levels, integrating contemporary medical insights, and outlines a hierarchical structure focusing on elements, structure, function, boundaries, and environment. This framework not only provides clarity to the fundamental questions of “who,” “why,” “what,” “which,” and “where” in TCM but also furnishes AI with a defined research trajectory. It enhances AI’s capacity to parse and simulate the complex mechanisms of action between TCM formulations and TCM syndromes, establishing a novel methodology for TCM research.
2.1. Elements—Who: Material basis
“Elements” are the foundational components within a system, signifying the various chemical constituents that comprise Chinese herbs, known as the “material basis.” This segment starts with the macro level of herbal materials, progressively narrowing down to the micro level of chemical components and individual ingredients, highlighting their collective role in the system’s overall functionality.
2.1.1. Herbal materials
Herbal materials, including natural plants such as licorice [
34], ginseng [
35], and Astragalus [
36], and those derived from animals [
37], minerals [
38], and fungi [
39], offer rich resources for TCM formulations. The type of herbs, the season of harvest [
40], the growing environment [
41], and the processing methods [
42] can all affect their medicinal properties. The combination and proportion of these herbs form the foundation of TCM formulations, allowing us to understand the relationship between Chinese medicine and syndromes at a holistic level.
2.1.2. Components
Each herb contains multiple chemical components like flavonoids [
43] and saponins [
44], which are the material carriers of the herb’s therapeutic actions. The diversity of these components in herbs and the subtle variations in their interactions determine their synergistic effects in TCM formulations. Therefore, an analysis at the component level provides us with a scientific basis for understanding complex Chinese herbal formulations from a chemical perspective.
2.1.3. Ingredients
We further investigate the individual ingredients that constitute these chemical components. Although not all individual components are directly involved in the medicinal effect, certain specific active molecules, such as quercetin [
45], chlorogenic acid [
46], and emodin [
47], are of particular interest due to their significant pharmacological activity. These active molecules play key roles in TCM formulations and have become a focus of extensive research.
For example, the classic TCM formula Danggui Buxue Tang, which treats symptoms of qi and blood deficiency, primarily features
Angelicae Sinensis Radix and
Astragalus. Danggui Buxue Tang contains a variety of active components such as saponins and flavonoids [
48]. Among them,
Astragalus saponin IV is present in the most abundant quantity at specific extraction ratios, thus exerting the optimal therapeutic effect [
49].
2.2. Structure—Why: Formulation design
“Structure” investigates the relationships and functional interactions among these elements, forming the system’s architecture. It delves into the holistic complexity of Chinese herbal medicine, its component interactions, and compatibility with the human body, aiming to unravel the logic and dynamics within herbal formulations and their profound interactions with TCM symptoms.
2.2.1. Herbal compatibility
The combination of herbal formulas adheres to the TCM principle of “monarch, minister, assistant, and courier,” reflecting a deep understanding of the synergistic and potentiation mechanisms of herbs and the holistic treatment philosophy [
50]. In clinical practice, although the compositions of TCM formulas are complex, certain herbs are commonly combined due to their complementary properties, forming frequently used pairs. For example, the combination of Scutellaria baicalensis and Coptis chinensis can enhance anti-inflammatory and antimicrobial effects [
51]. However, there is often a lack of clear scientific data supporting the dose-response relationships of many combinations [
52].
2.2.2. Component ADME
In the study of the ADME of Chinese medicine components, traditional radiolabeling techniques are not suitable for tracking complex components of Chinese medicine. Modern mass spectrometry, particularly liquid chromatography-high resolution mass spectrometry, is well-suited for analyzing Chinese medicine components and their metabolites, opening new avenues for studying the ADME characteristics of Chinese medicine [
53]. Moreover, facing the complex chemical composition of Chinese medicine and its typical application as combination therapy, researchers have established multi-component PK and drug combination PD methods to assess drug interactions [
54].
2.2.3. Dose-response relationship of ingredients
The dose-response relationship of ingredients is a core concept in multiple biological fields such as pharmacology, toxicology, and risk assessment [
55]. It describes the relationship between drug dose and biological response, determining not only the appropriate dose and frequency of a drug in populations but also crucial for the development of new cytotoxic drugs [
56]. The dose-response relationship is typically characterized by a sigmoidal model [
57], ranging in complexity from simple single-parameter equations to complex multi-compartment PK-PD models. For instance, studies on Huashi Baidu decoction (Q-14) have identified bioactive compounds with dose-dependent inhibitory effects on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), highlighting the importance of dose-response research in Chinese medicine [
58].
2.3. Function—What: Mechanism of action
“Function” examines the purposes and outcomes of the system, elucidating the clinical efficacy and mechanisms of action of TCM formulations. This involves a detailed examination from macro to micro levels, merging TCM theory with modern biological insights to uncover the therapeutic principles of Chinese medicine.
2.3.1. Efficacy of herbal medicine
Guided by TCM theory, the efficacy level of herbal medicine classifies and summarizes the therapeutic and health-promoting effects of herbs. Herbs are categorized by their expected therapeutic effects, such as “clearing heat” or “invigorating blood to dispel stasis,” providing a clear framework for understanding how ancient physicians chose appropriate herbs based on specific patient syndromes [
59]. For instance, some studies classify herb actions into different efficacy categories based on TCM theory, illustrating effective use of heat-clearing and detoxifying herbs in slowing the progression of hand, foot, and mouth disease [
60].
2.3.2. Biological mechanisms of components
At the component level, we examine how specific chemical components function within an organism. For instance, flavonoid compounds can activate antioxidant pathways, producing anti-inflammatory effects and inhibiting the activity of inflammation-related enzymes, thereby reducing inflammatory responses [
61]. Studies show that flavonoids can modulate the inflammatory response in cardiovascular diseases by inhibiting key pathways and reducing inflammatory cytokine expression [
62].
2.3.3. Targets of ingredients
At the level of targets of ingredients, we focus on the binding of single chemical entities to specific receptors within an organism and the mechanisms by which they activate or inhibit related biological responses [
63]. Analyzing the interaction networks of TCM ingredients with biological targets reveals at the micro level how TCM produces therapeutic effects. For example, computational analysis of target-disease associations for commonly used Chinese herbs provides strategies for the design of targeted herbal formulations for chronic diseases [
64]. Currently, emerging single-cell multi-omics technologies are becoming key tools for identifying and validating targets of ingredients of Chinese medicine [
65].
2.4. Boundaries—Which: Clinical positioning
“Boundaries” define the research scope by concentrating on specific system factors and minimizing external disruptions. This concept is crucial for determining the clinical application of Chinese medicine interactions with TCM syndromes, offering a clear research direction and ensuring a systematic, profound analysis.
2.4.1. TCM syndromes
This level concentrates on TCM’s traditional classification and description of diseases and symptoms. For instance, “liver qi stagnation” indicates an impeded flow of liver energy, which can lead to physical and emotional issues. This describes not only clinical symptoms but also involves constitution, lifestyle, and other factors, providing us with a comprehensive understanding of the system’s starting point. In practical applications, TCM therapies have shown promising results in treating cancer-related depression, demonstrating the potential of TCM as an alternative therapy [
66]. Similar TCM syndromes under different pathological states indicate that TCM syndrome classification transcends the boundaries of specific diseases [
67].
2.4.2. Pathophysiological characteristics
Integrating modern medical viewpoints and technologies, this level enables more accurate identification and description of the biomedical characteristics of TCM syndromes. “Damp-heat” in TCM corresponds to specific inflammatory responses or metabolic abnormalities in modern medicine, forming a bridge between ancient TCM theories and contemporary biomedicine. For example, adipotoxicity, hypoxia, and inflammation in obesity align with the pathophysiological characteristics of the TCM “damp-heat” syndrome [
68] and are associated with chronic low-grade systemic inflammation, which are driving factors for metabolic diseases.
2.4.3. Biomarkers
At this level, we focus on biomarkers associated with specific TCM syndromes or disease states. Identifying and distinguishing biomarkers of different TCM syndromes enhances the objectivity of diagnosis and the efficacy of treatment. Multi-omics studies chemically characterize Chinese herbal medicine ingredients and analyze biofluid samples, revealing how these herbs induce dynamic biomarker changes at molecular and cellular levels [
69]. Furthermore, utilizing dynamic network biomarker algorithms to analyze transcriptomic data from patients with chronic hepatitis B, clinical validation of biomarkers for heart yin deficiency and heart yang deficiency in chronic heart failure syndrome has underscored the potential for diagnosing critical states in TCM syndromes [
70].
2.5. Environment—Where: External conditions
“Environment” considers the external factors that influence the system, such as the patient’s constitution, lifestyle, and socio-psychological conditions. Understanding these external conditions is vital for tailoring treatments to individual needs, thereby optimizing therapeutic outcomes.
2.5.1. TCM constitution
Constitution reflects an individual’s susceptibility to diseases, arising from the interplay between genetics and environment [
71]. Research has indicated that certain constitutional types may be more prone to specific diseases such as depression [
72]. Analyzing the relationship between TCM constitutional types and diseases lays the foundation for health management and disease prevention [
73]. Studies have also explored the connection between the nine types of TCM constitution and conditions such as being overweight, obese, and underweight [
74], which aids in offering precise health management plans for individuals.
2.5.2. Epigenetic characteristics
The field of epigenetics highlights the dynamism and modifiability of genetic information, influenced by environmental factors and regulated by epigenetic mechanisms [
75]. Core epigenetic regulatory pathways, such as DNA methylation and chromatin remodeling, play a decisive role in the ADME of drugs and are closely related to adverse drug reactions [
76]. For instance, drugs and their metabolites can alter epigenetic states, resulting in varied drug responses among patients. The capability of TCM to modulate epigenetic modifications offers new strategies for the prevention and treatment of diseases such as atherosclerosis [
77].
2.5.3. Personalized treatment
Personalized treatment accounts for each patient’s unique traits, including genetics, TCM constitution, lifestyle, and disease history, which are crucial for customizing treatment strategies, dosages, and courses [
78]. TCM provides personalized treatment recommendations for chronic disease patients based on constitution theory, achieving precise diagnosis and therapy by comprehensively identifying syndromes, diseases, and constitutions [
79]. The alignment of traditional TCM constitutional classification with modern genetic typing foresees a shift in medicine from holistic to personalized and refined treatment [
80]. The integration of modern biotechnology and advanced diagnostic methods with the TCM typing treatment model paves new paths for the development of personalized TCM treatments and health maintenance.
2.6. Application: TCM research in cancer therapy
Expanding on the foundational systems theory-guided framework established in earlier sections, this analysis extends its disciplined methodology to the domain of cancer therapy, underscoring the significant role of TCM as a valuable adjunctive treatment. Recognized internationally for its prowess in bolstering immunity, stabilizing physiological balance, and actively suppressing the proliferation of cancer cells, TCM’s assimilation into oncological treatment regimens demonstrates its all-encompassing approach to health care [
81], [
82], [
83].
Fig. 5 serves as a visual exposition, employing a structured, top-down methodology to dissect the intricate application of TCM in oncology, encompassing its elements, structure, function, boundaries, and environment interactions. This comprehensive examination includes bioactive component screening, augmentation of therapeutic structures, modulation of biological pathways, stratification of oncological stages, and the development of prophylactic strategies influenced by environmental factors. This detailed case study emphasizes the nuanced integration of TCM throughout the cancer therapy lifecycle—from prevention to treatment and management—providing a blueprint for systematically exploring TCM’s potential in cancer therapy and beyond.
In TCM “elements,” research has identified several Chinese herbs, including
Astragalus and its compounds, for their pronounced anti-cancer properties [
84], [
85]. These components have demonstrated potential in clinical trials, highlighting TCM’s role in advancing anti-cancer drug development through methods like metabolomics and network pharmacology [
86]. The “structure” of TCM in cancer therapy emphasizes its crucial role in combination treatments [
87]. TCM enhances the efficacy of primary cancer treatments, reduces side effects, and modulates anti-tumor immune responses. Integrating TCM small molecules with conventional drugs demonstrates the potential to enhance cancer treatment outcomes and patient responsiveness [
88].
“Function” delves into how TCM formulations influence cancer, revealing their capability to intervene in cancer progression through various biological pathways [
89]. These actions include modulating the tumor microenvironment, reversing tumor immune escape, and effectively combating multidrug resistance, highlighting TCM’s unique value in tumor immunotherapy [
90]. Investigating TCM theory’s “boundaries” aids in precisely understanding cancer progression, providing insights into disease severity, treatment strategies, and intervention timing [
91], [
92]. TCM’s holistic approach, complemented by modern technologies, enhances precision in cancer treatment, despite challenges in early diagnosis [
93], [
94], [
95].
Lastly, the “environment” aspect highlights TCM’s comprehensive approach to cancer treatment, emphasizing whole-body regulation and prevention [
96], [
97]. TCM’s preventive strategies, aligned with personalized treatment trends, offer innovative therapeutic avenues by considering individual susceptibilities and genetic factors [
98].
3. “Elements-structure-function” bottom-up exploration of material basis
Transitioning from the top-down systems theory framework previously outlined, this section engages in a meticulous, bottom-up investigation of TCM’s material basis via the “elements-structure-function” model. This methodology resonates with the intrinsic nature of TCM’s material basis, aspiring to systematically decode the active components (“who” is acting), their action mechanisms (“why” they are effective), and the resulting therapeutic outcomes (“what” effects are manifested).
This bottom-up method provides a detailed delineation of TCM elements, crucial for utilizing AI in later stages of quantification and inference. Starting from the molecular sphere and scaling upward, this scrupulous definition process is essential for grasping a multi-layered perspective of the material basis. It establishes the foundation for AI to model intricate connections and predict outcomes in TCM compounds, ultimately refining the research process of TCM’s material basis through systematic AI deployment.
Illustrated in
Fig. 6, our approach begins with the isolation and classification of pharmacologically active and regulatory substances within TCM at an elemental level. This initial stage sheds light on their attributes and provenance, priming us for discerning their interplay. Advancing to the structural tier, our analysis deepens the understanding of TCM’s approach in harmonizing these components, disclosing the intricacies of drug interactions and self-organization mechanisms. Such knowledge is pivotal for optimizing ADME traits, enhancing drug delivery, and ensuring precise therapeutic targeting. At the functional tier, the exploration clarifies TCM’s extended regulatory influence in clinical settings, showcasing how TCM constituents orchestrate concerted actions throughout the body. This comprehensive inquiry underscores the scientific rigor and precision needed to define TCM’s material basis and illustrates the seamless integration of ancient wisdom with modern analytical methods, thereby enhancing the understanding and utility of TCM remedies.
3.1. Active substances
Active substances are the cornerstone of TCM, driving its therapeutic effects through interactions within the human body. This segment delves into the varied active substances integral to TCM’s efficacy, encompassing inorganic elements, small molecule ingredients, polysaccharides, peptides, microRNAs (miRNAs), and the emerging role of exosomes.
3.1.1. Inorganic elements
Essential for TCM’s effectiveness and safety, inorganic elements such as calcium, magnesium, and zinc—often derived from mineral medicines—play critical roles in bodily functions [
38]. Their regulation is vital for TCM quality control, with advanced techniques such as inductively coupled plasma mass spectrometry (ICP-MS) being used for precise measurement [
99], [
100].
3.1.2. Small molecule ingredients
Small molecules, including alkaloids and flavonoids, are key to TCM’s pharmacological diversity, offering a range of therapeutic actions from antioxidant to anti-inflammatory effects. Modern analytical technologies, combined with big data and AI, advance the identification and mechanistic study of these molecules [
101].
3.1.3. Polysaccharides
Found in various medicinal materials, polysaccharides exhibit immunomodulatory and anti-tumor activities. Their interaction with the gut microbiota, crucial for various physiological functions, underscores the complexity of TCM’s action mechanisms [
102], [
103]. Given that “structure determines function,” precise identification of polysaccharide structures is crucial. However, due to their highly complex structure, high molecular weight, and branched characteristics, precise identification of polysaccharide structures remains a challenge.
3.1.4. Peptides
Peptides are highly bioactive components in TCM, consisting of multiple amino acids linked by peptide bonds. They are abundantly present in snake venom, bee venom, and certain herbs and have been proven to have significant immunomodulatory, anti-inflammatory, anti-tumor, and antioxidant pharmacological effects [
104]. However, peptide research and application face challenges due to their complex molecular structures and sensitivity to environmental factors like temperature, pH, and enzymatic activity, which make them prone to degradation and instability. Additionally, the bioavailability of peptides, especially their oral absorption rates, is generally low, limiting their potential in drug development [
105]. Cutting-edge techniques in liquid chromatography-mass spectrometry (LC-MS) and proteomics are crucial for exploring their therapeutic potential [
106].
3.1.5. miRNA
miRNAs are short-chain non-coding RNA molecules about 21 to 24 nucleotides in length, primarily regulating gene expression by binding to the 3′ untranslated region of target messenger RNAs (mRNAs), thereby affecting key biological processes such as cell proliferation, differentiation, and apoptosis. The therapeutic potential of miRNAs, still in the early stages of exploration, has shown promise for treating various diseases [
107]. TCM research is gradually uncovering the important role of miRNAs in active ingredients. For example, miR2911 found in
Lonicerae Japonicae Flos has shown unique stability and can directly inhibit viral replication within the host, accelerating patient recovery [
108]. miR162a from goji berries can enter the bloodstream through oral intake and promote osteoblast formation, potentially impacting the treatment of osteoporosis [
109]. The miRNAs currently identified in relation to TCM are just the tip of the iceberg, and it is expected that more cases of TCM-miRNA interactions will be discovered in the future, providing new directions and ideas for TCM new drug development.
3.1.6. Exosomes
In cell biology, exosomes generally refer to a subtype of extracellular vehicles (EVs), which are vesicles released by cells into the extracellular space. Although the term “nanoparticles” is also used to describe similar structures in some literature, it refers to a broader category of substances at the micro and nanoscale. As a subset of EVs, exosomes show an increasing value in TCM research. Exosomes, loaded with various RNAs, proteins, and lipids, facilitate key information transfer between cells, influencing and regulating physiological and pathological processes [
110]. Studies have found that exosomes extracted from ginseng have the potential to inhibit the growth of melanoma [
111] and differentiation of osteoclasts [
112]. By deeply analyzing exosome composition, we can more precisely decipher TCM’s mechanisms of action at the cellular and tissue levels. Additionally, exosomes are explored as a novel drug delivery system with potential to enhance drug bioavailability and targeting [
113]. Although extracting high-purity exosomes from TCM poses technical challenges, there is reason to anticipate a more comprehensive understanding of exosomes’ unique role in TCM in the future.
3.2. Biopharmaceutical regulatory substances
In TCM, alongside active substances that directly contribute to therapeutic outcomes, a range of biopharmaceutical regulatory substances also plays a crucial role. These substances play a crucial role in the PK of TCM formulations, significantly influencing the ADME of active ingredients. Furthermore, they serve to modulate toxicity and bolster both the efficacy and safety of TCM treatments. The origins of these regulatory substances are diverse, spanning from excipients and the TCM ingredients themselves to various biological environmental factors.
3.2.1. Excipients
Introduced during TCM processing, excipients like alcohol, vinegar, salt, and honey stabilize drug effects and actively regulate physiological targets and drug transport mechanisms, such as P-glycoprotein transport [
114], [
115]. For example, cyclodextrins are notable for enhancing drug solubility and engaging in biological interactions relevant to cholesterol management and Alzheimer’s disease therapy [
116].
3.2.2. TCM itself
Within TCM’s complex formulations, certain ingredients embody the concept of “medicine-auxiliary unity,” acting simultaneously as active agents and regulatory elements. Flavonoids not only offer anti-inflammatory or antioxidant benefits but also modulate the activity of metabolic enzymes and transport proteins, thereby affecting the PK of co-administered substances [
117]. Polysaccharides exemplify another dual role; they provide immunomodulatory effects and improve the solubility and stability of other active compounds, thus enhancing overall therapeutic action [
118].
3.2.3. Biological environmental factors
The interaction of TCM components with endogenous substances, such as bile salts and plasma proteins, significantly influences drug behavior. Bile salts facilitate lipid digestion and absorption and, in conjunction with specific transporters and other molecules, can enhance drug bioavailability [
119], [
120]. Additionally, the binding of drugs to plasma proteins upon entering the bloodstream profoundly impacts drug distribution and efficacy, with variations in internal factors like pH and electrolytes further contributing to differences in drug action among individuals [
121].
3.3. Drug combinations
In both modern medicine and TCM, drug combinations are strategic in enhancing treatment efficacy, reducing toxicity, and preventing resistance, especially for complex conditions such as cancer and infectious diseases. The combination theory in TCM is crucial for analyzing interactions within multi-component formulations. It examines the independent action of each component as well as the synergistic or antagonistic effects arising from their interactions. This dual perspective is vital for appreciating the complex interplay of components in TCM and for designing effective combination therapies.
3.3.1. Independent drug actions
Independent drug action is a principle of drug combination where each drug exerts its pharmacological activity independently, without interaction or influence from others [
122]. This means that the effect of each drug in the combination does not depend on the presence of other drugs. The main advantage of independent drug action is that it provides multiple opportunities for patients with varying drug sensitivities to benefit from at least one drug, potentially increasing treatment effectiveness and success rates. This is particularly evident in TCM, where different components may target different physiological pathways or pathological processes, providing multiple therapeutic entry points for different patient groups. This multi-targeted efficacy is a distinct advantage of TCM.
3.3.2. Combined drug effects
In contrast to independent drug actions, combined drug effects emphasize the mechanisms and outcomes of drug interactions [
123]. Components in TCM often interact, producing synergistic or antagonistic effects, and such interactions can help enhance efficacy or reduce side effects. For example, the combination of Shuanghuanglian formulation with antibiotics can significantly enhance therapeutic effects [
124]. Under synergistic action, the combined effect of two drugs, like stevioside and eugenol interacting on specific biological pathways, exceeds the sum of their individual effects, offering cardiac protection [
125]. In antagonistic action, one drug may reduce the effect of another, such as the interaction between TTA-A2 and paclitaxel at the same active site [
126]. Potentiation occurs when one drug, such as lycopodium acid, enhances the effect of another—like the antimicrobial activity of antibiotics—even if it has no significant effect alone [
127]. These modes of action are not only crucial for understanding the efficacy of TCM but also provide new perspectives for drug design.
3.3.3. Evaluating combined drug effects
Methodologies for evaluating combined drug effects have significantly evolved, integrating advanced computational techniques with traditional pharmacological approaches. Notably, advancements have focused on utilizing both effect-based and dose-effect-based methods for a comprehensive assessment of drug combinations [
128]. Zheng et al. [
129] introduce SynergyFinder Plus, enhancing the analytical landscape with extended mathematical models that facilitate nuanced analyses of drug synergy and sensitivity. This platform integrates statistical evaluations and confidence intervals, providing a deeper understanding of combination therapies. In parallel, Malyutina et al. [
130] present a cross design with sensitivity and synergy scoring, streamlining the evaluation of drug interactions. This approach is noted for its reproducibility and efficiency, reducing the reliance on extensive experimental materials. Together, these innovations mark a transition to a more holistic analysis, merging mathematical rigor with pharmacological insights to refine the process of selecting and evaluating drug combinations.
3.3.4. Predicting combined drug effects
In the pursuit of identifying synergistic drug combinations, research strategies increasingly incorporate computational methods to navigate the extensive datasets associated with drug pairing and dosage combinations [
131]. These methods utilize public genomic and phenotypic resources to offer insights into cellular responses to drugs and aid in developing algorithms that predict effective drug combinations [
132]. Furthermore, analyzing the interplay between drug targets and disease-associated proteins within protein interaction networks illuminates the potential of diverse drug-drug-disease combinations, paving the way for novel therapeutic strategies [
133].
Recent advancements exemplify this shift towards computational synergy prediction. Malyutina et al. [
130] introduced a novel cross design paired with a drug combination sensitivity score and an S synergy score. This approach optimizes the evaluation of drug interactions, ensuring robust and precise assessments with reduced experimental requirements, thus enhancing the discovery rate in high-throughput drug combination screenings. Complementing this, Gan et al. [
134] applied a network medicine framework to TCM, demonstrating that the network proximity of an herb’s targets to symptom-related modules within the human protein interactome can predict the herb’s effectiveness in treating specific symptoms. This method not only validates the scientific basis of TCM but also sets a precedent for the molecular understanding of natural medicine. Together, these studies underscore the indispensable role of computational tools in the modern landscape of drug combination research, enabling a more efficient and scientifically grounded exploration of therapeutic potentials.
3.4. Self-assembly
Self-assembly, where molecules naturally organize into stable, ordered structures, is increasingly recognized in TCM research for its pharmacological and pharmaceutical implications. This process, central to nanotechnology due to its environmental friendliness, biodegradability, and biocompatibility, has become instrumental in the development of novel drug delivery systems [
135], [
136]. For example, traditional TCM decoction techniques can induce self-assembly of herbal compounds into nanoparticles, enhancing the bioavailability of medicines [
137], [
138]. Additionally, interactions between TCM compounds and the body’s endogenous molecules like bile salts or proteins can modify the efficacy and distribution of drugs [
139].
Technological advances have opened new avenues for exploring self-assembly in TCM, leading to the creation of targeted drug delivery mechanisms, particularly for cancer treatment. These mechanisms utilize biomimetic approaches for enhanced specificity in tumor targeting, offering promising prospects in therapy and diagnostics [
140], [
141], [
142], [
143]. The discussion on self-assembly in TCM spans several areas, including the behavior of small and large molecules and their interactions with endogenous small molecules and macromolecules. This comprehensive approach sheds light on the varied potential of self-assembly for drug delivery within TCM.
3.4.1. Small molecules in TCM
Small molecules in TCM enter nanotechnology through two primary self-assembly mechanisms. One mechanism is the self-assembly of molecules like terpenoids (including betulin, betulinic acid, and oleanolic acid) [
144] and steroids and glycosides that form gel-like structures [
145], offering significant drug delivery advantages. Moreover, compounds like ursolic acid [
146] and rhein [
147] have been found to self-assemble into nanostructures to enable effective drug delivery. The second type is the supramolecular assembly of two small molecules, such as berberine and baicalin [
148], enhancing biocompatibility and significantly improving antibacterial activity. Additionally, the self-assembly strategy of berberine helps to neutralize the toxicity of aristolochic acid [
149]. Similarly, based on like mechanisms, the combination of sanguinarine and baicalin can form hydrogels that enhance antibacterial effects [
150].
3.4.2. Large molecules in TCM
In the case of large TCM molecules, self-assembly capabilities allow polysaccharides and proteins to form supramolecular structures with unique functions, affecting the release and absorption of drugs and providing a new direction for controlled release formulations [
151]. Polysaccharides can self-assemble in response to pH changes or interact with metal ions [
152]. Similarly, the formation of higher-order protein structures, like ferritin [
153] and silk proteins [
154], is a self-assembly process. The self-assembly capabilities of zein [
155] and edible dock proteins [
156] in drug delivery are notable, and saponin-involved protein self-assembly can alter material properties [
157], opening new research directions for drug delivery technologies.
3.4.3. TCM with endogenous small molecules
The self-assembly of TCM with endogenous molecules like bile acids and phospholipids affects drug behavior in the body. They form supramolecular structures like micelles, enhancing drug absorption and bioavailability. Bile acids, especially in the intestines, interact with drugs to promote drug release [
158]. Studies on drug interactions with bile acids have explored their synergistic effects, stoichiometry, and binding constants to understand their interaction mechanisms [
159]. Furthermore, stearic acid has been found to enhance the stability of drug salt nanomicelles, promoting gastrointestinal absorption of drugs [
160].
3.4.4. TCM with endogenous macromolecules
The interaction between TCM and endogenous macromolecules is critical for the stability, release, and absorption of drugs. Drugs may undergo supramolecular self-assembly with proteins in the digestive system, such as trypsin and pepsin, affecting their catalytic activity. The binding of anthocyanin B3 to trypsin and pepsin [
161], the direct inhibition of naringin on these enzymes [
162], and the non-covalent interactions of polyphenolic compounds with digestive enzymes [
163] are examples of this process. Once in the bloodstream, TCM components may also self-assemble with proteins like albumin and globulin, affecting drug distribution and stability in the blood. Studies show that drug binding to plasma proteins has profound effects on their PK and PD [
164]. Hydrophobic interactions and hydrogen bonding enable flavonoid compounds to bind to plasma proteins, affecting their bioavailability [
165]. For instance, the low concentration of berberine in plasma and its high concentration in tissues may be related to its self-assembly with hemoglobin [
166].
3.5. Remote control
Remote control within the context of TCM refers to the complex interplay among various organs and systems within the body, unveiling new vistas for understanding the multifaceted actions and liver toxicity mechanisms inherent in TCM formulations (
Fig. 7). This concept, exemplified in the brain-gut and gut-brain axes, highlights the significance of bidirectional communication pathways in assessing TCM’s efficacy and safety. With advancements in metagenomics and metabolomics, researchers are now better equipped to explore TCM’s influence on gut microbiota and internal metabolites, paving the way for more precise and safe treatment methodologies.
Brain-gut axis: This axis underscores the symbiotic relationship between cognitive states and gastrointestinal health, spotlighting gut microbiota’s role in modulating neurological functions through metabolic byproducts like neurotransmitters and short-chain fatty acids [
167]. Such insights are pivotal for tailoring TCM approaches to central nervous system disorders, emphasizing the gut’s influence on mental well-being.
Brain-heart axis: Delving into the interconnection between emotional health and cardiac function, this axis reveals the dual impact of psychological states and heart diseases on each other [
168], [
169]. It underscores the potential of TCM in harmonizing heart and brain health, offering novel perspectives on treating cardiovascular and neurological conditions.
Brain-skin axis: This pathway illustrates how psychological stressors can precipitate skin disorders through neuroendocrine and immune responses [
170], [
171]. It advocates for a holistic TCM treatment strategy that addresses both the psychosomatic and physiological aspects of dermatological conditions.
Lung-gut axis: Highlighting the reciprocal influence of respiratory and gastrointestinal health, this axis points to the role of gut microbiota in respiratory diseases [
172], [
173]. It underscores the importance of maintaining gut health in TCM respiratory treatments.
Liver-gut axis: Focusing on the symbiotic relationship between the liver and the gut, this axis is crucial for understanding the metabolism and potential toxicity of TCM compounds [
174]. It emphasizes the role of bile acids in coordinating liver health, vital for TCM treatments targeting liver diseases.
Gut-muscle, gut-bone, and gut-kidney axes: These axes explore the impact of gut microbiota on muscle, bone, and kidney health, respectively, highlighting the extensive influence of the gut microbiota beyond the gastrointestinal tract [
175], [
176], [
177], [
178], [
179], [
180]. They provide a foundation for TCM strategies aimed at preserving muscle mass, enhancing bone metabolism, and managing chronic kidney diseases.
Kidney-bone axis and indirect hepatotoxicity: Examining how renal health impacts bone metabolism [
181], [
182] and exploring indirect hepatotoxicity mechanisms [
183], [
184], [
185], [
186] expand the scope of TCM research to include preventative and treatment strategies for liver injuries.
Through the remote-control networks, TCM research delves into the body’s complex internal networks, enhancing our comprehension of TCM’s therapeutic capabilities and its interactions with various bodily systems. This multidisciplinary approach enriches our understanding of TCM’s mechanisms and paves the way for innovative therapeutic interventions, combining ancient wisdom with modern science.
4. Component as fundamental units of “elements” in material basis
TCM research now emphasizes “components” as fundamental units, marking a paradigm shift towards an integrative understanding of its holistic material basis. This shift from analyzing isolated compounds to exploring synergistic interactions highlights TCM’s complexity and aligns with the dynamic nature of contemporary medical research. Illustrated in
Fig. 8, this shift portrays the TCM material basis as a vibrant ecosystem of bioactive substances, emphasizing the collective efficacy of constituents such as alkaloids, peptides, and polysaccharides.
4.1. Definition of TCM components
In addressing “components” within TCM research, it becomes clear that broad categorizations such as “total flavonoids” or “total saponins” barely scratch the surface of their complex nature. To refine our understanding, “components” are explored from compound production, efficacy, and notably, a structure-oriented perspective, which resonates with AI’s analytical capabilities. This approach not only aligns with AI’s prowess in data structuring but also propels AI into the elemental analysis phase, enriching our initial insights. The introduction of “component structure” theory underscores our methodological foundation, promoting a deeper, integrated exploration of TCM’s holistic material basis and bridging traditional wisdom with scientific rigor.
Defining TCM components involves a multidimensional framework that reflects the complexity of TCM and its integration with modern science. These components are essential for determining the efficacy, safety, and quality control of TCM formulations. By employing production-, efficacy-, and structure-oriented perspectives, we gain a thorough understanding of TCM components, enriching their characterization for modern medical use, as depicted in
Fig. 9.
4.1.1. Production-oriented
This production-oriented viewpoint links component identification to chemical methods for separation and purification. This approach, exemplified by the development of China’s first natural hypoglycemic drug from mulberry twig alkaloids [
187], underscores the importance of processing techniques. Evolving chromatographic methods now aim to achieve extracts with structural component purities exceeding 85% [
188], [
189]. While offering operational ease and facilitating rapid bioactive molecule utilization, this method calls for further pharmacological studies to explore the holistic efficacy of compound TCM formulations.
4.1.2. Efficacy-oriented
This efficacy-oriented perspective focuses on identifying compounds that reflect TCM’s holistic therapeutic effects. It addresses the complexity within TCM formulations, where multiple active ingredients contribute to therapeutic effects across different targets. This approach is crucial for leveraging TCM’s multifaceted therapeutic potential. For instance, researches from Zhang et al. [
190] and Xing et al. [
191] at China Pharmaceutical University focus on identifying “combinatorial bioactive ingredients” to replicate the therapeutic effects of original TCM prescriptions. Though complex, this strategy is pivotal in elucidating TCM’s holistic therapeutic effects.
4.1.3. Structure-oriented
Concentrating on molecular structures and their similarities, this approach facilitates the identification of bioactive entities within TCM. Using advanced tools such as ClassyFire [
192], Scaffold Hunter [
193], and SCONP [
194] enhances the efficiency of structural classification and prediction of biological activity in TCM compounds. This structure-oriented approach predicts PK and pharmacological actions of TCM ingredients [
195], [
196] and leverages AI to transform TCM research, facilitating the rapid discovery of effective components.
4.2. Component structure theory
“Component structure theory” marks a significant advancement in TCM research by focusing on “components” as fundamental units defined by their structural and functional congruence. This theory emerges in response to the need for a deeper, integrated analysis of TCM’s holistic material basis, transitioning from examining isolated bioactive entities to understanding their synergistic interactions. Anchored by three pivotal elements—chemical structure, stoichiometric structure, and aggregate structure—as depicted in
Fig. 10, this theory provides a robust framework to address critical challenges in TCM research. These challenges include identifying effective components, determining their optimal dosages, and examining their aggregation states within the body.
4.2.1. Chemical structure
This aspect emphasizes the importance of chemical similarities in classifying TCM components, facilitating a streamlined approach to analyzing TCM’s complex ingredient matrix. By identifying compounds with shared functional groups and potential bioactivity, this method pinpoints pharmacologically active units within TCM formulations. While structural resemblance lays the groundwork, the real challenge lies in correlating these similarities with the compounds’ behavior in a biological context, necessitating an integrated chemoinformatics and bioinformatics strategy to fully anticipate these properties.
4.2.2. Stoichiometric structure
Stoichiometric relationships are central to TCM material basis analysis, focusing on the holistic consideration of components within and among themselves. Initially, the interaction of constituents within a component is examined, assuming independent pharmacological effects due to their structural and biopharmaceutical congruence. The analysis then expands to inter-component relationships, where the cumulative efficacy and possible synergistic or antagonistic effects of combining diverse components are explored. This stoichiometric insight is pivotal for formulating TCM doses and maintaining consistency across different formulations, highlighting the importance of dosage optimization in TCM efficacy.
4.2.3. Aggregate structure
Research delves into the self-assembly behaviors of components and their resulting aggregate structures during formulation and within the body. This examination of both small molecule and large molecule self-assembly, and their interaction with bodily substances like bile salts and proteins, uncovers the influence of these higher-order structures on drug PK. Understanding these molecular assembly mechanisms is crucial for drug design, as it enhances the solubility, bioavailability, and stability of therapeutic agents, ensuring rigorous quality control in TCM formulations.
5. AI in quantification and inference of material basis
In the prior sections of our research, we have meticulously defined the elemental components within the rich tapestry of TCM’s material basis, guided by the comprehensive SFDM. As we pivot from these foundational definitions, we embark on an examination of the current state of AI applications in the intricate process of quantifying and inferring TCM’s material basis.
5.1. Overview of AI technologies
Before delving into the applications of AI in quantifying and inferring the material basis of TCM, it is imperative to comprehend the evolution of AI technologies and their applications in herbal and pharmaceutical development. AI has become a critical factor in enhancing the accuracy and reliability of TCM diagnosis, driving the objective, quantifiable, and standardized evolution of TCM diagnosis towards evidence-based medicine [
197]. Moreover, the prospects of AI application in the traditional medical field are broad, particularly in TCM’s four diagnostic methods [
198]. AI’s application in TCM, particularly in four key technological directions, shows significant potential.
5.1.1. Machine learning
ML, a core branch of AI, has shown vast potential in the TCM domain. This technology enables effective predictions by learning from massive data sets, crucial for managing complex data in TCM. The application of ML in TCM diagnostics is increasingly widespread. Tian et al. [
199] reviewed ML applications in TCM diagnostics, highlighting the importance of data preprocessing, model selection, and evaluation metrics. Furthermore, research by Wang et al. [
200] analyzed the molecular features of Chinese herbal medicines and their active components through ML methods, successfully predicting the classification of meridians, demonstrating the potential of ML in enhancing the accuracy of TCM classification.
5.1.2. Natural language processing
NLP is a technology that allows computers to understand, interpret, and generate human language, crucial for in-depth analysis of medical literature and precise processing of patient records. With the introduction of transformer architecture and its derivatives like bidirectional encoder representations from transformers (BERTs) and generative pre-trained transformer-4 (GPT-4), pre-trained on extensive text data, NLP has undergone revolutionary progress, significantly enhancing machine translation and text comprehension capabilities. The rapid development of large language models is reshaping research across fields, offering a novel approach to the complex domain of molecular studies [
201].
For instance, a study utilized the BERT model to construct a standardized model for TCM symptoms, effectively unifying various expressions of synonymous TCM symptoms, greatly improving data processing accuracy and efficiency [
202]. Furthermore, NLP technologies extend beyond text analysis to the development of auxiliary diagnostic systems. Another study developed an AI-based TCM auxiliary diagnostic system capable of processing unstructured notes in electronic health records using bidirectional long short-term memory networks-conditional random forests (Bi-LSTM-CRF) and convolutional neural networks (CNNs), accurately diagnosing various common diseases and generating corresponding syndrome lists [
203].
5.1.3. Computer vision
The application of computer vision technology in the TCM domain, especially in medical image analysis, has proven its indispensable value. This technology, which processes information from digital images or videos, is becoming integral to TCM diagnosis and treatment. A key research area is tongue image analysis, using deep learning to diagnose stomach cancer by examining tongue images and their microbiome, verifying the practicality of computer vision in traditional TCM tongue diagnosis and showcasing its huge potential in enhancing disease diagnosis accuracy [
204]. Additionally, hyperspectral imaging technology combined with ML has been effectively applied in quality control of Chinese medicines. By conducting in-depth analysis of hyperspectral data of Chinese medicines, this technology has significantly improved the accuracy and efficiency of Chinese medicine quality assessment, providing strong technical support for the standardization and quality supervision of Chinese medicine preparations [
205].
5.1.4. Knowledge representation and reasoning
Knowledge representation and reasoning in TCM focuses on effectively expressing and processing TCM knowledge in computer systems to support decision-making and new drug discovery. This involves creating knowledge graphs with extensive medical concepts and relationships, enabling AI to perform complex reasoning and support medical decision-making. This process not only facilitates a deeper understanding of TCM knowledge but also provides researchers with a powerful tool for discovering new treatment methods and drugs [
206].
In specific application cases, several studies have demonstrated the practical utility of knowledge representation and reasoning technologies in the TCM domain. For example, Li et al. [
207] developed a recurrent neural network model using a TCM cerebral palsy knowledge graph and electronic medical records to enhance diagnostic accuracy. Zhao et al. [
208] constructed a TCM knowledge graph applied to the potential knowledge discovery of diabetic nephropathy, systematically mining and sharing diagnostic and treatment knowledge to enhance the information support for medical decision-making. Additionally, Jin et al. [
209] proposed knowledge graph-enhanced multi-graph neural network (GNN) model for herbal recommendation showcases how to utilize attention mechanisms and TCM knowledge graphs to improve the precision and quality of herbal recommendation systems.
5.1.5. Application of AI in the study of TCM material basis
Table 1 [
210], [
211], [
212], [
213], [
214], [
215], [
216], [
217], [
218], [
219], [
220], [
221], [
222], [
223], [
224], [
225], [
226], [
227], [
228], [
229], [
230], [
231] summarizes a series of representative research cases, showcasing the application of AI technologies, such as deep learning and ML, in the study of the material basis of TCM. These studies leverage AI technologies to enhance the precision in analyzing TCM components and deepen the research on pharmacological mechanisms. By processing and analyzing vast amounts of data, AI not only accelerates the screening process for new TCM components but also enhances the personalization and accuracy of drug research and development. These technologies not only modernize TCM research but also bolster scientific support for its safety and efficacy.
5.2. AI in quantitative analysis
Quantitative analysis in TCM research aims to systematically identify and evaluate the physicochemical properties, chemical composition, and biopharmaceutical behaviors of TCM components, laying the foundation for in-depth mechanistic studies and clinical applications. Traditional analysis methods are limited by inefficiency, subjectivity, and lack of automation. The introduction of AI technologies can overcome these barriers, propelling rapid advancement in the modernization of TCM research.
5.2.1. Identification of medicinal material sources
In TCM research and practice, ensuring the authenticity and high quality of medicinal materials is crucial. Combined with DNA barcoding techniques, AI offers a revolutionary method for identifying medicinal material sources. This approach, by analyzing genetic information, can precisely identify and differentiate types of medicinal materials, significantly enhancing the accuracy of identification [
232]. AI-assisted image recognition also excels in efficiently and accurately identifying the morphology of medicinal materials, quickly verifying their authenticity and quality, playing a vital role in source tracing and quality control [
233].
5.2.2. Identification and classification of chemical components
The complex chemical components of TCM are the material basis of its therapeutic effects, and AI technology shows immense potential in the identification and classification of these components. AI accelerates the identification and precise classification of chemical components in complex TCM through high-throughput screening and automated analysis [
234]. Using advanced technologies like GNNs, AI can predict compound properties directly from molecular structures without traditional molecular descriptors, significantly improving the efficiency and accuracy of chemical analyses [
235].
5.2.3. Analysis of biopharmaceutical properties
The biopharmaceutical properties, including the ADME characteristics of drugs, are crucial for evaluating drug bioavailability and safety. The application of AI technology in this field, through molecular simulation and intelligent prediction, enables efficient and precise assessment of physicochemical parameters and ADME characteristics. GNNs excel at predicting complex molecular properties, handling multifidelity datasets, and applying transfer learning [
236]. Moreover, the use of physiologically-based PK (PB-PK) models allows for the simulation of drug distribution processes in various tissues and organs within the body, providing accurate predictive information for new drug development and safety evaluation [
237].
5.3. AI in inference analysis
The inference analysis phase of TCM material basis research requires in-depth exploration of the mechanisms of action, interaction networks, individual variations, and potential indications of active TCM components, crucial for the modernization and clinical translational application of TCM. AI’s capabilities in multidimensional data mining and knowledge discovery are reshaping traditional reasoning modes fundamentally, injecting new momentum into uncovering the unique molecular regulatory principles of TCM and expanding its clinical application scope.
5.3.1. Efficacious substances and mechanisms of action
AI, combined with computational omics technologies, displays great potential in identifying efficacious substances and deducing mechanisms of action. AI algorithms can efficiently identify molecules with potential therapeutic effects from vast databases of natural products, accelerating the new drug discovery process [
238]. This involves constructing computational models to accurately predict interactions between molecules and target proteins, delving into their molecular mechanisms of action [
239]. For instance, through structure-activity relationship (SAR) models and AI algorithms, researchers can quickly identify TCM components with specific pharmacological activities, such as hepatoprotective effects [
214]. Additionally, using quantitative SAR (QSAR) models to predict potential toxicity of compounds provides a scientific basis for pharmacotoxicological evaluations, further ensuring drug safety [
231].
5.3.2. Interactions from a systems biology perspective
The holistic efficacy of TCM derives from the interactive effects of active molecules within its complex component system. The application of network pharmacology, coupled with AI technologies, enables the dissection of the complex interactions between TCM components and biological networks from multi-omics data [
240]. AI-enhanced network medicine frameworks systematically map disease symptoms and TCM targets onto the human protein interaction network, revealing the molecular underpinnings of TCM’s diagnostic and therapeutic principles [
134]. This approach not only facilitates a deeper understanding of TCM’s scientific basis but also, by delineating the interaction patterns between disease and drug molecular networks, sheds light on the molecular origins of diseases and identifies crucial treatment interventions. Furthermore, AI technologies can extract patterns of patient symptoms and drug response modes from extensive clinical datasets, providing vital decision support for the customization of treatment plans and dosage optimization [
241].
5.3.3. Discovery of new indications and therapeutic potentials
AI technology, particularly virtual screening and molecular docking, efficiently uncovers active molecules with therapeutic potential from TCM resources. Computational similarity analyses, assessing the similarity between TCM molecules and known drugs, can predict new indications for TCM molecules, offering strong clues for the development of new therapeutic targets [
242]. Furthermore, large-scale virtual screening methods can identify a variety of potential active compounds from TCM that may be effective against osteoporosis [
243], antiviral activities (e.g., against SARS-CoV-2) [
213], and more, greatly expanding the application potential of TCM in the treatment of these diseases.
6. Advanced AI applications in TCM material basis research
Building on foundational research, this study introduces two advanced AI-driven proposals to enhance the bottom-up analysis of TCM’s material basis. These proposals aim to harness AI’s full potential in navigating the complex, multi-dimensional interactions that characterize TCM formulations, offering innovative solutions to longstanding research challenges.
6.1. “Component-syndrome” end-to-end data-driven model
Building on previous discussions, the “component-syndrome” end-to-end data-driven model, as depicted in
Fig. 11, represents a significant advancement in TCM research. This model leverages AI to decode the complex relationships among TCM components, biological targets, and clinical syndromes. Its development signifies a pivotal shift from analyzing individual components to embracing a holistic, multi-component framework, acknowledging that the efficacy of TCM stems from the dynamic interplay of multiple components.
The rationale behind this model stems from the limitations of existing AI models, such as QSAR and molecular docking, which focus on isolated interactions between components and biological targets. These models, while insightful, do not encompass the entirety of TCM’s intricate approach, which often involves a symphony of components acting in concert across multiple pathways to address a range of syndromes. The “component-syndrome” model extends beyond the scope of component-target models by encapsulating the synergy between multiple components and their collective impact on a spectrum of clinical syndromes. Where component-target models map a single pathway, the “component-syndrome” model leverages a network approach, acknowledging that the therapeutic impact of TCM extends across a network of biological pathways.
6.1.1. Core principles
We base the “component-syndrome” model on the principle of molecular similarity, positing that structurally similar molecules exhibit similar biological activities. Given that TCM formulations contain a variety of chemical components, their interactions and synergistic effects are crucial for the overall therapeutic effectiveness. We start by constructing a “component-component” topological network, grouping chemically similar components, and then expand to a “component-target-syndrome” network analysis, exploring how each chemical component interacts with specific biological targets and affects clinical syndromes. Ultimately, we develop the “component-syndrome” end-to-end model, moving beyond the analysis of single chemical components and focusing on how components are associated with specific clinical syndromes. This reveals at a macroscopic level how TCM components collectively act on biological pathways and influence syndromes.
6.1.2. Main algorithms
GNNs are central to constructing the “component-syndrome” model, ideal for managing complex network structures in TCM research. GNNs effectively capture the complex relationships between chemical component nodes and their interactions within the network. For instance, Lee et al. [
244] demonstrated the application of GNNs in establishing mapping relationships between molecular structures and specific attributes, such as odor. Similarly, Gautam et al. [
245] utilized GNNs in their QSAR model to predict the blood-brain barrier permeability of metabolites produced by humans and microbiomes. These case studies highlight the strong potential of GNNs in deciphering complex chemical data and predicting compound properties, further proving their value in constructing “component-component” and “component-syndrome” networks. Therefore, GNNs are an ideal tool to help deepen our understanding of the complexity and holistic effectiveness of TCM.
6.1.3. Technical steps
(1) High-throughput screening and identification: Using techniques like LC-MS to obtain detailed spectral data is crucial for identifying various chemical components. Molecular network topology and structural similarity fingerprinting are used for pattern recognition and component identification in the obtained spectral data [
246], aiding in accurately identifying key chemical components from complex data.
(2) Data-driven component clustering: Utilize cheminformatics and GNNs to cluster identified chemical components based on molecular similarity. The key here is using GNNs to decipher complex interactions between molecules and cluster components based on these interactions. Each component represents a group of chemically similar components, laying the foundation for subsequent network analysis.
(3) Component interaction and activity prediction: Use a “component-target-syndrome” network to link TCM syndromes [
134], and use GNNs to predict component interactions and pharmacological activities. Combining known pharmacological data and biomarker information, the “component-syndrome” end-to-end data-driven model can predict the potential therapeutic activity of each component, furthering our understanding of how they collectively act on specific biological pathways and clinical syndromes.
6.2. Oral formulations IVIVC mechanistic model
A key challenge in bottom-up analysis of TCM is accurately predicting and understanding the complex mechanisms of oral TCM formulations within the human body. The multi-component nature of TCM results in nonlinear and multidimensional complexities in their therapeutic effects, with traditional research methods often failing to capture the dynamic interactions among various components and their comprehensive interaction with biological systems. Therefore, we propose the application of AI technology, particularly mechanistic models, to overcome these challenges.
As shown in the
Fig. 12, AI mechanistic models are applied here to optimize IVIVC studies for oral TCM formulations. This includes simulating processes like drug dilution and ADME, which often overlap and interact spatially, surpassing simple independent or linear relationships. Especially for multi-component systems like TCM, considering factors like molecular structure, self-assembly, macroscopic movement within the gastrointestinal tract, PK, and interactions between organ systems necessitates an advanced model that can integrate all these aspects.
Using AI to construct and optimize these mechanistic models deepens our understanding of TCM component behavior and their interactions with complex biological pathways in the human body. This understanding is vital for predicting clinical effects, designing appropriate dosages, and minimizing adverse reactions. Therefore, optimizing IVIVC studies with AI not only enhances the accuracy of efficacy assessment and safety evaluation of TCM but also provides scientific support for the modernization and international development of TCM. The following sections will provide a complete research guidance plan for optimizing the mechanistic model of oral drug IVIVC, covering core principles, main algorithms, and technical steps.
6.2.1. Core principles
A major challenge is combining dissolution and absorption models to simulate the ongoing dynamic behavior of oral drugs in the gastrointestinal tract [
247]. Therefore, we propose an IVIVC optimization method for oral drugs, centering on the application of the PB-PK model. This model, based on a deep understanding of biology, predicts drug behavior in the human body, especially in the context of oral administration, and has shown great potential in new drug development [
248]. The application of the PB-PK model is particularly important for multi-component systems like TCM. The complexity of TCM arises not only from its diverse chemical components but also from their intricate interactions within the body. The PB-PK model allows us to consider the molecular structure of drugs, their self-assembly characteristics, macroscopic movement in the gastrointestinal tract, PK properties, and interactions between different organ systems. This comprehensive model helps us more accurately predict drug behavior in the body, especially for complex TCM formulations.
6.2.2. Main algorithms
Molecular dynamics simulation: This computational method simulates the behavior of molecular systems over time. It simulates the trajectories of molecules over time by calculating the forces and movements of particles interacting with each other. In IVIVC studies, molecular dynamics simulation can predict the behavior of drug molecules in specific environments in the body, such as the dissolution and absorption processes in the gastrointestinal tract. This simulation helps understand how the molecular structure of drugs affects their bioavailability and PK properties, particularly when considering complex multi-component systems of TCM.
Bayesian algorithms: These statistical methods update the probability of hypotheses or parameters using prior knowledge and new data. In IVIVC models, Bayesian algorithms can be used to optimize model parameters, especially in situations with scarce data or high uncertainty. By combining prior knowledge (such as known drug properties) and new experimental data, model parameters can be estimated more accurately. Bayesian methods also allow for the quantification of uncertainty, providing confidence intervals for model predictions, which is especially useful for clinical decision-making.
Combining these two algorithms, molecular dynamics simulation provides a microscopic view of drug behavior, while Bayesian algorithms help optimize and validate the model’s predictive capabilities at a macroscopic level. This multi-faceted approach not only enhances understanding of drug behavior but is particularly important in addressing the challenges of complex multi-component systems in TCM. This integrated approach allows for a more comprehensive assessment of the IVIVC properties of drugs, providing stronger scientific support for the effectiveness and safety of TCM oral formulations.
6.2.3. Technical steps
(1) Dynamic dissolution-absorption simulation: This employs molecular dynamics simulation to predict how oral drugs dissolve and absorb in the gastrointestinal tract. This simulation considers the complex physiological variable interactions during the drug dissolution process, such as fluctuations in pH, enzyme activity, and concentrations of bile salts and phospholipids. For instance, our previous research elucidated the mechanism of action of bile salts and phospholipids on the permeability enhancement of discontinuous saponin components in the gastrointestinal environment, improving the understanding of drug absorption mechanisms [
249].
(2) Multi-scale model integration: This approach integrates various scale models to capture the behavior of multi-component TCM mixtures in the body. This includes everything from the self-assembly details of drug molecules to their macroscopic movement in the gastrointestinal tract. For example, our prior research indicated that Astragalus polysaccharides could improve the biopharmaceutical properties of saponin components [
250]. The application of multi-scale models not only reveals the self-assembly process of drug molecules but also simulates their movement in the gastrointestinal tract, key to understanding how TCM forms aggregates and nanostructures in the gastrointestinal tract, thus optimizing drug release curves and guiding clinical dosage design.
(3) PB-PK model optimization: These enhance predictive accuracy using Bayesian networks and other ML techniques. These methods allow for a comprehensive simulation of drug ADME, considering spatial interactions and overlaps between these processes [
251]. These advanced techniques can integrate data from different laboratories and clinical trials, automatically adjusting model parameters, improving the model’s applicability in different populations. The application of Bayesian networks, especially in handling drug molecular structure, macroscopic movement in the gastrointestinal tract, PK properties, and interactions between organ systems, offers deep insights.
7. Conclusions
This study establishes a robust methodological framework that integrates AI with TCM through the SFDM, aligning with the complexity of TCM and the analytical precision of AI. It begins by elucidating the complexities of TCM’s holistic material basis and systematically unveils TCM’s research framework using a systems theory-guided top-down approach. The narrative highlights TCM’s holistic nature and demonstrates AI’s transformative impact in decoding the complex interrelations within TCM formulations. This includes a detailed analysis of active substances, biopharmaceutical regulatory substances, drug combinations, and self-assembly mechanisms explored from a bottom-up perspective. We emphasize the importance of viewing “components”—groups of structurally and functionally similar constituents—as fundamental units to articulate the entirety of TCM material basis. The introduction of the “component structure theory” addresses key aspects to consider during research, such as chemical structure, stoichiometric relationships, and aggregate states of components.
Moreover, AI technologies like ML, NLP, and computer vision are pivotal in advancing the research on TCM’s material basis. In the realm of quantitative analysis, AI enables precise identification of medicinal material sources, classification of chemical components, and analysis of biopharmaceutical properties. In inference analysis, AI’s capabilities extend to delineating efficacious substances and their mechanisms of action, analyzing interactions from a systems biology perspective, and discovering new indications and therapeutic potentials. The study introduces two advanced AI-driven models that underscore the significant role of AI in TCM research: the “component-syndrome” model, an end-to-end data-driven approach that integrates and analyzes complex datasets to predict syndrome-component correlations, and the IVIVC model for oral formulations, which mechanistically models the dissolution, absorption, and efficacy of TCM components, providing a predictive framework for their clinical efficacy.
Ultimately, this research propels forward the integration of traditional medicinal knowledge with modern computational techniques, laying a methodological foundation for future endeavors at the intersection of AI and TCM. This balanced approach promises innovative advancements that merge traditional insights with scientific inquiry, underscoring AI’s potential to enrich the TCM field and marking a step towards its enhanced scientific understanding and broader application.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (82230117). The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.
Compliance with ethics guidelines
Jingqi Zeng and Xiaobin Jia declare that they have no conflict of interest or financial conflicts to disclose.