[1] |
Pan Y. 2018 special issue on artificial intelligence 2.0: theories and applications. Front Inform Technol Electron Eng 2018; 19(1):1-2.
|
[2] |
Lyu YG. Artificial intelligence: enabling technology to empower society. Engineering 2020; 6(3):205-206.
|
[3] |
Lyu YG, Wu F. Toward a more general empowering artificial intelligence. Engineering 2023; 25:1-2.
|
[4] |
Lyu YG, Wu F. Further empowering humans in specific fields and rethinking AGI testing. Engineering 2024; 34:1-2.
|
[5] |
Li DF, Xu F. Synergizing knowledge graphs with large language models: a comprehensive review and future prospects. 2024. arXiv: 2407.18470.
|
[6] |
Pan S, Luo L, Wang Y, Chen C, Wang J, Wu X. Unifying large language models and knowledge graphs: a roadmap. IEEE Trans Knowl Data Eng 2024; 36(7):3580-3599.
|
[7] |
Kau A, He X, Nambissan A, Astudillo A, Yin H, Aryani A. Combining knowledge graphs and large language models. 2024. arXiv: 2407.06564.
|
[8] |
Yuan B, Chen Y, Zhang Y, Jiang W. Hide and seek in noise labels: noise-robust collaborative active learning with LLMs-powered assistance. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics; 2024 Aug 11–16; Bangkok, Thailand. Stroudsburg: Association for Computational Linguistics (ACL); 2024. p. 10977–1011.
|
[9] |
Hao Z, Jiang H, Jiang S, Ren J, Cao T. Hybrid SLM and LLM for edge-cloud collaborative inference. In: Proceedings of the Workshop on Edge and Mobile Foundation Models; 2024 Jun 3–7; Tokyo, Japan. New York City: Association for Computing Machinery (ACM); 2024. p. 36–41.
|
[10] |
Zhang K, Wang J, Ding N, Qi B, Hua E, Lv X, et al. Fast and slow generating: an empirical study on large and small language models collaborative decoding. 2024. arXiv: 2406.12295.
|
[11] |
McClenny LD, Braga-Neto UM. Self-adaptive physics-informed neural networks. J Comput Phys 2023; 474:111722.
|
[12] |
Sharma P, Chung WT, Akoush B, Ihme M. A review of physics-informed machine learning in fluid mechanics. Energies 2023; 16(5):2343.
|
[13] |
Zhou C, Liu P, Xu P, Iyer S, Sun J, Mao Y, et al. LIMA: less is more for alignment. In: Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023); 2023 Dec 10–16; New Orleans, LA, USA. Trier: NeurIPS Proceedings; 2024.
|
[14] |
Akyürek E, Bolukbasi T, Liu F, Xiong B, Tenney I, Andreas J, et al. Towards tracing factual knowledge in language models back to the training data. 2022. arXiv: 2205.11482.
|
[15] |
Shen T, Mao Y, He P, Long G, Trischler A, Chen W. Exploiting structured knowledge in text via graph-guided representation learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2020 Nov 16–20; online. Stroudsburg: Association for Computational Linguistics (ACL); 2020. p. 8980–94.
|
[16] |
Zhang D, Yuan Z, Liu Y, Zhuang F, Chen H, Xiong H. E-BERT: a phrase and product knowledge enhanced language model for E-commerce. 2020. arXiv: 2009.02835.
|
[17] |
Tian H, Gao C, Xiao X, Liu H, He B, Wu H, et al. SKEP: sentiment knowledge enhanced pre-training for sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics; 2020 Jul 5–10; online. Stroudsburg: Association for Computational Linguistics (ACL); 2020. p. 4067–76.
|
[18] |
Gao T. Knowledge authoring and question answering with KALM. 2019. arXiv: 1905.00840.
|
[19] |
Wang X, Gao T, Zhu Z, Zhang Z, Liu Z, Li J, et al. KEPLER: a unified model for knowledge embedding and pre-trained language representation. Trans Assoc Comput Linguist 2021; 9:176-194.
|
[20] |
Li S, Li X, Shang L, Sun C, Liu B, Ji Z, et al. Pre-training language models with deterministic factual knowledge. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing; 2022 Dec 7–11; Abu Dhabi, UAE. Stroudsburg: Association for Computational Linguistics (ACL); 2022. p. 11118–31.
|
[21] |
Xiong W, Du J, Wang WY, Stoyanov V. Pretrained encyclopedia: weakly supervised knowledge-pretrained language model. 2019. arXiv: 1912.09637.
|
[22] |
Ji J, Wang K, Qiu T, Chen B, Zhou J, Li C, et al. Language models resist alignment. 2024. arXiv: 2406.06144.
|
[23] |
Zhang S, Dong L, Li X, Zhang S, Sun X, Wang S, et al. Instruction tuning for large language models: a survey. 2023. arXiv: 2308.10792.
|
[24] |
Gekhman Z, Yona G, Aharoni R, Eyal M, Feder A, Reichart R, et al. Does fine-tuning LLMs on new knowledge encourage hallucinations? 2024. arXiv:2405.05904.
|
[25] |
Wang J, Huang W, Qiu M, Shi Q, Wang H, Li X, et al. Knowledge prompting in pre-trained language model for natural language understanding. 2022. arXiv: 2210.08536.
|
[26] |
Ye H, Zhang N, Deng S, Chen X, Xiong F, Chen X, et al. Ontology-enhanced prompt-tuning for few-shot learning. In: Proceedings of the ACM Web Conference 2022; 2022 Apr 25–29; online. New York City: Association for Computing Machinery (ACM); 2022. p. 778–87.
|
[27] |
Luo H, Tang Z, Peng S, Guo Y, Zhang W, Ma C, et al. ChatKBQA: a generate-then-retrieve framework for knowledge base question answering with fine-tuned large language models. 2023. arXiv: 2310.08975.
|
[28] |
Luo L, Li YF, Haffari G, Pan S. Reasoning on graphs: faithful and interpretable large language model reasoning. 2023. arxiv:2310.01061.
|
[29] |
Ovadia O, Brief M, Mishaeli M, Elisha O. Fine-tuning or retrieval? Comparing knowledge injection in LLMs. 2023. arXiv: 2312.05934.
|
[30] |
Yang D, Rao J, Chen K, Guo X, Zhang Y, Yang J, et al. IM-RAG: multi-round retrieval-augmented generation through learning inner monologues. In: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval; 2024 Jul 14–18; Washington, DC, USA. New York City: Association for Computing Machinery (ACM); 2024. p. 730–40.
|
[31] |
Mussmann S, Ermon S. Learning and inference via maximum inner product search. In: Proceedings of the International Conference on Machine Learning; 2016 Jun 20–22; New York City, NY, USA. Birmingham: Proceedings of Machine Learning Research; 2016. p. 2587–96.
|
[32] |
Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Proceedings of the 34th International Conference on Neural Information Processing Systems; 2020 Dec 6–12; Vancouver, BC, Canada. New York City: Association for Computing Machinery (ACM); 2020. p. 9459–74.
|
[33] |
Wu Y, Zhao Y, Hu B, Minervini P, Stenetorp P, Riedel S. An efficient memory-augmented transformer for knowledge-intensive NLP tasks. 2022. arXiv: 2210.16773.
|
[34] |
Guu K, Lee K, Tung Z, Pasupat P, Chang MW. REALM: retrieval augmented language model pre-training. In: Proceedings of the International Conference on Machine Learning; 2020 Jul 13–18; online. Birmingham: Proceedings of Machine Learning Research; 2020. p. 3929–38.
|
[35] |
Logan R, Liu NF, Peters ME, Gardner M, Singh S. Barack’s wife Hillary: using knowledge graphs for fact–aware language modeling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019 Jul 28–Aug 2; Florence, Italy. Stroudsburg: Association for Computational Linguistics (ACL); 2019. p. 5962–71
|
[36] |
Zhang Y, Li H, Zhang S, Wang R, He B, Dou H, et al. LLMCO4MR: LLMs-aided neural combinatorial optimization for ancient manuscript restoration from fragments with case studies on Dunhuang. In: Leonardis A, Ricci E, Roth S, Russakovsky O, Sattler T, Varol G, editors. Computer vision—ECCV 2024. Cham: Springer; 2024.
|
[37] |
Sun Y, Wang S, Feng S, Ding S, Pang C, Shang J, et al.ERNIE 3.0: large-scale knowledge enhanced pre-training for language understanding and generation. 2021. arXiv:2107.02137.
|
[38] |
Sun T, Shao Y, Qiu X, Guo Q, Hu Y, Huang X, et al. CoLAKE: contextualized language and knowledge embedding. In: Proceedings of the 28th International Conference on Computational Linguistics; 2023 Dec 8–13; Barcelona, Spain. Stroudsburg: Association for Computational Linguistics (ACL); 2020. p. 3660–70.
|
[39] |
Zhang T, Wang C, Hu N, Qiu M, Tang C, He X, et al. DKPLM: decomposable knowledge-enhanced pre-trained language model for natural language understanding. Proc Conf AAAI Artif Intell 2022; 36(10):11703-11711.
|
[40] |
Yu W, Zhu C, Fang Y, Yu D, Wang S, Xu Y, et al. Dict-BERT: enhancing language model pre-training with dictionary. 2021. arXiv: 2110.06490.
|
[41] |
Li S, Gao Y, Jiang H, Yin Q, Li Z, Yan X, et al. Graph reasoning for question answering with triplet retrieval. 2023. arXiv: 2305.18742.
|
[42] |
Luo L, Ju J, Xiong B, Li YF, Haffari G, Pan S. ChatRule: mining logical rules with large language models for knowledge graph reasoning. 2023. arXiv: 2309.01538.
|
[43] |
Wang J, Sun Q, Chen N, Li X, Gao M. Boosting language models reasoning with chain-of-knowledge prompting. 2023. arXiv: 2306.06427.
|
[44] |
Shazeer N, Mirhoseini A, Maziarz K, Davis A, Le Q, Hinton G, et al. Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. 2017. arXiv: 1701.06538.
|
[45] |
Wei J, Wang X, Schuurmans D, Bosma M, Ichter B, Xia F, et al. Chain-of-thought prompting elicits reasoning in large language models. In: Proceedings of the 36th International Conference on Neural Information Processing Systems; 2022 Nov 28–Dec 9; New Orleans, LA, USA. Red Hook: Curran Associates Inc.; 2022.
|
[46] |
Kraaijenbrink J, Wijnhoven F. Managing heterogeneous knowledge: a theory of external knowledge integration. Knowl Manag Res Pract 2008; 6(4):274-286.
|
[47] |
Dogan A, Birant D. A weighted majority voting ensemble approach for classification. In: Proceedings of the 2019 4th International Conference on Computer Science and Engineering (UBMK); 2019 4th International Conference on Computer Science and Engineering (UBMK 2019); 2019 Sep 11–15; Samsun, Türkiye. New York City: IEEE; 2019. p. 1–6.
|
[48] |
Kwon H, Park J, Lee Y. Stacking ensemble technique for classifying breast cancer. Healthc Inform Res 2019; 25(4):283-288.
|
[49] |
Du N, Huang Y, Dai AM, Tong S, Lepikhin D, Xu Y, et al. GLaM: efficient scaling of language models with mixture-of-experts. In: Proceedings of the 39th International Conference on Machine Learning; 2022 Jul 17–23; Baltimore, MD, USA. Birmingham: Proceedings of Machine Learning Research; 2022. p. 5547–69.
|
[50] |
Wang K, Xu Y, Wu Z, Luo S. LLM as prompter: low-resource inductive reasoning on arbitrary knowledge graphs. 2024. arXiv: 2402.11804.
|
[51] |
Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, et al. Llama 2: open foundation and fine-tuned chat models. 2023. arXiv: 2307.09288.
|
[52] |
Nayak A, Timmapathini HP. LLM2KB: constructing knowledge bases using instruction tuned context aware large language models. 2023. arXiv: 2308.13207.
|
[53] |
Wang H, Li R, Jiang H, Tian J, Wang Z, Luo C, et al. BlendFilter: advancing retrieval-augmented large language models via query generation blending and knowledge filtering. 2024. arXiv: 2402.11129.
|
[54] |
Parisi A, Zhao Y, Fiedel N. TALM: tool augmented language models. 2022. arXiv: 2205.12255.
|
[55] |
Schick T, Dwivedi-Yu J, Raileanu R, Lomeli M, Hambro E, et al. Toolformer: language models can teach themselves to use tools. In: Proceedings of the Thirty-Seventh Conference on Neural Information Processing Systems; 2023 Dec 10; New Orleans, LU, USA. New York City: Association for Computing Machinery (ACM); 2023.
|
[56] |
Shen Y, Song K, Tan X, Li D, Lu W, Zhuang Y. HuggingGPT: solving AI tasks with ChatGPT and its friends in hugging face. 2023. arXiv:2303.17580v4.
|
[57] |
Yao S, Yu D, Zhao J, Shafran I, Griffiths T, Cao Y, et al. Tree of thoughts: deliberate problem solving with large language models. In: Proceedings of the Thirty-Seventh Conference on Neural Information Processing Systems; 2023 Dec 10; New Orleans, LU, USA. New York City: Association for Computing Machinery (ACM); 2023.
|
[58] |
Besta M, Blach N, Kubicek A, Gerstenberger R, Podstawski M, Gianinazzi L, et al. Graph of thoughts: solving elaborate problems with large language models. 2024. arXiv:2308.09687v4.
|
[59] |
Rabby G, Auer S, D’Souza J, Oelen A. Fine-tuning and prompt engineering with cognitive knowledge graphs for scholarly knowledge organization. 2024. arXiv:2409.06433.
|
[60] |
Ein-Dor L, Toledo-Ronen O, Spector A, Greta S, Dankin L, Halfon A, et al. Conversational prompt engineering. 2024. arXiv: 2408.04560.
|
[61] |
Yu Z, Ouyang X, Shao Z, Wang M, Yu J. Prophet: prompting large language models with complementary answer heuristics for knowledge-based visual question answering. 2023. arXiv: 2303.01903.
|
[62] |
Lu X, Liao Y, Liu C, Lio P, Hui P. Heterogeneous model fusion federated learning mechanism based on model mapping. IEEE Internet Things J 2022; 9(8):6058-6068.
|
[63] |
Wu TH, Lian L, Gonzalez JE, Li B, Darrell T. Self-correcting LLM-controlled diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024 Jun 16–22; Seattle, WA, USA. New York City: IEEE; 2024. p. 6327–36.
|
[64] |
Wang Y, Zhu S, Fu F, Miao X, Zhang J, Zhu J, et al. Efficient multi-task large model training via data heterogeneity-aware model management. 2024. arXiv: 2409.03365.
|
[65] |
Sachin DN, Annappa B, Hegde S, Abhijit CS, Ambesange S. FedCure: a heterogeneity-aware personalized federated learning framework for intelligent healthcare applications in IoMT environments. IEEE Access 2024; 12:15867-15883.
|
[66] |
Haller M, Lenz C, Nachtigall R, Awayshehl FM, Alawadi S. Handling non-IID data in federated learning: an experimental evaluation towards unified metrics. In: Proceedings of the 2023 IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC); 2023 Nov 14–17; Abu Dhabi, UAE. New York City: IEEE; 2023. p. 0762–70.
|
[67] |
Ding K, Dong X, He Y, Cheng L, Fu C, Huan Z, et al. MSSM: a multiple-level sparse sharing model for efficient multi-task learning. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval; 2021 Jul 11–15; online. New York City: Association for Computing Machinery (ACM); 2021. p. 2237–41.
|
[68] |
Wang Z, Panda R, Karlinsky L, Feris R, Sun H, Kim Y. Multitask prompt tuning enables parameter-efficient transfer learning. 2023. arXiv: 2303.02861.
|
[69] |
Zhang W, Zhai G, Wei Y, Yang X, Ma K. Blind image quality assessment via vision-language correspondence: a multitask learning perspective. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023 Jun 18–22; Vancouver, BC, Canada. New York City: IEEE; 2023. p. 14071–81.
|
[70] |
Chen Q, Chen X, Wang J, Zhang S, Yao K, Feng H, et al. Group DETR: fast DETR training with group-wise one-to-many assignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023 Oct 2–6; Paris, France. New York City: IEEE; 2023. p. 6633–42.
|
[71] |
Ghosh A, Chung J, Yin D, Ramchandran K. An efficient framework for clustered federated learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems; 2020 Dec 6–10; Vancouver, BC, Canada. Red Hook: Curran Associates Inc.; 2020.
|
[72] |
Ye R, Wang W, Chai J, Li D, Li Z, Xu Y, et al. OpenFedLLM: training large language models on decentralized private data via federated learning. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; 2024 Aug 25–29; Barcelona, Spain. New York City: Association for Computing Machinery (ACM); 2024. p. 6137–47.
|
[73] |
Yang C, An Z, Zhou H, Zhuang F, Xu Y, Zhang Q. Online knowledge distillation via mutual contrastive learning for visual recognition. IEEE Trans Pattern Anal Mach Intell 2023; 45(8):10212-10227.
|
[74] |
Ni J, Tang H, Shang Y, Duan B, Yan Y. Adaptive cross-architecture mutual knowledge distillation. In: Proceedings of the 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG); 2024 May 27–31; Istanbul, Türkiye. New York City: IEEE; 2024. p. 1–5.
|
[75] |
Miao Z, Zhang W, Su J, Li X, Luan J, Chen Y, et al. Exploring all-in-one knowledge distillation framework for neural machine translation. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing; 2023 Dec 6–10; Singapore. Stroudsburg: Association for Computational Linguistics (ACL); 2023. p. 2929–40.
|
[76] |
Zhao J, Zhao W, Drozdov A, Rozonoyer B, Sultan MA, Lee JY, et al. Multistage collaborative knowledge distillation from a large language model for semi-supervised sequence generation. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics; 2024 Aug 11–16; Bangkok, Thailand. Stroudsburg: Association for Computational Linguistics (ACL); 2024. p. 14201–14.
|
[77] |
Starodubcev N, Fedorov A, Babenko A, Baranchuk D. Your student is better than expected: adaptive teacher–student collaboration for text-conditional diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024 Jun 16–22; Seattle, WA, USA. New York City: IEEE; 2024. p. 9275–85.
|
[78] |
Shao J, Wu F, Zhang J. Selective knowledge sharing for privacy-preserving federated distillation without a good teacher. Nat Commun 2024; 15:349.
|
[79] |
Wan F, Huang X, Cai D, Quan X, Bi W, Shi S. Knowledge fusion of large language models. 2024. arXiv: 2401.10491.
|
[80] |
Wang Y, Agarwal S, Mukherjee S, Liu X, Gao J, Awadallah AH, et al. AdaMix: mixture-of-adaptations for parameter-efficient model tuning. 2022. arXiv: 2205.12410.
|
[81] |
Wortsman M, Ilharco G, Gadre SY, Roelofs R, Gontijo-Lopes R, Morcos AS, et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In: Proceedings of the International Conference on Machine Learning; 2022 Jul 17–23; Baltimore, MD, USA. Seattle: Proceedings of Machine Learning Research; 2022. p. 23965–98.
|
[82] |
Arpit D, Wang H, Zhou Y, Xiong C. Ensemble of averages: improving model selection and boosting performance in domain generalization. 2022. arXiv: 2110.10832.
|
[83] |
Jin X, Ren X, Preotiuc-Pietro D, Cheng P. Dataless knowledge fusion by merging weights of language models. 2022. arXiv: 2212.09849.
|
[84] |
Perin G, Chen X, Liu S, Kailkhura B, Wang Z, Gallagher B. RankMean: module-level importance score for merging fine-tuned LLM models. In: Proceedings of the Findings of the Association for Computational Linguistics; 2024 Aug 11–16; Bangkok, Thailand. Stroudsburg: Association for Computational Linguistics (ACL); 2024. p. 1776–82.
|
[85] |
Yu L, Bi K, Ni S, Guo J. Contextual dual learning algorithm with listwise distillation for unbiased learning to rank. 2024. arXiv: 2408.09817.
|
[86] |
Park S, Van P Hentenryck. Self-supervised primal-dual learning for constrained optimization. Proc Conf AAAI Artif Intell 2023; 37(4):4052-4060.
|
[87] |
Fei H, Wu S, Ren Y, Zhang M. Matching structure for dual learning. In: Proceedings of the International Conference on Machine Learning; 2022 Jul 17–23; Baltimore, MD, USA. Seattle: Proceedings of Machine Learning Research; 2022. p. 6373–91.
|
[88] |
Ji W, Wang R, Tian Y, Wang X. An attention based dual learning approach for video captioning. Appl Soft Comput 2022; 117:108332.
|
[89] |
Li J, Xia Y, Yan R, Sun H, Zhao D, Liu T, et al. Stylized dialogue generation with multi-pass dual learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems; 2021 Sep 6–14; online. Red Hook: Curran Associates Inc.; 2021.
|
[90] |
Chen A, Lou L, Chen K, Bai X, Xiang Y, Yang M, et al. DUAL-REFLECT: enhancing large language models for reflective translation through dual learning feedback mechanisms. 2024. arXiv: 2406.07232.
|
[91] |
Dong J, Zhang M, Zhang Z, Chen X, Liu D, Qu X, et al. Dual learning with dynamic knowledge distillation for partially relevant video retrieval. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023 Oct 4–6; Paris, France. New York City: IEEE; 2023. p. 11302–12.
|
[92] |
Wang Y, Sun T, Li S, Yuan X, Ni W, Hossain E, et al. Adversarial attacks and defenses in machine learning-empowered communication systems and networks: a contemporary survey. IEEE Comm Surv and Tutor 2023; 25(4):2245-2298.
|
[93] |
Cheng P, Yang Y, Li J, Dai Y, Hu T, Cao P, et al. Adversarial preference optimization: enhancing your alignment via RM-LLM game. In: Proceedings of the Findings of the Association for Computational Linguistics; 2024 Aug 11–16; Bangkok, Thailand. Stroudsburg: Association for Computational Linguistics (ACL); 2024. p. 3705–16.
|
[94] |
Tan K, Luo K, Lan Y, Yuan Z, Shu J. An LLM-enhanced adversarial editing system for lexical simplification. 2024. arXiv: 2402.14704.
|
[95] |
Sheshadri A, Ewart A, Guo P, Lynch A, Wu C, Hebbar V, et al. Targeted latent adversarial training improves robustness to persistent harmful behaviors in LLMs. 2024. arXiv: 2407.15549.
|
[96] |
Hu X, Chen PY, Ho TY. RADAR: robust AI-text detection via adversarial learning. In: Proceedings of the 37th International Conference on Neural Information Processing Systems; 2023 Dec 10–16; New Orleans, LA, USA. Red Hook: Curran Associates Inc.; 2023. p. 15077–95.
|
[97] |
Thota S, Vangoor VKR, Reddy AK, Ravi CS. Federated learning: privacy-preserving collaborative machine learning. DLBSAR 2019; 5:168-190.
|
[98] |
Goddard C, Siriwardhana S, Ehghaghi M, Meyers L, Karpukhin V, Benedict B, et al. Arcee’s mergekit: a toolkit for merging large language models. 2024. arXiv: 2403.13257.
|
[99] |
Yang E, Wang Z, Shen L, Liu S, Guo G, Wang X, et al. AdaMerging: adaptive model merging for multi-task learning. 2024. arXiv: 2310.02575.
|
[100] |
Matena M, Raffel C. Merging models with fisher-weighted averaging. In: Proceedings of the 36th International Conference on Neural Information Processing Systems; 2022 Nov 28–Dec 9; New Orleans, LA, USA. Red Hook: Curran Associates Inc.; 2022. p. 17703–16.
|
[101] |
Yadav P, Tam D, Choshen L, Raffel C, Bansal M. TIES-MERGING: resolving interference when merging models. In: Proceedings of the 37th International Conference on Neural Information Processing Systems; 2023 Dec 10–16; New Orleans, LA, USA. Red Hook: Curran Associates Inc.; 2023.
|
[102] |
Yu L, Yu B, Yu H, Huang F, Li Y. Language models are super Mario: absorbing abilities from homologous models as a free lunch. 2023. arXiv: 2311.03099.
|
[103] |
Lu Z, Fan C, Wei W, Qu X, Chen D, Cheng Y. Twin-merging: dynamic integration of modular expertise in model merging. 2024. arXiv: 2406.15479.
|
[104] |
Tang A, Shen L, Luo Y, Yin N, Zhang L, Tao D. Merging multi-task models via weight-ensembling mixture of experts. 2024. arXiv: 2402.00433.
|
[105] |
Yang E, Shen L, Wang Z, Guo G, Chen X, Wang X, et al. Representation surgery for multi-task model merging. In: Proceedings of the 41st International Conference on Machine Learning; 2024 Jul 21–27; Vienna, Austria. Seattle: Proceedings of Machine Learning Research; 2024.
|
[106] |
Zhang J, Yang HF, Li A, Guo X, Wang P, Wang H, et al. MLLM-FL: multimodal large language model assisted federated learning on heterogeneous and long-tailed data. 2024. arXiv: 2409.06067.
|
[107] |
Bai J, Chen D, Qian B, Yao L, Li J. Federated fine-tuning of large language models under heterogeneous language tasks and client resources. 2024. arXiv: 2402.11505.
|
[108] |
Fan T, Ma G, Kang Y, Gu H, Song Y, Fan L, et al. FedMKT: federated mutual knowledge transfer for large and small language models. 2024. arXiv: 2406.02224.
|
[109] |
Li H, Zhao X, Guo D, Gu H, Zeng Z, Han Y, et al. Federated domain-specific knowledge transfer on large language models using synthetic data. 2024. arXiv: 2405.14212.
|
[110] |
Fan T, Kang Y, Chen W, Gu H, Song Y, Fan L, et al. PDSS: a privacy-preserving framework for step-by-step distillation of large language models. 2024. arXiv: 2406.12403.
|
[111] |
Gholami M, Akbari M, Hu C, Masrani V, Wang J, Zhang Y. GOLD: generalized knowledge distillation via out-of-distribution-guided language data generation. 2024. arXiv: 2403.19754.
|
[112] |
Li X, Fang Y, Liu M, Ling Z, Tu Z, Su H. Distilling large vision-language model with out-of-distribution generalizability. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023 Oct 4–6; Paris, France. New York City: IEEE; 2023. p. 2492–503.
|
[113] |
Agarwal R, Vieillard N, Zhou Y, Stanczyk P, Ramos S, Geist M, et al. On-policy distillation of language models: learning from self-generated mistakes. In: Proceedings of the Twelfth International Conference on Learning Representations; 2024 May 7–11; Vienna, Austria. London: ICLR; 2024.
|
[114] |
Chen Z, Wang W, Zhao Z, Su F, Men A, Meng H. PracticalDG: perturbation distillation on vision-language models for hybrid domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024 Jul 16–22; Seattle, WA, USA. New York City: IEEE; 2024. p. 23501–11.
|
[115] |
Feng S, Sun H, Yan X, Zhu H, Zou Z, Shen S, et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 2023; 615(7953):620-627.
|
[116] |
Bi K, Xie L, Zhang H, Chen X, Gu X, Tian Q. Accurate medium-range global weather forecasting with 3D neural networks. Nature 2023; 619(7970):533-538.
|
[117] |
Chen K, Han T, Gong J, Bai L, Ling F, Luo JJ, et al. FengWu: pushing the skillful global medium-range weather forecast beyond 10 days lead. 2023. arXiv: 2304.02948.
|
[118] |
Zhong X, Chen L, Liu J, Lin C, Qi Y, Li H. FuXi-extreme: improving extreme rainfall and wind forecasts with diffusion model. 2023. arXiv: 2310.19822.
|
[119] |
Yue M, Mifdal W, Zhang Y, Suh J, Yao Z. MathVC: an LLM-simulated multi-character virtual classroom for mathematics education. 2024. arXiv: 2404.06711.
|
[120] |
Müller J, Zeinhofer M. Achieving high accuracy with PINNs via energy natural gradient descent. In: Proceedings of the International Conference on Machine Learning; 2023 Jul 23–29; Honolulu, HI, USA. New York City: IEEE; 2023.
|
[121] |
Aymerich E, Pisano F, Cannas B, Sias G, Fanni A, Gao Y, et al. Physics informed neural networks towards the real-time calculation of heat fluxes at W7-X. Nucl Mater Energy 2023; 34:101401.
|
[122] |
Yang K, Swope A, Gu A, Chalamala R, Song P, Yu S, et al. LeanDojo: theorem proving with retrieval-augmented language models. In: Proceedings of the 37th International Conference on Neural Information Processing Systems; 2023 Dec 10–16; New Orleans, LA, USA. Red Hook: Curran Associates Inc.; 2024.
|
[123] |
Zhan B. AUTO2, a saturation-based heuristic prover for higher-order logic. In: Proceedings of Interactive Theorem Proving; 2016 Aug 22–25; Nancy, France; 2016.
|
[124] |
Steen A, Sutcliffe G, Scholl T, Benzmüller C. Solving modal logic problems by translation to higher-order logic. In: Proceedings of the International Conference on Logic and Argumentation; 2023 Sep 10–12; Hangzhou, China. Cham: Springer Nature Switzerland; 2023. p. 25–43.
|
[125] |
Foulis DJ, Randall CH. The empirical logic approach to the physical sciences. A. Hartkämper, H. Neumann (Eds.), Foundations of quantum mechanics and ordered linear spaces, Advanced Study Institute, Marburg 1973; 230-249.
|
[126] |
Xin H, Ren ZZ, Song J, Shao Z, Zhao W, Wang H, et al. DeepSeek-prover-V1. 5: harnessing proof assistant feedback for reinforcement learning and Monte-Carlo tree search. 2024. arXiv: 2408.08152.
|
[127] |
Zhou JP, Staats C, Li W, Szegedy C, Weinberger KQ, Wu Y. Don’t trust: verify-grounding LLM quantitative reasoning with autoformalization. 2024. arXiv: 2403.18120.
|
[128] |
Hong S, Zheng X, Chen J, Cheng Y, Wang J, Zhang C, et al. MetaGPT: meta programming for multi-agent collaborative framework. 2023. arXiv: 2308.00352.
|
[129] |
Qian C, Liu W, Liu H, Chen N, Dang Y, Li J, et al. ChatDev: communicative agents for software development. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistic; 2024 Aug 11–16; Bangkok, Thailand. Stroudsburg: Association for Computational Linguistics (ACL); 2024. p. 15174–86.
|
[130] |
Fang Y, Zhang Q, Zhang N, Chen Z, Zhuang X, Shao X, et al. Knowledge graph-enhanced molecular contrastive learning with functional prompt. Nat Mach Intell 2023; 5(5):542-553.
|
[131] |
Li Y, Cardoso-Silva J, Kelly JM, Delves MJ, Furnham N, Papageorgiou LG, et al. Optimisation-based modelling for explainable lead discovery in malaria. Artif Intell Med 2024; 147:102700.
|
[132] |
Poli M, Massaroli S, Nguyen E, Fu DY, Dao T, Baccus S, et al. Hyena hierarchy: towards larger convolutional language models. In: Proceedings of the International Conference on Machine Learning; 2023 Jul 23–29; Honolulu, HI, USA. Seattle: Proceedings of Machine Learning Research; 2023. p. 28043–78.
|
[133] |
Gu A, Dao T. Mamba: linear-time sequence modeling with selective state spaces. 2023. arXiv: 2312.00752.
|
[134] |
Sun Y, Dong L, Huang S, Ma S, Xia Y, Xue J, et al. Retentive network: a successor to transformer for large language models. 2023. arXiv: 2307.08621.
|
[135] |
Tang Z, Lv Z, Zhang S, Wu F, Kuang K. ModelGPT: unleashing LLM’s capabilities for tailored model generation. 2024. arXiv: 2402.12408.
|