Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges

Liyuan Chen; Shuoling Liu; Jiangpeng Yan; Xiaoyu Wang; Henglin Liu; Chuang Li; Kecheng Jiao; Jixuan Ying; Yang Veronica Liu; Qiang Yang; Xiu Li

doi:10.1016/j.eng.2025.11.029

Engineering ›› :202511029 DOI: 10.1016/j.eng.2025.11.029

Review

research-article

Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges

Author information +

History +

PDF (1243KB)

Abstract

The advent of foundation models (FMs), large-scale pre-trained models with strong generalization capabilities, has opened new frontiers for financial engineering. While general-purpose FMs such as GPT-4 and Gemini have demonstrated promising performance in tasks ranging from financial report summarization to sentiment-aware forecasting, many financial applications remain constrained by unique domain requirements such as multimodal reasoning, regulatory compliance, and data privacy. These challenges have spurred the emergence of financial foundation models (FFMs): a new class of models explicitly designed for finance. This survey presents a comprehensive overview of FFMs, with a taxonomy spanning three key modalities: financial language foundation models (FinLFMs), financial time-series foundation models (FinTSFMs), and financial visual-language foundation models (FinVLFMs). We review their architectures, training methodologies, datasets, and real-world applications. Furthermore, we identify critical challenges in data availability, algorithmic scalability, and infrastructure constraints and offer insights into future research opportunities. We hope this survey can serve as both a comprehensive reference for understanding FFMs and a practical roadmap for future innovation.

Keywords

Foundation models / Financial engineering / Artificial intelligence / Multimodal models

Cite this article

Download citation ▾

Liyuan Chen, Shuoling Liu, Jiangpeng Yan, Xiaoyu Wang, Henglin Liu, Chuang Li, Kecheng Jiao, Jixuan Ying, Yang Veronica Liu, Qiang Yang, Xiu Li. Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges. Engineering 202511029 DOI:10.1016/j.eng.2025.11.029

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Lyuu YD. Financial engineering and computation: principles, mathematics, algorithms. Cambridge: Cambridge University Press; 2001.

[2]	Raghavan P, El Gayar N. Fraud detection using machine learning and deep learning. In: Proceedings of the 2019 International Conference on Computational Intelligence and Knowledge Economy; 2019 Dec 11-12; Dubai, United Arab Emirates. New York City:IEEE; 2019. p. 334-9.

[3]	Ozbayoglu AM, Gudelek MU, Sezer OB. Deep learning for financial applications: a survey. Appl Soft Comput 2020; 93:106384.

[4]	Kumbure MM, Lohrmann C, Luukka P, Porras J. Machine learning techniques and data for stock market forecasting: a literature review. Expert Syst Appl 2022; 197:116659.

[5]	Sun Y, Li J. Deep learning for intelligent assessment of financial investment risk prediction. Comput Intell Neurosci 2022; 2022(1):3062566.

[6]	Sanderson K. GPT-4 is here: what scientists think. Nature 2023; 615(7954):773.

[7]	Gemini Team Google; Anil R, Borgeaud S, Alayrac JB, Yu J, Soricut R, et al. Gemini: a family of highly capable multimodal models. 2023. arXiv:2312.11805.

[8]	Yang A, Li A, Yang B, Zhang B, Hui B, Zheng B, et al. Qwen3 technical report. 2025. arXiv:2505.09388.

[9]	Liu S, Chen L, Yan J, Jiang Y, Wang X, Liu X, et al. When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations. Front Inf Technol Electro Eng. In press.

[10]	Nie Y, Kong Y, Dong X, Mulvey JM, Poor HV, Wen Q, et al. A survey of large language models for financial applications: progress, prospects and challenges. 2024. arXiv:2406.11903.

[11]	Chen Z, Ma J, Zhang X, Hao N, Yan A, Nourbakhsh A, et al. A survey on large language models for critical societal domains: finance, healthcare, and law. 2024. arXiv:2405.01769.

[12]	Liu J. A survey of financial AI: architectures, advances and open challenges. 2024. arXiv:2411.12747.

[13]	Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, et al. On the opportunities and risks of foundation models. 2021. arXiv:2108.07258.

[14]	Shanahan M. Talking about large language models. Commun ACM 2024; 67(2):68-79.

[15]	Wu S, Irsoy O, Lu S, Dabravolski V, Dredze M, Gehrmann S, et al. BloombergGPT: a large language model for finance. 2023. arXiv:2303.17564.

[16]	Zhao H, Liu Z, Wu Z, Li Y, Yang T, Shu P, et al. Revolutionizing finance with LLMs: an overview of applications and insights. 2024. arXiv:2401.11641.

[17]

Yeh CCM, Dai X, Chen H, Zheng Y, Fan Y, Der A, et al. Toward a foundation model for time series data. In: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management; 2023 Oct 21-25; Birmingham, UK. New York City: Association for Computing Machinery; 2023. p. 4400-4.

[18]	Wang Z, Li Y, Wu J, Soon J, Zhang X. FinVis-GPT: a multimodal large language model for financial chart analysis. 2023. arXiv:2308.01430.

[19]	Huang J, Xiao M, Li D, Jiang Z, Yang Y, Zhang Y, et al. Open-FinLLMs: open multimodal large language models for financial applications. 2024. arXiv:2408.11878.

[20]	Bhatia G, Nagoudi EMB, Cavusoglu H, Abdul-Mageed M. FinTral: a family of GPT-4 level multimodal financial large language models. 2024. arXiv:2402.10986.

[21]	Li Y, Wang S, Ding H, Chen H. Large language models in finance:a survey. In: Proceedings of the Fourth ACM International Conference on AI in Finance; 2023 Nov 27-29; Brooklyn, NY, USA. New York City: Association for Computing Machinery; 2023. p. 374-82.

[22]	Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. 2019. arXiv:1810.04805.

[23]	Araci D. FinBERT: financial sentiment analysis with pre-trained language models. 2019. arXiv:1908.10063.

[24]	Yang Y, Mark CSU,Huang A. FinBERT: a pretrained language model for financial communications. 2020. arXiv:2006.08097.

[25]

Liu Z, Huang D, Huang K, Li Z, Zhao J. FinBERT:a pre-trained financial language representation model for financial text mining. In: Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence; 2021 Jan 7-15; Yokohama, Japan. New York City: Association for Computing Machinery; 2021. p. 4513-9.

[26]	Kaplan J, McCandlish S, Henighan T, Brown TB, Chess B, Child R, et al. Scaling laws for neural language models. 2020. arXiv:2001.08361.

[27]	Fu X, Hirano M, Imajo K. Financial fine-tuning a large time series model. 2024. arXiv:2412.09880.

[28]	Open AI, Hurst A, Lerer A, Goucher AP, Perelman A, Ramesh A, et al. GPT-4o system card. 2024. arXiv:2410.21276.

[29]	Dong MM, Stratopoulos TC, Wang VX. A scoping review of ChatGPT research in accounting and finance. Int J Account Inf Syst 2024; 55:100715.

[30]	Ding H, Li Y, Wang J, Chen H. Large language model agent in financial trading: a survey. 2024. arXiv:2408.06361.

[31]	Desai AP, Ravi T, Luqman M, Mallya G, Kota N, Yadav P. Opportunities and challenges of generative-AI in finance. In: Proceedings of the IEEE International Conference on Big Data; 2024 Dec 15-18; Washington, DC, USA. New York City: IEEE; 2024. p. 4913-20.

[32]	Lee J, Stevens N, Han SC. Large language models in finance (FinLLMs). Neural Comput Appl 2025; 37:24853-67.

[33]	Zhang Z, Zhang H, Chen K, Guo Y, Hua J, Wang Y, et al. Mengzi: towards lightweight yet ingenious pre-trained models for Chinese. 2021. arXiv:2110.06696.

[34]	Shah RS, Chawla K, Eidnani D, Shah A, Du W, Chava S, et al. When FLUE meets FLANG: benchmarks and large pre-trained language model for financial domain. 2022. arXiv:2211.00083.

[35]	Clark K, Luong MT, Le QV, Manning CD. ELECTRA: pre-training text encoders as discriminators rather than generators. 2020. arXiv:2003.10555.

[36]	Lu D, Wu H, Liang J, Xu Y, He Q, Geng Y, et al. BBT-Fin: comprehensive construction of Chinese financial domain pretrained language model, corpus and benchmark. 2023. arXiv:2302.09432.

[37]	Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. 2019. arXiv:1910.10683.

[38]	Dowling M, Lucey B. ChatGPT for (finance) research: the bananarama conjecture. Finance Res Lett 2023; 53:103662.

[39]	Scao TL, Fan A, Akiki C, Pavlick E, Ilić S, Hesslow D, et al. BLOOM: a 176B-parameter open-access multilingual language model. 2022. arXiv:2211.05100.

[40]

Zhang X, Yang Q. XuanYuan2.0: a large Chinese financial chat model with hundreds of billions parameters. In: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM’23; 2023 Oct 21-25; Birmingham, UK. New York City: Association for Computing Machinery; 2023. p. 4435-9.

[41]	Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, et al. LLaMA 2: open foundation and fine-tuned chat models. 2023. arXiv:2307.09288.

[42]	Jayaseelan N. LLaMA 2: the new open source language model. J Mach Learn Res 2023; 24(1):1-15.

[43]	Cai Z, Cao M, Chen H, Chen K, Chen K, Chen X, et al. InternLM2 technical report.2024. arXiv:2403.17297.

[44]	Xie Q, Han W, Zhang X, Lai Y, Peng M, Lopez-Lira A, et al. PIXIU: a large language model, instruction data and evaluation benchmark for finance. 2023. arXiv:2306.05443.

[45]	Yang Y, Tang Y, Tam KY. InvestLM: a large language model for investment using financial domain instruction tuning. 2023. arXiv:2309.13064.

[46]	Wang N, Yang H, Wang CD. FinGPT: instruction tuning benchmark for open-source large language models in financial datasets. 2023. arXiv:2310.04793.

[47]	Yang H, Liu XY, Wang CD.FinGPT: open-source financial large language models. 2023. arXiv:2306.06031.

[48]	Tong H, Li J, Wu N, Gong M, Zhang D, Zhang Q.Ploutos: towards interpretable stock movement prediction with financial large language model. 2024. arXiv:2403.00782.

[49]	Duxiaoman-DI. Xuanyuan-finx1 [Internet]. San Francisco: GitHub, Inc.; [cited 2025 Nov 5]. Available from: https://github.com/Duxiaoman-DI/XuanYuan.

[50]	Tongyi-EconML. FinQwen:AI+finance [Internet]. San Francisco: GitHub, Inc.; undated [cited 2025 Nov 5]. Available from: https://github.com/Tongyi-EconML/FinQwen.

[51]	Hu G, Qin K, Yuan C, Peng M, Lopez-Lira A, Wang B, et al. No language is an Island: unifying Chinese and English in financial large language models, instruction data, and benchmarks. 2024. arXiv:2403.06249.

[52]	Li J, Bian Y, Wang G, Lei Y, Cheng D, Ding Z, et al. CFGPT: Chinese financial assistant with large language model. 2023. arXiv:2309.10654.

[53]	Wang Y, Chen H, Tang Y, Guo T, Han K, Nie Y, et al. PanGu-τ: enhancing language model architectures via nonlinearity compensation. 2023. arXiv:2312.17276.

[54]	Zhang H, Qiu B, Feng Y, Li S, Ma Q, Zhang X, et al. Baichuan4-finance technical report. 2024. arXiv:2412.15270.

[55]	Guo D, Yang D, Zhang H, Song J, Zhang R, Xu R, et al. DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. 2025. arXiv:2501.12948.

[56]	Qian L, Zhou W, Wang Y, Peng X, Yi H, Huang J, et al. Fino1: on the transferability of reasoning enhanced LLMs to finance. 2025. arXiv:2502.08127.

[57]	Chen W, Wang Q, Long Z, Zhang X, Lu Z, Li B, et al. DISC-FinLLM: a Chinese financial large language model based on multiple experts fine-tuning. 2023. arXiv:2310.15205.

[58]	Karpathy A. State of GPT 2023 [Internet]. San Francisco: GitHub, Inc.; [cited 2025 Nov 5]. Available from: https://github.com/giachat/State-of-GPT-2023.

[59]	Malo P, Sinha A, Korhonen P, Wallenius J, Takala P. Good debt or bad debt: detecting semantic orientations in economic texts. J Assoc Inf Sci Technol 2014; 65(4):782-96.

[60]	Maia M, Handschuh S, Freitas A, Davis B, McDermott R, Zarrouk M, et al. WWW’18 open challenge:financial opinion mining and question answering. In: Proceedings of the Web Conference; 2018 Apr 23-27; Lyon, France. Geneva: International World Wide Web Conferences Steering Committee; 2018. p. 1941-2.

[61]

Xu Y, Cohen SB. Stock movement prediction from tweets and historical prices. In: Gurevych I, Miyao Y, editors. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics; 2018 Jul 15-20; Melbourne, VIC, Australia. Stroudsburg: Association for Computational Linguistics; 2018. p. 1970-9.

[62]

Wu H, Zhang W, Shen W, Wang J. Hybrid deep sequential modeling for social text-driven stock prediction. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management; 2018 Oct 22-26; Turin, Italy. New York City: Association for Computing Machinery; 2018. p. 1627-30.

[63]	Guo X, Xia H, Liu Z, Cao H, Yang Z, Liu Z, et al. FinEval: a Chinese financial domain knowledge evaluation benchmark for large language models. 2023. arXiv:2308.09975.

[64]	Lei Y, Li JT, Cheng D, Ding Z, Jiang C.CFBenchmark: Chinese financial assistant benchmark for large language model. 2023. arXiv:2311.05812.

[65]	Duxiaoman-DI. FinanceIQ:Chinese financial domain knowledge assessment dataset [Internet]. San Francisco: GitHub, Inc.; [cited 2025 Nov 5]. Available from: https://github.com/Duxiaoman-DI/XuanYuan.

[66]	Nie Y, Yan B, Guo T, Liu H, Wang H, He W, et al. CFinBench: a comprehensive Chinese financial benchmark for large language models. 2024. arXiv:2407.02301.

[67]	Guo J, Guo Y, Li M, Tan S.FLAME: financial large-language model assessment and metrics evaluation. 2025. arXiv:2501.06211.

[68]	Zhu J, Li J, Wen Y, Guo L.Benchmarking large language models on CFLUE-a Chinese financial language understanding evaluation dataset. 2024. arXiv:2405.10542.

[69]	Xie Q, Han W, Chen Z, Xiang R, Zhang X, He Y, et al. FinBen: a holistic financial benchmark for large language models. 2024. arXiv:2402.12659.

[70]	Li X, Li Z, Shi C, Xu Y, Du Q, Tan M, et al. AlphaFin: benchmarking financial analysis with retrieval-augmented stock-chain framework. 2024. arXiv:2403.12582.

[71]	Chen Y, Nakamura T, Zhang R. M³ FinMeeting: multilingual multimodal benchmark for financial meeting understanding. 2024. arXiv:2406.07890.

[72]	Noels S, De Blaere J, De Bie T. A Dutch financial large language model. In: Proceedings of the 5th ACM International Conference on AI in Finance; 2024 Nov 14-17; Brooklyn, NY, USA. New York City: Association for Computing Machinery; 2024. p. 283-91.

[73]

Alvarado JCS, Verspoor K, Baldwin T. Domain adaption of named entity recognition to support credit risk assessment. In: Proceedings of the Australasian Language Technology Association Workshop 2015; 2015 Dec 8-9; Parramatta, NSW, Australia. Stroudsburg: Association for Computational Linguistics; 2015. p. 84-90.

[74]	Sinha A, Khandait T. Impact of news on the commodity market:dataset and results. In: Proceedings of the Future of Information and Communication Conference; 2021 Apr 29-30; Vancouver, BC, Canada. Berlin: Springer; 2021. p. 589-601.

[75]	Chen Z, Li S, Smiley C, Ma Z, Shah S, Wang WY.ConvFinQA: exploring the chain of numerical reasoning in conversational finance question answering. 2022. arXiv:2210.03849.

[76]	Wheeler A, Varner JD. MarketGPT: developing a pretrained transformer (GPT) for modeling financial time series. 2024. arXiv:2411.16585.

[77]	Das A, Kong W, Sen R, Zhou Y.A decoder-only foundation model for time-series forecasting. In: Proceedings of the 41st International Conference on Machine Learning; 2024 Jul 21-27; Vienna, Austria. Cambridge: JMLR.org; 2024. p. 10148-67.

[78]	Chitsaz F, Haratizadeh S.Dual adaptation of time-series foundation models for financial forecasting. In: Proceedings of the 1st ICML Workshop on Foundation Models for Structured Data; 2025 Jul 13-19; Vancouver, BC, Canada. San Diego: International Conference on Machine Learning; 2025. p. 1-5.

[79]	Goel A, Pasricha P, Kanniainen J.Time-series foundation model for value-at-risk. 2024. arXiv:2410.11773.

[80]	Goel A, Pasricha P, Magris M, Kanniainen J.Foundation time-series AI model for realized volatility forecasting. 2025. arXiv:2505.11163.

[81]	Zhou T, Ma Z, Wen Q, Wang X, Sun L, Jin R.FEDformer: frequency enhanced decomposed transformer for long-term series forecasting. 2022. arXiv:2201.12740.

[82]	Nie Y, Nguyen NH, Sinthong P, Kalagnanam J. A time series is worth 64 words: long-term forecasting with transformers. 2023. arXiv:2211.14730.

[83]	Zhu D, Chen J, Shen X, Li X, Elhoseiny M。 MiniGPT-4: enhancing vision-language understanding with advanced large language models. 2023. arXiv:2304.10592.

[84]	Zhang Y, Gong K, Zhang K, Li H, Qiao Y, Ouyang W, et al. Meta-transformer: a unified framework for multimodal learning. 2023. arXiv:2307.10802.

[85]	Jin M, Wang S, Ma L, Chu Z, Zhang JY, Shi X, et al. Time-LLM: time series forecasting by reprogramming large language models. 2023. arXiv:2310.01728.

[86]	Liu X, Hu J, Li Y, Diao S, Liang Y, Hooi B, et al. UniTime: a language-empowered unified model for cross-domain time series forecasting. In: Proceedings of the ACM Web Conference 2024; 2024 May 13-17; Singapore. New York City: Association for Computing Machinery; 2024. p. 4095-106.

[87]	Cheng J, Chin P. SocioDojo:building lifelong analytical agents with real-world text and time series. In: Proceedings of the 12th International Conference on Learning Representations; 2024 May 7-11; Vienna, Austria. New York City: IEEE; 2024. p. 1-35.

[88]	Shi Y, Fu Z, Chen S, Zhao B, Xu W, Zhang C, et al.. Kronos: a foundation model for the language of financial markets. 2025. arXiv:2508.02739.

[89]	Nie Y, Nguyen NH, Sinthong P, Kalagnanam J. A time series is worth 64 words: long-term forecasting with transformers. 2022. arXiv:2211.14730.

[90]	Saxena V. Google stock prices-training and test data (2012-2017) [Internet]. Mountain View: Kaggle; [cited 2025 Nov 5]. Available from: https://www.kaggle.com/datasets/vaibhavsxn/google-stock-prices-training-and-test-data.

[91]	Han H. S&P 500 historical data (1927-2020) [Internet]. Mountain View: Kaggle; [cited 2025 Nov 5]. Available from: https://www.kaggle.com/datasets/henryhan117/sp-500-historical-data.

[92]

Lai G, Chang WC, Yang Y, Liu H. 2018 Jul 8-12; Modeling long-and short-term temporal patterns with deep neural networks. In: Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval; 2018 Jul 8-12; Ann Arbor, MI, USA. New York City: Association for Computing Machinery (ACM); 2018. p. 747-56.

[93]	Bitcoin time series data [Internet]. Mountain View: Kaggle; 2020 [cited 2025 Nov 12]. Available from: https://www.kaggle.com/datasets/soham1024/bitcoin-time-series-data-till-02082020.

[94]	Dong Z, Fan X, Peng Z. FNSPID:a comprehensive financial news dataset in time series. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; 2024 Aug 25-29; Barcelona, Spain. New York City: Association for Computing Machinery; 2024. p. 4918-27.

[95]	Hu Y, Li Y, Liu P, Zhu Y, Li N, Dai T, et al. FinTSB: a comprehensive and practical benchmark for financial time series forecasting. 2025. arXiv:2502.18834.

[96]	Marconi BA. Time series foundation models for multivariate financial time series forecasting. 2025. arXiv:2507.07296.

[97]	Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, et al. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning; 2021 Jul 18-24; online. New York City: ML Research Press; 2021. p. 8748-63.

[98]	Li J, Li D, Xiong C, Hoi S. BLIP:bootstrapping language-image pre-training for unified vision-language understanding and generation. In: Proceedings of the 39th International Conference on Machine Learning; 2022 Jul 17-23; Baltimore, MD, USA. New York City: ML Research Press; 2022. p. 12888-900.

[99]	Guo Z, Xu R, Yao Y, Cui J, Ni Z, Ge C, et al. LLaVA-UHD:an LMM perceiving any aspect ratio and high-resolution images. In: Proceedings of the 18th European Conference on Computer Vision; 2024 Sep 29; Milan, Italy. Berlin: Springer; 2024. p. 390-406.

[100]

Alayrac JB, Donahue J, Luc P, Miech A, Barr I, Hasson Y, et al. Flamingo:a visual language model for few-shot learning. In: Proceedings of the 36th International Conference on Neural Information Processing Systems; 2022 Nov 28-Dec 9; New Orleans, LA, USA. Red Hook: Curran Associates Inc.; 2022. p. 23716-36.

[101]

Li J, Li D, Savarese S, Hoi S. BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models. In: Proceedings of the 40th International Conference on Machine Learning; 2023 Jul 23-29; Honolulu, HI, USA. C New York City: ML Research Press; 2023. p. 19730-42.

[102]

Vicuna Team.Vicuna: an open-source chatbot impressing GPT-4 with 90%* ChatGPT quality [Internet]. Bretagne: LMSYS ORG; 2023 Mar 30 [cited 2023 Apr 14]. Available from: https://lmsys.org/blog/2023-03-30-vicuna/.

[103]

Quinlan R. Statlog (Australian Credit approval) [Internet]. Irvine: UCI Machine Learning Repository; [cited 2023 Apr 14]. Available from: https://doi.org/10.24432/C59012.

[104]

Hofmann H. Statlog (German Credit data) [Internet]. Irvine: UCI Machine Learning Repository; [cited 2023 Apr 14]. Available from: https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data.

[105]

Zhu F, Lei W, Huang Y, Wang C, Zhang S, Lv J, et al. TAT-QA:a question answering benchmark on a hybrid of tabular and textual content in finance. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; 2021 Aug 1-6; online. Stroudsburg: Association for Computational Linguistics; 2021. p. 3277-87.

[106]

Chen Z, Chen W, Smiley C, Shah S, Borova I, Langdon D, et al. FinQA:a dataset of numerical reasoning over financial data. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing; 2021 Nov 7-11; Punta Cana, Dominican Republic. Stroudsburg: Association for Computational Linguistics; 2021. p. 3697-711.

[107]

Chen Z, Li S, Smiley C, Ma Z, Shah S, Wang WY. ConvFinQA:exploring the chain of numerical reasoning in conversational finance question answering. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing; 2022 Dec 7-11; Abu Dhabi, United Arab Emirates. Stroudsburg: Association for Computational Linguistics; 2022. p. 6279-92.

[108]

Masry A, Long DX, Tan JQ, Joty S, Hoque E. ChartQA: a benchmark for question answering about charts with visual and logical reasoning. 2022. arXiv:2203.10244.

[109]

Gan Z, Lu Y, Zhang D, Li H, Liu C, Liu J, et al. MME-Finance: a multimodal finance benchmark for expert-level understanding and reasoning. 2024. arXiv:2411.03314.

[110]

Yue X, Ni Y, Zhang K, Zheng T, Liu R, Zhang G, et al. MMMU:a massive multi-discipline multimodal understanding and reasoning benchmark for expert AGI. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024; 2024 Jun 17-21; Seattle, DC, USA. New York City: IEEE; 2024. p. 9556-67.

[111]

Kim S, Kim C, Kim T.FCMR: robust evaluation of financial cross-modal multi-hop reasoning. 2024. arXiv:2412.12567.

[112]

Xue S, Chen T, Zhou F, Dai Q, Chu Z, Mei H.FAMMA: a benchmark for financial domain multilingual multimodal question answering. 2024. arXiv:2410.04526.

[113]

Luo J, Kou Z, Yang L, Luo X, Huang J, Xiao Z, et al. FinMME: benchmark dataset for financial multi-modal reasoning evaluation. 2025. arXiv:2505.24714.

[114]

Khezresmaeilzadeh T, Razmara P, Azizi S, Sadeghi ME, Potraghloo EB. VISTA: vision-language inference for training-free stock time-series analysis. 2025. arXiv:2505.18570.

[115]

Guo J, Wang S, Ni LM, Shum HY. Quant 4.0: engineering quantitative investment with automated, explainable, and knowledge-driven artificial intelligence. Front Inf Technol Electron Eng 2024; 25(11):1421-45.

[116]

Aguda T, Siddagangappa S, Kochkina E, Kaur S, Wang D, Smiley C, et al. Large language models as financial data annotators: a study on effectiveness and efficiency. 2024. arXiv:2403.18152.

[117]

Balsiger D, Dimmler HR, Egger-Horstmann S, Hanne T. Assessing large language models used for extracting table information from annual financial reports. Computers 2024; 13(10):257.

[118]

Fatouros G, Metaxas K, Soldatos J, Kyriazis D. Can large language models beat wall street? Evaluating GPT-4’s impact on financial decision-making with marketsenseai. Neural Comput Appl 2025; 37:24893-18.

[119]

Chen M, Tang Y, Qi Q, Dai H, Lin Y, Ling C, et al. Enhancing stock timing predictions based on multimodal architecture: leveraging large language models (LLMs) for text quality improvement. PLoS One 2025; 20(6):e0326034.

[120]

Ko H, Lee J. Can ChatGPT improve investment decisions? From a portfolio management perspective. Finance Res Lett 2024; 64:105433.

[121]

Li J, Lei Y, Bian Y, Cheng D, Ding Z, Jiang C. RA-CFGPT: Chinese financial assistant with retrieval-augmented large language model. Front Comput Sci 2024; 18(5):185350.

[122]

Yu Y, Li H, Chen Z, Jiang Y, Li Y, Suchow JW, et al. FinMem: a performance-enhanced LLM trading agent with layered memory and character design. IEEE Trans Big Data. In press.

[123]

Wang M, Izumi K, Sakaji H.LLMFactor: extracting profitable factors through prompts for explainable stock movement prediction. 2024. arXiv:2406.10811.

[124]

Xiao Y, Sun E, Luo D, Wang W.TradingAgents: multi-agents LLM financial trading framework. 2024. arXiv:2412.20138.

[125]

Lopez-Lira A.Can large language models trade? Testing financial theories with LLM agents in market simulations. 2025. arXiv:2504.10789.

[126]

Zhang C, Liu X, Zhang Z, Jin M, Li L, Wang Z, et al. When AI meets finance (stockagent): large language model-based stock trading in simulated real-world environments. 2024. arXiv:2407.18957.

[127]

He W, Xi Z, Zhao W, Fan X, Ding Y, Shan Z, et al. Distill visual chart reasoning ability from LLMs to MLLMs. 2024. arXiv:2410.18798.

[128]

Albrecht JP. How the GDPR will change the world. Eur Data Prot L Rev 2016; 2(3):287-9.

[129]

Yeoh P. MiFID II key concerns. J Financ Regul Compliance 2019; 27(1):110-23.

[130]

King P, Tarbert H. Basel III: an overview. Banking & Financial Services Policy Report 2011; 30(5):1-18.

[131]

Fan T, Kang Y, Ma G, Chen W, Wei W, Fan L, et al. FATE-LLM: a industrial grade federated learning framework for large language models. 2023. arXiv:2310.10049.

[132]

Fan T, Ma G, Kang Y, Gu H, Song Y, Fan L, et al. FedMKT:federated mutual knowledge transfer for large and small language models. In: Proceedings of the 31st International Conference on Computational Linguistics; 2025 Jan 19-24; Dhabi, United Arab Emirates. Stroudsburg: Association for Computational Linguistics; 2025. p. 243-55.

[133]

Zhu S, Leung H, Wang X, Wei J, Xu H.When fintech meets privacy: securing financial LLMs with differential private fine-tuning. 2025. arXiv:2509.08995.

[134]

Jiang C, Zhang P, Ni Y, Wang X, Peng H, Liu S, et al. Multimodal retrieval-augmented generation for financial documents: image-centric analysis of charts and tables with large language models. Vis Comput 2025; 41(10):7657-70.

[135]

Glasserman P, Lin C.Assessing look-ahead bias in stock return predictions generated by GPT sentiment analysis. 2023. arXiv:2309.17322.

[136]

Drinkall F, Rahimikia E, Pierrehumbert JB, Zohren S.Time machine GPT. 2024. arXiv:2404.18543.

[137]

Wu F, Shen T, Bäck T, Chen J, Huang G, Jin Y, et al. Knowledge-empowered, collaborative, and co-evolving AI models: the post-LLM roadmap. Engineering 2025; 44:87-100.

[138]

Liu Y, Yan B, Zou T, Zhang J, Gu Z, Ding J, et al. Towards harnessing the collaborative power of large and small models for domain tasks. 2025. arXiv:2504.17421.