大模型时代我国高密度算力安全及发展研究
Security and Development of High-Density Computing Power in China in the Era of Large Language Models
近年来,以大语言模型(LLMs)和多模态大模型(MLLMs)为代表的生成式人工智能取得了显著进展。这些超大规模和复杂的大模型对计算资源提出了极高的要求,推动了高密度算力发展的迫切需求。本文从人工智能大模型技术发展的视角,探讨了大模型开发阶段和计算优化技术及其对算力需求的特点。围绕大模型对算力的需求,进一步剖析高密度算力的内涵与特征、发展现状及关键组成,并识别了我国高密度算力发展面临的五大关键挑战,包括供应链安全风险、物理硬件层瓶颈、软件栈不完整及高度依赖性、算力功耗与能源安全、网络安全风险等方面。对此,研究提出了未来我国高密度算力的主要发展策略,包括强化产业链自主可控、坚持自研与标准并举、构建开放统一软硬件生态、完善绿色算力创新体系、优化“学研”一体化创新体系等,以为我国高密度算力安全及发展提供参考。
In recent years, generative artificial intelligence represented by large language models (LLMs) and multimodal large language models (MLLMs) has achieved remarkable progress. These ultra-large-scale and complex models impose extremely high demand on computational resources, driving an urgent need for the development of high-density computing power. From the perspective of the technological advancement of LLMs, this study explores the development stages of LLMs, computational optimization techniques, and their characteristics in terms of computing power requirements. Focusing on the demand of LLMs for computing power, this study analyzes the implications and characteristics of high-density computing power, its current development status, and key components. It further identifies five major challenges faced by the development of high-density computing power in China, including supply chain security risks, bottlenecks at the hardware layer, incomplete software stacks with high external dependency, the power wall and energy security, as well as cybersecurity risks. In response, this study proposes several strategies for high-density computing power development in China, including improving self-supporting of the supply chain, adhering to self-dependent innovation and standardization, building an open and collaborative software-hardware ecosystem, improving green computing innovation, and optimizing an integrated research and innovation system.
高密度算力 / 智能算力 / 大模型 / 计算优化 / 安全发展
high-density computing power / intelligent computing power / large language model / computational optimization / safe development
| [1] |
新华网. 我国大模型数量超1500个 [EB/OL]. (2025-07-27)[2025-08-07]. http://www.news.cn/tech/20250727/97930c6826c147349fc068894ac6bb96/c.html. |
| [2] |
Xinhuanet. The number of large language models in China exceeds 1500 [EB/OL]. (2025-07-27)[2025-08-07]. http://www.news.cn/tech/20250727/97930c6826c147349fc068894ac6bb96/c.html. |
| [3] |
2022中国算力大会. 中国算力白皮书(2022年) [R].济南: 中国算力大会, 2022. |
| [4] |
2022 China Computational Power Conference. White paper on China's computing power [R]. Jinan: China Computational Power Conference, 2022. |
| [5] |
中国政府网. 算力基础设施高质量发展行动计划 [EB/OL]. (2023-10-10)[2025-09-17]. https://www.gov.cn/zhengce/zhengceku/202310/P020231009520949915888.pdf. |
| [6] |
China Goverment Website. High-quality development action plan for computing power infrastructure [EB/OL]. (2023-10-10)[2025-09-17]. https://www.gov.cn/zhengce/zhengceku/202310/P020231009520949915888.pdf. |
| [7] |
OECD. A blueprint for building national compute capacity for artificial intelligence [EB/OL]. (2023-02-28)[2025-09-17]. https://www.oecd.org/content/dam/oecd/en/publications/reports/2023/02/a-blueprint-for-building-national-compute-capacity-for-artificial-intelligence_c22fbbee/876367e3-en.pdf. |
| [8] |
EPOCH AI. Trends in AI supercomputers [EB/OL]. (2025-04-23)[2025-09-17]. https://epoch.ai/blog/trends-in-ai-supercomputers. |
| [9] |
OECD. OECD framework for the classification of AI systems [EB/OL]. (2022-02-22)[2025-09-17]. https://www.oecd.org/content/dam/oecd/en/publications/reports/2022/02/oecd-framework-for-the-classification-of-ai-systems_336a8b57/cb6d9eca-en.pdf#page=21.40. |
| [10] |
Hu Q, Sun P, Zhang T. Understanding the workload characteristics of large language model development [EB/OL]. (2024-03-19)[2025-09-17]. https://www.usenix.org/publications/loginonline/understanding-workload-characteristics-large-language-model-development. |
| [11] |
Bahri Y, Dyer E, Kaplan J, et al. Explaining neural scaling laws [J]. Proceedings of the National Academy of Sciences, 2024, 121(27): e2311878121. |
| [12] |
Kaplan J, McCandlish S, Henighan T, et al. Scaling laws for neural language models [EB/OL]. (2020-01-23)[2025-07-16]. https://arxiv.org/abs/2001.08361. |
| [13] |
Hoffmann J, Borgeaud S, Mensch A, et al. Training compute-optimal large language models [EB/OL]. (2022-05-29)[2025-07-16]. https://arxiv.org/abs/2203.15556. |
| [14] |
Besiroglu T, Erdil E, Barnett M, et al. Chinchilla scaling: A replication attempt [EB/OL]. (2024-04-15)[2025-07-16]. https://arxiv.org/abs/2404.10102. |
| [15] |
Sardana N, Portes J, Doubov S, et al. Beyond chinchilla-optimal: Accounting for inference in language model scaling laws [EB/OL]. (2023-12-31)[2025-07-16]. https://arxiv.org/abs/2401.00448. |
| [16] |
DeepSeek-AI, Bi X, Chen D L, et al. DeepSeek LLM: Scaling open-source language models with longtermism [EB/OL]. (2024-01-05)[2025-07-16]. https://arxiv.org/abs/2401.02954. |
| [17] |
Hu S D, Tu Y G, Han X, et al. MiniCPM: Unveiling the potential of small language models with scalable training strategies [EB/OL]. (2024-04-09)[2025-07-16]. https://arxiv.org/abs/2404.06395. |
| [18] |
Meta. Llama 3 [EB/OL]. [2025-11-07] https://www.llama.com/models/llama-3/. |
| [19] |
Snell C, Lee J, Xu K, et al. Scaling LLM test-time compute optimally can be more effective than scaling model parameters [EB/OL]. (2024-08-06)[2025-08-07]. https://arxiv.org/abs/2408.03314. |
| [20] |
Zhang Q Y, Lyu F Y, Sun Z X, et al. A survey on test-time scaling in large language models: What, how, where, and how well? [EB/OL]. (2025-03-31)[2025-08-07]. https://arxiv.org/abs/2503.24235. |
| [21] |
Lang J D, Guo Z H, Huang S Y. A comprehensive study on quantization techniques for large language models [R]. Xiamen: 2024 4th International Conference on Artificial Intelligence, Robotics, and Communication (ICAIRC), 2025. |
| [22] |
Lin J, Tang J M, Tang H T, et al. AWQ: Activation-aware weight quantization for on-device LLM compression and acceleration [J]. GetMobile: Mobile Computing and Communications, 2025, 28(4): 12‒17. |
| [23] |
Xiao G X, Lin J, Seznec M, et al. SmoothQuant: Accurate and efficient post-training quantization for large language models [EB/OL]. (2022-11-18)[2025-08-07]. https://arxiv.org/abs/2211.10438. |
| [24] |
Frantar E, Ashkboos S, Hoefler T, et al. GPTQ: Accurate post-training quantization for generative pre-trained transformers [EB/OL]. (2022-10-31)[2025-08-07]. https://arxiv.org/abs/2210.17323. |
| [25] |
Xu Z H, Xu Y, Xu H L, et al. Lightweight and post-training structured pruning for on-device large lanaguage models [EB/OL]. (2025-01-25)[2025-08-07]. https://arxiv.org/abs/2501.15255. |
| [26] |
Wang Y X, Ma M H, Wang Z K, et al. CFSP: An efficient structured pruning framework for LLMs with coarse-to-fine activation information [EB/OL]. (2024-09-20)[2025-08-07]. https://arxiv.org/abs/2409.13199. |
| [27] |
Wang Z H, Wohlwend J, Lei T. Structured pruning of large language models [EB/OL]. (2019-10-10)[2025-08-07]. https://arxiv.org/abs/1910.04732. |
| [28] |
Geng X, Gao J X, Zhang Y H, et al. Complex hybrid weighted pruning method for accelerating convolutional neural networks [J]. Scientific Reports, 2024, 14: 5570. |
| [29] |
Gou J P, Yu B S, Maybank S J, et al. Knowledge distillation: A survey [J]. International Journal of Computer Vision, 2021, 129(6): 1789‒1819. |
| [30] |
Busbridge D, Shidani A, Weers F, et al. Distillation scaling laws [EB/OL]. (2025-02-12)[2025-08-07]. https://arxiv.org/abs/2502.08606. |
| [31] |
史宏志, 赵健, 赵雅倩, 大模型时代的混合专家系统优化综述 [J]. 计算机研究与发展, 2025, 62(5): 1164‒1189. |
| [32] |
Shi H Z, Zhao J, Zhao Y Q, et al. Survey on system optimization for mixture of experts in the era of large models [J]. Journal of Computer Research and Development, 2025, 62(5): 1164‒1189. |
| [33] |
Lepikhin D, Lee H, Xu Y Z, et al. GShard: Scaling giant models with conditional computation and automatic sharding [EB/OL]. (2020-06-30)[2025-08-07]. https://arxiv.org/abs/2006.16668. |
| [34] |
Jiang A Q, Sablayrolles A, Roux A, et al. Mixtral of experts [EB/OL]. (2024-01-08)[2025-08-07]. https://arxiv.org/abs/2401.04088. |
| [35] |
DeepSeek-AI, Liu A X, Feng B, et al. DeepSeek-V3 technical report [EB/OL]. (2024-12-27)[2025-08-07]. https://arxiv.org/abs/2412.19437. |
| [36] |
Liu J C, Tang P, Wang W F, et al. A survey on inference optimization techniques for mixture of experts models [EB/OL]. (2024-12-18)[2025-08-07]. https://arxiv.org/abs/2412.14219. |
| [37] |
郭园方, 余梓彤, 刘艾杉, 多模态大模型安全研究进展 [J]. 中国图象图形学报, 2025, 30(6): 2051‒2081. |
| [38] |
Guo Y F, Yu Z T, Liu A S, et al. Recent progress of the security research for multimodal large models [J]. Journal of Image and Graphics, 2025, 30(6): 2051‒2081. |
| [39] |
Radford A, Kim J W, Hallacy C, et al. Learning transferable visual models from natural language supervision [R]. Online: International Conference on Machine Learning, 2021. |
| [40] |
中科算网, 中国科大高研院. AI大模型与异构算力融合技术白皮书 [EB/OL]. (2025-10)[2025-11-07]. https://pdf.dfcfw.com/pdf/H3_AP202510141762072518_1.pdf?1760514880000.pdf. |
| [41] |
Zhong Ke Suan Wang, Suzhou Institute For Advanced Research University of Science and Technology in China.White paper on AI large model and heterogeneous computing power fusion technology [EB/OL]. (2025-10)[2025-11-07]. https://pdf.dfcfw.com/pdf/H3_AP202510141762072518_1.pdf?1760514880000.pdf. |
| [42] |
算力100问. 第60问: 什么是算力密度? [EB/OL]. (2025-05-23)[2025-08-07]. https://mp.weixin.qq.com/s/I9tIKVmtylf371k9juyP-A. |
| [43] |
Questions on Computing Power. Question 60: What is computing density [EB/OL]. (2025-05-23)[2025-08-07]. https://mp.weixin.qq.com/s/I9tIKVmtylf371k9juyP-A. |
| [44] |
范特科技有限责任公司. 智算未来: 范特科技大模型与智算发展白皮书 [R]. 无锡: 范特科技有限责任公司, 2025. |
| [45] |
Fantec. The future of AI computing power: Fantech's big model and the development of AI computing power [R]. Wuxi: Fantec, 2025. |
| [46] |
东吴证券. 从GPGPU与ASIC之争——算力芯片看点系列 [EB/OL]. (2025-03-12)[2025-08-07]. https://pdf.dfcfw.com/pdf/H3_AP202503121644311455_1.pdf 1741816606000.pdf. |
| [47] |
Soochow Securities. From the battle between GPGPU and ASIC-the focus of computing chips series [EB/OL]. (2025-03-12)[2025-08-07]. https://pdf.dfcfw.com/pdf/H3_AP202503121644311455_1.pdf 1741816606000.pdf. |
| [48] |
FS. High-density servers: Maximizing efficiency and performance in data centers [EB/OL]. (2024-03-28)[2025-08-07]. https://www.fs.com/blog/highdensity-servers-maximizing-efficiency-and-performance-in-data-centers-7070.html. |
| [49] |
李泽林. 高密度数据中心建设浅析 [J]. 智能建筑与智慧城市, 2025 (S1): 153‒155. |
| [50] |
Li Z L. Brief analysis of the construction of high density data center [J]. Intelligent Building & Smart City, 2025 (S1): 153‒155. |
| [51] |
徐建, 郑伟, 郭晓春, 新型数据中心网络安全体系研究 [J]. 信息安全与通信保密, 2022, 20(7): 123‒132. |
| [52] |
Xu J, Zheng W, Guo X C, et al. Research on next-generation data center security system [J]. Information Security and Communications Privacy, 2022, 20(7): 123‒132. |
| [53] |
中国信通院. 算力中心冷板式液冷发展研究报告 [EB/OL].(2024-05)[2025-08-07]. http://www.caict.ac.cn/kxyj/qwfb/ztbg/202405/P020240523566116859176.pdf. |
| [54] |
China Academy of Information and Communications Technology. Research report on the development of cold plate liquid cooling in data centers [EB/OL]. (2024-05)[2025-08-07]. http://www.caict.ac.cn/kxyj/qwfb/ztbg/202405/P020240523566116859176.pdf. |
| [55] |
中国信通院. 数据中心白皮书 [EB/OL]. (2022-04)[2025-08-07]. http://www.caict.ac.cn/kxyj/qwfb/bps/202204/P020220422707354529853.pdf#page=33.43. |
| [56] |
China Academy of Information and Communications Technology. Data center white paper [EB/OL]. (2022-04)[2025-08-07]. http://www.caict.ac.cn/kxyj/qwfb/bps/202204/P020220422707354529853.pdf#page=33.43. |
| [57] |
McKinsey & Company. AI power: Expanding data center capacity to meet growing demand [EB/OL]. (2024-10-29)[2025-08-07]. https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/ai-power-expanding-data-center-capacity-to-meet-growing-demand. |
| [58] |
浪潮信息. i48M6 [EB/OL]. [2025-08-07]. https://www.ieisystem.com/product/server/8335.html. |
| [59] |
IEIT SYSTEMS.i48M6 [EB/OL]. [2025-08-07]. https://www.ieisystem.com/product/server/8335.html. |
| [60] |
腾讯网. 百度智能云: 昆仑芯超节点支持1U4卡的超高密度算力交付形式 [EB/OL]. (2025-05-29)[2025-09-25]. https://news.qq.com/rain/a/20250529A08EZK00. |
| [61] |
Tencent News. Baidu Intelligent Cloud: Kunlun chip super node supports ultra-high-density computing power delivery in 1U4 card form factor [EB/OL]. (2025-05-29)[2025-09-25]. https://news.qq.com/rain/a/20250529A08EZK00. |
| [62] |
中国新闻网. 国产软硬件一体高密度算力机柜——Shanghai Cube亮相WAIC2025 即将进入量产 [EB/OL]. (2025-07-28)[2025-08-07]. https://www.toutiao.com/article/7532007542914941490/. |
| [63] |
ChinaNews. Domestic hardware and software integrated high-density computing cabinet-Shanghai Cube debuted at WAIC2025 about to enter mass production [EB/OL]. (2025-07-28)[2025-08-07]. https://www.toutiao.com/article/7532007542914941490/. |
| [64] |
Nvidia. Nvidia DGX superPOD [EB/OL].[2025-08-07].https://www.nvidia.com/en-us/data-center/dgx-superpod/. |
| [65] |
Huawei. Atlas 900 A3 superPoD [EB/OL]. [2025-08-07]. https://info.support.huawei.com/computing/qrcode/atlas900a3superpod/index-cn.html. |
| [66] |
新浪财经. 云栖大会今开幕, 阿里云将展示超大规模集群、分布式训练、推理加速等能力,首次展出高密度AI服务器和高性能网络架构 [EB/OL]. (2025-09-24)[2025-09-25]. https://finance.sina.com.cn/roll/2025-09-24/doc-infrpwtx6867998.shtml. |
| [67] |
Finance Sina. The Yunqi Conference opens today, with Alibaba Cloud showcasing capabilities such as ultra-large-scale clusters, distributed training, and inference acceleration, while debuting high-density AI servers and high-performance network architectures for the first time [EB/OL]. (2025-09-24)[2025-09-25]. https://finance.sina.com.cn/roll/2025-09-24/doc-infrpwtx6867998.shtml. |
| [68] |
中国信通院. 中国绿色算力发展研究报告 [EB/OL]. (2024-06)[2025-08-07]. https://www.caict.ac.cn/kxyj/qwfb/ztbg/202407/P020240711551514828756.pdf. |
| [69] |
China Academy of Information and Communications Technology. Research report on the development of green computing power in China [EB/OL]. (2024-06)[2025-08-07]. https://www.caict.ac.cn/kxyj/qwfb/ztbg/202407/P020240711551514828756.pdf. |
| [70] |
余佳, 马晓波. 半导体先进封装领域专利技术综述 [J/OL]. 电子与封装, 2025: 1‒12[2025-07-22]. https://link.cnki.net/doi/10.16257/j.cnki.1681-1070.2026.0005. |
| [71] |
Yu J, Ma X B. Review on patent technologies in advanced semiconductor packaging [J/OL]. Electronics & Packaging, 2025: 1‒12[2025-07-22]. https://link.cnki.net/doi/10.16257/j.cnki.1681-1070.2026.0005. |
| [72] |
黄翔, 李潼, 褚俊杰. 算力时代数据中心液冷与蒸发冷的融合发展 [J/OL]. 制冷与空调, 1‒10 [2025-08-07]. https://link.cnki.net/urlid/11.4519.tb.20250327.1407.002 |
| [73] |
Huang X, Li T, Chu J J. The integration development of evaporative cooling and liquid cooling in the era of computing power [J/OL].Refrigeration and Air-Conditioning, 1‒10 [2025-08-07]. https://link.cnki.net/urlid/11.4519.tb.20250327.1407.002. |
| [74] |
电子工程专辑. 海关公布中国2024年芯片进出口数据, 出口首破万亿元 [EB/OL]. (2025-02-05)[2025-09-25]. https://www.eet-china.com/news/202502059154.html. |
| [75] |
EE Times China. Chinese Customs releases China's 2024 chip import and export data, with exports exceeding 1 trillion yuan for the first time [EB/OL]. (2025-02-05)[2025-09-25].https://www.eet-china.com/news/202502059154.html. |
| [76] |
陈聃. 基于近内存计算的图神经网络加速技术研究 [D]. 武汉: 华中科技大学(博士学位论文), 2024. |
| [77] |
Chen D. Research on acceleration techniques of graph neural networks based on near-memory processing [D]. Wuhan: Huazhong University of Science and Technology (Doctoral dissertation), 2024. |
| [78] |
Wulf W A, McKee S A. Hitting the memory wall: Implications of the obvious [J]. ACM SIGARCH Computer Architecture News, 1995, 23(1): 20‒24. |
| [79] |
TechInsights. China does it againa NAND memory market first [EB/OL]. (2025-09-04)[2025-09-25]. https://www.techinsights.com/blog/china-does-it-again-nand-memory-market-first. |
| [80] |
电子工程专辑. 长鑫存储HBM2内存获突破, DDR5良率明年可达90% [EB/OL]. (2024-12-30)[2025-09-25]. https://www.eet-china.com/news/202412303389.html. |
| [81] |
EE Times China. Changxin Memory achieves breakthrough in HBM2 memory, DDR5 yield expected to reach 90% next year [EB/OL]. (2024-12-30)[2025-09-25]. https://www.eet-china.com/news/202412303389.html. |
| [82] |
ZOMI酱, 苏统华. AI系统—原理与架构 [M]. 北京: 科学出版社, 2024. |
| [83] |
ZOMI Sauce, Su T H. AI Systems — Principles and architecture [M]. Beijing: Science Press, 2024. |
| [84] |
Medium. AI compilers demystified [EB/OL]. (2022-11-08)[2025-08-07]. https://medium.com/geekculture/ai-compilers-ae28afbc4907. |
| [85] |
段柳成, 肖巧玲, 金怡, 大模型时代国产大算力GPU的关键挑战与发展路径 [J]. 人工智能, 2025, 12(3): 8‒21. |
| [86] |
Duan L C, Xiao Q L, Jin Y, et al. Key challenges and development path of domestic large computing GPU in the age of large model [J]. AI-View, 2025, 12(3): 8‒21. |
| [87] |
IEA. Energy and AI [R]. Paris: IEA, 2024. |
| [88] |
中国信通院. 算力电力协同发展研究报告 (2025年) [EB/OL].(2025-05)[2025-09-25]. https://www.caict.ac.cn/kxyj/qwfb/ztbg/202505/P020250509511369626787.pdf#page=2.14. |
| [89] |
China Academy of Information and Communications Technology. Research report on the collaborative development of computing power and electricity [EB/OL]. (2025-05)[2025-09-25]. https://www.caict.ac.cn/kxyj/qwfb/ztbg/202505/P020250509511369626787.pdf#page=2.14. |
| [90] |
NIST. SP 800-53 control overlays for securing AI systems [EB/OL]. (2025-08-14)[2025-09-18]. https://csrc.nist.gov/csrc/media/Projects/cosais/documents/NIST-Overlays-SecuringAI-concept-paper.pdf. |
| [91] |
Guo Y N, Zhang Z K, Yang J. GPU memory exploitation for fun and profit [R]. Philadelphia: The 33rd USENIX Conference on Security Symposium, 2024. |
| [92] |
Hoover J. Analysis of GPU memory vulnerabilities [D]. Fayetteville: University of Arkansas, Fayetteville (Undergraduate honors theses), 2022. |
| [93] |
Forbs. Nvidia security warning—Act now as 7 new GPU vulnerabilities confirmed [EB/OL]. [2025-09-17]. https://www.forbes.com/sites/daveywinder/2025/01/28/nvidia-security-warning-act-now-as-7-new-gpu-vulnerabilities-confirmed/. |
| [94] |
Oligo. ShadowRay: First known attack campaign targeting AI workloads actively exploited in the wild [EB/OL]. (2024-03-26)[2025-09-17]. https://www.oligo.security/blog/shadowray-attack-ai-workloads-actively-exploited-in-the-wild. |
| [95] |
PyTorch. Compromised PyTorch-nightly dependency chain between December 25th and December 30th, 2022 [EB/OL]. (2022-12-31)[2025-08-07]. https://pytorch.org/blog/compromised-nightly-dependency/. |
| [96] |
Ridge Security Research Team. Securing your AI: Critical vulnerabilities found in popular Ollama framework [EB/OL]. (2025-03-20)[2025-08-07]. https://ridgesecurity.ai/blog/securing-your-ai-critical-vulnerabilities-found-in-popular-ollama-framework/. |
| [97] |
Federal Register. Information security controls: Cybersecurity Items [EB/OL]. (2022-05-26)[2025-09-26]. https://www.federalregister.gov/documents/2022/05/26/2022-11282/information-security-controls-cybersecurity-items. |
| [98] |
康旺, 寇竞, 赵巍胜. 存算一体芯片发展现状、趋势与挑战 [J]. 中国科学: 信息科学, 2024, 54(1): 16‒24. |
| [99] |
Kang W, Kou J, Zhao W S. In-memory computing technology: Development status, trends and challenges [J]. Scientia Sinica (Informationis), 2024, 54(1): 16‒24. |
中国工程院咨询项目“网络空间安全新技术新应用风险研究”(2023-JB-13)
“算力安全及其产业高质量发展战略研究”(2024-HYZD-02)
/
| 〈 |
|
〉 |