通智测试——基于动态具身物理社会交互环境的通用人工智能测试

, , , , , , , , , , 彭玉佳 , 韩佳衡 , 张振亮 , 范丽凤 , 刘腾宇 , 綦思源 , 封雪 , 马煜曦 , 王亦洲 , 朱松纯

工程(英文) ›› 2024, Vol. 34 ›› Issue (3) : 12 -23.

PDF (4176KB)
工程(英文) ›› 2024, Vol. 34 ›› Issue (3) : 12 -23. DOI: 10.1016/j.eng.2023.07.006
研究论文

通智测试——基于动态具身物理社会交互环境的通用人工智能测试

作者信息 +

The Tong Test: Evaluating Artificial General Intelligence Through Dynamic Embodied Physical and Social Interactions

Author information +
文章历史 +
PDF (4276K)

摘要

随着生成式预训练Transformer模型系列的发布,通用人工智能再次被推到了人工智能领域最受瞩目的前沿。然而,如何定义和评估通用人工智能的问题仍不明确。本研究提出,对于通用人工智能的评估应植根于动态物理和社会互动的具身环境(DEPSI)。具体而言,本文提出了定义通用人工智能的五个关键特征,提出以通智测试作为通用人工智能的评估系统。通智测试描述了一个以价值和能力为导向的测试系统,该系统通过DEPSI划分了通用人工智能五个级别的里程碑,致力于构建无限测试任务。本文将通智测试与经典的人工智能测试工具进行了多方面的对比,并提出了一个系统化的评估体系,以促进通用人工智能的标准化、定量化和客观化的基准制定和评估。

Abstract

The release of the generative pre-trained transformer (GPT) series has brought artificial general intelligence (AGI) to the forefront of the artificial intelligence (AI) field once again. However, the questions of how to define and evaluate AGI remain unclear. This perspective article proposes that the evaluation of AGI should be rooted in dynamic embodied physical and social interactions (DEPSI). More specifically, we propose five critical characteristics to be considered as AGI benchmarks and suggest the Tong test as an AGI evaluation system. The Tong test describes a value- and ability-oriented testing system that delineates five levels of AGI milestones through a virtual environment with DEPSI, allowing for infinite task generation. We contrast the Tong test with classical AI testing systems in terms of various aspects and propose a systematic evaluation system to promote standardized, quantitative, and objective benchmarks and evaluation of AGI.

关键词

通用人工智能 / 通用人工智能标准 / 通用人工智能测试 / 具身人工智能 / 价值对齐 / 图灵测试 / 因果

Key words

Artificial general intelligence / Artificial intelligence benchmark / Artificial intelligence evaluation / Embodied artificial intelligence / Value alignment / Turing test / Causality

引用本文

引用格式 ▾
, , , , , , , , , , 彭玉佳, 韩佳衡, 张振亮, 范丽凤, 刘腾宇, 綦思源, 封雪, 马煜曦, 王亦洲, 朱松纯 通智测试——基于动态具身物理社会交互环境的通用人工智能测试[J]. 工程(英文), 2024, 34(3): 12-23 DOI:10.1016/j.eng.2023.07.006

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

AI Summary AI Mindmap
PDF (4176KB)

2423

访问

0

被引

详细

导航
相关文章

AI思维导图

/