The release of the generative pre-trained transformer (GPT) series has brought artificial general intelligence (AGI) to the forefront of the artificial intelligence (AI) field once again. However, questions regarding how to define and evaluate AGI remain unclear. This special issue publishes research on systematic evaluations of AGI, entitled the Tong test. The research demonstrates that evaluations of AGI go beyond the classic Turing test and should be rooted in dynamic embodied physical and social interactions (DEPSI). The Tong test describes a value- and ability-oriented testing system (the U–V dual system), which delineates five levels of AGI milestones in a dynamic embodied environment. The evaluation system examines AGI from a series of tasks inspired by developmental psychology, such as the building block tower task, which reflects motor control, the mirror test, which reflects self-awareness, and collaboration and autonomous helping tasks. The Tong test promotes standardized, quantitative, and objective benchmarks and evaluation of AGI in the new era.