期刊首页 优先出版 当期阅读 过刊浏览 作者中心 关于期刊 English

《工程(英文)》 >> 2020年 第6卷 第3期 doi: 10.1016/j.eng.2020.01.011

“暗”,不止于“深”——迈向认知智能与类人常识的范式转换

a Center for Vision, Cognition, Learning, and Autonomy, University of California, Los Angeles, CA 90095, USA

b Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

收稿日期: 2019-09-16 修回日期: 2019-12-11 录用日期: 2020-01-03 发布日期: 2020-02-22

下一篇 上一篇

摘要

近年来,深度学习的研究进展本质上是基于“以大数据驱动小任务”的范式,其依靠的是通过大量数据训练的分类器解决一项单一的任务。本文中,我们提出将范式中数据和任务的关系颠倒。在“以小数据驱动大任务”的新范式下,只有少量数据的单一人工智能系统便可以发展出“常识”,并且用“常识”来解决各种任务。通过回顾综合了机器与人类视觉常识模型的最新突破,我们将阐释这个新范式的潜力。我们将功能性(functionality)、物理(physics)、意图(intent)、因果(causality)和效用(utility)(FPICU)认定为拥有类人常识的认知人工智能的5个核心领域。对于视觉理解,FPICU超越了传统的“是什么”和“在何处”的问题框架,而聚焦于“为什么”和“怎么样”。这些问题在像素层面上并不可见,却促进了视觉场景的创建、维护和发展。因此,我们将它们称为视觉的“暗物质”。正如仅仅研究可观察到的物质不足以理解宇宙,我们认为不研究学习FPICU等“暗物质”就无法理解视觉。本文通过展示如何在少量训练数据的条件下观测和应用FPICU来广泛完成一系列挑战性任务,从而证明这个新观点具有开发类人常识的认知智能的能力。这些任务包括工具使用、规划、效用推断和社交学习。总而言之,为了更好地完成任务,下一代人工智能技术必须具备类人常识的“暗物质”。

图片

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

图13

图14

图15

图16

图17

图18

图19

图20

图21

图22

图23

图24

图25

图26

图27

图28

图29

图30

图31

图32

图33

图34

图35

图36

图37

图38

参考文献

[ 1 ] Marr D. Vision: a computational investigation into the human representation and processing of visual information. San Francisco: W.H. Freeman and Company; 1982. 链接1

[ 2 ] Mishkin M, Ungerleider LG, Macko KA. Object vision and spatial vision: two cortical pathways. Trends Neurosci 1983;6:414–7. 链接1

[ 3 ] Ikeuchi K, Hebert M. Task-oriented vision. In: Landy MS, Maloney LT, Pavel M, editors. Exploratory vision. New York: Springer; 1996. p. 257–77. 链接1

[ 4 ] Land M, Mennie N, Rusted J. The roles of vision and eye movements in the control of activities of daily living. Perception 1999;28(11):1311–28. 链接1

[ 5 ] Fang F, He S. Cortical responses to invisible objects in the human dorsal and ventral pathways. Nat Neurosci 2005;8(10):1380–5. 链接1

[ 6 ] Creem-Regehr SH, Lee JN. Neural representations of graspable objects: are tools special? Brain Res Cogn Brain Res 2005;22(3):457–69. 链接1

[ 7 ] Potter MC. Meaning in visual search. Science 1975;187(4180):965–6. 链接1

[ 8 ] Potter MC. Short-term conceptual memory for pictures. J Exp Psychol Hum Learn 1976;2(5):509–22. 链接1

[ 9 ] Schyns PG, Oliva A. From blobs to boundary edges: evidence for time- and spatial-scale-dependent scene recognition. Psychol Sci 1994;5(4):195–200. 链接1

[10] Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. Nature 1996;381(6582):520–2. 链接1

[11] Greene MR, Oliva A. The briefest of glances: the time course of natural scene understanding. Psychol Sci 2009;20(4):464–72. 链接1

[12] Greene MR, Oliva A. Recognition of natural scenes from global properties: seeing the forest without representing the trees. Cognit Psychol 2009;58 (2):137–76. 链接1

[13] Li FF, Iyer A, Koch C, Perona P. What do we perceive in a glance of a real-world scene? J Vis 2007;7(1):10. 链接1

[14] Rousselet G, Joubert O, Fabre-Thorpe M. How long to get to the ‘‘gist” of realworld natural scenes? Vis Cognit 2005;12(6):852–77. 链接1

[15] Oliva A, Torralba A. Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 2001;42(3):145–75. 链接1

[16] Delorme A, Richard G, Fabre-Thorpe M. Ultra-rapid categorisation of natural scenes does not rely on colour cues: a study in monkeys and humans. Vision Res 2000;40(16):2187–200. 链接1

[17] Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci USA 2007;104(15):6424–9. 链接1

[18] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 2012 Neural Information Processing Systems; 2012 Dec 3–6; Lake Tahoe, NV, USA; 2012.

[19] Kavukcuoglu K, Sermanet P, Boureau YL, Gregor K, Mathieu M, Cun YL. Learning convolutional feature hierarchies for visual recognition. In: Proceedings of the 2010 Neural Information Processing Systems; 2010 Dec 6–11; Vancouver, BC, Canada; 2010.

[20] Deng J, Dong W, Socher R, Li LJ, Li K, Li FF. ImageNet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition; 2009 Jun 20–25; Miami, FL, USA; 2009.

[21] Rajalingham R, Issa EB, Bashivan P, Kar K, Schmidt K, DiCarlo JJ. Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. J Neurosci 2018;38(33):7255–69. 链接1

[22] Oliva A, Schyns PG. Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli. Cognit Psychol 1997;34(1):72–107. 链接1

[23] Schyns PG. Diagnostic recognition: task constraints, object information, and their interactions. Cognition 1998;67(1–2):147–79. 链接1

[24] Malcolm GL, Nuthmann A, Schyns PG. Beyond gist: strategic and incremental information accumulation for scene categorization. Psychol Sci 2014;25 (5):1087–97. 链接1

[25] Qi S, Huang S, Wei P, Zhu SC. Predicting human activities using stochastic grammar. In: Proceedings of the 2017 IEEE International Conference on Computer Vision; 2017 Oct 22–29; Venice, Italy; 2017. p. 1164–72.

[26] Pei M, Jia Y, Zhu SC. Parsing video events with goal inference and intent prediction. In: Proceedings of the 2011 IEEE International Conference on Computer Vision; 2011 Nov 6–13; Barcelona, Spain; 2011.

[27] Gosselin F, Schyns PG. Bubbles: a technique to reveal the use of information in recognition tasks. Vision Res 2001;41(17):2261–71. 链接1

[28] Ikeuchi K, Hebert M. Task oriented vision. In: Proceedings of the 1992 IEEE/ RSJ International Conference on Intelligent Robots and Systems; 1992 Jul 7– 10; Raleigh, NC, USA; 1992. p. 2187–94.

[29] Hartley R, Zisserman A. Multiple view geometry in computer vision. 2nd ed. Cambridge: Cambridge University Press; 2003. 链接1

[30] Ma Y, Soatto S, Kosecka J, Sastry SS. An invitation to 3-D vision: from images to geometric models. New York: Springer Science & Business Media; 2012. 链接1

[31] Gupta A, Hebert M, Kanade T, Blei DM. Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: Proceedings of the 2010 Neural Information Processing Systems; 2010 Dec 6–11; Vancouver, BC, Canada; 2010.

[32] Schwing AG, Fidler S, Pollefeys M, Urtasun R. Box in the box: joint 3D layout and object reasoning from single images. In: In: Proceedings of the 2013 IEEE International Conference on Computer Vision; 2013 Dec 1–8; Sydney, Australia. p. 353–60. 链接1

[33] Choi W, Chao YW, Pantofaru C, Savarese S. Understanding indoor scenes using 3D geometric phrases. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition; 2013 Jun 25–27; Portland, OR, USA; 2013. p. 33–40.

[34] Zhao Y, Zhu SC. Scene parsing by integrating function, geometry and appearance models. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition; 2013 Jun 25–27; Portland, OR, USA; 2013. p. 3119–26.

[35] Liu X, Zhao Y, Zhu SC. Single-view 3D scene reconstruction and parsing by attribute grammar. IEEE Trans Pattern Anal Mach Intell 2018;40(3):710–25. 链接1

[36] Huang S, Qi S, Zhu Y, Xiao Y, Xu Y, Zhu SC. Holistic 3D scene parsing and reconstruction from a single RGB image. In: Proceedings of the 2018 European Conference on Computer Vision; 2018 Sep 8–14; Munich, Germany; 2018.

[37] Chen Y, Huang S, Yuan T, Qi S, Zhu Y, Zhu SC. Holistic++ scene understanding: single-view 3D holistic scene parsing and human pose estimation with human–object interaction and physical commonsense. In: Proceedings of the 2019 IEEE International Conference on Computer Vision; 2019 Oct 27–Nov 2; Seoul, Korea. p. 8648–57. 链接1

[38] Huang S, Chen Y, Yuan T, Qi S, Zhu Y, Zhu SC. PerspectiveNet: 3D object detection from a single RGB image via perspective points. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, editors. Advances in neural information processing systems 32: proceedings of Neural Information Processing Systems 2019; 2019 Dec 8 14; Vancouver, BC, Canada; 2019. p. 8903 15.

[39] Tolman EC. Cognitive maps in rats and men. Psychol Rev 1948;55 (4):189–208. 链接1

[40] Wang RF, Spelke ES. Comparative approaches to human navigation. In: Jeffery KJ, editor. The neurobiology of spatial behaviour. Oxford: Oxford University Press; 2003. p. 119–43. 链接1

[41] Koenderink JJ, van Doorn AJ, Kappers AM, Lappin JS. Large-scale visual frontoparallels under full-cue conditions. Perception 2002;31(12):1467–75. 链接1

[42] Warren WH, Rothman DB, Schnapp BH, Ericson JD. Wormholes in virtual space: from cognitive maps to cognitive graphs. Cognition 2017;166:152–63. 链接1

[43] Gillner S, Mallot HA. Navigation and acquisition of spatial knowledge in a virtual maze. J Cogn Neurosci 1998;10(4):445–63. 链接1

[44] Foo P, Warren WH, Duchon A, Tarr MJ. Do humans integrate routes into a cognitive map? Map-versus landmark-based navigation of novel shortcuts. J Exp Psychol Learn Mem Cogn 2005;31(2):195–215. 链接1

[45] Chrastil ER, Warren WH. From cognitive maps to cognitive graphs. PLoS ONE 2014;9(11):e112544. 链接1

[46] Byrne RW. Memory for urban geography. Q J Exp Psychol 1979;31(1):147–54. 链接1

[47] Tversky B. Distortions in cognitive maps. Geoforum 1992;23(2):131–8. 链接1

[48] Ogle KN. Researches in binocular vision. Philadelphia: WB Saunders; 1950. 链接1

[49] Foley JM. Binocular distance perception. Psychol Rev 1980;87(5):411–34. 链接1

[50] Luneburg RK. Mathematical analysis of binocular vision. Princeton: Princeton University Press; 1947. 链接1

[51] Indow T. A critical review of Luneburg’s model with regard to global structure of visual space. Psychol Rev 1991;98(3):430–53. 链接1

[52] Gogel WC. A theory of phenomenal geometry and its applications. Percept Psychophys 1990;48(2):105–23. 链接1

[53] Glennerster A, Tcheang L, Gilson SJ, Fitzgibbon AW, Parker AJ. Humans ignore motion and stereo cues in favor of a fictional stable world. Curr Biol 2006;16 (4):428–32. 链接1

[54] Hafting T, Fyhn M, Molden S, Moser MB, Moser EI. Microstructure of a spatial map in the entorhinal cortex. Nature 2005;436(7052):801–6. 链接1

[55] Killian NJ, Jutras MJ, Buffalo EA. A map of visual space in the primate entorhinal cortex. Nature 2012;491(7426):761–4. 链接1

[56] O’Keefe J, Nadel L. The hippocampus as a cognitive map. Oxford: Clarendon Press; 1978. 链接1

[57] Jacobs J, Weidemann CT, Miller JF, Solway A, Burke JF, Wei XX, et al. Direct recordings of grid-like neuronal activity in human spatial navigation. Nat Neurosci 2013;16(9):1188–90. 链接1

[58] Fyhn M, Hafting T, Witter MP, Moser EI, Moser MB. Grid cells in mice. Hippocampus 2008;18(12):1230–8. 链接1

[59] Doeller CF, Barry C, Burgess N. Evidence for grid cells in a human memory network. Nature 2010;463(7281):657–61. 链接1

[60] Yartsev MM, Witter MP, Ulanovsky N. Grid cells without theta oscillations in the entorhinal cortex of bats. Nature 2011;479(7371):103–7. 链接1

[61] Gao R, Xie J, Zhu SC, Wu Y. Learning grid cells as vector representation of selfposition coupled with matrix representation of self-motion. In: Proceedings of the 2019 International Conference on Learning Representations; 2019 May 6–9; New Orleans, LA, USA; 2019.

[62] Xie J, Gao R, Nijkamp E, Zhu S, Wu YN. Representation learning: a statistical perspective. Annu Rev Stat Appl 2020:7. 链接1

[63] Gootjes-Dreesbach L, Pickup LC, Fitzgibbon AW, Glennerster A. Comparison of view-based and reconstruction-based models of human navigational strategy. J Vis 2017;17(9):11. 链接1

[64] Vuong J, Fitzgibbon AW, Glennerster A. Human pointing errors suggest a flattened, task-dependent representation of space. bioRxiv 2018:390088. 链接1

[65] Choi H, Scholl BJ. Perceiving causality after the fact: postdiction in the temporal dynamics of causal perception. Perception 2006;35(3):385–99. 链接1

[66] Scholl BJ, Nakayama K. Illusory causal crescents: misperceived spatial relations due to perceived causality. Perception 2004;33(4):455–69. 链接1

[67] Scholl BJ, Gao T. Perceiving animacy and intentionality: visual processing or higher-level judgment. In: Rutherford MD, Kuhlmeier VA, editors. Social perception: detection and interpretation of animacy, agency, and intention. Cambridge: The MIT Press; 2013. p. 197–229. 链接1

[68] Scholl BJ. Objects and attention: the state of the art. Cognition 2001;80(1– 2):1–46. 链接1

[69] Vul E, Alvarez G, Tenenbaum JB, Black MJ. Explaining human multiple object tracking as resource-constrained approximate inference in a dynamic probabilistic model. In: Proceedings of the 2009 Neural Information Processing Systems; 2009 Dec 7–10; Vancouver, BC, Canada; 2009.

[70] Battaglia PW, Hamrick JB, Tenenbaum JB. Simulation as an engine of physical scene understanding. Proc Natl Acad Sci USA 2013;110(45):18327–32. 链接1

[71] Hamrick J, Battaglia P, Tenenbaum JB. Internal physics models guide probabilistic judgments about object dynamics. In: Proceedings of the 2011 Annual Meeting of the Cognitive Science Society; 2011 Jul 20–23; Boston, MA, USA; 2011.

[72] Xie D, Shu T, Todorovic S, Zhu SC. Learning and inferring ‘‘dark matter” and predicting human intents and trajectories in videos. IEEE Trans Pattern Anal Mach Intell 2018;40(7):1639–52. 链接1

[73] Ullman T, Stuhlmüller A, Goodman N, Tenenbaum JB. Learning physics from dynamical scenes. In: Proceedings of the 2014 Annual Meeting of the Cognitive Science Society; 2014 Jul 23–26; Quebec City, QC, Canada; 2014.

[74] Gerstenberg T, Tenenbaum JB. Intuitive theories. In: Waldmann MR, editor. Oxford handbook of causal reasoning. New York: Oxford University Press; 2017. p. 515–48. 链接1

[75] Newton I, Colson J. The method of fluxions and infinite series; with its application to the geometry of curve-lines. London: Henry Woodfall; 1736. 链接1

[76] Maclaurin C. A treatise of fluxions: in two books. München: Ruddimans; 1742. 链接1

[77] Mueller ET. Commonsense reasoning: an event calculus based approach. 2nd ed. Amsterdam: Morgan Kaufmann; 2014. 链接1

[78] Mueller ET. Daydreaming in humans and machines: a computer model of the stream of thought. Norwood: Ablex Publishing Corporation; 1990. 链接1

[79] Michotte A. The perception of causality. 2nd ed. London: Methuen & Co; 1963. 链接1

[80] Carey S. The origin of concepts. New York: Oxford University Press; 2009. 链接1

[81] Farhadi A, Endres I, Hoiem D, Forsyth D. Describing objects by their attributes. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition; 2009 Jun 20–25; Miami, FL, USA; 2009. p. 1778–85.

[82] Parikh D, Grauman K. Relative attributes. In: Proceedings of the 2011 International Conference on Computer Vision; 2011 Nov 6–13; Barcelona, Spain; 2011. p. 503–10.

[83] Laptev I, Marszałek M, Schmid C, Rozenfeld B. Learning realistic human actions from movies. In: Proceedings of the 2008 Conference on Computer Vision and Pattern Recognition; 2008 Jun 24–26; Anchorage, AK, USA; 2008.

[84] Yao B, Zhu SC. Learning deformable action templates from cluttered videos. Proceedings of the 2009 International Conference on Computer Vision; 2009 Sep 29–Oct 2; Kyoto, Japan, 2009. 链接1

[85] Yao BZ, Nie BX, Liu Z, Zhu SC. Animated pose templates for modeling and detecting human actions. IEEE Trans Pattern Anal Mach Intell 2013;36 (3):436–52. 链接1

[86] Wang J, Liu Z, Wu Y, Yuan J. Mining actionlet ensemble for action recognition with depth cameras. Proceedings of the 2012 Conference on Computer Vision and Pattern Recognition; 2012 Jun 16–21; Providence, RI, USA, 2012. 链接1

[87] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the 2005 Conference on Computer Vision and Pattern Recognition; 2005 Jun 20–26; San Diego, CA, USA; 2005.

[88] Sadanand S, Corso JJ. Action bank: a high-level representation of activity in video. Proceedings of the 2012 Conference on Computer Vision and Pattern Recognition; 2012 Jun 16–21; Providence, RI, USA, 2012. 链接1

[89] Fleming RW, Barnett-Cowan M, Bülthoff HH. Perceived object stability is affected by the internal representation of gravity. Perception 2010;39:109. 链接1

[90] Zago M, Lacquaniti F. Visual perception and interception of falling objects: a review of evidence for an internal model of gravity. J Neural Eng 2005;2(3): S198–208. 链接1

[91] Kellman PJ, Spelke ES. Perception of partly occluded objects in infancy. Cognit Psychol 1983;15(4):483–524. 链接1

[92] Baillargeon R, Spelke ES, Wasserman S. Object permanence in five-month-old infants. Cognition 1985;20(3):191–208. 链接1

[93] Johnson SP, Aslin RN. Perception of object unity in 2-month-old infants. Dev Psychol 1995;31(5):739–45. 链接1

[94] Needham A. Factors affecting infants’ use of featural information in object segregation. Curr Dir Psychol Sci 1997;6(2):26–33. 链接1

[95] Baillargeon R. Infants’ physical world. Curr Dir Psychol Sci 2004;13(3):89–94. 链接1

[96] Zheng B, Zhao Y, Yu JC, Ikeuchi K, Zhu SC. Detecting potential falling objects by inferring human action and natural disturbance. In: Proceedings of the 2014 International Conference on Robotics and Automation; 2014 May 31– Jun 7; Hong Kong, China; 2014.

[97] Zheng B, Zhao Y, Yu JC, Ikeuchi K, Zhu SC. Beyond point clouds: scene understanding by reasoning geometry and physics. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition; 2013 Jun 23–28; Portland, OR, USA; 2013. p. 3127–34.

[98] Zheng B, Zhao Y, Yu JC, Ikeuchi K, Zhu SC. Scene understanding by reasoning stability and safety. Int J Comput Vis 2015;112(2):221–38. 链接1

[99] Qi S, Zhu Y, Huang S, Jiang C, Zhu SC. Human-centric indoor scene synthesis using stochastic grammar. In: Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition; 2018 Jun 18–22; Salt Lake City, UT, USA; 2018.

[100] Huang S, Qi S, Xiao Y, Zhu Y, Wu YN, Zhu SC. Cooperative holistic scene understanding: unifying 3D object, layout, and camera pose estimation. In: Proceedings of the 2018 Neural Information Processing Systems; 2018 Dec 3– 8; Montreal, QC, Canada; 2018.

[101] Gupta A, Satkin S, Efros AA, Hebert M. From 3D scene geometry to human workspace. In: Proceedings of the 2011 Conference on Computer Vision and Pattern Recognition; 2011 Jun 20–25; Providence, RI, USA; 2011.

[102] Iacoboni M, Molnar-Szakacs I, Gallese V, Buccino G, Mazziotta JC, Rizzolatti G. Grasping the intentions of others with one’s own mirror neuron system. PLoS Biol 2005;3(3):e79. 链接1

[103] Csibra G, Gergely G. ‘Obsessed with goals’: functions and mechanisms of teleological interpretation of actions in humans. Acta Psychol 2007;124 (1):60–78. 链接1

[104] Baker CL, Tenenbaum JB, Saxe RR. Goal inference as inverse planning. In: Proceedings of the 2007 Annual Meeting of the Cognitive Science Society; 2007 Aug 1–4; Austin, TX, USA; 2007.

[105] Baker CL, Goodman ND, Tenenbaum JB. Theory-based social goal inference. In: Proceedings of the 2008 Annual Meeting of the Cognitive Science Society; 2008 Jul 23–27; Washington, DC, USA; 2008. p. 1447–52.

[106] Hoai M, De la Torre F. Max-margin early event detectors. Int J Comput Vis 2014;107(2):191–202. 链接1

[107] Turek MW, Hoogs A, Collins R. Unsupervised learning of functional categories in video scenes. In: Proceedings of the 2010 European Conference on Computer Vision; 2010 Sep 5–11; Heraklion, Greece. p. 664–77. 链接1

[108] Grabner H, Gall J, van Gool L. What makes a chair a chair? In: Proceedings of the 2011 Conference on Computer Vision and Pattern Recognition; 2011 Jun 20–25; Providence, RI, USA; 2011. p. 1529–36.

[109] Jia Z, Gallagher A, Saxena A, Chen T. 3D-based reasoning with blocks, support, and stability. In: Proceedings of the 2013 Conference on Computer Vision and Pattern Recognition; 2013 Jun 23–28; Portland, OR, USA; 2013. p. 1–8.

[110] Jiang Y, Koppula H, Saxena A. Hallucinated humans as the hidden context for labeling 3D scenes. In: Proceedings of the 2013 Conference on Computer Vision and Pattern Recognition; 2013 Jun 23–28; Portland, OR, USA; 2013. p. 2993–3000.

[111] Shu T, Thurman SM, Chen D, Zhu SC, Lu H. Critical features of joint actions that signal human interaction. In: Proceedings of the 2016 Annual Meeting of the Cognitive Science Society; 2016 Aug 10–13; Philadelphia, PA, USA; 2016.

[112] Shu T, Peng Y, Fan L, Lu H, Zhu SC. Perception of human interaction based on motion trajectories: from aerial videos to decontextualized animations. Top Cogn Sci 2018;10(1):225–41. 链接1

[113] Shu T, Peng Y, Lu H, Zhu SC. Partitioning the perception of physical and social events within a unified psychological space. In: Proceedings of the 2019 Annual Meeting of the Cognitive Science Society; 2019 Jul 24–27; Montreal, QC, Canada; 2019.

[114] Baker C, Saxe R, Tenenbaum J. Bayesian theory of mind: modeling joint beliefdesire attribution. In: Proceedings of the 2011 Annual Meeting of the Cognitive Science Society; 2011 Jul 20–23; Boston, MA, USA; 2011.

[115] Zhao Y, Holtzen S, Gao T, Zhu SC. Represent and infer human theory of mind for human–robot interaction. Proceedings of the 2015 AAAI Fall Symposium Series; 2015 Nov 12–14; Arlington, VA, USA, 2015. 链接1

[116] Nisan N, Ronen A. Algorithmic mechanism design. Games Econ Behav 2001;35(1–2):166–96. 链接1

[117] Bentham J. An introduction to the principles of morals. London: Athlone; 1935. 链接1

[118] Nishant S. Utility learning, non-Markovian planning, and task-oriented programming language [dissertation]. Los Angeles: University of California; 2019. 链接1

[119] Robb AA. Optical geometry of motion: a new view of the theory of relativity. W Heffer 1911. 链接1

[120] Malament DB. The class of continuous timelike curves determines the topology of spacetime. J Math Phys 1977;18(7):1399–404. 链接1

[121] Robb AA. Geometry of time and space. New York: Cambridge University Press; 2014. 链接1

[122] Corrigan R, Denton P. Causal understanding as a developmental primitive. Dev Rev 1996;16(2):162–202. 链接1

[123] White PA. Causal processing: origins and development. Psychol Bull 1988;104(1):36–52. 链接1

[124] Chen YC, Scholl BJ. The perception of history: seeing causal history in static shapes induces illusory motion perception. Psychol Sci 2016;27(6):923–30. 链接1

[125] Holyoak KJ, Cheng PW. Causal learning and inference as a rational process: the new synthesis. Annu Rev Psychol 2011;62(1):135–63. 链接1

[126] Shanks DR, Dickinson A. Associative accounts of causality judgment. Psychol Learn Motiv 1988;21:229–61. 链接1

[127] Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical conditioning II: current research and theory. New York: Appleton-Century-Crofts; 1972. p. 64–99. 链接1

[128] Lu H, Yuille AL, Liljeholm M, Cheng PW, Holyoak KJ. Bayesian generic priors for causal learning. Psychol Rev 2008;115(4):955–84. 链接1

[129] Edmonds M, Qi S, Zhu Y, Kubricht J, Zhu SC, Lu H. Decomposing human causal learning: bottom-up associative learning and top-down schema reasoning. In: Proceedings of the 2019 Annual Meeting of the Cognitive Science Society; 2019 Jul 24–27; Montreal, QC, Canada; 2019.

[130] Waldmann MR, Holyoak KJ. Predictive and diagnostic learning within causal models: asymmetries in cue competition. J Exp Psychol Gen 1992;121 (2):222–36. 链接1

[131] Edmonds M, Kubricht J, Summers C, Zhu Y, Rothrock B, Zhu SC, et al. Human causal transfer: challenges for deep reinforcement learning. In: Proceedings of the 2018 Annual Meeting of the Cognitive Science Society; 2018 Jul 25–28; Madison, CT, USA; 2018.

[132] Cheng PW. From covariation to causation: a causal power theory. Psychol Rev 1997;104(2):367–405. 链接1

[133] Scholl BJ, Tremoulet PD. Perceptual causality and animacy. Trends Cogn Sci 2000;4(8):299–309. 链接1

[134] Rolfs M, Dambacher M, Cavanagh P. Visual adaptation of the perception of causality. Curr Biol 2013;23(3):250–4. 链接1

[135] McCollough C. Color adaptation of edge-detectors in the human visual system. Science 1965;149(3688):1115–6. 链接1

[136] Kominsky JF, Scholl BJ. Retinotopically specific visual adaptation reveals the structure of causal events in perception. In: Proceedings of the 2018 Annual Meeting of the Cognitive Science Society; 2018 Jul 25–28; Madison, CT, USA; 2018.

[137] Gerstenberg T, Peterson MF, Goodman ND, Lagnado DA, Tenenbaum JB. Eyetracking causality. Psychol Sci 2017;28(12):1731–44. 链接1

[138] Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature 2015;518 (7540):529–33. 链接1

[139] Schulman J, Levine S, Abbeel P, Jordan M, Moritz P. Trust region policy optimization. In: Proceedings of the 2015 International Conference on Machine Learning; 2015 Jul 6–11; Lille, France; 2015.

[140] Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016;529(7587):484–9. 链接1

[141] Levine S, Finn C, Darrell T, Abbeel P. End-to-end training of deep visuomotor policies. J Mach Learn Res 2016;17(1):1334–73. 链接1

[142] Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. 2017. arXiv:1707.06347.

[143] Zhang C, Vinyals O, Munos R, Bengio S. A study on overfitting in deep reinforcement learning. 2018. arXiv:1804.06893.

[144] Kansky K, Silver T, Mély DA, Eldawy M, Lázaro-Gredilla M, Lou X, et al. Schema networks: zero-shot transfer with a generative causal model of intuitive physics. 2017. arXiv:1706.04317.

[145] Edmonds M, Ma X, Qi S, Zhu Y, Lu H, Zhu SC. Theory-based causal transfer: integrating instance-level induction and abstract-level structure learning. 2019. arXiv:1911.11185.

[146] Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 1974;66(5):688–701. 链接1

[147] Imbens GW, Rubin DB. Causal inference for statistics, social, and biomedical sciences. New York: Cambridge University Press; 2015. 链接1

[148] Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983;70(1):41–55. 链接1

[149] Pearl J. Causality: models, reasoning and inference. New York: Cambridge University Press; 2000. 链接1

[150] Spirtes P, Glymour C, Scheines R, Heckerman D, Meek C, Cooper GF, et al. Causation, prediction, and search. 2nd ed. Cambridge: MIT Press; 2000. 链接1

[151] Chickering DW. Optimal structure identification with greedy search. J Mach Learn Res 2002;3:507–54. 链接1

[152] Peters J, Mooij JM, Janzing D, Schölkopf B. Causal discovery with continuous additive noise models. J Mach Learn Res 2014;15(1):2009–53. 链接1

[153] He YB, Geng Z. Active learning of causal networks with intervention experiments and optimal designs. J Mach Learn Res 2008;9(11):2523–47. 链接1

[154] Bramley NR, Dayan P, Griffiths TL, Lagnado DA. Formalizing Neurath’s ship: approximate algorithms for online causal learning. Psychol Rev 2017;124 (3):301–38. 链接1

[155] Fisher RA. The design of experiments. London: Oliver and Boyd; 1935. 链接1

[156] Fire A, Zhu SC. Learning perceptual causality from video. ACM Trans Intell Syst Technol 2016;7(2):23. 链接1

[157] Fire A, Zhu SC. Using causal induction in humans to learn and infer causality from video. In: Proceedings of the 2013 Annual Meeting of the Cognitive Science Society; 2013 Jul 31–Aug 3; Berlin, Germany; 2013.

[158] Zhu SC, Wu YN, Mumford D. Minimax entropy principle and its application to texture modeling. Neural Comput 1997;9(8):1627–60. 链接1

[159] Xu Y, Qin L, Liu X, Xie J, Zhu SC. A causal and–or graph model for visibility fluent reasoning in tracking interacting objects. In: Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition; 2018 Jun 18–22; Salt Lake City, UT, USA; 2018. p. 2178–87.

[160] Xiong C, Shukla N, Xiong W, Zhu SC. Robot learning with a spatial, temporal, and causal and–or graph. In: Proceedings of the 2016 IEEE International Conference on Robotics and Automation; 2016 May 16–21; Stockholm, Sweden; 2016.

[161] McCloskey M, Washburn A, Felch L. Intuitive physics: the straight-down belief and its origin. J Exp Psychol Learn Mem Cogn 1983;9(4):636–49. 链接1

[162] McCloskey M, Caramazza A, Green B. Curvilinear motion in the absence of external forces: naive beliefs about the motion of objects. Science 1980;210 (4474):1139–41. 链接1

[163] DiSessa AA. Unlearning Aristotelian physics: a study of knowledge-based learning. Cogn Sci 1982;6(1):37–75. 链接1

[164] Kaiser MK, Jonides J, Alexander J. Intuitive reasoning about abstract and familiar physics problems. Mem Cognit 1986;14(4):308–12. 链接1

[165] Smith KA, Battaglia P, Vul E. Consistent physics underlying ballistic motion prediction. In: Proceedings of the 2013 Annual Meeting of the Cognitive Science Society; 2013 Jul 31–Aug 3; Berlin, Germany; 2013.

[166] Kaiser MK, Proffitt DR, Whelan SM, Hecht H. Influence of animation on dynamical judgments. J Exp Psychol Hum Percept Perform 1992;18 (3):669–89. 链接1

[167] Kaiser MK, Proffitt DR, Anderson K. Judgments of natural and anomalous trajectories in the presence and absence of motion. J Exp Psychol Learn Mem Cogn 1985;11(4):795–803. 链接1

[168] Kim IK, Spelke ES. Perception and understanding of effects of gravity and inertia on object motion. Dev Sci 1999;2(3):339–62. 链接1

[169] Piaget J, Cook MT. The origins of intelligence in children. New York: International Universities Press; 1952. 链接1

[170] Piaget J, Cook MT. The construction of reality in the child. New York: Basic Books; 1954. 链接1

[171] Hespos SJ, Baillargeon R. Décalage in infants’ knowledge about occlusion and containment events: converging evidence from action tasks. Cognition 2006;99(2):B31–41. 链接1

[172] Hespos SJ, Baillargeon R. Young infants’ actions reveal their developing knowledge of support variables: converging evidence for violation-ofexpectation findings. Cognition 2008;107(1):304–16. 链接1

[173] Bower TGR. Development in infancy. New York: WH Freeman; 1974. 链接1

[174] Leslie AM, Keeble S. Do six-month-old infants perceive causality? Cognition 1987;25(3):265–88. 链接1

[175] Luo Y, Baillargeon R, Brueckner L, Munakata Y. Reasoning about a hidden object after a delay: evidence for robust representations in 5-month-old infants. Cognition 2003;88(3):B23–32. 链接1

[176] Baillargeon R, Li J, Ng W, Yuan S. An account of infants’ physical reasoning. In: Woodward A, Needham A, editors. Learning and the infant mind. New York: Oxford University Press; 2009. p. 66–116. 链接1

[177] Baillargeon R. The acquisition of physical knowledge in infancy: a summary in eight lessons. Blackwell Handb Child Cognit Dev 2002;1:46–83. 链接1

[178] Achinstein P. The nature of explanation. New York: Oxford University Press; 1983. 链接1

[179] Fischer J, Mikhael JG, Tenenbaum JB, Kanwisher N. Functional neuroanatomy of intuitive physical inference. Proc Natl Acad Sci USA 2016;113(34): E5072–81. 链接1

[180] Ullman TD, Spelke E, Battaglia P, Tenenbaum JB. Mind games: game engines as an architecture for intuitive physics. Trends Cogn Sci 2017;21 (9):649–65. 链接1

[181] Bates C, Yildirim I, Tenenbaum JB, Battaglia PW. Humans predict liquid dynamics using probabilistic simulation. In: Proceedings of the 2015 Annual Meeting of the Cognitive Science Society; 2015 Jul 23–25; Pasadena, CA, USA; 2015.

[182] Kubricht J, Jiang C, Zhu Y, Zhu SC, Terzopoulos D, Lu H. Probabilistic simulation predicts human performance on viscous fluid-pouring problem. In: Proceedings of the 2016 Annual Meeting of the Cognitive Science Society; 2016 Aug 10–13; Philadelphia, PA, USA; 2016.

[183] Kubricht J, Zhu Y, Jiang C, Terzopoulos D, Zhu SC, Lu H. Consistent probabilistic simulation underlying human judgment in substance dynamics. In: Proceedings of the 2017 Annual Meeting of the Cognitive Science Society; 2017 Jul 26–29; London, UK; 2017.

[184] Kubricht JR, Holyoak KJ, Lu H. Intuitive physics: current research and controversies. Trends Cogn Sci 2017;21(10):749–59. 链接1

[185] Mumford D, Desolneux A. Pattern theory: the stochastic analysis of realworld signals. Boca Raton: CRC Press; 2010. 链接1

[186] Mumford D. Pattern theory: a unifying perspective. In: Joseph A, Mignot F, Murat F, Prum B, Rentschler R, editors. First European congress of mathematics. Heidelberg: Springer; 1994. p. 187–224. 链接1

[187] Julesz B. Visual pattern discrimination. IRE Trans Inf Theory 1962;8(2):84–92. 链接1

[188] Zhu SC, Wu Y, Mumford D. Filters, random fields and maximum entropy (frame): towards a unified theory for texture modeling. Int J Comput Vis 1998;27(2):107–26. 链接1

[189] Julesz B. Textons, the elements of texture perception, and their interactions. Nature 1981;290(5802):91–7. 链接1

[190] Zhu SC, Guo CE, Wang Y, Xu Z. What are textons? Int J Comput Vis 2005;62 (1–2):121–43. 链接1

[191] Guo C, Zhu SC, Wu YN. Towards a mathematical theory of primal sketch and sketchability. In: Proceedings of the 9th IEEE International Conference on Computer Vision; 2003 Oct 13–16; Nice, France; 2003.

[192] Guo C, Zhu SC, Wu YN. Primal sketch: integrating structure and texture. Comput Vis Image Underst 2007;106(1):5–19. 链接1

[193] Nitzberg M, Mumford DB. The 2.1-D sketch. In: Proceedings of the 3rd International Conference on Computer Vision; 1990 Dec 4–7; Osaka, Japan; 1990.

[194] Wang JYA, Adelson EH. Layered representation for motion analysis. In: Proceedings of the 1993 IEEE Conference on Computer Vision and Pattern Recognition; 1993 Jun 15–17; New York, NY, USA; 1993.

[195] Wang JA, Adelson EH. Representing moving images with layers. IEEE Trans Image Process 1994;3(5):625–38. 链接1

[196] Marr D, Nishihara HK. Representation and recognition of the spatial organization of three-dimensional shapes. Proc R Soc Lond B Biol Sci 1978;200(1140):269–94. 链接1

[197] Binford I. Visual perception by computer. In: Proceedings of the 1971 IEEE Conference of Systems and Control; 1971 Dec 15–17; Miami Beach, FL, USA; 1971.

[198] Brooks RA. Symbolic reasoning among 3-D models and 2-D images. Artif Intell 1981;17(1–3):285–348. 链接1

[199] Kanade T. Recovery of the three-dimensional shape of an object from a single view. Artif Intell 1981;17(1–3):409–60. 链接1

[200] Broadbent D. A question of levels: comment on McClelland and Rumelhart. J Exp Psychol Gen 1985;114(2):189–92. 链接1

[201] Lowe D. Perceptual organization and visual recognition. Springer Science & Business Media; 1985. Boston. 链接1

[202] Pentland AP. Perceptual organization and the representation of natural form. In: Fischler MA, Firschein O, editors. Readings in computer vision. Amsterdam: Elsevier; 1987. p. 680–99. 链接1

[203] Wertheimer M. [Experimental studies on the seeing of motion]. Z Psychol Z Angew Psychol 1912;61(3):161–265. German.

[204] Wagemans J, Elder JH, Kubovy M, Palmer SE, Peterson MA, Singh M, et al. A century of Gestalt psychology in visual perception: I. perceptual grouping and figure–ground organization. Psychol Bull 2012;138(6):1172–217. 链接1

[205] Wagemans J, Feldman J, Gepshtein S, Kimchi R, Pomerantz JR, van der Helm PA, et al. A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations. Psychol Bull 2012;138(6):1218–52. 链接1

[206] Köhler W. [The physical Gestalten at rest and in steady state]. Braunschweig: Vieweg und Sohn.; 1920. German.

[207] Köhler W. Physical Gestalten. In: Ellis WD, editor. A source book of Gestalt psychology. London: Routledge & Kegan Paul; 1938. p. 17–54. 链接1

[208] Wertheimer M. [Investigations in gestalt theory: II. laws of organization in perceptual forms]. Psychol Forsch 1923;4(1):301–50. German.

[209] Wertheimer M. Laws of organization in perceptual forms. In: Ellis WD, editor. A source book of Gestalt psychology. London: Routledge & Kegan Paul; 1938. p. 71–94. 链接1

[210] Koffka K. Principles of Gestalt psychology. London: Routledge; 1935. 链接1

[211] Waltz D. Understanding line drawings of scenes with shadows. In: Winston PH, Horn B, editors. The psychology of computer vision. New York: McGrawHill Companies; 1975. 链接1

[212] Barrow HG, Tenenbaum JM. Interpreting line drawings as three-dimensional surfaces. Artif Intell 1981;17(1–3):75–116. 链接1

[213] Lowe DG. Three-dimensional object recognition from single two-dimensional images. Artif Intell 1987;31(3):355–95. 链接1

[214] Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis 2004;60(2):91–110. 链接1

[215] Solso RL, MacLin MK, MacLin OH. Cognitive psychology. 7th ed. New York: Pearson Education; 2005. 链接1

[216] Dayan P, Hinton GE, Neal RM, Zemel RS. The Helmholtz machine. Neural Comput 1995;7(5):889–904. 链接1

[217] Roberts LG. Machine perception of three-dimensional solids [dissertation]. Cambridge: Massachusetts Institute of Technology; 1963. 链接1

[218] Biederman I, Mezzanotte RJ, Rabinowitz JC. Scene perception: detecting and judging objects undergoing relational violations. Cognit Psychol 1982;14 (2):143–77. 链接1

[219] Blum M, Griffith A, Neumann B. A stability test for configurations of blocks Technical report. Cambridge: Massachusetts Institute of Technology; 1970. 链接1

[220] Brand M, Cooper P, Birnbaum L. Seeing physics, or: physics is for prediction. In: Proceedings of the Workshop on Physics-based Modeling in Computer Vision; 1995 Jun 18–19; Cambridge, MA, USA; 1995. p. 144–50.

[221] Gupta A, Efros AA, Hebert M. Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Proceedings of the 2010 European Conference on Computer Vision; 2010 Sep 5–11; Heraklion, Greece; 2010. p. 482–96.

[222] Hedau V, Hoiem D, Forsyth D. Recovering the spatial layout of cluttered rooms. In: Proceedings of the 2009 International Conference on Computer Vision; 2009 Sep 29–Oct 2; Kyoto, Japan; 2009. p. 1849–56.

[223] Lee DC, Hebert M, Kanade T. Geometric reasoning for single image structure recovery. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition; 2009 Jun 20–25; Miami, FL, USA; 2009. p. 2136–43.

[224] Hedau V, Hoiem D, Forsyth D. Recovering free space of indoor scenes from a single image. In: Proceedings of the 2012 Conference on Computer Vision and Pattern Recognition; 2012 Jun 16–21; Providence, RI, USA; 2012. p. 2807–14.

[225] Silberman N, Hoiem D, Kohli P, Fergus R. Indoor segmentation and support inference from RGBD images. In: Proceedings of the 2012 European Conference on Computer Vision; 2012 Oct 7–13; Florence, Italy; 2012. p. 746–60.

[226] Schwing AG, Hazan T, Pollefeys M, Urtasun R. Efficient structured prediction for 3D indoor scene understanding. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition; 2012 Jun 16–21; Providence, RI, USA; 2012. p. 2815–22.

[227] Guo R, Hoiem D. Support surface prediction in indoor scenes. In: Proceedings of the 2013 IEEE International Conference on Computer Vision; 2013 Dec 1– 8; Sydney, NSW, Australia; 2013. p. 2144–51.

[228] Shao T, Monszpart A, Zheng Y, Koo B, Xu W, Zhou K, et al. Imagining the unseen: stability-based cuboid arrangements for scene understanding. ACM Trans Graph 2014;33(6):1–11. 链接1

[229] Du Y, Liu Z, Basevi H, Leonardis A, Freeman B, Tenenbaum J, et al. Learning to exploit stability for 3D scene parsing. In: Proceedings of the 2018 Neural Information Processing Systems; 2018 Dec 3–8; Montreal, QC, Canada; 2018.

[230] Wu J, Yildirim I, Lim JJ, Freeman B, Tenenbaum J. Galileo: perceiving physical object properties by integrating a physics engine with deep learning. In: Proceedings of the 2015 Neural Information Processing Systems; 2015 Dec 7– 12; Montreal, QC, Canada; 2015.

[231] Wu J, Lim JJ, Zhang H, Tenenbaum JB, Freeman WT. Physics 101: learning physical object properties from unlabeled videos. In: Proceedings of the 2016 British Machine Vision Conference; 2016 Sep 19–22; York, UK; 2016.

[232] Zhu Y, Zhao Y, Zhu SC. Understanding tools: task-oriented object modeling, learning and recognition. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition; 2015 Jun 7–12; Boston, MA, USA; 2015. p. 2855–64.

[233] Zhu Y, Jiang C, Zhao Y, Terzopoulos D, Zhu SC. Inferring forces and learning human utilities from videos. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition; 2016 Jun 26–Jul 1; Las Vegas, NV, USA; 2016.

[234] Brubaker MA, Fleet DJ. The kneed walker for human pose tracking. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition; 2008 Jun 23–28; Anchorage, AK, USA; 2008. p. 1–8.

[235] Brubaker MA, Sigal L, Fleet DJ. Estimating contact dynamics. In: Proceedings of the 2009 IEEE International Conference on Computer Vision; 2009 Sep 29– Oct 2; Kyoto, Japan; 2009. p. 2389–96.

[236] Brubaker MA, Fleet DJ, Hertzmann A. Physics-based person tracking using the anthropomorphic walker. Int J Comput Vis 2010;87(1–2):140–55. 链接1

[237] Pham TH, Kheddar A, Qammaz A, Argyros AA. Towards force sensing from vision: observing hand-object interactions to infer manipulation forces. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition; 2015 Jun 7–12; Boston, MA, USA. p. 2810–9. 链接1

[238] Wang Y, Min J, Zhang J, Liu Y, Xu F, Dai Q, et al. Video-based hand manipulation capture through composite motion control. ACM Trans Graph 2013;32(4):43. 链接1

[239] Zhao W, Zhang J, Min J, Chai J. Robust realtime physics-based motion control for human grasping. ACM Trans Graph 2013;32(6):207. 链接1

[240] Gibson JJ. The perception of the visual world. Boston: Houghton Mifflin; 1950. 链接1

[241] Gibson JJ. The senses considered as perceptual systems. Boston: Houghton Mifflin; 1966. 链接1

[242] Nelson K. Concept, word, and sentence: interrelations in acquisition and development. Psychol Rev 1974;81(4):267–85. 链接1

[243] Gibson JJ. The theory of affordances. In: Gieseking JJ, Mangold W, Katz C, Low S, Saegert S, editors. The people, place, and space reader. New York: Routledge; 2014. 链接1

[244] Hassanin M, Khan S, Tahtali M. Visual affordance and function understanding: a survey. 2018. arXiv:1807.06775.

[245] Min H, Yi C, Luo R, Zhu J, Bi S. Affordance research in developmental robotics: a survey. IEEE Trans Cogn Dev Syst 2016;8(4):237–55. 链接1

[246] Bohg J, Morales A, Asfour T, Kragic D. Data-driven grasp synthesis—a survey. IEEE Trans Robot 2014;30(2):289–309. 链接1

[247] Yamanobe N, Wan W, Ramirez-Alpizar IG, Petit D, Tsuji T, Akizuki S, et al. A brief review of affordance in robotic manipulation research. Adv Robot 2017;31(19–20):1086–101. 链接1

[248] Kohler W. The mentality of apes. New York: Routledge; 1925. 链接1

[249] Thorpe WH. Learning and instinct in animals. Cambridge: Harvard University Press; 1956. 链接1

[250] Oakley KP. Man the tool-maker. Chicago: University of Chicago Press; 1968. 链接1

[251] Goodall J. The chimpanzees of Gombe: patterns of behavior. Cambridge: Bellknap Press of the Harvard University Press; 1986. 链接1

[252] Whiten A, Goodall J, McGrew WC, Nishida T, Reynolds V, Sugiyama Y, et al. Cultures in chimpanzees. Nature 1999;399(6737):682–5. 链接1

[253] Byrne R, Whiten A, editors. Machiavellian intelligence: social expertise and the evolution of intellect in monkeys, apes, and humans. New York: Oxford University Press; 1988. 链接1

[254] Santos LR, Rosati A, Sproul C, Spaulding B, Hauser MD. Means-means-end tool choice in cotton-top tamarins (Saguinus oedipus): finding the limits on primates’ knowledge of tools. Anim Cogn 2005;8:236–46. 链接1

[255] Hunt GR. Manufacture and use of hook-tools by New Caledonian crows. Nature 1996;379(6562):249–51. 链接1

[256] Weir AA, Chappell J, Kacelnik A. Shaping of hooks in New Caledonian crows. Science 2002;297(5583):981. 链接1

[257] McCoy DE, Schiestl M, Neilands P, Hassall R, Gray RD, Taylor AH. New Caledonian crows behave optimistically after using tools. Curr Biol 2019;29 (16):2737–42. 链接1

[258] Beck BB. Animal tool behavior: the use and manufacture of tools by animals. New York: Garland STPM Press; 1980. 链接1

[259] Bird CD, Emery NJ. Insightful problem solving and creative tool modification by captive nontool-using rooks. Proc Natl Acad Sci USA 2009;106 (25):10370–5. 链接1

[260] Freeman P, Newell A. A model for functional reasoning in design. In: Proceedings of the 1971 International Joint Conference on Artificial Intelligence; 1971 Sep 1–3; London, England; 1971.

[261] Winston PH. Learning structural descriptions from examples Technical report. Cambridge: Massachusetts Institute of Technology; 1970. 链接1

[262] Winston PH, Binford TO, Katz B, Lowry M. Learning physical descriptions from functional definitions, examples, and precedents. Proceedings of the 1983 AAAI Conference on Artificial Intelligence; 1983 Aug 22–26; Washington, DC, USA, 1983. 链接1

[263] Brady M, Agre PE. The mechanic’s mate. In: Proceedings of the 6th European Conference on Artificial Intelligence; 1984 Sep 5–7; Pisa, Italy; 1984. p. 79– 94

[264] Connell JH, Brady M. Generating and generalizing models of visual objects. Artif Intell 1987;31(2):159–83. 链接1

[265] Ho SB. Representing and using functional definitions for visual recognition [dissertation]. Madison: The University of Wisconsin-Madison; 1987. 链接1

[266] DiManzo M, Trucco E, Giunchiglia F, Ricci F. FUR: understanding functional reasoning. Int J Intell Syst 1989;4(4):431–57. 链接1

[267] Minsky M. The society of mind. New York: Simon and Schuster Paperbacks; 1988. 链接1

[268] Stark L, Bowyer K. Achieving generalized object recognition through reasoning about association of function to structure. IEEE Trans Pattern Anal Mach Intell 1991;13(10):1097–104. 链接1

[269] Liu Z, Freeman WT, Tenenbaum JB, Wu J. Physical primitive decomposition. In: Proceedings of the 2018 European Conference on Computer Vision; 2018 Sep 8–14; Munich, Germany; 2018.

[270] Baber C. Cognition and tool use: forms of engagement in human and animal use of tools. London: CRC Press; 2003. 链接1

[271] Inhelder B, Piaget J. The growth of logical thinking from childhood to adolescence: an essay on the construction of formal operational structures. London: Psychology Press; 1958. 链接1

[272] Hespos SJ, Baillargeon R. Reasoning about containment events in very young infants. Cognition 2001;78(3):207–45. 链接1

[273] Wang SH, Baillargeon R, Paterson S. Detecting continuity violations in infancy: a new account and new evidence from covering and tube events. Cognition 2005;95(2):129–73. 链接1

[274] Hespos SJ, Spelke ES. Precursors to spatial language: the case of containment. In: Aurnague M, Hickmann M, editors. The categorization of spatial entities in language and cognition. Amsterdam: John Benjamins Publishing; 2007. p. 233–45. 链接1

[275] Strickland B, Scholl BJ. Visual perception involves event-type representations: the case of containment versus occlusion. J Exp Psychol Gen 2015;144(3):570–80. 链接1

[276] Casasola M, Cohen LB. Infant categorization of containment, support and tight-fit spatial relationships. Dev Sci 2002;5(2):247–64. 链接1

[277] Davis E, Marcus G, Frazier-Logue N. Commonsense reasoning about containers using radically incomplete information. Artif Intell 2017;248:46–84. 链接1

[278] Davis E. How does a box work? A study in the qualitative dynamics of solid objects. Artif Intell 2011;175(1):299–345. 链接1

[279] Davis E. Pouring liquids: a study in commonsense physical reasoning. Artif Intell 2008;172(12–13):1540–78. 链接1

[280] Cohn AG. Qualitative spatial representation and reasoning techniques. In: Proceedings of the 1997 Annual Conference on Artificial Intelligence; 1997 Sep 9–12; Freiburg, Germany; 1997. p. 1–30.

[281] Cohn AG, Hazarika SM. Qualitative spatial representation and reasoning: an overview. Fundam Inform 2001;46(1–2):1–29. 链接1

[282] Liang W, Zhao Y, Zhu Y, Zhu SC. Evaluating human cognition of containing relations with physical simulation. In: Proceedings of the 2015 Annual Meeting of the Cognitive Science Society; 2015 Jul 23–25; Pasadena, CA, USA; 2015.

[283] Yu LF, Duncan N, Yeung SK. Fill and transfer: a simple physics-based approach for containability reasoning. In: Proceedings of the 2015 International Conference on Computer Vision; 2015 Dec 11–18; Santiago, Chile; 2015.

[284] Mottaghi R, Schenck C, Fox D, Farhadi A. See the glass half full: reasoning about liquid containers, their volume and content. In: Proceedings of the 2017 International Conference on Computer Vision; 2017 Oct 22–29; Venice, Italy; 2017.

[285] Liang W, Zhao Y, Zhu Y, Zhu SC. What is where: inferring containment relations from videos. In: Proceedings of the 2016 International Joint Conference on Artificial Intelligence; 2016 Jul 9–15; New York, NY, USA; 2016.

[286] Liang W, Zhu Y, Zhu SC. Tracking occluded objects and recovering incomplete trajectories by reasoning about containment relations and human actions. In: Proceedings of the 2018 AAAI Conference on Artificial Intelligence; 2018 Feb 2–7; New Orleans, LA, USA; 2018.

[287] Jiang Y, Lim M, Saxena A. Learning object arrangements in 3D scenes using human context. In: Proceedings of the 29th International Conference on Machine Learning; 2012 Jun 26–Jul 1; Edinburgh, Scotland. p. 907–14. 链接1

[288] Jiang C, Qi S, Zhu Y, Huang S, Lin J, Yu LF, et al. Configurable 3D scene synthesis and 2D image rendering with per-pixel ground truth using stochastic grammars. Int J Comput Vis 2018;126(9):920–41. 链接1

[289] Dautenhahn K, Nehaniv CL, editors. Imitation in animals and artifacts. Cambridge: MIT Press; 2002. 链接1

[290] Argall BD, Chernova S, Veloso M, Browning B. A survey of robot learning from demonstration. Robot Auton Syst 2009;57(5):469–83. 链接1

[291] Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J. An algorithmic perspective on imitation learning. Found Trends Rob 2018;7(1–2):1–179. 链接1

[292] Gu Y, Sheng W, Liu M, Ou Y. Fine manipulative action recognition through sensor fusion. In: Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2015 Sep 28–Oct 2; Hamburg, Germany; 2015.

[293] Hammond FL, Mengüç Y, Wood RJ. Toward a modular soft sensor-embedded glove for human hand motion and tactile pressure measurement. In: Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2014 Sep 14–18; Chicago, IL, USA. p. 4000–7. 链接1

[294] Liu H, Xie X, Millar M, Edmonds M, Gao F, Zhu Y, et al. A glove-based system for studying hand-object manipulation via joint pose and force sensing. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2017 Sep 24–28; Vancouver, BC, USA. p. 6617–24. 链接1

[295] Edmonds M, Gao F, Xie X, Liu H, Qi S, Zhu Y, et al. Feeling the force: integrating force and pose for fluent discovery through imitation learning to open medicine bottles. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2017 Sep 24–28; Vancouver, BC, USA. p. 3530–7. 链接1

[296] Liu H, Zhang Y, Si W, Xie X, Zhu Y, Zhu SC. Interactive robot knowledge patching using augmented reality. In: Proceedings of the 2018 IEEE International Conference on Robotics and Automation; 2018 May 21–25; Brisbane, QLD, Australia. p. 1947–54. 链接1

[297] Edmonds M, Gao F, Liu H, Xie X, Qi S, Rothrock B, et al. A tale of two explanations: enhancing human trust by explaining robot behavior. Sci Robot 2019;4(37):eaay4663. 链接1

[298] Liu H, Zhang C, Zhu Y, Jiang C, Zhu SC. Mirroring without overimitation: learning functionally equivalent manipulation actions. Proceedings of the 2019 AAAI Conference on Artificial Intelligence; 2019 Jan 27–Feb 1; Honolulu, HI, USA, 2019. 链接1

[299] Dennett DC. The intentional stance. Cambridge: MIT Press; 1989. 链接1

[300] Heider F. The psychology of interpersonal relations. London: Psychology Press; 2013. 链接1

[301] Gergely G, Nádasdy Z, Csibra G, Bíró S. Taking the intentional stance at 12 months of age. Cognition 1995;56(2):165–93. 链接1

[302] Premack D, Woodruff G. Does the chimpanzee have a theory of mind? Behav Brain Sci 1978;1(4):515–26. 链接1

[303] Baldwin DA, Baird JA. Discerning intentions in dynamic human action. Trends Cogn Sci 2001;5(4):171–8. 链接1

[304] Woodward AL. Infants selectively encode the goal object of an actor’s reach. Cognition 1998;69(1):1–34. 链接1

[305] Meltzoff AN, Brooks R. ‘‘Like me” as a building block for understanding other minds: bodily acts, attention, and intention. In: Malle BF, Moses LJ, Baldwin DA, editors. Intentions and intentionality: foundations of social cognition. Cambridge: MIT Press; 2001. p. 171–92. 链接1

[306] Baldwin DA, Baird JA, Saylor MM, Clark MA. Infants parse dynamic action. Child Dev 2001;72(3):708–17. 链接1

[307] Tomasello M, Carpenter M, Call J, Behne T, Moll H. Understanding and sharing intentions: the origins of cultural cognition. Behav Brain Sci 2005;28 (5):675–91. 链接1

[308] Biro S, Hommel B. Becoming an intentional agent: introduction to the special issue. Acta Psychol 2007;124(1):1–7. 链接1

[309] Gergely G, Bekkering H, Király I. Rational imitation in preverbal infants. Nature 2002;415(6873):755. 链接1

[310] Woodward AL, Sommerville JA, Gerson S, Henderson AME, Buresh J. The emergence of intention attribution in infancy. Psychol Learn Motiv 2009;51:187–222. 链接1

[311] Zelazo PD, Astington JW, Olson DR, editors. Developing theories of intention: social understanding and self-control. Mahwah: Lawrence Erlbaum Associates Publishers; 1999. 链接1

[312] Bloom P. Intention, history, and artifact concepts. Cognition 1996;60 (1):1–29. 链接1

[313] Heider F, Simmel M. An experimental study of apparent behavior. Am J Psychol 1944;57(2):243–59. 链接1

[314] Berry DS, Misovich SJ. Methodological approaches to the study of social event perception. Pers Soc Psychol Bull 1994;20(2):139–52. 链接1

[315] Bassili JN. Temporal and spatial contingencies in the perception of social events. J Pers Soc Psychol 1976;33(6):680–5. 链接1

[316] Dittrich WH, Lea SE. Visual perception of intentional motion. Perception 1994;23(3):253–68. 链接1

[317] Dennett DC. Précis of the intentional stance. Behav Brain Sci 1988;11 (3):495–505. 链接1

[318] Liu S, Brooks NB, Spelke ES. Origins of the concepts cause, cost, and goal in prereaching infants. Proc Natl Acad Sci USA 2019;116(36):17747–52. 链接1

[319] Gao T, Newman GE, Scholl BJ. The psychophysics of chasing: a case study in the perception of animacy. Cognit Psychol 2009;59(2):154–79. 链接1

[320] Liu S, Spelke ES. Six-month-old infants expect agents to minimize the cost of their actions. Cognition 2017;160:35–42. 链接1

[321] Gergely G, Csibra G. Teleological reasoning in infancy: the naïve theory of rational action. Trends Cogn Sci 2003;7(7):287–92. 链接1

[322] Baker CL, Saxe R, Tenenbaum JB. Action understanding as inverse planning. Cognition 2009;113(3):329–49. 链接1

[323] Pereira LM, Anh HT. Intention recognition via causal Bayes networks plus plan generation. In: Proceedings of the 14th Portuguese Conference on Artificial Intelligence; 2009 Oct 12–15; Aveiro, Portugal; 2009. p. 138–49.

[324] Narang S, Best A, Manocha D. Inferring user intent using Bayesian theory of mind in shared avatar-agent virtual environments. IEEE Trans Vis Comput Graph 2019;25(5):2113–22. 链接1

[325] Nakahashi R, Baker CL, Tenenbaum JB. Modeling human understanding of complex intentional action with a Bayesian nonparametric subgoal model. Proceedings of the 2016 AAAI Conference on Artificial Intelligence; 2016 Feb 12–17; Phoenix, AZ, USA, 2016. 链接1

[326] Holtzen S, Zhao Y, Gao T, Tenenbaum JB, Zhu SC. Inferring human intent from video by sampling hierarchical plans. In: Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2016 Oct 9–14; Daejeon, Korea. p. 1489–96. 链接1

[327] Kong Y, Fu Y. Human action recognition and prediction: a survey. 2018. arXiv:1806.11230.

[328] Blakemore SJ, Decety J. From the perception of action to the understanding of intention. Nat Rev Neurosci 2001;2(8):561–7. 链接1

[329] Elsner B, Hommel B. Effect anticipation and action control. J Exp Psychol Hum Percept Perform 2001;27(1):229–40. 链接1

[330] Elsner B. Infants’ imitation of goal-directed actions: the role of movements and action effects. Acta Psychol 2007;124(1):44–59. 链接1

[331] Rizzolatti G, Craighero L. The mirror–neuron system. Annu Rev Neurosci 2004;27(1):169–92. 链接1

[332] Kaplan JT, Iacoboni M. Getting a grip on other minds: mirror neurons, intention understanding, and cognitive empathy. Soc Neurosci 2006;1(3– 4):175–83. 链接1

[333] Reid VM, Csibra G, Belsky J, Johnson MH. Neural correlates of the perception of goal-directed action in infants. Acta Psychol 2007;124(1):129–38. 链接1

[334] Csibra G, Gergely G. The teleological origins of mentalistic action explanations: a developmental hypothesis. Dev Sci 2002;1(2):255–9. 链接1

[335] Gergely G. The development of understanding self and agency. In: Goswami U, editor. Blackwell handbook of childhood cognitive development. Oxford: Blackwell Publishers Ltd.; 2002. p. 26–46.

[336] Kleinke CL. Gaze and eye contact: a research review. Psychol Bull 1986;100 (1):78–100. 链接1

[337] Emery NJ. The eyes have it: the neuroethology, function and evolution of social gaze. Neurosci Biobehav Rev 2000;24(6):581–604. 链接1

[338] Burgoon JK, Guerrero LK, Floyd K. Nonverbal communication. New York: Routledge; 2016. 链接1

[339] Wei P, Liu Y, Shu T, Zheng N, Zhu SC. Where and why are they looking? Jointly inferring human attention and intentions in complex tasks. In: Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition; 2018 Jun 18–22; Salt Lake City, UT, USA; 2018. p. 6801–9.

[340] Melis AP, Tomasello M. Chimpanzees (Pan troglodytes) coordinate by communicating in a collaborative problem-solving task. Proc R Soc B 1901;2019(286):20190408. 链接1

[341] Fan L, Chen Y, Wei P, Wang W, Zhu SC. Inferring shared attention in social scene videos. In: Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition; 2018 Jun 18–22; Salt Lake City, UT, USA; 2018. p. 6460–8.

[342] Fan L, Wang W, Huang S, Tang X, Zhu SC. Understanding human gaze communication by spatio-temporal graph reasoning. In: Proceedings of the 2019 International Conference on Computer Vision; 2019 Oct 27–Nov 2; Seoul, Korea. p. 5724–33. 链接1

[343] Trick S, Koert D, Peters J, Rothkopf C. Multimodal uncertainty reduction for intention recognition in human–robot interaction. 2019. arXiv:1907.02426.

[344] Shu T, Ryoo MS, Zhu SC. Learning social affordance for human–robot interaction. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence; 2016 Jul 9–15; New York, NY, USA; 2016. p. 3454–61.

[345] Shu T, Gao X, Ryoo MS, Zhu SC. Learning social affordance grammar from videos: transferring human interactions to human–robot interactions. In: Proceedings of the 2017 IEEE International Conference on Robotics and Automation; 2017 May 29–Jun 3; Singapore, Singapore; 2017.

[346] Russell SJ, Norvig P. Artificial intelligence: a modern approach. 3rd ed. New York: Pearson Education Limited; 2016. 链接1

[347] Hutcheson F. An inquiry into the original of our ideas of beauty and virtue: in two treatises. 2nd ed. London: Darby J, Bettesworth A, Fayram F, Pemberton J, Rivington C, Hooke J, Clay F, Batley J, Symon E; 1726.

[348] Mill JS. Utilitarianism. 12th ed. New York: Longmans, Green and Company; 1895. 链接1

[349] Shukla N, He Y, Chen F, Zhu SC. Learning human utility from video demonstrations for deductive planning in robotics. In: Proceedings of the 1st Annual Conference on Robot Learning; 2017 Nov 13–15; Mountain View, CA, USA. p. 448–57. 链接1

[350] Grice HP, Cole P, Morgan J. Logic and conversation. In: Ezcurdia M, Stainton RJ, editors. The semantics–pragmatics boundary in philosophy. Toronto: Broadview Press; 2013. 链接1

[351] Goodman ND, Frank MC. Pragmatic language interpretation as probabilistic inference. Trends Cogn Sci 2016;20(11):818–29. 链接1

[352] Lewis D. Convention: a philosophical study. Oxford: Blackwell Publishers; 2002. 链接1

[353] Sperber D, Wilson D. Relevance: communication and cognition. Cambridge: Harvard University Press; 1986. 链接1

[354] Wittgenstein L. Philosophical investigations. New York: Macmillan; 1953. 链接1

[355] Clark HH. Using language. Cambridge: Cambridge University Press; 1996. 链接1

[356] Qing C, Franke M. Variations on a Bayesian theme: comparing Bayesian models of referential reasoning. In: Zeevat H, Schmitz HC, editors. Bayesian natural language semantics and pragmatics. Heidelberg: Springer; 2015. p. 201–20. 链接1

[357] Goodman ND, Stuhlmüller A. Knowledge and implicature: modeling language understanding as social cognition. Top Cogn Sci 2013;5(1):173–84. 链接1

[358] Dale R, Reiter E. Computational interpretations of the Gricean maxims in the generation of referring expressions. Cogn Sci 1995;19(2):233–63. 链接1

[359] Benz A, Jäger G, van Rooij R. An introduction to game theory for linguists. In: Benz A, Jäger G, van Rooij R, editors. Game theory and pragmatics. London: Palgrave Macmillan; 2006. p. 1–82. 链接1

[360] Jäger G. Applications of game theory in linguistics. Lang Linguist Compass 2008;2(3):406–21. 链接1

[361] Frank MC, Goodman ND. Predicting pragmatic reasoning in language games. Science 2012;336(6084):998. 链接1

[362] Kleiman-Weiner M, Gerstenberg T, Levine S, Tenenbaum JB. Inference of intention and permissibility in moral decision making. In: Proceedings of the 2015 Annual Meeting of the Cognitive Science Society; 2015 Jul 23–25; Pasadena, CA, USA; 2015.

[363] Kleiman-Weiner M, Ho MK, Austerweil JL, Littman ML, Tenenbaum JB. Coordinate to cooperate or compete: abstract goals and joint intentions in social interaction. In: Proceedings of the 2016 Annual Meeting of the Cognitive Science Society; 2016 Aug 10–13; Philadelphia, PA, USA; 2016.

[364] Shum M, Kleiman-Weiner M, Littman ML, Tenenbaum JB. Theory of minds: understanding behavior in groups through inverse planning. In: Proceedings of the 2019 AAAI Conference on Artificial Intelligence; 2019 Jan 27–Feb 1; Honolulu, HI, USA; 2019.

[365] Kleiman-Weiner M, Shaw A, Tenenbaum JB. Constructing social preferences from anticipated judgments: when impartial inequity is fair and why? In: Proceedings of the 2017 Annual Meeting of the Cognitive Science Society; 2017 Jul 26–29; London, UK; 2017.

[366] Kleiman-Weiner M, Saxe R, Tenenbaum JB. Learning a commonsense moral theory. Cognition 2017;167:107–23. 链接1

[367] Kinney M, Tsatsoulis C. Learning communication strategies in multiagent systems. Appl Intell 1998;9(1):71–91. 链接1

[368] Lowe R, Wu Y, Tamar A, Harb J, Abbeel OP, Mordatch I. Multi-agent actorcritic for mixed cooperative–competitive environments. In: Proceedings of the 2017 Neural Information Processing Systems; 2017 Dec 3–9; Long Beach, CA, USA; 2017.

[369] Foerster J, Assael IA, de Freitas N, Whiteson S. Learning to communicate with deep multi-agent reinforcement learning. Proceedings of the 2016 Neural Information Processing Systems; 2016 Dec 5–10; Barcelona, Spain, 2016. 链接1

[370] Foerster J, Nardelli N, Farquhar G, Afouras T, Torr PHS, Kohli P, et al. Stabilising experience replay for deep multi-agent reinforcement learning. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6–11; Sydney, NSW, Australia. p. 1146–55. 链接1

[371] Holyoak KJ. Analogy and relational reasoning. In: Holyoak KJ, Morrison RG, editors. The Oxford handbook of thinking and reasoning. New York: Oxford University Press; 2012. p. 234–59. 链接1

[372] Raven JC. Raven progressive matrices. Torrance: Western Psychological Services; 1938. 链接1

[373] Zhang C, Gao F, Jia B, Zhu Y, Zhu SC. RAVEN: a dataset for relational and analogical visual reasoning. In: Proceedings of the 2019 Conference on Computer Vision and Pattern Recognition; 2019 Jun 16–20; Long Beach, CA, USA. p. 5317–27. 链接1

[374] Legg S, Hutter M. Universal intelligence: a definition of machine intelligence. Minds Mach 2007;17(4):391–444. 链接1

[375] Mo K, Zhu S, Chang AX, Yi L, Tripathi S, Guibas LJ, et al. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the 2019 Conference on Computer Vision and Pattern Recognition; 2019 Jun 16–20; Long Beach, CA, USA. p. 909–18. 链接1

[376] Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, et al. ShapeNet: an information-rich 3D model repository. 2015. arXiv:1512.03012.

[377] Feng T, Yu LF, Yeung SK, Yin K, Zhou K. Crowd-driven mid-scale layout design. ACM Trans Graph 2016;35(4):132. 链接1

[378] Savva M, Chang AX, Dosovitskiy A, Funkhouser T, Koltun V. MINOS: multimodal indoor simulator for navigation in complex environments. 2017. arXiv:1712.03931.

[379] Brodeur S, Perez E, Anand A, Golemo F, Celotti L, Strub F, et al. HoME: a household multimodal environment. 2017. arXiv:1711.11017.

[380] Xia F, Zamir AR, He Z, Sax A, Malik J, Savarese S. Gibson Env: real-world perception for embodied agents. In: Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition; 2018 Jun 18–22; Salt Lake City, UT, USA; 2018. p. 9068–79.

[381] Wu Y, Wu YX, Gkioxari G, Tian Y. Building generalizable agents with a realistic and rich 3D environment. 2018. arXiv:1801.02209.

[382] Kolve E, Mottaghi R, Han W, VanderBilt E, Weihs L, Herrasti A, et al. AI2- THOR: an interactive 3D environment for visual AI. 2017. arXiv:1712.05474.

[383] Puig X, Ra K, Boben M, Li J, Wang T, Fidler S, et al. VirtualHome: simulating household activities via programs. In: Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition; 2018 Jun 18–22; Salt Lake City, UT, USA; 2018. p. 8494–502.

[384] Xie X, Liu H, Zhang Z, Qiu Y, Gao F, Qi S, et al. VRGym: a virtual testbed for physical and interactive AI. In: Proceedings of the ACM TURC; 2019 May 17– 19; Chengdu, China; 2019.

[385] Gao X, Gong R, Shu T, Xie X, Wang S, Zhu SC. VRKitchen: an interactive 3D virtual environment for task-oriented learning. 2019. arXiv:1903.05757.

[386] Shah S, Dey D, Lovett C, Kapoor A. AirSim: high-fidelity visual and physical simulation for autonomous vehicles. In: Hutter M, Siegwart R, editors. Field and service robotics. Cham: Springer; 2018. p. 621–35. 链接1

[387] Gao M, Wang X, Wu K, Pradhana A, Sifakis E, Yuksel C, et al. GPU optimization of material point methods. ACM Trans Graph 2018;37(6):254. 链接1

[388] Terzopoulos D, Platt J, Barr A, Fleischer K. Elastically deformable models. In: Stone MC, editor. Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques; 1987 July 27–31; Anaheim, CA, USA. New York: Association for Computing Machinery; 1987. p. 205–14.

[389] Terzopoulos D, Fleischer K. Modeling inelastic deformation: viscolelasticity, plasticity, fracture. In: Beach RJ, editor. Proceedings of the 15th Annual Conference on Computer Graphics and Interactive Techniques; 1988 Aug 1– 5; Atlanta, GA, USA; New York: Association for Computing Machinery; 1988. p. 269–78.

[390] Foster N, Metaxas D. Realistic animation of liquids. Graph Models Image Proc 1996;58(5):471–83. 链接1

[391] Stam J. Stable fluids. ACM Trans Graph 1999;99:121–8. 链接1

[392] Bridson R. Fluid simulation for computer graphics. London: CRC Press; 2015. 链接1

[393] Bonet J, Wood RD. Nonlinear continuum mechanics for finite element analysis. New York: Cambridge University Press; 1997. 链接1

[394] Blemker S, Teran J, Sifakis E, Fedkiw R, Delp S. Fast 3D muscle simulations using a new quasistatic invertible finite-element algorithm. In: Proceedings of the 2005 International Symposium on Computer Simulation in Biomechanics; 2005 Jul 28–30; Cleveland, OH, USA; 2005.

[395] Hegemann J, Jiang C, Schroeder C, Teran JM. A level set method for ductile fracture. In: Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation; 2013 Jul 19–21; Anaheim, CA, USA; 2013. p. 193–201.

[396] Gast TF, Schroeder C, Stomakhin A, Jiang C, Teran JM. Optimization integrator for large time steps. IEEE Trans Vis Comput Graph 2015;21(10):1103–15. 链接1

[397] Li M, Gao M, Langlois T, Jiang C, Kaufman DM. Decomposed optimization time integrator for large-step elastodynamics. ACM Trans Graph 2019;38(4):70. 链接1

[398] Wang Y, Jiang C, Schroeder C, Teran J. An adaptive virtual node algorithm with robust mesh cutting. In: Proceedings of the 2014 ACM SIGGRAPH/ Eurographics Symposium on Computer Animation; 2014 Jul 21–23; Copenhagen, Denmark; 2014. p. 77–85.

[399] Monaghan JJ. Smoothed particle hydrodynamics. Annu Rev Astron Astrophys 1992;30(1):543–74. 链接1

[400] Liu WK, Jun S, Zhang YF. Reproducing kernel particle methods. Int J Numer Methods Fluids 1995;20(8–9):1081–106. 链接1

[401] Li S, Liu WK. Meshfree and particle methods and their applications. Appl Mech Rev 2002;55(1):1–34. 链接1

[402] Donea J, Giuliani S, Halleux JP. An arbitrary Lagrangian-Eulerian finite element method for transient dynamic fluid–structure interactions. Comput Methods Appl Mech Eng 1982;33(1–3):689–723. 链接1

[403] Brackbill JU, Ruppel HM. FLIP: a method for adaptively zoned, particle-in-cell calculations of fluid flows in two dimensions. J Comput Phys 1986;65 (2):314–43. 链接1

[404] Jiang C, Schroeder C, Selle A, Teran J, Stomakhin A. The affine particle-in-cell method. ACM Trans Graph 2015;34(4):51. 链接1

[405] Sulsky D, Chen Z, Schreyer HL. A particle method for history-dependent materials. Comput Methods Appl Mech Eng 1994;118(1–2):179–96. 链接1

[406] Sulsky D, Zhou SJ, Schreyer HL. Application of a particle-in-cell method to solid mechanics. Comput Phys Commun 1995;87(1–2):236–52. 链接1

[407] Stomakhin A, Schroeder C, Chai L, Teran J, Selle A. A material point method for snow simulation. ACM Trans Graph 2013;32(4):102. 链接1

[408] Gaume J, Gast T, Teran J, van Herwijnen A, Jiang C. Dynamic anticrack propagation in snow. Nat Commun 2018;9(1):3047. 链接1

[409] Ram D, Gast T, Jiang C, Schroeder C, Stomakhin A, Teran J, et al. A material point method for viscoelastic fluids, foams and sponges. In: Proceedings of the 14th ACM SIGGRAPH/Eurographics Symposium on Computer Animation; 2015 Aug 7–9; Los Angeles, CA, USA; 2015. p. 157–63.

[410] Yue Y, Smith B, Batty C, Zheng C, Grinspun E. Continuum foam: a material point method for shear-dependent flows. ACM Trans Graph 2015;34 (5):160. 链接1

[411] Fang Y, Li M, Gao M, Jiang C. Silly rubber: an implicit material point method for simulating non-equilibrated viscoelastic and elastoplastic solids. ACM Trans Graph 2019;38(4):118. 链接1

[412] Klár G, Gast T, Pradhana A, Fu C, Schroeder C, Jiang C, et al. Drucker-Prager elastoplasticity for sand animation. ACM Trans Graph 2016;35(4):103. 链接1

[413] Daviet G, Bertails-Descoubes F. A semi-implicit material point method for the continuum simulation of granular materials. ACM Trans Graph 2016;35 (4):102. 链接1

[414] Hu Y, Fang Y, Ge Z, Qu Z, Zhu Y, Pradhana A, et al. A moving least squares material point method with displacement discontinuity and two-way rigid body coupling. ACM Trans Graph 2018;37(4):150. 链接1

[415] Wang S, Ding M, Gast TF, Zhu L, Gagniere S, Jiang C, et al. Simulation and visualization of ductile fracture with the material point method. ACM Trans Graph 2019;2(2):18. 链接1

[416] Wolper J, Fang Y, Li M, Lu J, Gao M, Jiang C. CD-MPM: continuum damage material point methods for dynamic fracture animation. ACM Trans Graph 2019;38(4):119. 链接1

[417] Jiang C, Gast T, Teran J. Anisotropic elastoplasticity for cloth, knit and hair frictional contact. ACM Trans Graph 2017;36(4):152. 链接1

[418] Han X, Gast TF, Guo Q, Wang S, Jiang C, Teran J. A hybrid material point method for frictional contact with diverse materials. ACM Trans Graph 2019;2(2):17. 链接1

[419] Fu C, Guo Q, Gast T, Jiang C, Teran J. A polynomial particle-in-cell method. ACM Trans Graph 2017;36(6):222. 链接1

[420] Stomakhin A, Schroeder C, Jiang C, Chai L, Teran J, Selle A. Augmented MPM for phase-change and varied materials. ACM Trans Graph 2014;33(4):138. 链接1

[421] Tampubolon AP, Gast T, Klár G, Fu C, Teran J, Jiang C, et al. Multi-species simulation of porous sand and water mixtures. ACM Trans Graph 2017;36 (4):105. 链接1

[422] Gao M, Pradhana A, Han X, Guo Q, Kot G, Sifakis E, et al. Animating fluid sediment mixture in particle-laden flows. ACM Trans Graph 2018;37(4):149. 链接1

[423] Nairn JA. Material point method calculations with explicit cracks. Comput Model Eng Sci 2003;4(6):649–64. 链接1

[424] Chen Z, Shen L, Mai YW, Shen YG. A bifurcation-based decohesion model for simulating the transition from localization to decohesion with the MPM. Z Angew Math Phys 2005;56(5):908–30. 链接1

[425] Schreyer HL, Sulsky DL, Zhou SJ. Modeling delamination as a strong discontinuity with the material point method. Comput Methods Appl Mech Eng 2002;191(23–24):2483–507. 链接1

[426] Sulsky D, Schreyer HL. Axisymmetric form of the material point method with applications to upsetting and Taylor impact problems. Comput Methods Appl Mech Eng 1996;139(1–4):409–29. 链接1

[427] Huang P, Zhang X, Ma S, Wang HK. Shared memory OpenMP parallelization of explicit MPM and its application to hypervelocity impact. Comput Model Eng Sci 2008;38(2):119–48. 链接1

[428] Hu W, Chen Z. Model-based simulation of the synergistic effects of blast and fragmentation on a concrete wall using the MPM. Int J Impact Eng 2006;32 (12):2066–96. 链接1

[429] York AR, Sulsky D, Schreyer HL. Fluid-membrane interaction based on the material point method. Int J Numer Methods Eng 2000;48(6):901–24. 链接1

[430] Bandara S, Soga K. Coupling of soil deformation and pore fluid flow using material point method. Comput Geotech 2015;63:199–214. 链接1

[431] Guilkey JE, Hoying JB, Weiss JA. Computational modeling of multicellular constructs with the material point method. J Biomech 2006;39(11): 2074–86. 链接1

[432] Huang P. Material point method for metal and soil impact dynamics problems. Beijing: Tsinghua University; 2010. 链接1

[433] Fang Y, Hu Y, Hu SM, Jiang C. A temporally adaptive material point method with regional time stepping. Comput Graph Forum 2018;37(8):195–204. 链接1

[434] Bardenhagen SG, Kober EM. The generalized interpolation material point method. Comput Model Eng Sci 2004;5(6):477–96. 链接1

[435] Gao M, Tampubolon AP, Jiang C, Sifakis E. An adaptive generalized interpolation material point method for simulating elastoplastic materials. ACM Trans Graph 2017;36(6):223. 链接1

[436] Sadeghirad A, Brannon RM, Burghardt J. A convected particle domain interpolation technique to extend applicability of the material point method for problems involving massive deformations. Int J Numer Methods Eng 2011;86(12):1435–56. 链接1

[437] Zhang DZ, Ma X, Giguere PT. Material point method enhanced by modified gradient of shape function. J Comput Phys 2011;230(16):6379–98. 链接1

[438] Bernstein DS, Givan R, Immerman N, Zilberstein S. The complexity of decentralized control of Markov decision processes. Math Oper Res 2002;27 (4):819–40. 链接1

[439] Goldman CV, Zilberstein S. Optimizing information exchange in cooperative multi-agent systems. In: Proceedings of the 2nd International Joint Conference on Autonomous Agents and Multiagent Systems; 2003 Jul 14– 18; Melbourne, VIC, Australia. p. 137–44. 链接1

[440] Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, et al. Playing Atari with deep reinforcement learning. 2013. arXiv:1312.5602.

[441] Tampuu A, Matiisen T, Kodelja D, Kuzovkin I, Korjus K, Aru J, et al. Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 2017;12(4):e0172395. 链接1

[442] Foerster JN, Farquhar G, Afouras T, Nardelli N, Whiteson S. Counterfactual multi-agent policy gradients. In: Proceedings of the 2018 AAAI Conference on Artificial Intelligence; 2018 Feb 2–7; New Orleans, LA, USA; 2018.

[443] Sukhbaatar S, Fergus R. Learning multiagent communication with backpropagation. In: Proceedings of the 2016 Neural Information Processing Systems; 2016 Dec 5–10; Barcelona, Spain; 2016. p. 2244–52.

[444] Mordatch I, Abbeel P. Emergence of grounded compositional language in multi-agent populations. In: Proceedings of the 2018 AAAI Conference on Artificial Intelligence; 2018 Feb 2–7; New Orleans, LA, USA; 2018.

[445] Lazaridou A, Peysakhovich A, Baroni M. Multi-agent cooperation and the emergence of (natural) language. In: Proceedings of the 5th International Conference on Learning Representations; 2017 Apr 24–26; Toulon, France; 2017.

[446] Havrylov S, Titov I. Emergence of language with multi-agent games: learning to communicate with sequences of symbols. In: Proceedings of the 2017 Neural Information Processing Systems; 2017 Dec 3–9; Long Beach, CA, USA; 2017.

[447] Evtimova K, Drozdov A, Kiela D, Cho K. Emergent language in a multi-modal, multi-step referential game. 2017. arXiv:1705.10369.

[448] Lazaridou A, Hermann KM, Tuyls K, Clark S. Emergence of linguistic communication from referential games with symbolic and pixel input. In: Proceedings of the 2018 International Conference on Learning Representations; 2018 Apr 30–May 3; Vancouver, BC, Canada; 2018.

[449] Wagner K, Reggia JA, Uriagereka J, Wilkinson GS. Progress in the simulation of emergent communication and language. Adapt Behav 2003;11(1):37–69. 链接1

[450] Ibsen-Jensen R, Tkadlec J, Chatterjee K, Nowak MA. Language acquisition with communication between learners. J R Soc Interface 2018;15(140):20180073. 链接1

[451] Graesser L, Cho K, Kiela D. Emergent linguistic phenomena in multi-agent communication games. 2019. arXiv:1901.08706.

[452] Dupoux E, Jacob P. Universal moral grammar: a critical appraisal. Trends Cogn Sci 2007;11(9):373–8. 链接1

[453] Mikhail J. Elements of moral cognition: Rawls’ linguistic analogy and the cognitive science of moral and legal judgment. New York: Cambridge University Press; 2011. 链接1

[454] Blake PR, McAuliffe K, Corbit J, Callaghan TC, Barry O, Bowie A, et al. The ontogeny of fairness in seven societies. Nature 2015;528(7581):258–61. 链接1

[455] Henrich J, Boyd R, Bowles S, Camerer C, Fehr E, Gintis H, et al. In search of homo economicus: behavioral experiments in 15 small-scale societies. Am Econ Rev 2001;91(2):73–8. 链接1

[456] House BR, Silk JB, Henrich J, Barrett HC, Scelza BA, Boyette AH, et al. Ontogeny of prosocial behavior across diverse societies. Proc Natl Acad Sci USA 2013;110(36):14586–91. 链接1

[457] Graham J, Meindl P, Beall E, Johnson KM, Zhang L. Cultural differences in moral judgment and behavior, across and within societies. Curr Opin Psychol 2016;8:125–30. 链接1

[458] Hurka T. Virtue, vice, and value. Cambridge: Oxford University Press; 2000. 链接1

[459] Rawls J. A theory of justice. Cambridge: Harvard University Press; 1971. 链接1

[460] Haidt J. The new synthesis in moral psychology. Science 2007;316 (5827):998–1002. 链接1

[461] Hamlin JK. Moral judgment and action in preverbal infants and toddlers: evidence for an innate moral core. Curr Dir Psychol Sci 2013;22 (3):186–93. 链接1

[462] Kim R, Kleiman-Weiner M, Abeliuk A, Awad E, Dsouza S, Tenenbaum JB, et al. A computational model of commonsense moral decision making. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society; 2018 Feb 2–3; New Orleans, LA, USA; 2018. p. 197–203.

[463] Holyoak KJ, Thagard P. The analogical mind. Am Psychol 1997;52(1): 35–44. 链接1

[464] Buehner MJ, Cheng PW. Causal learning. In: Holyoak KJ, Morrison RG, editors. The Oxford handbook of thinking and reasoning. New York: Oxford University Press; 2012. p. 210–33. 链接1

[465] Hesse MB. Models and analogies in science. South Bend: Notre Dame University Press; 1966. 链接1

[466] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 2013 Neural Information Processing Systems; 2013 Dec 5–8; Lake Tahoe, NV, USA; 2013.

[467] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. 2013. arXiv:1301.3781.

[468] Carpenter PA, Just MA, Shell P. What one intelligence test measures: a theoretical account of the processing in the Raven progressive matrices test. Psychol Rev 1990;97(3):404–31. 链接1

[469] Antol S, Agrawal A, Lu J, Mitchell M, Batra D, Zitnick CL, et al. VQA: visual question answering. In: Proceedings of the 2015 International Conference on Computer Vision; 2015 Dec 11–18; Santiago, Chile; 2015. p. 2425–33.

[470] Snow RE, Kyllonen PC, Marshalek B. The topography of ability and learning correlations. Adv Psychol Hum Intell 1984;2(S47):103. 链接1

[471] Jaeggi SM, Buschkuehl M, Jonides J, Perrig WJ. Improving fluid intelligence with training on working memory. Proc Natl Acad Sci USA 2008;105 (19):6829–33. 链接1

[472] Bower GH. A contrast effect in differential conditioning. J Exp Psychol 1961;62(2):196–9. 链接1

[473] Meyer DR. The effects of differential rewards on discrimination reversal learning by monkeys. J Exp Psychol 1951;41(4):268–74. 链接1

[474] Schrier AM, Harlow HF. Effect of amount of incentive on discrimination learning by monkeys. J Comp Physiol Psychol 1956;49(2):117–22. 链接1

[475] Shapley RM, Victor JD. The effect of contrast on the transfer properties of cat retinal ganglion cells. J Physiol 1978;285(1):275–98. 链接1

[476] Lawson R. Brightness discrimination performance and secondary reward strength as a function of primary reward amount. J Comp Physiol Psychol 1957;50(1):35–9. 链接1

[477] Amsel A. Frustrative nonreward in partial reinforcement and discrimination learning: some recent history and a theoretical extension. Psychol Rev 1962;69(4):306–28. 链接1

[478] Gibson JJ, Gibson EJ. Perceptual learning; differentiation or enrichment? Psychol Rev 1955;62(1):32–41. 链接1

[479] Gibson JJ. The ecological approach to visual perception: classic edition. London: Psychology Press; 2014. 链接1

[480] Catrambone R, Holyoak KJ. Overcoming contextual limitations on problemsolving transfer. J Exp Psychol Learn Mem Cogn 1989;15(6):1147–56. 链接1

[481] Gentner D, Gunn V. Structural alignment facilitates the noticing of differences. Mem Cognit 2001;29(4):565–77. 链接1

[482] Hammer R, Diesendruck G, Weinshall D, Hochstein S. The development of category learning strategies: what makes the difference? Cognition 2009;112 (1):105–19. 链接1

[483] Gick ML, Paterson K. Do contrasting examples facilitate schema acquisition and analogical transfer? Can J Psychol 1992;46(4):539. 链接1

[484] Haryu E, Imai M, Okada H. Object similarity bootstraps young children to action-based verb extension. Child Dev 2011;82(2):674–86. 链接1

[485] Smith L, Gentner D. The role of difference–detection in learning contrastive categories. In: Proceedings of the 2014 Annual Meeting of the Cognitive Science Society; 2014 Jul 23–26; Quebec City, QC, Canada; 2014.

[486] Gentner D. Structure-mapping: a theoretical framework for analogy. Cogn Sci 1983;7(2):155–70. 链接1

[487] Gentner D, Markman AB. Structural alignment in comparison: no difference without similarity. Psychol Sci 1994;5(3):152–8. 链接1

[488] Schwartz DL, Chase CC, Oppezzo MA, Chin DB. Practicing versus inventing with contrasting cases: the effects of telling first on learning and transfer. J Educ Psychol 2011;103(4):759–75. 链接1

[489] Zhang C, Jia B, Gao F, Zhu Y, Lu H, Zhu SC. Learning perceptual inference by contrasting. In: Proceedings of the 2019 Neural Information Processing Systems; 2019 Dec 8–14; Vancouver, BC, Canada; 2019.

[490] Dehaene S. The number sense: how the mind creates mathematics. New York: Oxford University Press; 2011. 链接1

[491] Zhang W, Zhang C, Zhu Y, Zhu SC. Machine number sense: a dataset of visual arithmetic problems for abstract and relational reasoning. In: Proceedings of the 2020 AAAI Conference on Artificial Intelligence; 2020 Feb 7–12; New York, NY, USA; 2020.

相关研究