期刊首页 优先出版 当期阅读 过刊浏览 作者中心 关于期刊 English

《工程(英文)》 >> 2023年 第25卷 第6期 doi: 10.1016/j.eng.2022.10.017

通讯式学习——统一的机器学习模式

a Beijing Institute for General Artificial Intelligence, Beijing 100086, China
b Department of Computer Science, University of California, Los Angeles,  CA 90024, USA
c Department of Automation, Tsinghua University, Beijing 100084, China
d Institute for Artificial Intelligence, Peking University, Beijing 100871, China

收稿日期: 2022-02-04 修回日期: 2022-09-05 录用日期: 2022-10-27 发布日期: 2023-03-31

下一篇 上一篇

摘要

在本文中,我们提出了通讯式学习的框架,从而统一已经存在的机器学习范式,如被动学习(passive learning)、主动学习(active learning)、算法式教学(algorithmic teaching)等;同时促进新的学习方法的发展。扎根于人类的合作式通讯,这个范式用通讯的过程来刻画学习,同时将传授(pedagogy)的思想应用在机器学习领域。引入传授让机器可以更好地利用多种信息源进行学习:除了传统的随机抽样数据,还包括来自于因材施教的老师有目的性的信息。具体来讲,在通讯式学习模式中,一个老师和一个学生通过交流完成特定知识的学习过程。每个智能体都有一套思维(mind),包括其知识(knowledge)、效用(utility)和思维的变迁规则(dynamics)。每个智能体同时估计其伙伴的思维以进行高效的交流。我们给出了可以支持这种递归过程的师生思维表征(mental representation)和学习过程的公式(learning formulation),这些结构让通讯式学习具有和人类相似的学习效率。我们进一步用一些典型的人机合作任务来展示通讯式学习模式的可行性,同时说明了这个模型可以超越香农(Shannon)的通讯极限。最后,我们给出了通讯式学习框架对于基础学习理论的贡献,包括提出了学习的阶层以及定义了学习的停机问题。

补充材料

图片

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

图13

图14

图15

图16

图17

参考文献

[ 1 ] Zhu Y, Gao T, Fan L, Huang S, Edmonds M, Liu H, et al. Dark, beyond deep: a paradigm shift to cognitive AI with humanlike common sense. Engineering 2020;6(3):310‒45. 链接1

[ 2 ] Shulman LS. Knowledge and teaching: foundations of the new reform. Harv Educ Rev 1987;57(1):1‒23. 链接1

[ 3 ] Tomasello M. Origins of human communication. Cambridge: MIT Press; 2010.

[ 4 ] Holyoak KJ, Thagard P. Mental leaps: analogy in creative thought. Cambridge: MIT Press; 1995. 链接1

[ 5 ] Lake BM, Salakhutdinov R, Tenenbaum JB. Human-level concept learning through probabilistic program induction. Science 2015;350(6266):1332‒8. 链接1

[ 6 ] Premack D, Woodruff G. Does the chimpanzee have a theory of mind? Behav Brain Sci 1978;1(4):515‒26. 链接1

[ 7 ] Clark HH. Using language. New York City: Cambridge University Press; 1996.

[ 8 ] Shannon CE. A mathematical theory of communication. Bell Syst Tech J 1948;27(3):379‒423. 链接1

[ 9 ] Valiant LG. A theory of the learnable. Commun ACM 1984;27(11):1134‒42. 链接1

[10] Grice HP. Logic and conversation. In: Cole P, Morgan J, editors. Syntax and semantics: speech acts. New York City: Academic Press; 1975. 链接1

[11] Levinson SC. Presumptive meanings: the theory of generalized conversational implicature. Cambridge: MIT Press; 2000. 链接1

[12] Goodman ND, Stuhlmüller A. Knowledge and implicature: modeling language understanding as social cognition. Top Cogn Sci 2013;5:173‒84. 链接1

[13] Eaves BS, Schweinhart Jr AM, Shafto P. Tractable Bayesian teaching. In: Jones M, editor. Big data in cognitive science. New York City: Psychology Press; 2015. 链接1

[14] Eaves Jr BS, Feldman NH, Griffiths TL, Shafto P. Infant-directed speech is consistent with teaching. Psychol Rev 2016;123(6):758‒71. 链接1

[15] Ho MK, Littman ML, MacGlashan J, Cushman F, Austerweil JL. Showing versus doing: teaching by demonstration. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, editors. Advances in neural information processing systems. Barcelona: Curran Associates, Inc.; 2016. 链接1

[16] Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev 1959;3(3):210‒29. 链接1

[17] Bishop CM. Pattern recognition and machine learning. New York City: Springer; 2006.

[18] Shalev-Shwartz S, Ben-David S. Understanding machine learning: from theory to algorithms. New York City: Cambridge University Press; 2014. 链接1

[19] Deng J, Dong W, Socher R, Li LJ, Li K, Li FF. Imagenet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition; 2009 Jun 20‒25; Miami, FL, USA; 2009. 链接1

[20] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Commun ACM 2017;60(6):84‒90. 链接1

[21] Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2014 Jun 23‒28; Columbus, OH, USA; 2014. 链接1

[22] Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision; 2015 Dec 11‒18; Santiago, Chile; 2015. 链接1

[23] He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision; 2017 Oct 22‒29; Venice, Italy; 2017. 链接1

[24] Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. 2018. arXiv:1810.04805. 链接1

[25] Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, et al. Playing Atari with deep reinforcement learning. 2013. arXiv:1312.5602.

[26] Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016;529(7587):484‒9. 链接1

[27] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016 Jun 27‒30; Las Vegas, NV, USA; 2016. 链接1

[28] Angluin D. Queries and concept learning. Mach Learn 1988;2(4):319‒42. 链接1

[29] Settles B. Active learning literature survey. Technical report. Madison: University of Wisconsin-Madison; 2010.

[30] Argall BD, Chernova S, Veloso M, Browning B. A survey of robot learning from demonstration. Robot Auton Syst 2009;57(5):469‒83. 链接1

[31] Shafto P, Goodman ND, Griffiths TL. A rational account of pedagogical reasoning: teaching by, and learning from, examples. Cogn Psychol 2014;71:55‒89. 链接1

[32] Milli S, Abbeel P, Mordatch I. Interpretable and pedagogical examples. 2017. arXiv:1711.00694.

[33] Yang SCH, Yu Y, Givchi A, Wang P, Vong WK, Shafto P. Optimal cooperative inference. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics; 2018 Apr 9‒11; Lanzarote, Spain; 2018.

[34] Chen Y, Aodha OM, Su S, Perona P, Yue Y. Near-optimal machine teaching via explanatory teaching sets. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics; 2018 Apr 9‒11; Lanzarote, Spain; 2018.

[35] Chen Y, Singla A, Aodha OM, Perona P, Yue Y. Understanding the role of adaptivity in machine teaching: the case of version space learners. 2018. arXiv:1802.05190.

[36] Hadfield-Menell D, Russell SJ, Abbeel P, Dragan A. Cooperative inverse reinforcement learning. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, editors. Advances in neural information processing systems. Barcelona: Curran Associates, Inc.; 2016. 链接1

[37] Ho MK, Littman ML, Cushman F, Austerweil JL. Effectively learning from pedagogical demonstrations. In: Proceedings of the Annual Conference of the Cognitive Science Society; 2018 Jul 25‒28; Madison, WI, USA; 2018.

[38] Cakmak M, Lopes M. Algorithmic and human teaching of sequential decision tasks. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence; 2012 Jul 22‒26; Toronto, ON, Canada; 2012.

[39] Zhu X. Machine teaching for Bayesian learners in the exponential family. In: Proceedings of the 27th International Conference on Neural Information Processing Systems; 2013 Dec 9‒12; Lake Tahoe, NV, USA; 2013.

[40] Zhu X. Machine teaching: an inverse problem to machine learning and an approach toward optimal education. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence; 2015 Jan 25‒30; Austin, TX, USA; 2015. 链接1

[41] Liu W, Dai B, Humayun A, Tay C, Yu C, Smith LB, et al. Iterative machine teaching. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6‒11; Sydney, NSW, Australia; 2017.

[42] Fan Y, Tian F, Qin T, Li XY, Liu TY. Learning to teach. In: Proceedings of the 6th International Conference on Learning Representations; 2018 Apr 30‒May 3; Vancouver, BC, Canada; 2018.

[43] Jiang L, Zhou Z, Leung T, Li LJ, Li FF. MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: Proceedings of the 35th International Conference on Machine Learning; 2018 Jul 10‒15; Stockholm, Sweden; 2018.

[44] Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, et al. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32th Conference on Neural Information Processing Systems; 2018 Dec 3‒8; Montreal, QC, Canada; 2018.

[45] Wang P, Wang J, Paranamana P, Shafto P. A mathematical theory of cooperative communication. In: Proceedings of the 34th Conference on Neural Information Processing Systems; 2020 Dec 6‒12; Vancouver, BC, Canada; 2020.

[46] Gweon H, Tenenbaum JB, Schulz LE. Infants consider both the sample and the sampling process in inductive generalization. Proc Natl Acad Sci USA 2010;107(20):9066‒71. 链接1

[47] Csibra G, Gergely G. Social learning and social cognition: the case for pedagogy. In: Munakata Y, Johnson MH, editors. Processes of change in brain and cognitive development—attention and performance XXI. Oxford: Oxford University Press; 2006. 链接1

[48] Csibra G, Gergely G. Natural pedagogy. Trends Cogn Sci 2009;13(4):148‒53. 链接1

[49] Xu F, Denison S. Statistical inference and sensitivity to sampling in 11-monthold infants. Cognition 2009;112(1):97‒104. 链接1

[50] Xu F, Tenenbaum JB. Sensitivity to sampling in Bayesian word learning. Dev Sci 2007;10(3):288‒97. 链接1

[51] Gweon H, Shafto P, Schulz L. Development of children’s sensitivity to overinformativeness in learning and teaching. Dev Psychol 2018;54(11):2113‒25. 链接1

[52] Sperber D, Wilson D. Relevance: communication and cognition. Oxford: Blackwell; 1986.

[53] Peltola T, Çelikok MM, Daee P, Kaski S. Machine teaching of active sequential learners. In: Proceedings of the 33th Conference on Neural Information Processing Systems; 2019 Dec 8‒14; Vancouver, BC, Canada; 2019.

[54] Shafto P, Goodman N. Teaching games: statistical sampling assumptions for learning in pedagogical situations. In: Proceedings of the 30th Annual Conference of the Cognitive Science Society; 2008 Jul 23‒26; Washiton, DC, USA; 2008.

[55] Wang J, Wang P, Shafto P. Sequential cooperative Bayesian inference. In: Proceedings of the 37th International Conference on Machine Learning; 2020 Jul 13‒18; Vienna, Austria; 2020.

[56] Hastie T, Tibshirani R, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. New York City: Springer; 2009.

[57] Vapnik V. The nature of statistical learning theory. New York City: Springer; 1999. 链接1

[58] Rivest RL. Cryptography and machine learning. In: Proceedings of the International Conference on the Theory and Applications of Cryptology: Advances in Cryptology; 1991 Nov 11‒14; Fujiyoshida, Japan; 1991.

[59] Zilles S, Lange S, Holte R, Zinkevich MA. Models of cooperative teaching and learning. J Mach Learn Res 2011;12:349‒84.

[60] Weaver W. Recent contributions to the mathematical theory of communication. ETC Rev Gen Semant 1953;10(4):261‒81.

[61] Fagin R, Halpern JY, Moses Y, Vardi MY. Reasoning about knowledge. Cambridge: MIT Press; 2003. 链接1

[62] Doshi P, Gmytrasiewicz PJ. Monte Carlo sampling methods for approximating interactive POMDPs. J Artif Intell Res 2009;34:297‒337. 链接1

[63] Albrecht SV, Stone P. Autonomous agents modelling other agents: a comprehensive survey and open problems. Artif Intell 2018;258:66‒95. 链接1

[64] Foerster J, Chen RY, Al-Shedivat M, Whiteson S, Abbeel P, Mordatch I. Learning with opponent-learning awareness. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems; 2018 Jul 10‒15; Stockholm, Sweden; 2018.

[65] De Weerd H, Verbrugge R, Verheij B. Theory of mind in the Mod game: an agent-based model of strategic reasoning. In: Proceedings of the European Conference on Social Intelligence; 2014 Nov 3‒5; Barcelona, Spain; 2014. 链接1

[66] De Weerd H, Verbrugge R, Verheij B. Higher-order theory of mind in the Tacit Communication Game. Biol Inspired Cogn Archit 2015;11:10‒21. 链接1

[67] Zhu SC, Mumford D. A stochastic grammar of images. Found Trends Comput Graph Vis 2007;2(4):259‒362. 链接1

[68] Qi S, Zhu Y, Huang S, Jiang C, Zhu SC. Human-centric indoor scene synthesis using stochastic grammar. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR); 2018 Jun 18‒22; Salt Lake City, UT, USA; 2018. 链接1

[69] Liu C, Chai JY, Shukla N, Zhu SC. Task learning through visual demonstration and situated dialogue. In: Workshops at the Thirtieth AAAI Conference on Artificial Intelligence; 2016 Feb 12‒17; Phoenix, AZ, USA; 2016. 链接1

[70] Liu C, Yang S, Saba-Sadiya S, Shukla N, He Y, Zhu SC, et al. Jointly learning grounded task structures from language instruction and visual demonstration. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing; 2016 Nov 1‒5; Austin, TX, USA; 2016. 链接1

[71] Shukla N, He Y, Chen F, Zhu SC. Learning human utility from video demonstrations for deductive planning in robotics. In: Proceedings of Conference on Robot Learning; 2017 Nov 13‒15; Mountain View, CA, USA; 2017.

[72] Edmonds M, Gao F, Xie X, Liu H, Qi S, Zhu Y, et al. Feeling the force: integrating force and pose for fluent discovery through imitation learning to open medicine bottles. In: Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2017 Sep 24‒28; Vancouver, BC, Canada. New York City: IEEE; 2017. p. 3530‒7. 链接1

[73] Fire A, Zhu SC. Learning perceptual causality from video. ACM Trans Intell Syst Technol 2015;7(2):1‒22. 链接1

[74] Zhao Y, Holtzen S, Tao G, Zhu SC. Represent and infer human theory of mind for human‒robot interaction. In: AAAIFall Symposia; 2015 Nov 12‒14; Arlington, VA, USA; 2015.

[75] Zhu Y, Zhao Y, Zhu SC. Understanding tools: task-oriented object modeling, learning and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015 Jun 7‒12; Boston, MA, USA; 2015. 链接1

[76] Huang SH, Huang I, Pandya R, Dragan AD. Nonverbal robot feedback for human teachers. 2019. arXiv:1911.02320.

[77] Balbach FJ. Measuring teachability using variants of the teaching dimension. Theor Comput Sci 2008;397(1‒3):94‒113.

[78] Goldman SA, Kearns MJ. On the complexity of teaching. J Comput Syst Sci 1995;50(1):20‒31. 链接1

[79] CausalityPearl J.. Cambridge: Cambridge University Press; 2009.

[80] Bradley RA, Terry ME. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika 1952;39(3‒4):324‒45.

[81] Ramachandran D, Amir E. Bayesian inverse reinforcement learning. In: Proceedings of International Joint Conference on Artificial Intelligence; 2007 Jan 6‒12; Hyderabad, India; 2007.

[82] Baker CL, Saxe R, Tenenbaum JB. Action understanding as inverse planning. Cognition 2009;113(3):329‒49. 链接1

[83] Goodman ND, Frank MC. Pragmatic language interpretation as probabilistic inference. Trends Cogn Sci 2016;20(11):818‒29. 链接1

[84] Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M. How does disagreement help generalization against label corruption? In: Proceedings of the 36th International Conference on Machine Learning; 2019 Jun 10‒15; Long Beach, CA, USA; 2019.

[85] Li J, Socher R, Hoi SCH. DivideMix: learning with noisy labels as semisupervised learning. 2020. arXiv:2002.07394.

[86] Berthelot D, Roelofs R, Sohn K, Carlini N, Kurakin A. AdaMatch: a unified approach to semi-supervised learning and domain adaptation. In: Proceedings of International Conference on Learning Representations; 2022 Apr 25‒29; online; 2022.

[87] Yuan L, Fu Z, Shen J, Xu L, Shen J, Zhu SC. Emergence of pragmatics from referential game between theory of mind agents. In: Emergent Communication Workshop, 33rd Conference on Neural Information Processing Systems; 2019 Dec 8‒14; Vancouver, BC, Canada; 2019.

[88] Lazaridou A, Peysakhovich A, Baroni M. Multi-agent cooperation and the emergence of (natural) language. In: International Conference on Learning Representations; 2017 Apr 24‒26; Toulon, France; 2017.

[89] Lazaridou A, Hermann KM, Tuyls K, Clark S. Emergence of linguistic communication from referential games with symbolic and pixel input. In: International Conference on Learning Representations; 2018 Apr 30‒May 3; Vancouver, BC, Canada; 2018.

[90] Watkins CJ, Dayan P. Q-learning. Mach Learn 1992;8(3‒4):279‒92.

[91] Williams RJ. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 1992;8(3‒4):229‒56.

[92] Chen X, Cheng Y, Tang B. On the recursive teaching dimension of VC classes. In: Proceedings of the 30th International Conference on Neural Information Processing Systems; 2016 Dec 5‒10; Barcelona, Spain; 2016.

[93] Doliwa T, Fan G, Simon HU, Zilles S. Recursive teaching dimension, VCdimension and sample compression. J Mach Learn Res 2014;15:3107‒31.

[94] Mitchell TM. Machine learning. New York City: McGraw-Hill; 1997.

[95] Yuan L, Zhou D, Shen J, Gao J, Chen JL, Gu Q, et al. Iterative teacher-aware learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems; 2021 Dec 6‒14; online; 2021.

[96] Babes M, Marivate V, Subramanian K, Littman ML. Apprenticeship learning about multiple intentions. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11); 2011 Jun 28‒Jul 2; Bellevue, WA, USA; 2011.

[97] MacGlashan J, Littman ML. Between imitation and intention learning. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence; 2015 Jul 25‒Aug 1; Buenos Aires, Argentina; 2015.

[98] De Weerd H, Verbrugge R, Verheij B. Negotiating with other minds: the role of recursive theory of mind in negotiation with incomplete information. Auton Agent Multi-Ag 2017;31(2):250‒87. 链接1

[99] Ziebart BD, Maas AL, Bagnell JA, Dey AK. Maximum entropy inverse reinforcement learning. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI); 2008 Jul 13‒17; Chicago, IL, USA; 2008.

[100] Vroman MC. Maximum likelihood inverse reinforcement learning [dissertation]. New Jersey: Rutgers University-Graduate School-New Brunswick; 2014.

[101] Liu W, Dai B, Li X, Liu Z, Rehg J, Song L. Towards black-box iterative machine teaching. In: Proceedings of the 35th International Conference on Machine Learning; 2018 Jul 10‒15; Stockholm, Sweden; 2018. 链接1

[102] Wu L, Tian F, Xia Y, Fan Y, Qin T,Lai J, et al. Learning to teach with dynamic loss functions. In: Proceedings of the 32th Conference on Neural Information Processing Systems; 2018 Dec 3‒8; Montreal, QC, Canada; 2018. 链接1

[103] Gao X, Gong R, Zhao Y, Wang S, Shu T, Zhu SC. Joint mind modeling for explanation generation in complex human‒robot collaborative tasks. In: Proceedings of 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN); 2020 Aug 31‒Sep 4; Naples, Italy; 2020. 链接1

[104] Yuan T, Liu H, Fan L, Zheng Z, Gao T, Zhu Y, et al. Joint inference of states, robot knowledge, and human (false-) beliefs. In: Proceedings of 2020 IEEE International Conference on Robotics and Automation (ICRA); 2020 May 31‒Aug 31; Paris, France; 2020. 链接1

[105] Yuan L, Gao X, Zheng Z, Edmonds M, Wu YN, Rossano F, et al. In situ bidirectional human‒robot value alignment. Sci Robot 2022;7(68): eabm4183. 链接1

[106] Russell S. Human compatible: artificial intelligence and the problem of control. New York City: Viking; 2019.

[107] Tang N, Stacy S, Zhao M, Marquez G, Gao T. Bootstrapping an Imagined We for cooperation. In: Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci); 2020 Jul 29‒Aug 1; online; 2020.

[108] Stacy S, Zhao Q, Zhao M, Kleiman-Weiner M, Gao T. Intuitive signaling through an “Imagined We”. In: Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci); 2020 Jul 29‒Aug 1; online; 2020. 链接1

[109] Bara CP, Ch-Wang S, Chai J. MindCraft: theory of mind modeling for situated dialogue in collaborative tasks. In: Proceedings of the conference on Empirical Methods in Natural Language Processing (EMNLP); 2018 Nov 2‒4; Brussels, Belgium; 2018.

[110] Fan L, Qiu S, Zheng Z, Gao T, Zhu SC, Zhu Y. Learning triadic belief dynamics in nonverbal communication from videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2021 Jun 20‒25; Nashville, TN, USA; 2021. 链接1

[111] Arora S, Doshi P. A survey of inverse reinforcement learning: challenges, methods and progress. Artif Intell 2021;297:103500. 链接1

[112] Blumer A, Ehrenfeucht A, Haussler D, Warmuth M. Learnability and the Vapnik‒Chervonenkis dimension. J ACM 1989;36(4):929‒65. 链接1

[113] Bartlett PL, Bousquet O, Mendelson S. Localized Rademacher complexities. In: Proceedings of International Conference on Computational Learning Theory; 2022 Jul 2‒5; London, UK; 2002. 链接1

[114] Chapelle O, Schölkopf B, Zien A. An augmented PAC model for semisupervised learning. In: Chapelle O, Schölkopf B, Zien A, editors. Semisupervised learning. Cambridge: MIT Press; 2006. 链接1

[115] Barbu A, Pavlovskaia M, Zhu SC. Rates for inductive learning of compositional models. In: Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence; 2013 Jul 14‒18; Bellevue, WA, USA; 2013.

[116] Hintikka J. Knowledge and belief: an introduction to the logic of the two notions. Stud Log 1962;16:119‒22.

[117] Aumann RJ. Agreeing to disagree. Ann Stat 1976;4(6):1236‒9. 链接1

[118] Halpern JY, Moses Y. Knowledge and common knowledge in a distributed environment. J ACM 1990;37(3):549‒87. 链接1

[119] Smith NJ, Goodman ND, Frank MC. Learning and using language via recursive pragmatic reasoning about other agents. In: Proceedings of the 26th International Conference on Neural Information Processing Systems; 2013 Dec 5‒10; Lake Tahoe, NV, USA; 2013.

[120] Carston R. Informativeness, relevance and scalar implicature. Pragmat Beyond New Ser 1998;37:179‒238. 链接1

[121] Vogel A, Bodoia M, Potts C, Jurafsky D. Emergence of Gricean maxims from multi-agent decision theory. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2013 Jun 9‒14; Atlanta, GA, USA; 2013.

[122] Turing AM. On computable numbers, with an application to the Entscheidungsproblem. Proc Lond Math Soc 1937;2(1):230‒65.

[123] Stone P, Kraus S. To teach or not to teach? Decision making under uncertainty in ad hoc teams. In: Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems; 2010 May 10‒14; Toronto, ON, Canada; 2010. 链接1

[124] Zhang A, Sodhani S, Khetarpal K, Pineau J. Learning robust state abstractions for hidden-parameter block MDPs. In: Proceedings of the International Conference on Learning Representations; 2020 Apr 26‒May 1; online; 2020.

[125] Barrett S, Rosenfeld A, Kraus S, Stone P. Making friends on the fly: cooperating with new teammates. Artif Intell 2017;242:132‒71. 链接1

相关研究