
Theory of Collective Intelligence Evolution and Its Applications in Intelligent Robots
Xiaoya Qi, Chuang Liu, Chen Fu, Zhongxue Gan
Strategic Study of CAE ›› 2018, Vol. 20 ›› Issue (4) : 101-111.
Theory of Collective Intelligence Evolution and Its Applications in Intelligent Robots
Collective intelligence (CI) is widely studied in the past few decades. The most well-known CI algorithm is the ant colony optimization (ACO). ACO is used to solve complex path searching problems through CI emergence. Recently, DeepMind announced the AlphaZero program which has achieved superhuman performance in the game of Go, Chess, and Shogi, by tabula rasa reinforcement learning from games of self-play. By experimenting and implementing the AlphaZero series program in the game of Gomoku, along with analyzing and comparing the Monte-Carlo tree search (MCTS) and ACO algorithms, it is realized that the success of AlphaZero is not only due to the deep neural network and reinforcement learning, but also due to the MCTS algorithm, which is discovered to be a CI emergence algorithm. Thus we propose a CI evolution theory, as a general framework towards artificial general intelligence (AGI). Combining the strengths of deep learning, reinforcement learning, and CI algorithm, CI evolution theory enables individual intelligence to evolve with high efficiency and low cost through CI emergence. This CI evolution theory has natural applications in intelligent robots. A cloud-terminal platform is developed to help intelligent robots evolve their intelligent models. As a proof of this idea, a welding robot's welding parameter optimization intelligent model is implemented on the platform.
collective intelligence / emergence / evolution / positive feedback / ant colony optimization / Monte-Carlo tree search / distributed AI cloud-terminal platform / intelligent robot
[1] |
Landemore H. Democratic reason: Politics, collective intelligence, and the rule of the many [M]. Princeton: Princeton University Press, 2012.
|
[2] |
Wolpert D H, Tumer K, Frank J. Using collective intelligence to route internet traffic [M]. Cambridge: MIT Press, 1999.
|
[3] |
Wolpert D H, Tumer K. Collective intelligence, data routing and braess’paradox [J]. Journal of Artificial Intelligence Research, 2002, 16(4): 708–714.
|
[4] |
Tumer K, Wolpert D H. Collectives and the design of complex systems [M]. Berlin: Springer-Verlag, 2004.
|
[5] |
Ng A Y, Harada D, Russell S J. Policy invariance under reward transformations: Theory and application to reward shaping [C]. San Francisco: ICML’99 Proceedings of the Sixteenth International Conference on Machine Learning, 1999.
|
[6] |
Marden J R, Shamma J S. Game theoretic learning in distributed control—Handbook of dynamic game theory [M]. Berlin: Springer International Publishing, 2017.
|
[7] |
Samuel A L. Some studies in machine learning using the game of checkers II—Recent progress [J]. IBM Journal of Research and Development, 1967, 11: 601–617.
|
[8] |
Bon G L. The crowd: A study of the popular mind [M]. Berlin: Springer-Verlag, 2009.
|
[9] |
Thomas R L, Malone W, Dellarocas C. The collective intelligence genome [J]. IEEE Engineering Management Review, 2010, 55(1): 21–31.
|
[10] |
Woolley A W, Chabris C F, Pentland A, et al. Evidence for a collective intelligence factor in the performance of human groups [J]. Science, 2010, 330(6004): 686–688.
|
[11] |
Colorni A, Dorigo M, Maniezzo, et al. Distributed optimization by ant colonies [C]. Berlin: The 1st European Conference on Artificial Life, 1992.
|
[12] |
Stutzle T, Hoos H H. Max-min ant system [J]. Future Generation Computer Systems, 2000,16(8): 889–914.
|
[13] |
Zlochin M, Birattari M, Meuleau N, et al. Model-based search for combinatorial optimization: A critical survey [J]. Annals of Operations Research, 2004, 131(1–4): 373–395.
|
[14] |
Dorigo M, Birattari M, Stutzle T. Ant colony optimization [J]. IEEE Computational Intelligence Magazine, 2006, 1(1): 28–39.
|
[15] |
Rego C, Gamboa D, Glover F, et al. Traveling salesman problem heuristics: Leading methods, implementations and latest advances [J]. European Journal of Operational Research, 2011, 211(3): 427–441.
|
[16] |
Rabiner L R. Combinatorial optimization: Algorithms and complexity [J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1984, 32(6): 1258–1259.
|
[17] |
Poli R, Kennedy J, Blackwell T. Particle swarm optimization an overview [J]. Swarm Intelligence, 2007, 1(1): 33–57.
|
[18] |
Rodrigues F, Pereira F C, Ribeiro B. Learning from multiple annotators: Distinguishing good from random labelers [J]. Pattern Recognition Letters, 2013, 34(12): 1428–1436.
|
[19] |
Yan Y, Fung G, Rosales R M, et al. Active learning from crowds [C]. Bellevue: The 28th International Conference on Machine Learning, 2011.
|
[20] |
Long C, Hua G, Kapoor A. Active visual recognition with expertise estimation in crowd sourcing [C]. Sydney: The IEEE International Conference on Computer Vision, 2013.
|
[21] |
Zhao Z, Yan D, Ng W, et al. A transfer learning based framework of crowd-selection on twitter [C]. Birmingham: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013.
|
[22] |
Fang M, Yin J, Zhu X. Knowledge transfer for multi-labeler active learning [C]. Prague: The Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2013.
|
[23] |
Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search [J]. Nature, 2016, 529(7587): 484–489.
|
[24] |
Silver D, Schrittwieser J, Simonyan K, et al. Mastering the game of Go without human knowledge [J]. Nature, 2017, 550(7676): 354–359.
|
[25] |
Dorigo M, Gambardella L M. Ant colony system: A cooperative learning approach to the traveling salesman problem [J]. IEEE Transactions on evolutionary computation, 1997, 1(1): 53–66.
|
[26] |
Dorigo M, Blum C. Ant colony optimization theory: A survey [J]. Theoretical Computer Science, 2005, 344(3): 243–278.
|
[27] |
Dorigo M, Maniezzo V, Colorni A. The ant system: Optimization by a colony of cooperating agents [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 1996, 26(1): 29–41.
|
[28] |
Browne C B, Powley E, Whitehouse D, et al. A survey of Monte Carlo tree search methods [J]. IEEE Transactions on Computational Intelligence and AI in games, 2012, 4(1): 1–43.
|
[29] |
Coulom R. Efficient selectivity and backup operators in Monte-Carlo tree search [C]. Turin: International Conference on Computers and Games, 2006.
|
[30] |
Kocsis L, Szepesvári C. Bandit based Monte-Carlo planning [C]. Berlin: European Conference on Machine Learning, 2006.
|
[31] |
Brémaud P. An introduction to probabilistic modeling [M]. Berlin: Springer Science & Business Media, 2012.
|
[32] |
Gutjahr W J. A graph-based ant system and its convergence [J]. Future Generation Computer Systems, 2000, 16(8): 873–888.
|
[33] |
Stutzle T, Dorigo M. A short convergence proof for a class of ant colony optimization algorithms [J]. IEEE Transactions on Evolutionary Computation, 2002, 6(4): 358–365.
|
[34] |
Auer P, Cesa-Bianchi N, Fischer P. Finite-time analysis of the multiarmed bandit problem [J]. Machine Learning, 2002, 47(2–3): 235–256.
|
[35] |
Rosin C D. Multi-armed bandits with episode context [J]. Annals of Mathematics and Artificial Intelligence, 2011, 61(3): 203–230.
|
/
〈 |
|
〉 |