Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Strategic Study of CAE >> 2018, Volume 20, Issue 4 doi: 10.15302/J-SSCAE-2018.04.017

Theory of Collective Intelligence Evolution and Its Applications in Intelligent Robots

1. Beijing Deep Singularity Technology Co., Ltd., Beijing 100086, China;

2. Intelligent Robot Research Institute, Fudan University, Shanghai 200433, China

Funding project:CAE Advisory Project “Research on Intelligent Manufacturing Led by New-Generation Artificial Intelligence” (2017-ZD-08-03) Received: 2018-08-10 Revised: 2018-08-15

Next Previous

Abstract

Collective intelligence (CI) is widely studied in the past few decades. The most well-known CI algorithm is the ant colony optimization (ACO). ACO is used to solve complex path searching problems through CI emergence. Recently, DeepMind announced the AlphaZero program which has achieved superhuman performance in the game of Go, Chess, and Shogi, by tabula rasa reinforcement learning from games of self-play. By experimenting and implementing the AlphaZero series program in the game of Gomoku, along with analyzing and comparing the Monte-Carlo tree search (MCTS) and ACO algorithms, it is realized that the success of AlphaZero is not only due to the deep neural network and reinforcement learning, but also due to the MCTS algorithm, which is discovered to be a CI emergence algorithm. Thus we propose a CI evolution theory, as a general framework towards artificial general intelligence (AGI). Combining the strengths of deep learning, reinforcement learning, and CI algorithm, CI evolution theory enables individual intelligence to evolve with high efficiency and low cost through CI emergence. This CI evolution theory has natural applications in intelligent robots. A cloud-terminal platform is developed to help intelligent robots evolve their intelligent models. As a proof of this idea, a welding robot's welding parameter optimization intelligent model is implemented on the platform.

Figures

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Fig.6

Fig.7

Fig.8

Fig.9

References

[ 1 ] Landemore H. Democratic reason: Politics, collective intelligence, and the rule of the many [M]. Princeton: Princeton University Press, 2012. link1 link2

[ 2 ] Wolpert D H, Tumer K, Frank J. Using collective intelligence to route internet traffic [M]. Cambridge: MIT Press, 1999. link1 link2

[ 3 ] Wolpert D H, Tumer K. Collective intelligence, data routing and braess’paradox [J]. Journal of Artificial Intelligence Research, 2002, 16(4): 708–714. link1 link2

[ 4 ] Tumer K, Wolpert D H. Collectives and the design of complex systems [M]. Berlin: Springer-Verlag, 2004. link1 link2

[ 5 ] Ng A Y, Harada D, Russell S J. Policy invariance under reward transformations: Theory and application to reward shaping [C]. San Francisco: ICML’99 Proceedings of the Sixteenth International Conference on Machine Learning, 1999. link1 link2

[ 6 ] Marden J R, Shamma J S. Game theoretic learning in distributed control—Handbook of dynamic game theory [M]. Berlin: Springer International Publishing, 2017.

[ 7 ] Samuel A L. Some studies in machine learning using the game of checkers II—Recent progress [J]. IBM Journal of Research and Development, 1967, 11: 601–617. link1

[ 8 ] Bon G L. The crowd: A study of the popular mind [M]. Berlin: Springer-Verlag, 2009. link1

[ 9 ] Thomas R L, Malone W, Dellarocas C. The collective intelligence genome [J]. IEEE Engineering Management Review, 2010, 55(1): 21–31.

[10] Woolley A W, Chabris C F, Pentland A, et al. Evidence for a collective intelligence factor in the performance of human groups [J]. Science, 2010, 330(6004): 686–688.

[11] Colorni A, Dorigo M, Maniezzo, et al. Distributed optimization by ant colonies [C]. Berlin: The 1st European Conference on Artificial Life, 1992.

[12] Stutzle T, Hoos H H. Max-min ant system [J]. Future Generation Computer Systems, 2000,16(8): 889–914.

[13] Zlochin M, Birattari M, Meuleau N, et al. Model-based search for combinatorial optimization: A critical survey [J]. Annals of Operations Research, 2004, 131(1–4): 373–395. link1 link2

[14] Dorigo M, Birattari M, Stutzle T. Ant colony optimization [J]. IEEE Computational Intelligence Magazine, 2006, 1(1): 28–39.

[15] Rego C, Gamboa D, Glover F, et al. Traveling salesman problem heuristics: Leading methods, implementations and latest advances [J]. European Journal of Operational Research, 2011, 211(3): 427–441. link1 link2

[16] Rabiner L R. Combinatorial optimization: Algorithms and complexity [J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1984, 32(6): 1258–1259. link1

[17] Poli R, Kennedy J, Blackwell T. Particle swarm optimization an overview [J]. Swarm Intelligence, 2007, 1(1): 33–57.

[18] Rodrigues F, Pereira F C, Ribeiro B. Learning from multiple annotators: Distinguishing good from random labelers [J]. Pattern Recognition Letters, 2013, 34(12): 1428–1436. link1 link2

[19] Yan Y, Fung G, Rosales R M, et al. Active learning from crowds [C]. Bellevue: The 28th International Conference on Machine Learning, 2011.

[20] Long C, Hua G, Kapoor A. Active visual recognition with expertise estimation in crowd sourcing [C]. Sydney: The IEEE International Conference on Computer Vision, 2013. link1 link2

[21] Zhao Z, Yan D, Ng W, et al. A transfer learning based framework of crowd-selection on twitter [C]. Birmingham: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013. link1 link2

[22] Fang M, Yin J, Zhu X. Knowledge transfer for multi-labeler active learning [C]. Prague: The Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2013. link1 link2

[23] Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search [J]. Nature, 2016, 529(7587): 484–489. link1 link2

[24] Silver D, Schrittwieser J, Simonyan K, et al. Mastering the game of Go without human knowledge [J]. Nature, 2017, 550(7676): 354–359. link1 link2

[25] Dorigo M, Gambardella L M. Ant colony system: A cooperative learning approach to the traveling salesman problem [J]. IEEE Transactions on evolutionary computation, 1997, 1(1): 53–66. link1 link2

[26] Dorigo M, Blum C. Ant colony optimization theory: A survey [J]. Theoretical Computer Science, 2005, 344(3): 243–278. link1 link2

[27] Dorigo M, Maniezzo V, Colorni A. The ant system: Optimization by a colony of cooperating agents [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 1996, 26(1): 29–41. link1 link2

[28] Browne C B, Powley E, Whitehouse D, et al. A survey of Monte Carlo tree search methods [J]. IEEE Transactions on Computational Intelligence and AI in games, 2012, 4(1): 1–43. link1 link2

[29] Coulom R. Efficient selectivity and backup operators in Monte-Carlo tree search [C]. Turin: International Conference on Computers and Games, 2006.

[30] Kocsis L, Szepesvári C. Bandit based Monte-Carlo planning [C]. Berlin: European Conference on Machine Learning, 2006. link1 link2

[31] Brémaud P. An introduction to probabilistic modeling [M]. Berlin: Springer Science & Business Media, 2012. link1 link2

[32] Gutjahr W J. A graph-based ant system and its convergence [J]. Future Generation Computer Systems, 2000, 16(8): 873–888. link1 link2

[33] Stutzle T, Dorigo M. A short convergence proof for a class of ant colony optimization algorithms [J]. IEEE Transactions on Evolutionary Computation, 2002, 6(4): 358–365. link1 link2

[34] Auer P, Cesa-Bianchi N, Fischer P. Finite-time analysis of the multiarmed bandit problem [J]. Machine Learning, 2002, 47(2–3): 235–256.

[35] Rosin C D. Multi-armed bandits with episode context [J]. Annals of Mathematics and Artificial Intelligence, 2011, 61(3): 203–230. link1 link2

Related Research