Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Engineering >> 2019, Volume 5, Issue 5 doi: 10.1016/j.eng.2019.03.010

Brain Encoding and Decoding in fMRI with Bidirectional Deep Generative Models

a Research Center for Brain-Inspired Intelligence and National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
b School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
c Center for Excellence in Brain Science and Intelligence Technology Chinese Academy of Sciences, Shanghai 200031, China

Received: 2017-09-29 Revised: 2018-06-08 Accepted: 2019-03-28 Available online: 2019-06-01

Next Previous

Abstract

Brain encoding and decoding via functional magnetic resonance imaging (fMRI) are two important aspects of visual perception neuroscience. Although previous researchers have made significant advances in brain encoding and decoding models, existing methods still require improvement using advanced machine learning techniques. For example, traditional methods usually build the encoding and decoding models separately, and are prone to overfitting on a small dataset. In fact, effectively unifying the encoding and decoding procedures may allow for more accurate predictions. In this paper, we first review the existing encoding and decoding methods and discuss
the potential advantages of a "bidirectional" modeling strategy. Next, we show that there are correspondences between deep neural networks and human visual streams in terms of the architecture and computational rules Furthermore, deep generative models (e.g., variational autoencoders (VAEs) and generative adversarial networks (GANs)) have produced promising results in studies on brain encoding and decoding. Finally, we propose that the dual learning method, which was originally designed for machine translation tasks, could help to improve the performance of encoding and decoding models by leveraging large-scale unpaired data.

Figures

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

References

[ 1 ] Kay KN, Naselaris T, Prenger RJ, Gallant JL. Identifying natural images from human brain activity. Nature 2008;452(7185):352–5. link1

[ 2 ] Horikawa T, Kamitani Y. Generic decoding of seen and imagined objects using hierarchical visual features. Nat Commun 2017;8:15037. link1

[ 3 ] Naselaris T, Kay KN, Nishimoto S, Gallant JL. Encoding and decoding in fMRI. NeuroImage 2011;56(2):400–10. link1

[ 4 ] Chen M, Han J, Hu X, Jiang X, Guo L, Liu T. Survey of encoding and decoding of visual stimulus via fMRI: an image analysis perspective. Brain Imaging Behav 2014;8(1):7–23. link1

[ 5 ] Van Gerven MA. A primer on encoding models in sensory neuroscience. J Math Psychol 2017;76:172–83. link1

[ 6 ] Kay KN, Winawer J, Rokem A, Mezer A, Wandell BA. A two-stage cascade model of BOLD responses in human visual cortex. PLoS Comput Biol 2013;9(5): e1003079. link1

[ 7 ] St-Yves G, Naselaris T. The feature-weighted receptive field: an interpretable encoding model for complex feature spaces. NeuroImage 2018;180(Pt A):188–202. link1

[ 8 ] Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 2001;293(5539):2425–30. link1

[ 9 ] Haynes JD, Rees G. Decoding mental states from brain activity in humans. Nat Rev Neurosci 2006;7(7):523–34. link1

[10] Naselaris T, Prenger RJ, Kay KN, Oliver M, Gallant JL. Bayesian reconstruction of natural images from human brain activity. Neuron 2009;63(6):902–15. link1

[11] Horikawa T, Tamaki M, Miyawaki Y, Kamitani Y. Neural decoding of visual imagery during sleep. Science 2013;340(6132):639–42. link1

[12] Miyawaki Y, Uchida H, Yamashita O, Sato MA, Morito Y, Tanabe HC, et al. Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron 2008;60(5):915–29. link1

[13] Fujiwara Y, Miyawaki Y, Kamitani Y. Modular encoding and decoding models derived from bayesian canonical correlation analysis. Neural Comput 2013;25 (4):979–1005. link1

[14] Yu S, Zheng N, Ma Y, Wu H, Chen B. A novel brain decoding method: a correlation network framework for revealing brain connections. 2017. arXiv:1712.01668.

[15] Schoenmakers S, Barth M, Heskes T, Van Gerven M. Linear reconstruction of perceived images from human brain activity. NeuroImage 2013;83:951–61. link1

[16] Schoenmakers S, Güçlü U, Van Gerven M, Heskes T. Gaussian mixture models and semantic gating improve reconstructions from human brain activity. Front Comput Neurosci 2015;8:173. link1

[17] Cowen AS, Chun MM, Kuhl BA. Neural portraits of perception: reconstructing face images from evoked brain activity. NeuroImage 2014;94:12–22. link1

[18] Lee H, Kuhl BA. Reconstructing perceived and retrieved faces from activity patterns in lateral parietal cortex. J Neurosci 2016;36(22):6069–82. link1

[19] Güçlütürk Y, Güçlü U, Seeliger K, Bosch S, Van Lier R, Van Gerven MA. Reconstructing perceived faces from brain activations with deep adversarial neural decoding. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al., editors. Advances in neural information processing systems 30 (NIPS 2017) La Jolla: Neural Information Processing Systems Foundation; 2017. p. 4249–60.

[20] Wen H, Shi J, Zhang Y, Lu K, Cao J, Liu Z. Neural encoding and decoding with deep learning for dynamic natural vision. Cereb Cortex 2018;28(12):4136–60. link1

[21] Horikawa T, Kamitani Y. Hierarchical neural representation of dreamed objects revealed by brain decoding with deep neural network features. Front Comput Neurosci 2017;11:4. link1

[22] Naselaris T, Olman CA, Stansbury DE, Ugurbil K, Gallant JL. A voxel-wise encoding model for early visual areas decodes mental images of remembered scenes. NeuroImage 2015;105:215–28. link1

[23] Zeidman P, Silson EH, Schwarzkopf DS, Baker CI, Penny W. Bayesian population receptive field modelling. NeuroImage 2018;180(Pt A):173–87. link1

[24] Güçlü U, Van Gerven MA. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J Neurosci 2015;35(27):10005–14. link1

[25] Huth AG, De Heer WA, Griffiths TL, Theunissen FE, Gallant JL. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 2016;532 (7600):453–8. link1

[26] Shirer WR, Ryali S, Rykhlevskaia E, Menon V, Greicius MD. Decoding subjectdriven cognitive states with whole-brain connectivity patterns. Cereb Cortex 2012;22(1):158–65. link1

[27] Mokhtari F, Hossein-Zadeh GA. Decoding brain states using backward edge elimination and graph kernels in fMRI connectivity networks. J Neurosci Methods 2013;212(2):259–68. link1

[28] Yargholi E, Hossein-Zadeh GA. Brain decoding-classification of hand written digits from fMRI data employing Bayesian networks. Front Hum Neurosci 2016;10:351. link1

[29] Yargholi E, Hossein-Zadeh GA. Reconstruction of digit images from human brain fMRI activity through connectivity informed Bayesian networks. J Neurosci Methods 2016;257:159–67. link1

[30] Manning JR, Zhu X, Willke TL, Ranganath R, Stachenfeld K, Hasson U, et al. A probabilistic approach to discovering dynamic full-brain functional connectivity patterns. NeuroImage 2018;180(Pt A):243–52. link1

[31] Du C, Du C, He H. Sharing deep generative representation for perceived image reconstruction from human brain activity. In: Proceedings of the 2017 International Joint Conference on Neural Networks; 2017 May 14–19; Anchorage, AK, USA. New York: IEEE; 2017. p. 1049–56. link1

[32] Han K, Wen H, Shi J, Lu K, Zhang Y, Liu Z. Variational autoencoder: an unsupervised model for modeling and decoding fMRI activity in visual cortex. NeuroImage 2019;198:125–36. link1

[33] Seeliger K, Güçlü U, Ambrogioni L, Güçlütürk Y, Van Gerven MAJ. Generative adversarial networks for reconstructing natural images from brain activity. NeuroImage 2018;181:775–85. link1

[34] Kuo PC, Chen YS, Chen LF, Hsieh JC. Decoding and encoding of visual patterns using magnetoencephalographic data represented in manifolds. NeuroImage 2014;102(Pt 2):435–50. link1

[35] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44. link1

[36] Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw 2015;61:85–117. link1

[37] McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 1943;5(4):115–33. link1

[38] Cox DD, Dean T. Neural networks and neuroscience-inspired computer vision. Curr Biol 2014;24(18):R921–9. link1

[39] Cichy RM, Khosla A, Pantazis D, Torralba A, Oliva A. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Sci Rep 2016;6(1):27755. link1

[40] Eickenberg M, Gramfort A, Varoquaux G, Thirion B. Seeing it all: convolutional network layers map the function of the human visual system. NeuroImage 2017;152:184–94. link1

[41] DiCarlo JJ, Zoccolan D, Rust NC. How does the brain solve visual object recognition? Neuron 2012;73(3):415–34. link1

[42] DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends Cogn Sci 2007;11(8):333–41. link1

[43] Li J, Zhang Z, He H. Visual information processing mechanism revealed by fMRI data. In: Proceedings of the 2016 International Conference on Brain and Health Informatics; 2016 Oct 13–16; Omaha, NE, USA. Chem: Springer; 2016. p. 85–93. link1

[44] Higgins I, Matthey L, Glorot X, Pal A, Uria B, Blundell C, et al. Early visual concept learning with unsupervised deep learning. 2016. arXiv:1606.05579.

[45] Kingma DP, Welling M. Auto-encoding variational Bayes. 2014. arXiv:1312.6114.

[46] Rezende DJ, Mohamed S, Wierstra D. Stochastic backpropagation and approximate inference in deep generative models. In: Ghahramani Z, Welling M, Cortes C, Lawrenc ND, Weinberger KQ, editors. Advances in neural information processing systems (NIPS 2014) La Jolla: Neural Information Processing Systems Foundation; 2014. p. 1278–86.

[47] Goodfellow I, Abadie JP, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrenc ND, Weinberger KQ, editors. Advances in Neural Information Processing Systems (NIPS 2014) La Jolla: Neural Information Processing Systems Foundation; 2014. p. 2672–80.

[48] St-Yves G, Naselaris T. Generative adversarial networks conditioned on brain activity reconstruct seen images. In: Proceedings of the 2018 IEEE International Conference on System, Man, and Cybernetics; 2018 Oct 7–10; Miyazaki, Japan. New York: IEEE; 2018. link1

[49] Shen G, Dwivedi K, Majima K, Horikawa T, Kamitani Y. End-to-end deep image reconstruction from human brain activity. Front Comput Neurosci 2019;13:21. link1

[50] Kulkarni TD, Whitney WF, Kohli P, Tenenbaum J. Deep convolutional inverse graphics network. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, editors. Advances in neural information processing systems 28 (NIPS 2015) La Jolla: Neural Information Processing Systems Foundation; 2015. p. 2539–47.

[51] Eslami SA, Heess N, Weber T, Tassa Y, Szepesvari D, Hinton GE, et al. Attend, infer, repeat: fast scene understanding with generative models. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, editors. Advances in neural information processing systems 29 (NIPS 2016) La Jolla: Neural Information Processing Systems Foundation; 2016. p. 3225–33.

[52] Norman KA, Polyn SM, Detre GJ, Haxby JV. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends Cogn Sci 2006;10(9):424–30. link1

[53] Isola P, Zhu J, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. 2016. arXiv:1611.07004.

[54] Liu M, Breuel T, Kautz J. Unsupervised image-to-image translation networks. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al., editors. Advances in neural information processing systems 30 (NIPS 2017) La Jolla: Neural Information Processing Systems Foundation; 2017. p. 700–8.

[55] Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H. Generative adversarial text to image synthesis. 2016. arXiv:1605.05396.

[56] Hong S, Yang D, Choi J, Lee H. Inferring semantic layout for hierarchical textto-image synthesis. 2018. arXiv:1801.05091.

[57] He D, Xia Y, Qin T, Wang L, Yu N, Liu T, et al. Dual learning for machine translation. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, editors. Advances in neural information processing systems 29 (NIPS 2016) La Jolla: Neural Information Processing Systems Foundation; 2016. p. 820–8.

[58] Xia Y, Qin T, Chen W, Bian J, Yu N, Liu T. Dual supervised learning. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6–11; Sydney, Australia. Brookline: Microtome Publishing; 2017. p. 3789– 98.

[59] Xia Y, Tan X, Tian F, Qin T, Yu N, Liu T. Model-level dual learning. In: Proceedings of the 35th International Conference on Machine Learning; 2018 Jul 10–15; Stockholm, Sweden. Brookline: Microtome Publishing; 2018. p. 5379–88.

[60] Zhu J, Park T, Isola P, Efros AA. Unpaired image-to-Image translation using cycle-consistent adversarial networks. In: Proceedings of the 2017 IEEE International Conference on Computer Vision; 2017 Oct 22–29; Venice, Italy. New York: IEEE; 2017. p. 2242–51. link1

Related Research