期刊首页 优先出版 当期阅读 过刊浏览 作者中心 关于期刊 English

《工程(英文)》 >> 2023年 第25卷 第6期 doi: 10.1016/j.eng.2021.12.012

记忆成像

a Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
b Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
c Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518071, China
d Institute for Brain and Cognitive Science, Tsinghua University, Beijing 100084, China
e Department of Automation, Tsinghua University, Beijing 100084, China
f Beijing Laboratory of Brain and Cognitive Intelligence, Beijing Municipal Education Commission, Beijing 100010, China
g Hangzhou Hikvision Digital Technology Co., Ltd., Hangzhou 310012, China

收稿日期: 2021-06-10 修回日期: 2021-12-03 录用日期: 2021-12-08 发布日期: 2022-02-17

下一篇 上一篇

摘要

宽视场高分辨光场成像是大场景多对象感知与理解的关键。现有的结构阵列成像系统沿用单像感器均匀感知原理,系统固化难扩展、拼接重建欠鲁棒。与此同时,现有视觉理解架构严格遵循先成像后处理的单一前馈路径,图像的感知重建与语义理解相互独立。不同的是,人类视觉感知和理解系统具有前馈和反馈两条通路,前馈通路从视觉输入中提取对象表征(称为记忆印迹),而在反馈通路中,相关的印迹被重新激活,用于生成对象的高分辨想象。受此启发,本文提出了一种双路径成像机制,称为记忆成像。我们先提取大场景的整体表征,将实例级的局部对象抽象为记忆印迹,再建立场景与局部对象的双向关联关系来实现感知与理解。成像系统会在激发-抑制状态和关联状态之间交替:在前一种状态下,像素级细节会被动态巩固为记忆印迹或受到抑制。而在关联状态下,基于当前观测的对象图像信息,对象对应的印迹会被激活,通过时空一致的映射合成当前时刻的高分辨图像细节。实验结果表明,记忆成像系统有望改变传统的成像范式,在大场景多对象复杂关系感知与理解中潜力巨大。

图片

图1

图2

图3

图4

图5

图6

图7

图8

图9

参考文献

[ 1 ] Zhang J, Zhu T, Zhang A, Yuan X, Wang Z, Beetschen S, et al. Multiscale-VR: multiscale gigapixel 3D panoramic videography for virtual reality. In: Proceedings of 2020 IEEE International Conference on Computational Photography (ICCP); 2020 Apr 24‒26; LouisSt., MO, USA. New York City: IEEE; 2020. p. 1‒12. 链接1

[ 2 ] Li F, Yu J, Chai J. A hybrid camera for motion deblurring and depth map superresolution. In: Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition; 2008 Jun 23‒28; Anchorage, AK, USA. New York City: IEEE; 2008. p. 1‒8. 链接1

[ 3 ] Brady DJ, Gehm ME, Stack RA, Marks DL, Kittle DS, Golish DR, et al. Multiscale gigapixel photography. Nature 2012;486(7403):386‒9. 链接1

[ 4 ] Li G, Zhao Y, Ji M, Yuan X, Fang L. Zoom in to the details of human-centric videos. In: Proceedings of 2020 IEEE International Conference on Image Processing (ICIP); 2020 Oct 25‒28; Abu Dhabi, United Arab Emirates. New York City: IEEE; 2020. p. 3089‒93. 链接1

[ 5 ] Xu Y, Deng Z, Wang M, Xu W, So A-C, Cui S. Voting-based multiagent reinforcement learning for intelligent IoT. IEEE Internet Things J 2021;8(4):2681‒93. 链接1

[ 6 ] Zhang J, Koppel A, Bedi AS, Szepesvari C, Wang M. Variational policy gradient method for reinforcement learning with general utilities. 2020. arXiv:2007.02151.

[ 7 ] Ilie A, Welch G. Online control of active camera networks for computer vision tasks. ACM Trans Sens Netw 2014;10(2):1‒40. 链接1

[ 8 ] Gu J, Hitomi Y, Mitsunaga T, Nayar S. Coded rolling shutter photography: flexible space‒time sampling. In: Proceedings of 2010 IEEE International Conference on Computational Photography (ICCP); 2010 Mar 29‒30; Cambridge, MA, USA. New York City: IEEE; 2010. p. 1‒8. 链接1

[ 9 ] Josselyn SA, Tonegawa S. Memory engrams: recalling the past and imagining the future. Science 2020;367(6473):eaaw4325. 链接1

[10] Tonegawa S, Morrissey MD, Kitamura T. The role of engram cells in the systems consolidation of memory. Nat Rev Neurosci 2018;19(8):485‒98. 链接1

[11] Tonegawa S, Liu X, Ramirez S, Redondo R. Memory engram cells have come of age. Neuron 2015;87(5):918‒31. 链接1

[12] Josselyn SA, Köhler S, Frankland PW. Finding the engram. Nat Rev Neurosci 2015;16(9):521‒34. 链接1

[13] Frankland PW, Bontempi B. The organization of recent and remote memories. Nat Rev Neurosci 2005;6(2):119‒30. 链接1

[14] Dudai Y. The neurobiology of consolidations, or, how stable is the engram? Annu Rev Psychol 2004;55(1):51‒86. 链接1

[15] Marr D. A theory for cerebral neocortex. Proc R Soc Lond B 1970;176(1043):161‒234. 链接1

[16] Kandel ER, Schwartz JH, Jessell TM. Principles of neural science. 4th ed. New York: McGraw-Hill; 2000.

[17] Kim KI, Kwon Y. Single-image super-resolution using sparse regression and natural image prior. IEEE Trans Pattern Anal Mach Intell 2010;32(6):1127‒33. 链接1

[18] Yang J, Wang Z, Lin Z, Cohen S, Huang T. Coupled dictionary training for image super-resolution. IEEE Trans Image Process 2012;21(8):3467‒78. 链接1

[19] Cao F, Cai M, Tan Y, Zhao J. Image super-resolution via adaptive lp (0 < p < 1) regularization and sparse representation. IEEE Trans Neural Networks Learn Syst 2016;27(7):1550‒61. 链接1

[20] Yu J, Gao X, Tao D, Li X, Zhang K. A unified learning framework for single image super-resolution. IEEE Trans Neural Networks Learn Syst 2013;25(4):780‒92. 链接1

[21] Yang J, Lin Z, Cohen S. Fast image super-resolution based on in-place example regression. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition; 2013 Jun 23‒28; Portland, OR, USA. New York City: IEEE; 2013. p. 1059‒66. 链接1

[22] Freeman WT, Jones TR, Pasztor EC. Example-based super-resolution. IEEE Comput Graphics Appl 2002;22(2):56‒65. 链接1

[23] Kim J, Lee JK, Lee KM. Accurate image super-resolution using very deep convolutional networks. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016 Jun 27‒30; VegasLas, NV, USA. New York City: IEEE; 2016. p. 1646‒54. 链接1

[24] Tai Y, Yang J, Liu X. Image super-resolution via deep recursive residual network. In: Proceedings of 2017 IEEE conference on Computer Vision and Pattern Recognition (CVPR); 2017 Jul 21‒26; Honolulu, HI, USA. New York City: IEEE; 2017. p. 3147‒55. 链接1

[25] Kim J, Lee JK, Lee MK. Deeply-recursive convolutional network for image super-resolution. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016 Jun 27‒30; VegasLas, NV, USA. New York City: IEEE; 2016. p. 1637‒45. 链接1

[26] Tong T, Li G, Liu X, Gao Q. Image super-resolution using dense skip connections. In: Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV); 2017 Oct 22‒29; Venice, Italy. New York City: IEEE; 2017. p. 4799‒807. 链接1

[27] Johnson J, Alahi A, Li FF. Perceptual losses for real-time style transfer and superresolution. In: Proceedings of European Conference on Computer Vision (ECCV); 2016 Oct 11‒14; Amsterdam, The Netherlands. Springer; 2016. p. 694‒711. 链接1

[28] Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, et al. Photorealistic single image super-resolution using a generative adversarial network. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017 Jul 21‒26; Honolulu, HI, USA. New York City: IEEE; 2017. p. 4681‒90. 链接1

[29] Boominathan V, Mitra K, Veeraraghavan A. Improving resolution and depth-offield of light field cameras using a hybrid imaging system. In: Proceedings of 2014 IEEE International Conference on Computational Photography (ICCP); 2014 May 2‒4; ClaraSanta, CA, USA. New York City: IEEE; 2014. p. 1‒10. 链接1

[30] Wu J, Wang H, Wang X, Zhang Y. A novel light field super-resolution framework based on hybrid imaging system. In: Proceedings of 2015 Visual Communications and Image Processing (VCIP); 2015 Dec 13‒16; Singapore. New York City: IEEE; 2015. p. 1‒4. 链接1

[31] Wang Y, Liu Y, Heidrich W, Dai Q. The light field attachment: turning a DSLR into a light field camera using a low budget camera ring. IEEE Trans Visualization Comput Graphics 2017;23(10):2357‒64. 链接1

[32] Zhang Z, Wang Z, Lin Z, Qi H. Image super-resolution by neural texture transfer. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019 Jun 15‒20; LongBeach, CA, USA. New York City: IEEE; 2019. p. 7982‒91. 链接1

[33] Tan Y, Zheng H, Zhu Y, Yuan X, Lin X, Brady D, et al. CrossNet++: cross-scale large-parallax warping for reference-based super-resolution. IEEE Trans Pattern Anal Mach Intell 2021;43(12):4291‒305. 链接1

[34] Zheng H, Ji M, Wang H, Liu Y, Fang L. Crossnet: an end-to-end reference-based super resolution network using cross-scale warping. In: Proceedings of European Conference on Computer Vision (ECCV); 2018 Sep 8‒14; Munich, Germany. New York City: IEEE; 2018. p. 88‒104. 链接1

[35] Kopf J, Uyttendaele M, Deussen O, Cohen MF. Capturing and viewing gigapixel images. In: Proceedings of Special Interest Group on Computer Graphics and Interactive Techniques Conference; 2007 Aug 5‒9; San Diego, CA, USA. New York City: ACM; 2007. p. 93‒es. 链接1

[36] Brady DJ, Hagen N. Multiscale lens design. Opt Express 2009;17(13):10659‒74. 链接1

[37] Marks DL, Brady DJ. Gigagon: a monocentric lens design imaging 40 gigapixels. In: Proceedings of Imaging Systems 2010; 2010 Jun 7‒8; Tucson, AZ, USA. OSA; 2010. p. ITuC2. 链接1

[38] Cossairt OS, Miau D, Nayar SK. Gigapixel computational imaging. In: Proceedings of 2011 IEEE International Conference on Computational Photography (ICCP); 2011 Apr 8‒10; Pittsburgh, PA, USA. New York City: IEEE; 2011. p. 1‒8. 链接1

[39] Fan J, Suo J, Wu J, Xie H, Shen Y, Chen F, et al. Video-rate imaging of biological dynamics at centimetre scale and micrometre resolution. Nat Photonics 2019;13(11):809‒16. 链接1

[40] Yuan X, Fang L, Dai Q, Brady DJ, Liu Y. Multiscale gigapixel video: a cross resolution image matching and warping approach. In: Proceedings of 2017 IEEE International Conference on Computational Photography (ICCP); 2017 May 12‒14; Stanford, CA, USA. New York City: IEEE; 2017. p. 1‒9. 链接1

[41] Vaseghi SV. Advanced digital signal processing and noise reduction. 3rd ed. West Sussex: John Wiley & Sons; 2006. 链接1

[42] Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 2017;39 (6):1137‒49. 链接1

[43] He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV); 2017 Oct 22‒29; Venice, Italy. New York City: IEEE; 2017. p. 2961‒9. 链接1

[44] Clark RN. Visual astronomy of the deep sky. Cambridge: Cambridge University Press; 1990.

[45] Curcio CA, Sloan KR, Kalina RE, Hendrickson AE. Human photoreceptor topography. J Comp Neurol 1990;292(4):497‒523. 链接1

[46] Wauthier FL, Jordan MI, Jojic N. Efficient ranking from pairwise comparisons. In: Proceedings of 30th International Conference on Machine Learning; 2013 Jun 16‒21; Atlanta, GA, USA. ACM; 2013. p. 109‒17. 链接1

[47] Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, et al. FlowNet: learning optical flow with convolutional networks. In: Proceedings of 2015 IEEE International Conference on Computer Vision; 2015 Dec 7‒13; Santiago, Chile. New York City: IEEE; 2015. p. 2758‒66. 链接1

[48] Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T. FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition; 2017 Jul 21‒26; Honolulu, HI, USA. New York City: IEEE; 2017. p. 2462‒70. 链接1

[49] Bruhn A, Weickert J, Schnörr C. Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. Int J Comput Vision 2005;61(3):1‒21. 链接1

[50] Wang X, Zhang X, Zhu Y, Guo Y, Yuan X, Xiang L, et al. PANDA: a gigapixel-level human-centric video dataset. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020 Jun 13‒19; Seattle, WA, USA. New York City: IEEE; 2020. p. 3268‒78. 链接1

[51] Lim B, Son S, Kim H, Nah S, Lee KM. Enhanced deep residual networks for single image super-resolution. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); 2017 Jul 21‒ 26; Honolulu, HI, USA. New York City: IEEE; 2017. p. 136‒44. 链接1

[52] Kolchinsky A, Tracey BD. Estimating mixture entropy with pairwise distances. Entropy 2017;19(7):361. 链接1

相关研究