
基于普通器件实现快1000倍的相机与机器视觉
Tiejun Huang, Yajing Zheng, Zhaofei Yu, Rui Chen, Yuan Li, Ruiqin Xiong, Lei Ma, Junwei Zhao, Siwei Dong, Lin Zhu, Jianing Li, Shanshan Jia, Yihua Fu, Boxin Shi, Si Wu, Yonghong Tian
工程(英文) ›› 2023, Vol. 25 ›› Issue (6) : 110-119.
基于普通器件实现快1000倍的相机与机器视觉
1000× Faster Camera and Machine Vision with Ordinary Devices
在数码相机中,我们发现了一个重大缺陷,即从胶片相机继承的图像和视频模型阻碍了相机捕捉快速变化的光子世界。我们提出了一种新的视觉形式,称为视象(vform),这是一个比特序列阵列,其中每个比特表示光子的累积是否达到了一个阈值,从而可以记录和重建任何时刻场景的光强。仅使用消费级CMOS(互补金属氧化物半导体器件)传感器和集成电路,开发了一种比传统相机快1000 倍的脉冲相机。将视象看作生物视觉中的脉冲序列,进一步开发了基于脉冲神经网络的机器视觉系统,它可以将机器的速度和生物视觉的机理结合起来,从而实现了比人类视觉快1000 倍的高速目标检测和跟踪,并通过辅助裁判和目标瞄准系统证明了脉冲相机和超级视觉系统的效用。视象模型和芯片有望从根本上改变图像和视频的概念以及摄影、电影和视觉媒体等相关行业,并开启一个全新的基于脉冲神经网络的速度自由的机器视觉时代。
In digital cameras, we find a major limitation: the image and video form inherited from a film camera obstructs it from capturing the rapidly changing photonic world. Here, we present vidar, a bit sequence array where each bit represents whether the accumulation of photons has reached a threshold, to record and reconstruct the scene radiance at any moment. By employing only consumer-level complementary metal-oxide-semiconductor (CMOS) sensors and integrated circuits, we have developed a vidar camera that is 1000× faster than conventional cameras. By treating vidar as spike trains in biological vision, we have further developed a spiking neural network (SNN)-based machine vision system that combines the speed of the machine and the mechanism of biological vision, achieving high-speed object detection and tracking 1000× faster than human vision. We demonstrate the utility of the vidar camera and the super vision system in an assistant referee and target pointing system. Our study is expected to fundamentally revolutionize the image and video concepts and related industries, including photography, movies, and visual media, and to unseal a new SNN-enabled speed-free machine vision era.
Vidar相机 / 脉冲神经网络 / 超级视觉系统 / 全时成像
Vidar camera / Spiking neural networks / Super vision system / Full-time imaging
[1] |
Haykin S, Van B. Signals and systems. New Jersey: John Wiley & Sons; 2007.
|
[2] |
Chakravorty P. What is a signal? IEEE Signal Process Mag 2018;35(5):175‒7.
|
[3] |
Stump D. Digital cinematography: fundamentals, tools, techniques, and workflows. Boca Raton: CRC Press; 2014.
|
[4] |
Itatani J, Quéré F, Yudin GL, Ivanov MY, Krausz F, Corkum PB. Attosecond streak camera. Phys Rev Lett 2002;88(17):173903.
|
[5] |
Bradley DK, Bell PM, Landen OL, Kilkenny JD, Oertel J. Development and characterization of a pair of 30‒40 ps x-ray framing cameras. Rev Sci Instrum 1995;66(1):716‒8.
|
[6] |
Wässle H. Parallel processing in the mammalian retina. Nat Rev Neurosci 2004;5(10):747‒57.
|
[7] |
Masland RH. The neuronal organization of the retina. Neuron 2012;76(2):266‒80.
|
[8] |
Litwiller D. CCD vs CMOS. Photon Spectra 2001;35:154‒8.
|
[9] |
Lamb TD, Pugh EN. Phototransduction, dark adaptation, and rhodopsin regeneration theproctor lecture. InvestOphthalmol Vis Sci 2006;47(12):5137‒52.
|
[10] |
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436‒44.
|
[11] |
Maass W. Networks of spiking neurons: the third generation of neural network models. Neural Netw 1997;10(9):1659‒71.
|
[12] |
Roy K, Jaiswal A, Panda P. Towards spike-based machine intelligence with neuromorphic computing. Nature 2019;575(7784):607‒17.
|
[13] |
Marr D, Poggio T, Ullman S. Vision: a computational investigation into the human representation and processing of visual information. Cambridge: MIT Press; 2010.
|
[14] |
Palmer SE. Vision science: photons to phenomenology. Cambridge: MIT Press; 1999.
|
[15] |
Li Z. Understanding vision: theory, model, and data. New York City: Oxford University Press; 2014.
|
[16] |
Davies ER. Computer and machine vision: theory, algorithm, practicalities. 4th ed. London: Academic Press; 2012.
|
[17] |
Sonka M, Hlavac V, Boyle R. Image processing, analysis, and machine vision. 4th ed. Stamford: Cengage Learning; 2015.
|
[18] |
Medathati NVK, Neumann H, Masson GS, Kornprobst P. Bio-inspired computer vision: towards a synergistic approach of artificial and biological vision. Comput Vis Image Underst 2016;150:1‒30.
|
[19] |
Tsodyks MV, Markram H. The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability. Proc Natl Acad Sci USA 1997;94(2):719‒23.
|
[20] |
Tsodyks M, Pawelzik K, Markram H. Neural networks with dynamic synapses. Neural Comput 1998;10(4):821‒35.
|
[21] |
Costa RP, Sjöström PJ, van Rossum MCW. Probabilistic inference of short-term synaptic plasticity in neocortical microcircuits. Front Comput Neurosci 2013;7:75.
|
[22] |
Mi Y, Fung CA, Wong KM, Wu S. Spike frequency adaptation implements anticipative tracking in continuous attractor neural networks. Adv Neural Inf Process Syst 2014;27:505‒13.
|
[23] |
Tavanaei A, Maida A. BP-STDP: approximating backpropagation using spike timing dependent plasticity. Neurocomputing 2019;330:39‒47.
|
[24] |
Song S, Miller KD, Abbott LF. Competitive Hebbian learning through spiketiming- dependent synaptic plasticity. Nat Neurosci 2000;3(9):919‒26.
|
[25] |
Rumsey CC, Abbott LF. Synaptic equalization by anti-STDP. Neurocomputing 2004;58:359‒61.
|
[26] |
Adelson EH, Bergen JR. The plenoptic function and the elements of early vision. In: Landy MS, Movshon JA, editors. Computational models of visual processing. Cambridge: MIT Press; 1991.
|
/
〈 |
|
〉 |