Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Strategic Study of CAE >> 2024, Volume 26, Issue 1 doi: 10.15302/J-SSCAE-2024.01.012

Spike-Based Vision for Autonomous Driving Scenarios

1. National Engineering Research Center of Visual Technology, Beijing 100871, China;

2. Institute of Artificial Intelligence, Peking University, Beijing 100871, China

Funding project:中国工程院咨询项目“新一代人工智能及产业集群发展战略研究”(2022-PP-07) Received: 2023-11-14 Revised: 2024-01-05 Available online: 2024-02-02

Next Previous

Abstract

Autonomous driving is an important research direction in computer vision which has broad application prospects. Pure vision perception schemes have significant research value in autonomous driving scenarios. Different from traditional cameras, spike vision sensor offers imaging speeds over a thousand times faster than traditional cameras, possess advantages such as high temporal resolution, high dynamic range, low data redundancy, and low power consumption. This study focuses on autonomous driving scenarios, introducing the imaging principles, perception capabilities, and advantages of the spike camera. Besides, focusing on visual tasks related to autonomous driving, this study elaborates on the principles and methods of spike-based image/video reconstruction, discusses the approach to image enhancement based on sensor fusion with spike cameras,and provides a detailed description of the algorithms and technical routes for motion optical flow estimation, object recognition, detection, segmentation, and tracking, and deep estimation of three-dimensional scenes based on spike cameras. It also summarizes the development of the spike camera data and systems. At last, it analyzes the challenges, potential solutions, and future directions for spike vision research. Spike cameras and their algorithms and systems hold great potentials in the field of autonomous driving and represent one of the future research directions in computer vision.

Figures

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

图13

图14

图15

References

[ 1 ] Lichtsteiner P, Posch C, Delbruck T. A 128$/times$ 128 120 dB 15 $/mu$s latency asynchronous temporal contrast vision sensor [J]. IEEE Journal of Solid-State Circuits, 2008, 43(2): 566‒576.

[ 2 ] Posch C, Matolin D, Wohlgenannt R. An asynchronous time-based image sensor [C]. Seattle: 2008 IEEE International Symposium on Circuits and Systems, 2008.

[ 3 ] Brandli C, Berner R, Yang M H, et al. A 240 × 180 130 dB 3 µs latency global shutter spatiotemporal vision sensor [J]. IEEE Journal of Solid-State Circuits, 2014, 49(10): 2333‒2341.

[ 4 ] Moeys D P, Corradi F, Li C H, et al. A sensitive dynamic and active pixel vision sensor for color or neural imaging applications [J]. IEEE Transactions on Biomedical Circuits and Systems, 2018, 12(1): 123‒136.

[ 5 ] Huang J, Guo M H, Chen S S. A dynamic vision sensor with direct logarithmic output and full-frame picture-on-demand [C]. Baltimore: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), 2017.

[ 6 ] Huang T J, Zheng Y J, Yu Z F, et al. 1000 × faster camera and machine vision with ordinary devices [J]. Engineering, 2023, 25: 110‒119.

[ 7 ] Posch C, Matolin D, Wohlgenannt R. A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS [J]. IEEE Journal of Solid-State Circuits, 2011, 46(1): 259‒275.

[ 8 ] Chen D G, Matolin D, Bermak A, et al. Pulse-modulation imaging—Review and performance analysis [J]. IEEE Transactions on Biomedical Circuits and Systems, 2011, 5(1): 64‒68.

[ 9 ] Son B, Suh Y, Kim S, et al. 4.1 A 640 × 480 dynamic vision sensor with a 9 µm pixel and 300 Meps address-event representation [C]. San Diego: 2017 IEEE International Solid-State Circuits Conference (ISSCC), 2017.

[10] Culurciello E, Etienne-Cummings R, Boahen K A. A biomorphic digital image sensor [J]. IEEE Journal of Solid-State Circuits, 2003, 38(2): 281‒294.

[11] 李家宁, 田永鸿. 神经形态视觉传感器的研究进展及应用综述 [J]. 计算机学报, 2021, 44(6): 1258‒1286.
Li J N, Tian Y H. Recent advances in neuromorphic vision sensors: A survey [J]. Chinese Journal of Computers, 2021, 44(6): 1258‒1286.

[12] Gallego G, Delbrück T, Orchard G, et al. Event-based vision: A survey [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1): 154‒180.

[13] Bardow P, Davison A J, Leutenegger S. Simultaneous optical flow and intensity estimation from an event camera [C]. Las Vegas: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[14] Li H M, Li G Q, Shi L P. Super-resolution of spatiotemporal event-stream image [J]. Neurocomputing, 2019, 335: 206‒214.

[15] Munda G, Reinbacher C, Pock T. Real-time intensity-image reconstruction for event cameras using manifold regularisation [J]. International Journal of Computer Vision, 2018, 126(12): 1381‍‒1393.

[16] Scheerlinck C, Barnes N, Mahony R. Continuous-time intensity estimation using event cameras [C]. Los Angeles: Asian Conference on Computer Vision, 2019.

[17] Scheerlinck C, Barnes N, Mahony R. Asynchronous spatial image convolutions for event cameras [J]. IEEE Robotics and Automation Letters, 2019, 4(2): 816‒822.

[18] Barua S, Miyatani Y, Veeraraghavan A. Direct face detection and video reconstruction from event cameras [C]. Lake Placid: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), 2016.

[19] Aharon M, Elad M, Bruckstein A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311‒4322.

[20] Rebecq H, Ranftl R, Koltun V, et al. Events-to-video: Bringing modern computer vision to event cameras [C]. Long Beach: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[21] Scheerlinck C, Rebecq H, Gehrig D, et al. Fast image reconstruction with an event camera [C]. Snowmass Village: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), 2020.

[22] Cadena P R G, Qian Y Q, Wang C X, et al. SPADE-E2VID: Spatially-adaptive denormalization for event-based video reconstruction [J]. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 2021, 30: 2488‒2500.

[23] Weng W M, Zhang Y Y, Xiong Z W. Event-based video reconstruction using transformer [C]. Beijing: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

[24] Zhu L, Wang X, Chang Y, et al. Event-based video reconstruction via potential-assisted spiking neural network [C]. New Orleans: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[25] Pini S, Borghi G, Vezzani R. Learn to see by events: Color frame synthesis from event and RGB cameras [C]. Beijing: International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2020.

[26] Wang L, Mohammad Mostafavi I S, Ho Y S, et al. Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks [C]. Long Beach: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[27] Wang L, Kim T K, Yoon K J. EventSR: From asynchronous events to image reconstruction, restoration, and super-resolution via end-to-end adversarial learning [C]. Seattle: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[28] Paredes-Vallés F, de Croon G C H E. Back to event basics: Self-supervised learning of image reconstruction for event cameras via photometric constancy [C]. Online: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[29] Liu S C, Rueckauer B, Ceolini E, et al. Event-driven sensing for efficient perception: Vision and audition algorithms [J]. IEEE Signal Processing Magazine, 2019, 36(6): 29‒37.

[30] Barranco F, Fermuller C, Ros E. Real-time clustering and multi-target tracking using event-based sensors [C]. Madrid: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.

[31] Piątkowska E, Belbachir A N, Schraml S, et al. Spatiotemporal multiple persons tracking using Dynamic Vision Sensor [C]. Providence: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012.

[32] Lagorce X, Meyer C, Ieng S H, et al. Asynchronous event-based multikernel algorithm for high-speed visual features tracking [J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(8): 1710‒1720.

[33] Rodríguez-Gomez J P, Eguíluz A G, Martínez-de Dios J R, et al. Asynchronous event-based clustering and tracking for intrusion monitoring in UAS [C]. Paris: 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020.

[34] Mitrokhin A, Fermüller C, Parameshwara C, et al. Event-based moving object detection and tracking [C]. Madrid: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.

[35] Chen H S, Wu Q Q, Liang Y J, et al. Asynchronous tracking-by-detection on adaptive time surfaces for event-based object tracking [C]. New York: The 27th ACM International Conference on Multimedia, 2019.

[36] Chen H S, Suter D, Wu Q Q, et al. End-to-end learning of object motion estimation from retinal events for event-based object tracking [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 10534‒10541.

[37] Rebecq H, Gallego G, Mueggler E, et al. EMVS: Event-based multi-view stereo—3D reconstruction with an event camera in real-time [J]. International Journal of Computer Vision, 2018, 126(12): 1394‒1414.

[38] Rebecq H, Gallego G, Scaramuzza D. EMVS: Event-based multi-view stereo [C]. York: The British Machine Vision Conference, 2016.

[39] Hidalgo-Carrió J, Gehrig D, Scaramuzza D. Learning monocular dense depth from events [C]. Fukuoka: 2020 International Conference on 3D Vision (3DV), 2020.

[40] Gehrig D, Rüegg M, Gehrig M, et al. Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction [J]. IEEE Robotics and Automation Letters, 2021, 6(2): 2822‒2829.

[41] Schraml S, Belbachir A N, Milosevic N, et al. Dynamic stereo vision system for real-time tracking [C]. Paris: 2010 IEEE International Symposium on Circuits and Systems, 2010.

[42] Kogler J, Sulzbachner C, Humenberger M, et al. Address-event based stereo vision with bio-inspired silicon retina imagers [C]. Rijeka: Advances in Theory and Applications of Stereo Vision, 2011.

[43] Kogler J, Humenberger M, Sulzbachner C. Event-based stereo matching approaches for frameless address event stereo data [C]. Las Vegas: The 7th International Conference on Advances in Visual Computing, 2011.

[44] Rogister P, Benosman R, Ieng S H, et al. Asynchronous event-based binocular stereo matching [J]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(2): 347‒353.

[45] Lee J, Delbruck T, Park P K J, et al. Live demonstration: Gesture-based remote control using stereo pair of dynamic vision sensors [C]. Seoul: 2012 IEEE International Symposium on Circuits and Systems (ISCAS), 2012.

[46] Camuñas-Mesa L A, Serrano-Gotarredona T, Ieng S H, et al. On the use of orientation filters for 3D reconstruction in event-driven stereo vision [J]. Frontiers in Neuroscience, 2014, 8: 48.

[47] Tulyakov S, Fleuret F, Kiefel M, et al. Learning an event sequence embedding for dense event-based deep stereo [C]. Seoul: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019.

[48] Ahmed S H, Jang H W, Uddin S N, et al. Deep event stereo leveraged by event-to-image translation [C]. Vancouver: The AAAI Conference on Artificial Intelligence, 2021.

[49] Nam Y, Mostafavi M, Yoon K J, et al. Stereo depth from events cameras: Concentrate and focus on the future [C]. New Orleans: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[50] Zhang K X, Che K W, Zhang J G, et al. Discrete time convolution for fast event-based stereo [C]. New Orleans: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[51] Cho H, Cho J, Yoon K J. Learning adaptive dense event stereo from the image domain [C]. Vancouver: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[52] Rançon U, Cuadrado-Anibarro J, Cottereau B R, et al. StereoSpike: Depth learning with a spiking neural network [J]. IEEE Access, 2022, 10: 127428‒127439.

[53] Hadviger A, Marković I, Petrović I. Stereo dense depth tracking based on optical flow using frames and events [J]. Advanced Robotics, 2021, 35(3‒4): 141‒152.

[54] Wang Z W, Pan L Y, Ng Y, et al. Stereo hybrid event-frame (SHEF) cameras for 3D perception [C]. Prague: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.

[55] Zuo Y F, Cui L, Peng X, et al. Accurate depth estimation from a hybrid event-RGB stereo setup [C]. Prague: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.

[56] Mostafavi I S M, Yoon K J, Choi J. Event-intensity stereo: Estimating depth by the best of both worlds [C]. Montreal: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

[57] 黄铁军‍. 脉冲连续摄影原理与超高速高动态成像验证 [J]. 电子学报, 2022, 50(12): 2919‒2927.
Huang T J. Spiking continuous photographing principle and demonstration on ultrahigh speed and high dynamic imaging [J]. Acta Electronica Sinica, 2022, 50(12): 2919‒2927.

[58] 黄铁军, 余肇飞, 李源, 等‍. 脉冲视觉研究进展 [J]. 中国图象图形学报, 2022, 27(6): 1823‒1839.
Huang T J, Yu Z F, Li Y, et al. Advances in spike vision [J]. Journal of Image and Graphics, 2022, 27(6): 1823‒1839.

[59] Masland R H. The neuronal organization of the retina [J]. Neuron, 2012, 76(2): 266‒280.

[60] Wässle H. Parallel processing in the mammalian retina [J]. Nature Reviews Neuroscience, 2004, 5: 747‒757.

[61] Chang Y K, Zhou C, Hong Y C, et al. 1000 FPS HDR video with a spike-RGB hybrid camera [C]. Vancouver: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[62] Chen S Y, Zhang J Y, Zheng Y J, et al. Enhancing motion deblurring in high-speed scenes with spike streams [C]. New Orleans: Thirty-Seventh Conference on Neural Information Processing Systems, 2023.

[63] Zhu L, Dong S W, Huang T J, et al. A retina-inspired sampling method for visual texture reconstruction [C]. Shanghai: 2019 IEEE International Conference on Multimedia and Expo (ICME), 2019.

[64] Zhao J, Xiong R Q, Huang T J. High-speed motion scene reconstruction for spike camera via motion aligned filtering [C]. Online: 2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020.

[65] Zhao J, Xiong R Q, Liu H F, et al. Spk2ImgNet: learning to reconstruct dynamic scene from continuous spike stream [C]. Nashville: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[66] Zhu L, Zheng Y L, Geng M Y, et al. Recurrent spike-based image restoration under general illumination [EB/OL]. (2023-08-06)‍[2023-11-01]. http://arxiv.org/abs/2308.03018.pdf.

[67] Zhang J, Jia S, Yu Z, et al. Learning temporal-ordered representation for spike streams based on discrete wavelet transforms [C]. Washington, DC: The AAAI Conference on Artificial Intelligence, 2023.

[68] Zhu L, Dong S W, Li J N, et al. Retina-like visual image reconstruction via spiking neural model [C]. Seattle: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[69] Zheng Y J, Zheng L X, Yu Z F, et al. High-speed image reconstruction through short-term plasticity for spiking cameras [C]. Nashville: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[70] Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation [C]. Munich: International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015.

[71] Bi G Q, Poo M M. Synaptic modifications in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type [J]. The Journal of Neuroscience, 1998, 18(24): 10464‒10472.

[72] Tsodyks M, Pawelzik K, Markram H. Neural networks with dynamic synapses [J]. Neural Computation, 1998, 10(4): 821‒835.

[73] Tsodyks M V, Markram H. The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability [J]. Proceedings of the National Academy of Sciences of the United States of America, 1997, 94(2): 719‒723.

[74] Chen S, Yu Z, Huang T. Self-supervised joint dynamic scene reconstruction and optical flow estimation for spiking camera [C]. Washington, DC: The AAAI Conference on Artificial Intelligence, 2023.

[75] Dong Y C, Zhao J, Xiong R Q, et al. 3D residual interpolation for spike camera demosaicing [C]. Bordeaux: 2022 IEEE International Conference on Image Processing (ICIP), 2022.

[76] Zhao J, Xie J Y, Xiong R Q, et al. Super resolve dynamic scene from continuous spike streams [C]. Montreal: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

[77] Mostafavi I S M, Choi J, Yoon K J. Learning to super resolve intensity images from events [C]. Seattle: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[78] Weng W M, Zhang Y Y, Xiong Z W. Boosting event stream super-resolution with a recurrent neural network [C]. Tel Aviv: Computer Vision-ECCV 2022: 17th European Conference, 2022.

[79] Zhao J, Xiong R, Zhang J, et al. Learning to super-resolve dynamic scenes for neuromorphic spike camera [C]. Washington, DC: The AAAI Conference on Artificial Intelligence, 2023.

[80] Xiang X J, Zhu L, Li J N, et al. Learning super-resolution reconstruction for high temporal resolution spike stream [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(1): 16‒29.

[81] Duan P Q, Ma Y, Zhou X Y, et al. NeuroZoom: Denoising and super resolving neuromorphic events and spikes [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(12): 15219‒15232.

[82] Duan P Q, Wang Z W, Zhou X Y, et al. EventZoom: Learning to denoise and super resolve neuromorphic events [C]. Nashville: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[83] Zhang S, Zhang Y, Jiang Z, et al. Learning to see in the dark with events [C]. Cham: European Conference on Computer Vision, 2020.

[84] Rebecq H, Ranftl R, Koltun V, et al. High speed and high dynamic range video with an event camera [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(6): 1964‒1980.

[85] Zou Y H, Zheng Y Q, Takatani T, et al. Learning to reconstruct high speed and high dynamic range videos from events [C]. Nashville: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[86] Xu F, Yu L, Wang B S, et al. Motion deblurring with real events [C]. Montreal: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

[87] Shang W, Ren D W, Zou D Q, et al. Bringing events into video deblurring with non-consecutively blurry frames [C]. Montreal: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

[88] Sun L, Sakaridis C, Liang J Y, et al. Event-based fusion for motion deblurring with cross-modal attention [C]. Tel Aviv: Computer Vision-ECCV 2022: 17th European Conference, 2022.

[89] Sun L, Sakaridis C, Liang J Y, et al. Event-based frame interpolation with ad-hoc deblurring [C]. Vancouver: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[90] Zhang X, Yu L, Yang W, et al. Generalizing event-based motion deblurring in real-world scenarios [EB/OL]. (2023-08-11)[2023-11-01]. http://arxiv.org/abs/2308.05932.pdf.

[91] Cho H, Jeong Y, Kim T, et al. Non-coaxial event-guided motion deblurring with spatial alignment [C]. Montreal: The IEEE/CVF International Conference on Computer Vision (ICCV), 2023.

[92] Han J, Zhou C, Duan P Q, et al. Neuromorphic camera guided high dynamic range imaging [C]. Seattle: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[93] Shaw R, Catley-Chandar S, Leonardis A, et al. HDR reconstruction from bracketed exposures and events [EB/OL]. (2022-03-28)[2023-11-01]. http://arxiv.org/abs/2203.14825.pdf.

[94] Messikommer N, Georgoulis S, Gehrig D, et al. Multi-bracket high dynamic range imaging with event cameras [C]. New Orleans: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2022.

[95] Yang Y X, Han J, Liang J X, et al. Learning event guided high dynamic range video reconstruction [C]. Vancouver: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[96] Liang J, Yang Y, Li B, et al. Coherent Event Guided Low-Light Video Enhancement [C]. Paris: The IEEE/CVF International Conference on Computer Vision (ICCV), 2023.

[97] Janai J, Güney F, Behl A, et al. Computer vision for autonomous vehicles: Problems, datasets and state of the art [J]. Foundations and Trends® in Computer Graphics and Vision, 2020, 12(1‒3): 1‒308.

[98] Gallego G, Rebecq H, Scaramuzza D. A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation [C]. Salt Lake City: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[99] Gallego G, Gehrig M, Scaramuzza D. Focus is all you need: Loss functions for event-based vision [C]. Long Beach: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[100] Gallego G, Gehrig M, Scaramuzza D. Focus is all you need: Loss functions for event-based vision [C]. Long Beach: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[101] Stoffregen T, Kleeman L. Simultaneous Optical Flow and Segmentation (SOFAS) using dynamic vision sensor [EB/OL]. (2018-05-31)[2023-11-01]. http://arxiv.org/abs/1805.12326.pdf.

[102] Shiba S, Aoki Y, Gallego G. Secrets of event-based optical flow [C]. Cham: European Conference on Computer Vision, 2022.

[103] Hu L W, Zhao R, Ding Z L, et al. Optical flow estimation for spiking camera [C]. New Orleans: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[104] Ponghiran W, Liyanagedera C M, Roy K. Event-based temporally dense optical flow estimation with sequential learning [EB/OL]. (2022-10-03)[2023-11-01]. http://arxiv.org/abs/2210.01244.pdf.

[105] Liu H T, Chen G, Qu S Q, et al. TMA: Temporal motion aggregation for event-based optical flow [EB/OL]. (2023-03-21)[2023-11-01]. http://arxiv.org/abs/2303.11629.pdf.

[106] Gehrig M, Muglikar M, Scaramuzza D. Dense continuous-time optical flow from events and frames [EB/OL]. (2022-03-25)[2023-11-01]. http://arxiv.org/abs/2203.13674.pdf.

[107] Zhao R, Xiong R, Zhao J, et al. Learning optical flow from continuous spike streams [J]. Advances in Neural Information Processing Systems, 2022, 35: 7905‒7920.

[108] Gehrig M, Millhäusler M, Gehrig D, et al. E-RAFT: Dense optical flow from event cameras [C]. London: 2021 International Conference on 3D Vision (3DV), 2021.

[109] Orchard G, Benosman R, Etienne-Cummings R, et al. A spiking neural network architecture for visual motion estimation [C]. Rotterdam: 2013 IEEE Biomedical Circuits and Systems Conference (BioCAS), 2013.

[110] Paredes-Valles F, Scheper K Y W, de Croon G C H E. Unsupervised learning of a hierarchical spiking neural network for optical flow estimation: From events to global motion perception [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2051‒2064.

[111] Haessig G, Cassidy A, Alvarez R, et al. Spiking optical flow for event-based sensors using IBM´s TrueNorth neurosynaptic system [J]. IEEE Transactions on Biomedical Circuits and Systems, 2018, 12(4): 860‒870.

[112] Lee C, Kosta A K, Zhu A Z, et al. Spike-FlowNet: Event-based optical flow estimation with energy-efficient hybrid neural networks [C]. Glasgow: Computer Vision—ECCV 2020: 16th European Conference, 2020.

[113] Lee C, Kosta A K, Roy K. Fusion-FlowNet: Energy-efficient optical flow estimation using sensor fusion and deep fused spiking-analog network architectures [C]. Philadelphia: 2022 International Conference on Robotics and Automation (ICRA), 2022.

[114] Wan Z X, Dai Y C, Mao Y X. Learning dense and continuous optical flow from an event camera [J]. IEEE Transactions on Image Processing, 2022, 31: 7237‒7251.

[115] Wan Z X, Mao Y X, Zhang J, et al. RPEFlow: Multimodal fusion of RGB-PointCloud-event for joint optical flow and scene flow estimation [EB/OL]. (2023-09-26)[2023-11-01]. http://arxiv.org/abs/2309.15082.pdf.

[116] Chen N F Y. Pseudo-labels for supervised learning on dynamic vision sensor data, applied to object detection under ego-motion [C]. Salt Lake City: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018.

[117] Perot E, de Tournemire P, Nitti D, et al. Learning to detect objects with a 1 megapixel event camera [C]. Vancouver: The 34th International Conference on Neural Information Processing Systems, 2020.

[118] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection [C]. Las Vegas: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[119] Cannici M, Ciccone M, Romanoni A, et al. Asynchronous convolutional networks for object detection in neuromorphic cameras [C]. Long Beach: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019.

[120] Hu Y H, Delbruck T, Liu S C. Learning to exploit multiple vision modalities by using grafted networks [C]. Glasgow: Computer Vision-ECCV 2020: 16th European Conference, 2020.

[121] Zhang J Q, Dong B, Zhang H W, et al. Spiking transformers for event-based single object tracking [C]. New Orleans: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[122] Zhu Y Y, Zhang Y, Xie X D, et al. An FPGA accelerator for high-speed moving objects detection and tracking with a spike camera [J]. Neural Computation, 2022, 34(8): 1812‒1839.

[123] Li J N, Dong S W, Yu Z F, et al. Event-based vision enhanced: A joint detection framework in autonomous driving [C]. Shanghai: 2019 IEEE International Conference on Multimedia and Expo (ICME), 2019.

[124] Li J N, Wang X, Zhu L, et al. Retinomorphic object detection in asynchronous visual streams [C]. Vancouver: The AAAI Conference on Artificial Intelligence, 2022.

[125] Zhang J, Wang Y, Liu W, et al. Frame-event alignment and fusion network for high frame rate tracking [C]. Vancouver: The IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.

[126] Zhang J Y, Tang L L, Yu Z F, et al. Spike transformer: Monocular depth estimation for spiking camera [C]. Tel Aviv: Computer Vision-ECCV 2022: 17th European Conference, 2022.

[127] Wang Y X, Li J N, Zhu L, et al. Learning stereo depth estimation with bio-inspired spike cameras [C]. Taiwan: 2022 IEEE International Conference on Multimedia and Expo (ICME), 2022.

[128] Zheng Y J, Zhang J Y, Zhao R, et al. SpikeCV: Open a continuous computer vision era [EB/OL]. (2023-05-21)‍[2023-11-01].https://arxiv.org/abs/2303.11684.

[129] Kim J, Bae J, Park G, et al. N-imagenet: Towards robust, fine-grained object recognition with event cameras [C]. Montreal: The IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

Related Research