Resource Type

Journal Article 146

Year

2023 17

2022 23

2021 19

2020 12

2019 11

2018 9

2017 13

2016 6

2015 3

2014 10

2013 1

2012 1

2011 1

2010 1

2009 1

2008 1

2007 4

2006 3

2005 1

2004 2

open ︾

Keywords

Artificial intelligence 6

Computer vision 6

Deep learning 6

Deep reinforcement learning 2

Graph neural networks 2

Machine learning 2

Simultaneous localization and mapping (SLAM) 2

Visual tracking 2

application scenarios 2

mechanical properties 2

3D parametric model 1

3D visual knowledge 1

4-adj model 1

5G 1

5G technology 1

6016 alloy 1

6016 aluminum alloy 1

AI root technology 1

API protocol mining 1

open ︾

Search scope:

排序: Display mode:

Visual knowledge: an attempt to explore machine creativity Perspectives

Yueting Zhuang, Siliang Tang,yzhuang@zju.edu.cn,siliang@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2100116

Abstract: 长期以来困扰人工智能领域的一个问题是:人工智能是否具有创造力,或者说,算法的推理过程是否可以具有创造性。本文从思维科学的角度探讨人工智能创造力的问题。首先,列举形象思维推理的相关研究;然后,重点介绍一种特殊的视觉知识表示形式,即视觉场景图;最后,详细介绍视觉场景图构造问题与潜在应用。所有证据表明,视觉知识和视觉思维不仅可以改善当前人工智能任务的性能,而且可以用于机器创造力的实践。

Keywords: 思维科学;形象思维推理;视觉知识表达;视觉场景图    

Unsupervised object detection with scene-adaptive concept learning Research Articles

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2000567

Abstract: Object detection is one of the hottest research directions in computer vision, has already made impressive progress in academia, and has many valuable applications in the industry. However, the mainstream detection methods still have two shortcomings: (1) even a model that is well trained using large amounts of data still cannot generally be used across different kinds of scenes; (2) once a model is deployed, it cannot autonomously evolve along with the accumulated unlabeled scene data. To address these problems, and inspired by theory, we propose a novel scene-adaptive evolution algorithm that can decrease the impact of scene changes through the concept of object groups. We first extract a large number of object proposals from unlabeled data through a pre-trained detection model. Second, we build the dictionary of object concepts by clustering the proposals, in which each cluster center represents an object prototype. Third, we look into the relations between different clusters and the object information of different groups, and propose a graph-based group information propagation strategy to determine the category of an object concept, which can effectively distinguish positive and negative proposals. With these pseudo labels, we can easily fine-tune the pre-trained model. The effectiveness of the proposed method is verified by performing different experiments, and the significant improvements are achieved.

Keywords: 视觉知识;无监督视频目标检测;场景自适应学习    

Miniaturized five fundamental issues about visual knowledge Perspectives

Yun-he Pan,panyh@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2040000

Abstract: 认知心理学早已指出,人类知识记忆中的重要部分是视觉知识,被用来进行形象思维。因此,基于视觉的人工智能(AI)是AI绕不开的课题,且具有重要意义。本文继《论视觉知识》一文,讨论与之相关的5个基本问题:(1)视觉知识表达;(2)视觉识别;(3)视觉形象思维模拟;(4)视觉知识的学习;(5)多重知识表达。视觉知识的独特优点是具有形象的综合生成能力,时空演化能力和形象显示能力。这些正是字符知识和深度神经网络所缺乏的。AI与计算机辅助设计/图形学/视觉的技术联合将在创造、预测和人机融合等方面对AI新发展提供重要的基础动力。视觉知识和多重知识表达的研究是发展新的视觉智能的关键,也是促进AI 2.0取得重要突破的关键理论与技术。这是一块荒芜、寒湿而肥沃的“北大荒”,也是一块充满希望值得多学科合作勇探的“无人区”。

Keywords: 视觉知识表达;视觉识别;视觉形象思维模拟;视觉知识学习;多重知识表达    

Toward the Next Generation of Retinal Neuroprosthesis: Visual Computation with Spikes Review

Zhaofei Yu, Jian K. Liu, Shanshan Jia, Yichen Zhang, Yajing Zheng, Yonghong Tian, Tiejun Huang

Engineering 2020, Volume 6, Issue 4,   Pages 449-461 doi: 10.1016/j.eng.2020.02.004

Abstract:

A neuroprosthesis is a type of precision medical device that is intended to manipulate the neuronal signals of the brain in a closed-loop fashion, while simultaneously receiving stimuli from the environment and controlling some part of a human brain or body. Incoming visual information can be processed by the brain in millisecond intervals. The retina computes visual scenes and sends its output to the cortex in the form of neuronal spikes for further computation. Thus, the neuronal signal of interest for a retinal neuroprosthesis is the neuronal spike. Closed-loop computation in a neuroprosthesis includes two stages: encoding a stimulus as a neuronal signal, and decoding it back into a stimulus. In this paper, we review some of the recent progress that has been achieved in visual computation models that use spikes to analyze natural scenes that include static images and dynamic videos. We hypothesize that in order to obtain a better understanding of the computational principles in the retina, a hypercircuit view of the retina is necessary, in which the different functional network motifs that have been revealed in the cortex neuronal network are taken into consideration when interacting with the retina. The different building blocks of the retina, which include a diversity of cell types and synaptic connections—both chemical synapses and electrical synapses (gap junctions)—make the retina an ideal neuronal network for adapting the computational techniques that have been developed in artificial intelligence to model the encoding and decoding of visual scenes. An overall systems approach to visual computation with neuronal spikes is necessary in order to advance the next generation of retinal neuroprosthesis as an artificial visual system.

Keywords: Visual coding     Retina     Neuroprosthesis     Brain–machine interface     Artificial intelligence     Deep learning     Spiking neural network     Probabilistic graphical model    

Novel robust simultaneous localization and mapping for long-term autonomous robots Research Articles

Wei WEI, Xiaorui ZHU, Yi WANG,weirui9003@gmail.com,xiaoruizhu@hit.edu.cn,wangyi601@aliyun.com

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 2,   Pages 234-245 doi: 10.1631/FITEE.2000358

Abstract: A fundamental task for mobile robots is . Moreover, is an important property for SLAM. When vehicles or robots steer fast or steer in certain scenarios, such as low-texture environments, long corridors, tunnels, or other duplicated structural environments, most SLAM systems might fail. In this paper, we propose a novel robust visual inertial navigation (VILN) SLAM system, including stereo visual-inertial LiDaR odometry and visual-LiDaR loop closure. The proposed VILN SLAM system can perform well with low drift after experiments, even when the LiDaR or visual measurements are degraded occasionally in complex scenes. Extensive experimental results show that the has been greatly improved in various scenarios compared to state-of-the-art SLAM systems.

Keywords: Simultaneous localization and mapping (SLAM)     Long-term     Robustness     Light detection and ranging (LiDaR)     Visual inertial LiDaR navigation (VILN)    

Three-dimensional shape space learning for visual concept construction: challenges and research progress Perspective

Xin TONG

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 9,   Pages 1290-1297 doi: 10.1631/FITEE.2200318

Abstract: Human beings can easily categorize three-dimensional (3D) objects with similar shapes and functions into a set of “visual concepts” and learn “visual knowledge” of the surrounding 3D real world (). Developing efficient methods to learn the computational representation of the visual concept and the visual knowledge is a critical task in artificial intelligence (). A crucial step to this end is to learn the shape space spanned by all 3D objects that belong to one visual concept. In this paper, we present the key technical challenges and recent research progress in 3D shape space learning and discuss the open problems and research opportunities in this area.

Keywords: 视觉概念;视觉知识;三维几何学习;三维形状空间;三维结构    

Visual-feature-assisted mobile robot localization in a long corridor environment Research Article

Gengyu GE, Yi ZHANG, Wei WANG, Lihe HU, Yang WANG, Qin JIANG,gegengyu_2021@163.com,zhangyi@cqupt.edu.cn

Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 6,   Pages 876-889 doi: 10.1631/FITEE.2200208

Abstract: plays a vital role in the navigation system and is a fundamental capability for autonomous movement. In an indoor environment, the current mainstream scheme uses two-dimensional (2D) laser light detection and ranging (LiDAR) to build an occupancy grid map with simultaneous and mapping (SLAM) technology; it then locates the robot based on the known grid map. However, such solutions work effectively only in those areas with salient geometrical features. For areas with repeated, symmetrical, or similar structures, such as a long corridor, the conventional ing method will fail. To solve this crucial problem, this paper presents a novel coarse-to-fine paradigm that uses to assist in a long corridor. First, the is remote-controlled to move from the starting position to the end along a middle line. In the moving process, a grid map is built using the laser-based SLAM method. At the same time, a visual map consisting of special images which are keyframes is created according to a keyframe selection strategy. The keyframes are associated with the robot’s poses through timestamps. Second, a moving strategy is proposed, based on the extracted range features of the laser scans, to decide on an initial rough position. This is vital for the because it gives instructions on where the robot needs to move to adjust its pose. Third, the captures images in a proper perspective according to the moving strategy and matches them with the image map to achieve a coarse . Finally, an improved ing method is presented to achieve fine . Experimental results show that our method is effective and robust for global . The success rate reaches 98.8% while the average moving distance is only 0.31 m. In addition, the method works well when the is kidnapped to another position in the corridor.

Keywords: Mobile robot     Localization     Simultaneous localization and mapping (SLAM)     Corridor environment     Particle filter     Visual features    

混合-增强智能:协作与认知 Review

南宁 郑,子熠 刘,鹏举 任,永强 马,仕韬 陈,思雨 余,建儒 薛,霸东 陈,飞跃 王

Frontiers of Information Technology & Electronic Engineering 2017, Volume 18, Issue 2,   Pages 153-179 doi: 10.1631/FITEE.1700053

Abstract: 本文讨论人机协同的混合-增强智能的基本框架,以及基于认知计算的混合-增强智能的基本要素:直觉推理与因果模型、记忆和知识演化;特别论述了直觉推理在复杂问题求解中的作用和基本原理,以及基于记忆与推理的视觉场景理解的认知学习网络

Keywords: 人-机协同;混合增强智能;认知计算;直觉推理;因果模型;认知映射;视觉场景理解;自主驾驶汽车    

A quantitative attribute-based benchmark methodology for single-target visual tracking Article

Wen-jing KANG, Chang LIU, Gong-liang LIU

Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 3,   Pages 405-421 doi: 10.1631/FITEE.1900245

Abstract: In the past several years, various visual object tracking benchmarks have been proposed, and some of them have been used widely in numerous recently proposed trackers. However, most of the discussions focus on the overall performance, and cannot describe the strengths and weaknesses of the trackers in detail. Meanwhile, several benchmark measures that are often used in tests lack convincing interpretation. In this paper, 12 frame-wise visual attributes that reflect different aspects of the characteristics of image sequences are collated, and a normalized quantitative formulaic definition has been given to each of them for the first time. Based on these definitions, we propose two novel test methodologies, a correlation-based test and a weight-based test, which can provide a more intuitive and easier demonstration of the trackers’ performance for each aspect. Then these methods have been applied to the raw results from one of the most famous tracking challenges, the Video Object Tracking (VOT) Challenge 2017. From the tests, most trackers did not perform well when the size of the target changed rapidly or intensely, and even the advanced deep learning based trackers did not perfectly solve the problem. The scale of the targets was not considered in the calculation of the center location error; however, in a practical test, the center location error is still sensitive to the targets’ changes in size.

Keywords: Visual tracking     Performance evaluation     Visual attributes     Computer vision    

Visual commonsense reasoning with directional visual connections Research Articles

Yahong Han, Aming Wu, Linchao Zhu, Yi Yang,yahong@tju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2000722

Abstract: To boost research into cognition-level visual understanding, i.e., making an accurate inference based on a thorough understanding of visual details, (VCR) has been proposed. Compared with traditional visual question answering which requires models to select correct answers, VCR requires models to select not only the correct answers, but also the correct rationales. Recent research into human cognition has indicated that brain function or cognition can be considered as a global and dynamic integration of local neuron connectivity, which is helpful in solving specific cognition tasks. Inspired by this idea, we propose a to achieve VCR by dynamically reorganizing the that is contextualized using the meaning of questions and answers and leveraging the directional information to enhance the reasoning ability. Specifically, we first develop a GraphVLAD module to capture to fully model visual content correlations. Then, a contextualization process is proposed to fuse sentence representations with visual neuron representations. Finally, based on the output of , we propose to infer answers and rationales, which includes a ReasonVLAD module. Experimental results on the VCR dataset and visualization analysis demonstrate the effectiveness of our method.

Keywords: 视觉常识推理;有向连接网络;视觉神经元连接;情景化连接;有向连接    

Soft-HGRNs: soft hierarchical graph recurrent networks for multi-agent partially observable environments Research Article

Yixiang REN, Zhenhui YE, Yining CHEN, Xiaohong JIANG, Guanghua SONG,yixiangren@zju.edu.cn,zhenhuiye@zju.edu.cn,ch19930611@zju.edu.cn,ghsong@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 1,   Pages 117-130 doi: 10.1631/FITEE.2200073

Abstract: The recent progress in multi-agent (MADRL) makes it more practical in real-world tasks, but its relatively poor scalability and the partially observable constraint raise more challenges for its performance and deployment. Based on our intuitive observation that human society could be regarded as a large-scale partially observable environment, where everyone has the functions of communicating with neighbors and remembering his/her own experience, we propose a novel network structure called the hierarchical graph recurrent network (HGRN) for multi-agent cooperation under . Specifically, we construct the multi-agent system as a graph, use a novel graph convolution structure to achieve communication between heterogeneous neighboring agents, and adopt a recurrent unit to enable agents to record historical information. To encourage exploration and improve robustness, we design a method that can learn stochastic policies of a configurable target action entropy. Based on the above technologies, we propose a value-based MADRL algorithm called Soft-HGRN and its actor-critic variant called SAC-HGRN. Experimental results based on three homogeneous tasks and one heterogeneous environment not only show that our approach achieves clear improvements compared with four MADRL baselines, but also demonstrate the interpretability, scalability, and transferability of the proposed model.

Keywords: Deep reinforcement learning     Graph-based communication     Maximum-entropy learning     Partial observability     Heterogeneous settings    

Visual Inspection Technology and its Application

Ye Shenghua,Zhu Jigui,Wang Zhong,Yang Xueyou

Strategic Study of CAE 1999, Volume 1, Issue 1,   Pages 49-52

Abstract:

Visual inspection, especially, the active visual inspection and passive visual inspection based on triangulation method has advantages of non-contact, rapid speed, flexibility, etc. Visual inspection is a advanced inspection technology, satisfies modern manufacturing demands. This paper discusses the principle of visual inspection, studies several developed applied visual inspection systems, these systems demostrate wide application foreground of visual inspection from different points of view.

Keywords: active visual inspection     passive visual inspection     inspection system     modern manufacturing    

Study on Fire Design in Performance-based Design

Xu Liang,Zhang Heping,Yang Yun,Zhu Wuba

Strategic Study of CAE 2004, Volume 6, Issue 1,   Pages 64-67

Abstract:

Fire design is a key step in performance-based design. In this paper, several fire design methods have been introduced and fire design in a high-rack warehouse has been presented as an example to show how to apply fire design methods.

Keywords: fire design     fire growth curve     heat release rate    

Large-scale graph processing systems: a survey Review

Ning LIU, Dong-sheng LI, Yi-ming ZHANG, Xiong-lve LI

Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 3,   Pages 384-404 doi: 10.1631/FITEE.1900127

Abstract: Graph is a significant data structure that describes the relationship between entries. Many application domains in the real world are heavily dependent on graph data. However, graph applications are vastly different from traditional applications. It is inefficient to use general-purpose platforms for graph applications, thus contributing to the research of specific graph processing platforms. In this survey, we systematically categorize the graph workloads and applications, and provide a detailed review of existing graph processing platforms by dividing them into generalpurpose and specialized systems. We thoroughly analyze the implementation technologies including programming models, partitioning strategies, communication models, execution models, and fault tolerance strategies. Finally, we analyze recent advances and present four open problems for future research.

Keywords: Graph workloads     Graph applications     Graph processing systems    

Paper evolution graph: multi-view structural retrieval for academic literature None

Dan-ping LIAO, Yun-tao QIAN

Frontiers of Information Technology & Electronic Engineering 2019, Volume 20, Issue 2,   Pages 187-205 doi: 10.1631/FITEE.1700105

Abstract:

Academic literature retrieval concerns about the selection of papers that are most likely to match a user’s information needs. Most of the retrieval systems are limited to list-output models, in which the retrieval results are isolated from each other. In this paper, we aim to uncover the relationships between the retrieval results and propose a method to build structural retrieval results for academic literature, which we call a paper evolution graph (PEG). The PEG describes the evolution of diverse aspects of input queries through several evolution chains of papers. By using the author, citation, and content information, PEGs can uncover various underlying relationships among the papers and present the evolution of articles from multiple viewpoints. Our system supports three types of input queries: keyword query, single-paper query, and two-paper query. The construction of a PEG consists mainly of three steps. First, the papers are soft-clustered into communities via metagraph factorization, during which the topic distribution of each paper is obtained. Second, topically cohesive evolution chains are extracted from the communities that are relevant to the query. Each chain focuses on one aspect of the query. Finally, the extracted chains are combined to generate a PEG, which fully covers all the topics of the query. Experimental results on a real-world dataset demonstrate that the proposed method can construct meaningful PEGs.

Keywords: Paper evolution graph     Academic literature retrieval     Metagraph factorization     Topic coherence    

Title Author Date Type Operation

Visual knowledge: an attempt to explore machine creativity

Yueting Zhuang, Siliang Tang,yzhuang@zju.edu.cn,siliang@zju.edu.cn

Journal Article

Unsupervised object detection with scene-adaptive concept learning

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Journal Article

Miniaturized five fundamental issues about visual knowledge

Yun-he Pan,panyh@zju.edu.cn

Journal Article

Toward the Next Generation of Retinal Neuroprosthesis: Visual Computation with Spikes

Zhaofei Yu, Jian K. Liu, Shanshan Jia, Yichen Zhang, Yajing Zheng, Yonghong Tian, Tiejun Huang

Journal Article

Novel robust simultaneous localization and mapping for long-term autonomous robots

Wei WEI, Xiaorui ZHU, Yi WANG,weirui9003@gmail.com,xiaoruizhu@hit.edu.cn,wangyi601@aliyun.com

Journal Article

Three-dimensional shape space learning for visual concept construction: challenges and research progress

Xin TONG

Journal Article

Visual-feature-assisted mobile robot localization in a long corridor environment

Gengyu GE, Yi ZHANG, Wei WANG, Lihe HU, Yang WANG, Qin JIANG,gegengyu_2021@163.com,zhangyi@cqupt.edu.cn

Journal Article

混合-增强智能:协作与认知

南宁 郑,子熠 刘,鹏举 任,永强 马,仕韬 陈,思雨 余,建儒 薛,霸东 陈,飞跃 王

Journal Article

A quantitative attribute-based benchmark methodology for single-target visual tracking

Wen-jing KANG, Chang LIU, Gong-liang LIU

Journal Article

Visual commonsense reasoning with directional visual connections

Yahong Han, Aming Wu, Linchao Zhu, Yi Yang,yahong@tju.edu.cn

Journal Article

Soft-HGRNs: soft hierarchical graph recurrent networks for multi-agent partially observable environments

Yixiang REN, Zhenhui YE, Yining CHEN, Xiaohong JIANG, Guanghua SONG,yixiangren@zju.edu.cn,zhenhuiye@zju.edu.cn,ch19930611@zju.edu.cn,ghsong@zju.edu.cn

Journal Article

Visual Inspection Technology and its Application

Ye Shenghua,Zhu Jigui,Wang Zhong,Yang Xueyou

Journal Article

Study on Fire Design in Performance-based Design

Xu Liang,Zhang Heping,Yang Yun,Zhu Wuba

Journal Article

Large-scale graph processing systems: a survey

Ning LIU, Dong-sheng LI, Yi-ming ZHANG, Xiong-lve LI

Journal Article

Paper evolution graph: multi-view structural retrieval for academic literature

Dan-ping LIAO, Yun-tao QIAN

Journal Article