Resource Type

Journal Article 569

Conference Videos 38

Conference Information 24

Year

2023 64

2022 92

2021 68

2020 64

2019 56

2018 34

2017 41

2016 15

2015 5

2014 11

2013 4

2012 7

2011 9

2010 6

2009 13

2008 10

2007 23

2006 19

2005 16

2004 20

open ︾

Keywords

Machine learning 42

Deep learning 34

numerical simulation 21

Artificial intelligence 16

Reinforcement learning 14

pattern recognition 6

Computer vision 5

simulation 5

Active learning 4

Additive manufacturing 4

Big data 4

noetic science 4

Autonomous driving 3

Bayesian optimization 3

Random forest 3

computer simulation 3

Adaptive dynamic programming 2

Attention 2

Autonomous learning 2

open ︾

Search scope:

排序: Display mode:

Miniaturized five fundamental issues about visual knowledge Perspectives

Yun-he Pan,panyh@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2040000

Abstract: 认知心理学早已指出,人类知识记忆中的重要部分是视觉知识,被用来进行形象思维。因此,基于视觉的人工智能(AI)是AI绕不开的课题,且具有重要意义。本文继《论视觉知识》一文,讨论与之相关的5个基本问题:(1)视觉知识表达;(2)视觉识别;(3)视觉形象思维模拟;(4)视觉知识的学习;(5)多重知识表达。视觉知识的独特优点是具有形象的综合生成能力,时空演化能力和形象显示能力。这些正是字符知识和深度神经网络所缺乏的。AI与计算机辅助设计/图形学/视觉的技术联合将在创造、预测和人机融合等方面对AI新发展提供重要的基础动力。视觉知识和多重知识表达的研究是发展新的视觉智能的关键,也是促进AI 2.0取得重要突破的关键理论与技术。这是一块荒芜、寒湿而肥沃的“北大荒”,也是一块充满希望值得多学科合作勇探的“无人区”。

Keywords: 视觉知识表达;视觉识别;视觉形象思维模拟;视觉知识学习;多重知识表达    

Visual knowledge: an attempt to explore machine creativity Perspectives

Yueting Zhuang, Siliang Tang,yzhuang@zju.edu.cn,siliang@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2100116

Abstract: 长期以来困扰人工智能领域的一个问题是:人工智能是否具有创造力,或者说,算法的推理过程是否可以具有创造性。本文从思维科学的角度探讨人工智能创造力的问题。首先,列举形象思维推理的相关研究;然后,重点介绍一种特殊的视觉知识表示形式,即视觉场景图;最后,详细介绍视觉场景图构造问题与潜在应用。所有证据表明,视觉知识和视觉思维不仅可以改善当前人工智能任务的性能,而且可以用于机器创造力的实践。

Keywords: 思维科学;形象思维推理;视觉知识表达;视觉场景图    

Three-dimensional shape space learning for visual concept construction: challenges and research progress Perspective

Xin TONG

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 9,   Pages 1290-1297 doi: 10.1631/FITEE.2200318

Abstract: Human beings can easily categorize three-dimensional (3D) objects with similar shapes and functions into a set of “visual concepts” and learn “visual knowledge” of the surrounding 3D real world (). Developing efficient methods to learn the computational representation of the visual concept and the visual knowledge is a critical task in artificial intelligence (). A crucial step to this end is to learn the shape space spanned by all 3D objects that belong to one visual concept. In this paper, we present the key technical challenges and recent research progress in 3D shape space learning and discuss the open problems and research opportunities in this area.

Keywords: 视觉概念;视觉知识;三维几何学习;三维形状空间;三维结构    

On visual knowledge Perspective

Yun-he PAN

Frontiers of Information Technology & Electronic Engineering 2019, Volume 20, Issue 8,   Pages 1021-1025 doi: 10.1631/FITEE.1910001

Abstract: 提出“视觉知识”概念。视觉知识是知识表达的一种新形式. 它与迄今为止人工智能(AI)所用知识表达方法不同. 其中视觉概念具有典型(prototype)与范畴结构、层次结构与动作结构等要素. 视觉概念能构成视觉命题,包括场景结构与动态结构,视觉命题能构成视觉叙事。指出重构计算机图形学成果可实现视觉知识表达及其推理与操作,重构计算机视觉成果可实现视觉知识学习。实现视觉知识表达、推理、学习和应用技术将是AI 2.0取得突破的重要方向之一。

Keywords: None    

Multiple Knowledge Representation of Artificial Intelligence

Yunhe Pan

Engineering 2020, Volume 6, Issue 3,   Pages 216-217 doi: 10.1016/j.eng.2019.12.011

Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies Perspective

Yi Yang, Yueting Zhuang, Yunhe Pan,yangyics@zju.edu.cn,yzhuang@zju.edu.cn,panyh@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 12,   Pages 1551-1684 doi: 10.1631/FITEE.2100463

Abstract: In this paper, we present a multiple knowledge representation (MKR) framework and discuss its potential for developing big data artificial intelligence (AI) techniques with possible broader impacts across different AI areas. Typically, canonical knowledge representations and modern representations each emphasize a particular aspect of transforming inputs into symbolic encoding or vectors. For example, knowledge graphs focus on depicting semantic connections among concepts, whereas deep neural networks (DNNs) are more of a tool to perceive raw signal inputs. MKR is an advanced AI representation framework for more complete intelligent functions, such as raw signal perception, feature extraction and vectorization, knowledge symbolization, and logical reasoning. MKR has two benefits: (1) it makes the current AI techniques (dominated by deep learning) more explainable and generalizable, and (2) it expands current AI techniques by integrating MKR to facilitate the mutual benefits of the complementary capacity of each representation, e.g., raw signal perception and symbolic encoding. We expect that MKR research and its applications will drive the evolution of AI 2.0 and beyond.

Keywords: 多重知识表达;人工智能;大数据    

Visual recognition of cardiac pathology based on 3D parametric model reconstruction Research Article

Jinxiao XIAO, Yansong LI, Yun TIAN, Dongrong XU, Penghui LI, Shifeng ZHAO, Yunhe PAN

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 9,   Pages 1324-1337 doi: 10.1631/FITEE.2200102

Abstract: Visual recognition of cardiac images is important for and treatment. Due to the limited availability of annotated datasets, traditional methods usually extract features directly from two-dimensional slices of three-dimensional (3D) heart images, followed by pathological classification. This process may not ensure the overall anatomical consistency in 3D heart. A new method for classification of cardiac pathology is therefore proposed based on reconstruction. First, 3D heart models are reconstructed based on multiple 3D volumes of cardiac imaging data at the end-systole (ES) and end-diastole (ED) phases. Next, based on these reconstructed 3D hearts, s are constructed through the statistical shape model (SSM), and then the heart data are augmented via the variation in shape parameters of one with visual knowledge constraints. Finally, shape and motion features of 3D heart models across two phases are extracted to classify cardiac pathology. Comprehensive experiments on the automated cardiac diagnosis challenge (ACDC) dataset of the Statistical Atlases and Computational Modelling of the Heart (STACOM) workshop confirm the superior performance and efficiency of this proposed approach.

Keywords: 3D visual knowledge     3D parametric model     Cardiac pathology diagnosis     Data augmentation    

Unsupervised object detection with scene-adaptive concept learning Research Articles

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2000567

Abstract: Object detection is one of the hottest research directions in computer vision, has already made impressive progress in academia, and has many valuable applications in the industry. However, the mainstream detection methods still have two shortcomings: (1) even a model that is well trained using large amounts of data still cannot generally be used across different kinds of scenes; (2) once a model is deployed, it cannot autonomously evolve along with the accumulated unlabeled scene data. To address these problems, and inspired by theory, we propose a novel scene-adaptive evolution algorithm that can decrease the impact of scene changes through the concept of object groups. We first extract a large number of object proposals from unlabeled data through a pre-trained detection model. Second, we build the dictionary of object concepts by clustering the proposals, in which each cluster center represents an object prototype. Third, we look into the relations between different clusters and the object information of different groups, and propose a graph-based group information propagation strategy to determine the category of an object concept, which can effectively distinguish positive and negative proposals. With these pseudo labels, we can easily fine-tune the pre-trained model. The effectiveness of the proposed method is verified by performing different experiments, and the significant improvements are achieved.

Keywords: 视觉知识;无监督视频目标检测;场景自适应学习    

A quantitative attribute-based benchmark methodology for single-target visual tracking Article

Wen-jing KANG, Chang LIU, Gong-liang LIU

Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 3,   Pages 405-421 doi: 10.1631/FITEE.1900245

Abstract: In the past several years, various visual object tracking benchmarks have been proposed, and some of them have been used widely in numerous recently proposed trackers. However, most of the discussions focus on the overall performance, and cannot describe the strengths and weaknesses of the trackers in detail. Meanwhile, several benchmark measures that are often used in tests lack convincing interpretation. In this paper, 12 frame-wise visual attributes that reflect different aspects of the characteristics of image sequences are collated, and a normalized quantitative formulaic definition has been given to each of them for the first time. Based on these definitions, we propose two novel test methodologies, a correlation-based test and a weight-based test, which can provide a more intuitive and easier demonstration of the trackers’ performance for each aspect. Then these methods have been applied to the raw results from one of the most famous tracking challenges, the Video Object Tracking (VOT) Challenge 2017. From the tests, most trackers did not perform well when the size of the target changed rapidly or intensely, and even the advanced deep learning based trackers did not perfectly solve the problem. The scale of the targets was not considered in the calculation of the center location error; however, in a practical test, the center location error is still sensitive to the targets’ changes in size.

Keywords: Visual tracking     Performance evaluation     Visual attributes     Computer vision    

Visual commonsense reasoning with directional visual connections Research Articles

Yahong Han, Aming Wu, Linchao Zhu, Yi Yang,yahong@tju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2000722

Abstract: To boost research into cognition-level visual understanding, i.e., making an accurate inference based on a thorough understanding of visual details, (VCR) has been proposed. Compared with traditional visual question answering which requires models to select correct answers, VCR requires models to select not only the correct answers, but also the correct rationales. Recent research into human cognition has indicated that brain function or cognition can be considered as a global and dynamic integration of local neuron connectivity, which is helpful in solving specific cognition tasks. Inspired by this idea, we propose a to achieve VCR by dynamically reorganizing the that is contextualized using the meaning of questions and answers and leveraging the directional information to enhance the reasoning ability. Specifically, we first develop a GraphVLAD module to capture to fully model visual content correlations. Then, a contextualization process is proposed to fuse sentence representations with visual neuron representations. Finally, based on the output of , we propose to infer answers and rationales, which includes a ReasonVLAD module. Experimental results on the VCR dataset and visualization analysis demonstrate the effectiveness of our method.

Keywords: 视觉常识推理;有向连接网络;视觉神经元连接;情景化连接;有向连接    

Visual Inspection Technology and its Application

Ye Shenghua,Zhu Jigui,Wang Zhong,Yang Xueyou

Strategic Study of CAE 1999, Volume 1, Issue 1,   Pages 49-52

Abstract:

Visual inspection, especially, the active visual inspection and passive visual inspection based on triangulation method has advantages of non-contact, rapid speed, flexibility, etc. Visual inspection is a advanced inspection technology, satisfies modern manufacturing demands. This paper discusses the principle of visual inspection, studies several developed applied visual inspection systems, these systems demostrate wide application foreground of visual inspection from different points of view.

Keywords: active visual inspection     passive visual inspection     inspection system     modern manufacturing    

Visual interpretability for deep learning: a survey Review

Quan-shi ZHANG, Song-chun ZHU

Frontiers of Information Technology & Electronic Engineering 2018, Volume 19, Issue 1,   Pages 27-39 doi: 10.1631/FITEE.1700808

Abstract: This paper reviews recent studies in understanding neural-network representations and learning neural networks with interpretable/disentangled middle-layer representations. Although deep neural networks have exhibited superior performance in various tasks, interpretability is always Achilles’ heel of deep neural networks. At present, deep neural networks obtain high discrimination power at the cost of a low interpretability of their black-box representations. We believe that high model interpretability may help people break several bottlenecks of deep learning, e.g., learning from a few annotations, learning via human–computer communications at the semantic level, and semantically debugging network representations. We focus on convolutional neural networks (CNNs), and revisit the visualization of CNN representations, methods of diagnosing representations of pre-trained CNNs, approaches for disentangling pre-trained CNN representations, learning of CNNs with disentangled representations, and middle-to-end learning based on model interpretability. Finally, we discuss prospective trends in explainable artificial intelligence.

Keywords: Artificial intelligence     Deep learning     Interpretable model    

Advances in Computer Vision-Based Civil Infrastructure Inspection and Monitoring Review

Billie F. Spencer Jr.,Vedhus Hoskere,Yasutaka Narazaki

Engineering 2019, Volume 5, Issue 2,   Pages 199-222 doi: 10.1016/j.eng.2018.11.030

Abstract:

Computer vision techniques, in conjunction with acquisition through remote cameras and unmanned aerial vehicles (UAVs), offer promising non-contact solutions to civil infrastructure condition assessment. The ultimate goal of such a system is to automatically and robustly convert the image or video data into actionable information. This paper provides an overview of recent advances in computer vision techniques as they apply to the problem of civil infrastructure condition assessment. In particular, relevant research in the fields of computer vision, machine learning, and structural engineering are presented. The work reviewed is classified into two types: inspection applications and monitoring applications. The inspection applications reviewed include identifying context such as structural components, characterizing local and global visible damage, and detecting changes from a reference image. The monitoring applications discussed include static measurement of strain and displacement, as well as dynamic measurement of displacement for modal analysis. Subsequently, some of the key challenges that persist towards the goal of automated vision-based civil infrastructure and monitoring are presented. The paper concludes with ongoing work aimed at addressing some of these stated challenges.

Keywords: Structural inspection and monitoring     Artificial intelligence     Computer vision     Machine learning     Optical flow    

Performance analysis of visualmarkers for indoor navigation systems Article

Gaetano C. LA DELFA,Salvatore MONTELEONE,Vincenzo CATANIA,Juan F. DE PAZ,Javier BAJO

Frontiers of Information Technology & Electronic Engineering 2016, Volume 17, Issue 8,   Pages 730-740 doi: 10.1631/FITEE.1500324

Abstract: The massive diffusion of smartphones, the growing interest in wearable devices and the Internet of Things, and the exponential rise of location based services (LBSs) have made the problem of localization and navigation inside buildings one of the most important technological challenges of recent years. Indoor positioning systems have a huge market in the retail sector and contextual advertising; in addition, they can be fundamental to increasing the quality of life for citizens if deployed inside public buildings such as hospitals, airports, and museums. Sometimes, in emergency situations, they can make the difference between life and death. Various approaches have been proposed in the literature. Recently, thanks to the high performance of smartphones’ cameras, marker-less and marker-based computer vision approaches have been investigated. In a previous paper, we proposed a technique for indoor localization and navigation using both Bluetooth low energy (BLE) and a 2D visual marker system deployed into the floor. In this paper, we presented a qualitative performance evaluation of three 2D visual markers, Vuforia, ArUco marker, and AprilTag, which are suitable for real-time applications. Our analysis focused on specific case study of visual markers placed onto the tiles, to improve the efficiency of our indoor localization and navigation approach by choosing the best visual marker system.

Keywords: Indoor localization     Visual markers     Computer vision    

Grasp Planning and Visual Servoing for an Outdoors Aerial Dual Manipulator Article

Pablo Ramon-Soria, Begoña C. Arrue, Anibal Ollero

Engineering 2020, Volume 6, Issue 1,   Pages 77-88 doi: 10.1016/j.eng.2019.11.003

Abstract:

This paper describes a system for grasping known objects with unmanned aerial vehicles (UAVs) provided with dual manipulators using an RGB-D camera. Aerial manipulation remains a very challenging task. This paper covers three principal aspects for this task: object detection and pose estimation, grasp planning, and in-flight grasp execution. First, an artificial neural network (ANN) is used to obtain clues regarding the object’s position. Next, an alignment algorithm is used to obtain the object’s six-dimensional (6D) pose, which is filtered with an extended Kalman filter. A three-dimensional (3D) model of the object is then used to estimate an arranged list of good grasps for the aerial manipulator. The results from the detection algorithm—that is, the object’s pose—are used to update the trajectories of the arms toward the object. If the target poses are not reachable due to the UAV’s oscillations, the algorithm switches to the next feasible grasp. This paper introduces the overall methodology, and provides the experimental results of both simulation and real experiments for each module, in addition to a video showing the results.


 

 

 

 

 

 

 

Keywords: Aerial manipulation     Grasp planning     Visual servoing    

Title Author Date Type Operation

Miniaturized five fundamental issues about visual knowledge

Yun-he Pan,panyh@zju.edu.cn

Journal Article

Visual knowledge: an attempt to explore machine creativity

Yueting Zhuang, Siliang Tang,yzhuang@zju.edu.cn,siliang@zju.edu.cn

Journal Article

Three-dimensional shape space learning for visual concept construction: challenges and research progress

Xin TONG

Journal Article

On visual knowledge

Yun-he PAN

Journal Article

Multiple Knowledge Representation of Artificial Intelligence

Yunhe Pan

Journal Article

Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies

Yi Yang, Yueting Zhuang, Yunhe Pan,yangyics@zju.edu.cn,yzhuang@zju.edu.cn,panyh@zju.edu.cn

Journal Article

Visual recognition of cardiac pathology based on 3D parametric model reconstruction

Jinxiao XIAO, Yansong LI, Yun TIAN, Dongrong XU, Penghui LI, Shifeng ZHAO, Yunhe PAN

Journal Article

Unsupervised object detection with scene-adaptive concept learning

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Journal Article

A quantitative attribute-based benchmark methodology for single-target visual tracking

Wen-jing KANG, Chang LIU, Gong-liang LIU

Journal Article

Visual commonsense reasoning with directional visual connections

Yahong Han, Aming Wu, Linchao Zhu, Yi Yang,yahong@tju.edu.cn

Journal Article

Visual Inspection Technology and its Application

Ye Shenghua,Zhu Jigui,Wang Zhong,Yang Xueyou

Journal Article

Visual interpretability for deep learning: a survey

Quan-shi ZHANG, Song-chun ZHU

Journal Article

Advances in Computer Vision-Based Civil Infrastructure Inspection and Monitoring

Billie F. Spencer Jr.,Vedhus Hoskere,Yasutaka Narazaki

Journal Article

Performance analysis of visualmarkers for indoor navigation systems

Gaetano C. LA DELFA,Salvatore MONTELEONE,Vincenzo CATANIA,Juan F. DE PAZ,Javier BAJO

Journal Article

Grasp Planning and Visual Servoing for an Outdoors Aerial Dual Manipulator

Pablo Ramon-Soria, Begoña C. Arrue, Anibal Ollero

Journal Article