Resource Type

Journal Article 730

Conference Videos 45

Conference Information 21

Year

2024 1

2023 87

2022 117

2021 101

2020 69

2019 72

2018 53

2017 62

2016 32

2015 11

2014 8

2013 9

2012 10

2011 16

2010 8

2009 11

2008 9

2007 19

2006 20

2005 14

open ︾

Keywords

Machine learning 43

Deep learning 34

Artificial intelligence 16

Reinforcement learning 14

Computer vision 5

Active learning 4

Adaptive control 4

COVID-19 4

Neural network 4

multi-objective optimization 4

resource utilization 4

solid waste 4

Additive manufacturing 3

Big data 3

Multi-objective optimization 3

neural network 3

2035 2

Adaptive dynamic programming 2

Anomaly detection 2

open ︾

Search scope:

排序: Display mode:

Unsupervised object detection with scene-adaptive concept learning Research Articles

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2000567

Abstract: Object detection is one of the hottest research directions in computer vision, has already made impressive progress in academia, and has many valuable applications in the industry. However, the mainstream detection methods still have two shortcomings: (1) even a model that is well trained using large amounts of data still cannot generally be used across different kinds of scenes; (2) once a model is deployed, it cannot autonomously evolve along with the accumulated unlabeled scene data. To address these problems, and inspired by theory, we propose a novel scene-adaptive evolution algorithm that can decrease the impact of scene changes through the concept of object groups. We first extract a large number of object proposals from unlabeled data through a pre-trained detection model. Second, we build the dictionary of object concepts by clustering the proposals, in which each cluster center represents an object prototype. Third, we look into the relations between different clusters and the object information of different groups, and propose a graph-based group information propagation strategy to determine the category of an object concept, which can effectively distinguish positive and negative proposals. With these pseudo labels, we can easily fine-tune the pre-trained model. The effectiveness of the proposed method is verified by performing different experiments, and the significant improvements are achieved.

Keywords: 视觉知识;无监督视频目标检测;场景自适应学习    

Dynamic parameterized learning for unsupervised domain adaptation Research Article

Runhua JIANG, Yahong HAN

Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 11,   Pages 1616-1632 doi: 10.1631/FITEE.2200631

Abstract: enables neural networks to transfer from a labeled source domain to an unlabeled target domain by learning domain-invariant representations. Recent approaches achieve this by directly matching the marginal distributions of these two domains. Most of them, however, ignore exploration of the dynamic trade-off between and learning, thus rendering them susceptible to the problems of negative transfer and outlier samples. To address these issues, we introduce the dynamic parameterized learning framework. First, by exploring domain-level semantic knowledge, the dynamic alignment parameter is proposed, to adaptively adjust the of and learning. Besides, for obtaining semantic-discriminative and domain-invariant representations, we propose to align training trajectories on both source and target domains. Comprehensive experiments are conducted to validate the effectiveness of the proposed methods, and extensive comparisons are conducted on seven datasets of three visual tasks to demonstrate their practicability.

Keywords: Unsupervised domain adaptation     Optimization steps     Domain alignment     Semantic discrimination    

Layer-wise domain correction for unsupervised domain adaptation Article

Shuang LI, Shi-ji SONG, Cheng WU

Frontiers of Information Technology & Electronic Engineering 2018, Volume 19, Issue 1,   Pages 91-103 doi: 10.1631/FITEE.1700774

Abstract: Deep neural networks have been successfully applied to numerous machine learning tasks because of their impressive feature abstraction capabilities. However, conventional deep networks assume that the training and test data are sampled from the same distribution, and this assumption is often violated in real-world scenarios. To address the domain shift or data bias problems, we introduce layer-wise domain correction (LDC), a new unsupervised domain adaptation algorithm which adapts an existing deep network through additive correction layers spaced throughout the network. Through the additive layers, the representations of source and target domains can be perfectly aligned. The corrections that are trained via maximum mean discrepancy, adapt to the target domain while increasing the representational capacity of the network. LDC requires no target labels, achieves state-of-the-art performance across several adaptation benchmarks, and requires significantly less training time than existing adaptation methods.

Keywords: Unsupervised domain adaptation     Maximum mean discrepancy     Residual network     Deep learning    

Self-supervised graph learning with target-adaptive masking for session-based recommendation Research Article

Yitong WANG, Fei CAI, Zhiqiang PAN, Chengyu SONG,wangyitong20@nudt.edu.cn,caifei08@nudt.edu.cn,panzhiqiang@nudt.edu.cn,songchengyu@nudt.edu.cn

Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 1,   Pages 73-87 doi: 10.1631/FITEE.2200137

Abstract: aims to predict the next item based on a user's limited interactions within a short period. Existing approaches use mainly recurrent neural networks (RNNs) or (GNNs) to model the sequential patterns or the transition relationships between items. However, such models either ignore the over-smoothing issue of GNNs, or directly use cross-entropy loss with a softmax layer for model optimization, which easily results in the over-fitting problem. To tackle the above issues, we propose a self-supervised graph learning with (SGL-TM) method. Specifically, we first construct a global graph based on all involved sessions and subsequently capture the self-supervised signals from the global connections between items, which helps supervise the model in generating accurate representations of items in the ongoing session. After that, we calculate the main supervised loss by comparing the ground truth with the predicted scores of items adjusted by our designed module. Finally, we combine the main supervised component with the auxiliary self-supervision module to obtain the final loss for optimizing the model parameters. Extensive experimental results from two benchmark datasets, Gowalla and Diginetica, indicate that SGL-TM can outperform state-of-the-art baselines in terms of Recall@20 and MRR@20, especially in short sessions.

Keywords: Session-based recommendation     Self-supervised learning     Graph neural networks     Target-adaptive masking    

Miniaturized five fundamental issues about visual knowledge Perspectives

Yun-he Pan,panyh@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2040000

Abstract: 认知心理学早已指出,人类知识记忆中的重要部分是视觉知识,被用来进行形象思维。因此,基于视觉的人工智能(AI)是AI绕不开的课题,且具有重要意义。本文继《论视觉知识》一文,讨论与之相关的5个基本问题:(1)视觉知识表达;(2)视觉识别;(3)视觉形象思维模拟;(4)视觉知识的学习;(5)多重知识表达。视觉知识的独特优点是具有形象的综合生成能力,时空演化能力和形象显示能力。这些正是字符知识和深度神经网络所缺乏的。AI与计算机辅助设计/图形学/视觉的技术联合将在创造、预测和人机融合等方面对AI新发展提供重要的基础动力。视觉知识和多重知识表达的研究是发展新的视觉智能的关键,也是促进AI 2.0取得重要突破的关键理论与技术。这是一块荒芜、寒湿而肥沃的“北大荒”,也是一块充满希望值得多学科合作勇探的“无人区”。

Keywords: 视觉知识表达;视觉识别;视觉形象思维模拟;视觉知识学习;多重知识表达    

Visual knowledge: an attempt to explore machine creativity Perspectives

Yueting Zhuang, Siliang Tang,yzhuang@zju.edu.cn,siliang@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2100116

Abstract: 长期以来困扰人工智能领域的一个问题是:人工智能是否具有创造力,或者说,算法的推理过程是否可以具有创造性。本文从思维科学的角度探讨人工智能创造力的问题。首先,列举形象思维推理的相关研究;然后,重点介绍一种特殊的视觉知识表示形式,即视觉场景图;最后,详细介绍视觉场景图构造问题与潜在应用。所有证据表明,视觉知识和视觉思维不仅可以改善当前人工智能任务的性能,而且可以用于机器创造力的实践。

Keywords: 思维科学;形象思维推理;视觉知识表达;视觉场景图    

Unsupervised feature selection via joint local learning and group sparse regression Regular Papers

Yue WU, Can WANG, Yue-qing ZHANG, Jia-jun BU

Frontiers of Information Technology & Electronic Engineering 2019, Volume 20, Issue 4,   Pages 538-553 doi: 10.1631/FITEE.1700804

Abstract:

Feature selection has attracted a great deal of interest over the past decades. By selecting meaningful feature subsets, the performance of learning algorithms can be effectively improved. Because label information is expensive to obtain, unsupervised feature selection methods are more widely used than the supervised ones. The key to unsupervised feature selection is to find features that effectively reflect the underlying data distribution. However, due to the inevitable redundancies and noise in a dataset, the intrinsic data distribution is not best revealed when using all features. To address this issue, we propose a novel unsupervised feature selection algorithm via joint local learning and group sparse regression (JLLGSR). JLLGSR incorporates local learning based clustering with group sparsity regularized regression in a single formulation, and seeks features that respect both the manifold structure and group sparse structure in the data space. An iterative optimization method is developed in which the weights finally converge on the important features and the selected features are able to improve the clustering results. Experiments on multiple real-world datasets (images, voices, and web pages) demonstrate the effectiveness of JLLGSR.

Keywords: Unsupervised     Local learning     Group sparse regression     Feature selection    

A novel robotic visual perception framework for underwater operation Research Article

Yue LU, Xingyu CHEN, Zhengxing WU, Junzhi YU, Li WEN

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 11,   Pages 1602-1619 doi: 10.1631/FITEE.2100366

Abstract:

Underwater robotic operation usually requires visual perception (e.g., object detection and tracking), but underwater scenes have poor visual quality and represent a special domain which can affect the accuracy of visual perception. In addition, detection continuity and stability are important for , but the commonly used static accuracy based evaluation (i.e., average precision) is insufficient to reflect detector performance across time. In response to these two problems, we present a design for a novel robotic visual perception framework. First, we generally investigate the relationship between a quality-diverse data domain and in detection performance. As a result, although domain quality has an ignorable effect on within-domain detection accuracy, is beneficial to detection in real sea scenarios by reducing the domain shift. Moreover, non-reference assessments are proposed for detection continuity and stability based on object tracklets. Further, online tracklet refinement is developed to improve the temporal performance of detectors. Finally, combined with , an accurate and stable underwater robotic visual perception framework is established. Small-overlap suppression is proposed to extend (VID) methods to a single-object tracking task, leading to the flexibility to switch between detection and tracking. Extensive experiments were conducted on the ImageNet VID dataset and real-world robotic tasks to verify the correctness of our analysis and the superiority of our proposed approaches. The codes are available at https://github.com/yrqs/VisPerception.

Keywords: Underwater operation     Robotic perception     Visual restoration     Video object detection    

Three-dimensional shape space learning for visual concept construction: challenges and research progress Perspective

Xin TONG

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 9,   Pages 1290-1297 doi: 10.1631/FITEE.2200318

Abstract: Human beings can easily categorize three-dimensional (3D) objects with similar shapes and functions into a set of “visual concepts” and learn “visual knowledge” of the surrounding 3D real world (). Developing efficient methods to learn the computational representation of the visual concept and the visual knowledge is a critical task in artificial intelligence (). A crucial step to this end is to learn the shape space spanned by all 3D objects that belong to one visual concept. In this paper, we present the key technical challenges and recent research progress in 3D shape space learning and discuss the open problems and research opportunities in this area.

Keywords: 视觉概念;视觉知识;三维几何学习;三维形状空间;三维结构    

Federated unsupervised representation learning Research Article

Fengda ZHANG, Kun KUANG, Long CHEN, Zhaoyang YOU, Tao SHEN, Jun XIAO, Yin ZHANG, Chao WU, Fei WU, Yueting ZHUANG, Xiaolin LI,fdzhang@zju.edu.cn,kunkuang@zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2023, Volume 24, Issue 8,   Pages 1181-1193 doi: 10.1631/FITEE.2200268

Abstract: To leverage the enormous amount of unlabeled data on distributed edge devices, we formulate a new problem in called federated unsupervised (FURL) to learn a common representation model without supervision while preserving data privacy. FURL poses two new challenges: (1) data distribution shift (non-independent and identically distributed, non-IID) among clients would make local models focus on different categories, leading to the inconsistency of representation spaces; (2) without unified information among the clients in FURL, the representations across clients would be misaligned. To address these challenges, we propose the federated contrastive averaging with dictionary and alignment (FedCA) algorithm. FedCA is composed of two key modules: a dictionary module to aggregate the representations of samples from each client which can be shared with all clients for consistency of representation space and an alignment module to align the representation of each client on a base model trained on public data. We adopt the contrastive approach for local model training. Through extensive experiments with three evaluation protocols in IID and non-IID settings, we demonstrate that FedCA outperforms all baselines with significant margins.

Keywords: Federated learning     Unsupervised learning     Representation learning     Contrastive learning    

Automatic image enhancement by learning adaptive patch selection None

Na LI, Jian ZHAN

Frontiers of Information Technology & Electronic Engineering 2019, Volume 20, Issue 2,   Pages 206-221 doi: 10.1631/FITEE.1700125

Abstract:

Today, digital cameras are widely used in taking photos. However, some photos lack detail and need enhancement. Many existing image enhancement algorithms are patch based and the patch size is always fixed throughout the image. Users must tune the patch size to obtain the appropriate enhancement. In this study, we propose an automatic image enhancement method based on adaptive patch selection using both dark and bright channels. The double channels enhance images with various exposure problems. The patch size used for channel extraction is selected automatically by thresholding a contrast feature, which is learned systematically from a set of natural images crawled from the web. Our proposed method can automatically enhance foggy or under-exposed/backlit images without any user interaction. Experimental results demonstrate that our method can provide a significant improvement in existing patch-based image enhancement algorithms.

Keywords: Image enhancement     Contrast enhancement     Dark channel     Bright channel     Adaptive patch based processing    

Interactive image segmentation with a regression based ensemble learning paradigm Article

Jin ZHANG, Zhao-hui TANG, Wei-hua GUI, Qing CHEN, Jin-ping LIU

Frontiers of Information Technology & Electronic Engineering 2017, Volume 18, Issue 7,   Pages 1002-1020 doi: 10.1631/FITEE.1601401

Abstract: To achieve fine segmentation of complex natural images, people often resort to an interactive segmentation paradigm, since fully automatic methods often fail to obtain a result consistent with the ground truth. However, when the foreground and background share some similar areas in color, the fine segmentation result of conventional interactive methods usually relies on the increase of manual labels. This paper presents a novel interactive image segmentation method via a regression-based ensemble model with semi-supervised learning. The task is formulated as a non-linear problem integrating two complementary spline regressors and strengthening the robustness of each regressor via semi-supervised learning. First, two spline regressors with a complementary nature are constructed based on multivariate adaptive regression splines (MARS) and smooth thin plate spline regression (TPSR). Then, a regressor boosting method based on a clustering hypothesis and semi-supervised learning is proposed to assist the training of MARS and TPSR by using the region segmentation information contained in unlabeled pixels. Next, a support vector regression (SVR) based decision fusion model is adopted to integrate the results of MARS and TPSR. Finally, the GraphCut is introduced and combined with the SVR ensemble results to achieve image segmentation. Extensive experimental results on benchmark datasets of BSDS500 and Pascal VOC have demonstrated the effectiveness of our method, and the com-parison with experiment results has validated that the proposed method is comparable with the state-of-the-art methods for in-teractive natural image segmentation.

Keywords: Interactive image segmentation     Multivariate adaptive regression splines (MARS)     Ensemble learning     Thin-plate spline regression (TPSR)     Semi-supervised learning     Support vector regression (SVR)    

Learning to select pseudo labels: a semi-supervised method for named entity recognition Research Articles

Zhen-zhen Li, Da-wei Feng, Dong-sheng Li, Xi-cheng Lu,lizhenzhen14@nudt.edu.cn,davyfeng.c@gmail.com,dsli@nudt.edu.cn,xclu@nudt.edu.cn

Frontiers of Information Technology & Electronic Engineering 2020, Volume 21, Issue 6,   Pages 809-962 doi: 10.1631/FITEE.1800743

Abstract: models have achieved state-of-the-art performance in (NER); the good performance, however, relies heavily on substantial amounts of labeled data. In some specific areas such as medical, financial, and military domains, labeled data is very scarce, while is readily available. Previous studies have used to enrich word representations, but a large amount of entity information in is neglected, which may be beneficial to the NER task. In this study, we propose a for NER tasks, which learns to create high-quality labeled data by applying a pre-trained module to filter out erroneous pseudo labels. Pseudo labels are automatically generated for and used as if they were true labels. Our semi-supervised framework includes three steps: constructing an optimal single neural model for a specific NER task, learning a module that evaluates pseudo labels, and creating new labeled data and improving the NER model iteratively. Experimental results on two English NER tasks and one Chinese clinical NER task demonstrate that our method further improves the performance of the best single neural model. Even when we use only pre-trained static word embeddings and do not rely on any external knowledge, our method achieves comparable performance to those state-of-the-art models on the CoNLL-2003 and OntoNotes 5.0 English NER tasks.

Keywords: 命名实体识别;无标注数据;深度学习;半监督学习方法    

A Motion-Adaptive Algorithm for Video Scan Format Conversion and the Hardware Implementation

Zhang Guanglie,Zheng Nanning,Wu Yong,Zhang Xia

Strategic Study of CAE 2001, Volume 3, Issue 6,   Pages 41-47

Abstract:

Along with the development of digital processing TV and new-generation TV fully digitalized, video scan format conversion has become an important technology. In this paper, by incorporateing noise-reduced filter with edge-preserved into motion adaptive deinterlacing algorithm, a new algorithm for scan format conversion is proposed. The principle and structure for implementing this algorithm in hardware are discussed. Accordingly, the simulation experiment in FPGA (Field-Programmable Gate Arrays) is designed. The experimental results show that the algorithm proposed in the paper is very efficient.

Keywords: scan format conversion     motion adaptive deinterlacing     edge-preserved noise-reduced filter    

Shot classification and replay detection for sports video summarization Research Article

Ali JAVED, Amen ALI KHAN,ali.javed@uettaxila.edu.pk

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 5,   Pages 790-800 doi: 10.1631/FITEE.2000414

Abstract: Automated analysis of sports is challenging due to variations in cameras, replay speed, illumination conditions, editing effects, game structure, genre, etc. To address these challenges, we propose an effective framework based on and for field sports videos. Accurate is mandatory to better structure the input video for further processing, i.e., key events or . Therefore, we present a based method for . Then we analyze each shot for and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos. For this purpose, we propose local octa-pattern features to represent video frames and train the for classification as replay or non-replay frames. The proposed framework is robust to variations in cameras, replay speed, shot speed, illumination conditions, game structure, sports genre, broadcasters, logo designs and placement, frame transitions, and editing effects. The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer, baseball, and cricket. Experimental results demonstrate that the proposed framework can reliably be used for and to summarize field sports videos.

Keywords: Extreme learning machine     Lightweight convolutional neural network     Local octa-patterns     Shot classification     Replay detection     Video summarization    

Title Author Date Type Operation

Unsupervised object detection with scene-adaptive concept learning

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Journal Article

Dynamic parameterized learning for unsupervised domain adaptation

Runhua JIANG, Yahong HAN

Journal Article

Layer-wise domain correction for unsupervised domain adaptation

Shuang LI, Shi-ji SONG, Cheng WU

Journal Article

Self-supervised graph learning with target-adaptive masking for session-based recommendation

Yitong WANG, Fei CAI, Zhiqiang PAN, Chengyu SONG,wangyitong20@nudt.edu.cn,caifei08@nudt.edu.cn,panzhiqiang@nudt.edu.cn,songchengyu@nudt.edu.cn

Journal Article

Miniaturized five fundamental issues about visual knowledge

Yun-he Pan,panyh@zju.edu.cn

Journal Article

Visual knowledge: an attempt to explore machine creativity

Yueting Zhuang, Siliang Tang,yzhuang@zju.edu.cn,siliang@zju.edu.cn

Journal Article

Unsupervised feature selection via joint local learning and group sparse regression

Yue WU, Can WANG, Yue-qing ZHANG, Jia-jun BU

Journal Article

A novel robotic visual perception framework for underwater operation

Yue LU, Xingyu CHEN, Zhengxing WU, Junzhi YU, Li WEN

Journal Article

Three-dimensional shape space learning for visual concept construction: challenges and research progress

Xin TONG

Journal Article

Federated unsupervised representation learning

Fengda ZHANG, Kun KUANG, Long CHEN, Zhaoyang YOU, Tao SHEN, Jun XIAO, Yin ZHANG, Chao WU, Fei WU, Yueting ZHUANG, Xiaolin LI,fdzhang@zju.edu.cn,kunkuang@zju.edu.cn

Journal Article

Automatic image enhancement by learning adaptive patch selection

Na LI, Jian ZHAN

Journal Article

Interactive image segmentation with a regression based ensemble learning paradigm

Jin ZHANG, Zhao-hui TANG, Wei-hua GUI, Qing CHEN, Jin-ping LIU

Journal Article

Learning to select pseudo labels: a semi-supervised method for named entity recognition

Zhen-zhen Li, Da-wei Feng, Dong-sheng Li, Xi-cheng Lu,lizhenzhen14@nudt.edu.cn,davyfeng.c@gmail.com,dsli@nudt.edu.cn,xclu@nudt.edu.cn

Journal Article

A Motion-Adaptive Algorithm for Video Scan Format Conversion and the Hardware Implementation

Zhang Guanglie,Zheng Nanning,Wu Yong,Zhang Xia

Journal Article

Shot classification and replay detection for sports video summarization

Ali JAVED, Amen ALI KHAN,ali.javed@uettaxila.edu.pk

Journal Article