Resource Type

Journal Article 18

Year

2022 3

2021 4

2020 1

2018 1

2017 1

2016 2

2007 1

2002 2

2001 2

2000 1

open ︾

Keywords

Automatic questioning 1

Dialogue summarization 1

Evidence analysis 1

Extreme learning machine 1

Focus of controversy 1

Game theory 1

Hierarchical B-picture structure 1

High Efficiency Video Coding (HEVC) 1

Intelligent trial system 1

Judgment prediction 1

Lightweight convolutional neural network 1

Local octa-patterns 1

Long-term prediction 1

Long-term temporal correlation 1

Multicast 1

Multispectral/hyperspectral video acquisition 1

Nash equilibrium 1

Oligopoly market 1

Replay detection 1

open ︾

Search scope:

排序: Display mode:

Shot classification and replay detection for sports video summarization Research Article

Ali JAVED, Amen ALI KHAN,ali.javed@uettaxila.edu.pk

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 5,   Pages 790-800 doi: 10.1631/FITEE.2000414

Abstract: Automated analysis of sports is challenging due to variations in cameras, replay speed, illumination conditions, editing effects, game structure, genre, etc. To address these challenges, we propose an effective framework based on and for field sports videos. Accurate is mandatory to better structure the input video for further processing, i.e., key events or . Therefore, we present a based method for . Then we analyze each shot for and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos. For this purpose, we propose local octa-pattern features to represent video frames and train the for classification as replay or non-replay frames. The proposed framework is robust to variations in cameras, replay speed, shot speed, illumination conditions, game structure, sports genre, broadcasters, logo designs and placement, frame transitions, and editing effects. The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer, baseball, and cricket. Experimental results demonstrate that the proposed framework can reliably be used for and to summarize field sports videos.

Keywords: Extreme learning machine     Lightweight convolutional neural network     Local octa-patterns     Shot classification     Replay detection     Video summarization    

Video summarization with a graph convolutional attention network Research Articles

Ping Li, Chao Tang, Xianghua Xu,patriclouis.lee@gmail.com

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 6,   Pages 902-913 doi: 10.1631/FITEE.2000429

Abstract: has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider the local and global relations among frames of video, leading to a deteriorated summarization performance. To address the above problem, we propose a graph convolutional attention network (GCAN) for . GCAN consists of two parts, embedding learning and , where embedding learning includes the temporal branch and graph branch. In particular, GCAN uses dilated temporal convolution to model local cues and temporal self-attention to exploit global cues for video frames. It learns graph embedding via a multi-layer to reveal the intrinsic structure of frame samples. The part combines the output streams from the temporal branch and graph branch to create the context-aware representation of frames, on which the importance scores are evaluated for selecting representative frames to generate video summary. Experiments are carried out on two benchmark databases, SumMe and TVSum, showing that the proposed GCAN approach enjoys superior performance compared to several state-of-the-art alternatives in three evaluation settings.

Keywords: 时序学习;自注意力机制;图卷积网络;上下文融合;视频摘要    

Avideo conferencing system based on SDN-enabled SVCmulticast Project supported by the National Natural Science Foundation of China (Nos. 61573329 and 61233003), the Youth Innovation Promotion Association CAS, and the Fundamental Research Funds for the Central Universities, China Article

En-zhong YANG,Lin-kai ZHANG,Zhen YAO,Jian YANG

Frontiers of Information Technology & Electronic Engineering 2016, Volume 17, Issue 7,   Pages 672-681 doi: 10.1631/FITEE.1601087

Abstract: Current typical video conferencing connection is bridged by a multipoint control unit (MCU), which may cause large delay and communication bottleneck for the whole system. With the development of network technology, a video conferencing system can be implemented based on software-defined networking (SDN), which makes the service controllable and improves the scalability and flexibility. Additionally, a video encoding method called scalable video coding (SVC) can also help. In this paper, we propose a video conferencing architecture based on SDN-enabled SVC multicasting, which discards the traditional Internet group management protocol (IGMP) and MCU. The system implements SVC multicast streaming to satisfy different device capabilities of various conference terminals. The SDN controller is responsible for dynamically managing and controlling the layers of a video stream when a conference member faces network congestion. Also, a conference manager is designed to facilitate the management of the conference members. Experimental results show that our system can not only provide a flexible and controllable video delivery, but also reduce the network usage while guaranteeing the quality of service (QoS) of video conferencing.

Keywords: Software-defined networking (SDN)     Multicast     Scalable video coding     Video conferencing system    

Scheme and Techniques for Hierarchical Oranization of Video

Zhang Yujin,Lu Haibin

Strategic Study of CAE 2000, Volume 2, Issue 3,   Pages 18-22

Abstract:

Digital video is an important data format in multimedia information systems. Traditional video representation is just a time sequence——video stream, thus it is difficult for computer to recognize or perceive video in the content level. To efficiently access and utilize video information, a suitable organization of video data is critical. This paper proposes a video organization scheme, which arranges video into four layers: video program, episode, shot and image frame. This hierarchical structure provides a compact and meaningful video catalogue, which can be easily used for non-linear browsing and content-based retrieval of video data. To achieve such an organization, it needs not only detect the boundary of shots and episodes, but also extract the key frames of shots and select the representative shots and frames for episodes. This paper proposes a number of suitable criteria and techniques for video segmentation and organization, and integrates these techniques into a prototype system. Some organization results using real video data are presented, which show the effectiveness of this organization scheme.

Keywords: video     organization     browsing     shot     episode     key frame     representative frame    

Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots None

Xu-guang ZUO, Lu YU

Frontiers of Information Technology & Electronic Engineering 2018, Volume 19, Issue 3,   Pages 459-470 doi: 10.1631/FITEE.1601552

Abstract: The latest video coding standard High Efficiency Video Coding (HEVC) can achieve much higher coding efficiency than previous video coding standards. Particularly, by exploiting the hierarchical B-picture prediction structure, temporal redundancy among neighbor frames is eliminated remarkably well. In practice, videos available to consumers usually contain many repeated shots, such as TV series, movies, and talk shows. According to our observations, when these videos are encoded by HEVC with the hierarchical B-picture structure, the temporal correlation in each shot is well exploited. However, the long-term correlation between repeated shots has not been used. We propose a long-term prediction (LTP) scheme to use the long-term temporal correlation between correlated shots in a video. The long-term reference (LTR) frames of a source video are chosen by clustering similar shots and extracting the representative frames, and a modified hierarchical B-picture coding structure based on an LTR frame is introduced to support long-term temporal prediction. An adaptive quantization method is further designed for LTR frames to improve the overall video coding efficiency. Experimental results show that up to 22.86% coding gain can be achieved using the new coding scheme.

Keywords: High Efficiency Video Coding (HEVC)     Long-term temporal correlation     Long-term prediction     Hierarchical B-picture structure    

High-resolution spectral video acquisition Review

Lin-sen CHEN, Tao YUE, Xun CAO, Zhan MA, David J. BRADY

Frontiers of Information Technology & Electronic Engineering 2017, Volume 18, Issue 9,   Pages 1250-1260 doi: 10.1631/FITEE.1700098

Abstract: Compared with conventional cameras, spectral imagers providemany more features in the spectral domain. They have been used invarious fields such as material identification, remote sensing, precisionagriculture, and surveillance. Traditional imaging spectrometers usegenerally scanning systems. They cannot meet the demands of dynamicscenarios. This limits the practical applications for spectral imaging.Recently, with the rapid development in computational photographytheory and semiconductor techniques, spectral video acquisition hasbecome feasible. This paper aims to offer a review of the state-of-the-artspectral imaging technologies, especially those capable of capturingspectral videos. Finally, we evaluate the performances of the existingspectral acquisition systems and discuss the trends for future work.

Keywords: Multispectral/hyperspectral video acquisition     Snapshot     Under-sampling and reconstruction    

Design of Real-time Video Processing System Based onMultimedia Processor TMS320DM642

Zhao Zhen,Zhang Weining,Tian Honglei

Strategic Study of CAE 2007, Volume 9, Issue 3,   Pages 87-91

Abstract:

A real-time video processing system based on multimedia processor TMS320DM642 is designed.  The function and working mode of video decoder TVP5150, encoder SAA7121and video port DM642 are presented. The working principle of software module is discussed. At last, the software design and the characteristic of the system are analyzed.

Keywords: TMS320DM642     TVP5150     SAA7121     video port    

A novel robotic visual perception framework for underwater operation Research Article

Yue LU, Xingyu CHEN, Zhengxing WU, Junzhi YU, Li WEN

Frontiers of Information Technology & Electronic Engineering 2022, Volume 23, Issue 11,   Pages 1602-1619 doi: 10.1631/FITEE.2100366

Abstract:

Underwater robotic operation usually requires visual perception (e.g., object detection and tracking), but underwater scenes have poor visual quality and represent a special domain which can affect the accuracy of visual perception. In addition, detection continuity and stability are important for , but the commonly used static accuracy based evaluation (i.e., average precision) is insufficient to reflect detector performance across time. In response to these two problems, we present a design for a novel robotic visual perception framework. First, we generally investigate the relationship between a quality-diverse data domain and in detection performance. As a result, although domain quality has an ignorable effect on within-domain detection accuracy, is beneficial to detection in real sea scenarios by reducing the domain shift. Moreover, non-reference assessments are proposed for detection continuity and stability based on object tracklets. Further, online tracklet refinement is developed to improve the temporal performance of detectors. Finally, combined with , an accurate and stable underwater robotic visual perception framework is established. Small-overlap suppression is proposed to extend (VID) methods to a single-object tracking task, leading to the flexibility to switch between detection and tracking. Extensive experiments were conducted on the ImageNet VID dataset and real-world robotic tasks to verify the correctness of our analysis and the superiority of our proposed approaches. The codes are available at https://github.com/yrqs/VisPerception.

Keywords: Underwater operation     Robotic perception     Visual restoration     Video object detection    

A Motion-Adaptive Algorithm for Video Scan Format Conversion and the Hardware Implementation

Zhang Guanglie,Zheng Nanning,Wu Yong,Zhang Xia

Strategic Study of CAE 2001, Volume 3, Issue 6,   Pages 41-47

Abstract:

Along with the development of digital processing TV and new-generation TV fully digitalized, video scan format conversion has become an important technology. In this paper, by incorporateing noise-reduced filter with edge-preserved into motion adaptive deinterlacing algorithm, a new algorithm for scan format conversion is proposed. The principle and structure for implementing this algorithm in hardware are discussed. Accordingly, the simulation experiment in FPGA (Field-Programmable Gate Arrays) is designed. The experimental results show that the algorithm proposed in the paper is very efficient.

Keywords: scan format conversion     motion adaptive deinterlacing     edge-preserved noise-reduced filter    

Unsupervised object detection with scene-adaptive concept learning Research Articles

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 5,   Pages 615-766 doi: 10.1631/FITEE.2000567

Abstract: Object detection is one of the hottest research directions in computer vision, has already made impressive progress in academia, and has many valuable applications in the industry. However, the mainstream detection methods still have two shortcomings: (1) even a model that is well trained using large amounts of data still cannot generally be used across different kinds of scenes; (2) once a model is deployed, it cannot autonomously evolve along with the accumulated unlabeled scene data. To address these problems, and inspired by theory, we propose a novel scene-adaptive evolution algorithm that can decrease the impact of scene changes through the concept of object groups. We first extract a large number of object proposals from unlabeled data through a pre-trained detection model. Second, we build the dictionary of object concepts by clustering the proposals, in which each cluster center represents an object prototype. Third, we look into the relations between different clusters and the object information of different groups, and propose a graph-based group information propagation strategy to determine the category of an object concept, which can effectively distinguish positive and negative proposals. With these pseudo labels, we can easily fine-tune the pre-trained model. The effectiveness of the proposed method is verified by performing different experiments, and the significant improvements are achieved.

Keywords: 视觉知识;无监督视频目标检测;场景自适应学习    

Crowd modeling based on purposiveness and a destination-driven analysis method Research Articles

Ning Ding, Weimin Qi, Huihuan Qian,hhqian@cuhk.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 10,   Pages 1351-1369 doi: 10.1631/FITEE.2000312

Abstract: This study focuses on the multiphase flow properties of crowd motions. Stability is a crucial forewarning factor for the crowd. To evaluate the behaviors of newly arriving pedestrians and the stability of a crowd, a novel motion structure analysis model is established based on purposiveness, and is used to describe the continuity of pedestrians’ pursuing their own goals. We represent the crowd with self-driven particles using a destination-driven analysis method. These self-driven particles are trackable feature points detected from human bodies. Then we use trajectories to calculate these self-driven particles’ purposiveness and select trajectories with high purposiveness to estimate the common destinations and the inherent structure of the crowd. Finally, we use these common destinations and the crowd structure to evaluate the behavior of newly arriving pedestrians and . Our studies show that the purposiveness parameter is a suitable descriptor for middle-density human crowds, and that the proposed destination-driven analysis method is capable of representing complex crowd motion behaviors. Experiments using synthetic and real data and videos of both human and animal crowds have been conducted to validate the proposed method.

Keywords: 人群建模;智能视频监控;人群稳定性    

Learning-based parameter prediction for quality control in three-dimensional medical image compression Research Articles

Yuxuan Hou, Zhong Ren, Yubo Tao, Wei Chen,3140104190@zju.edu.cn,renzhong@cad.zju.edu.cn

Frontiers of Information Technology & Electronic Engineering 2021, Volume 22, Issue 9,   Pages 1169-1178 doi: 10.1631/FITEE.2000234

Abstract: is of vital importance in compressing three-dimensional (3D) medical imaging data. Optimal compression parameters need to be determined based on the specific quality requirement. In , regarded as the state-of-the-art compression tool, the quantization parameter (QP) plays a dominant role in controlling quality. The direct application of a video-based scheme in predicting the ideal parameters for 3D cannot guarantee satisfactory results. In this paper we propose a parameter prediction scheme to achieve efficient . Its kernel is a support vector regression (SVR) based learning model that is capable of predicting the optimal QP from both video-based and structural image features extracted directly from raw data, avoiding time-consuming processes such as pre-encoding and iteration, which are often needed in existing techniques. Experimental results on several datasets verify that our approach outperforms current video-based methods.

Keywords: 医学图像压缩;高效视频编码(HEVC);质量控制;基于学习方法    

A Weighted Block-matching Criterion for the Hardware Implementation of Motion Estimators

Zhang Xia,Zheng Nanning,Zhang Guanglie,Wu Yong,Wang Shaorui,Xu Weipu

Strategic Study of CAE 2002, Volume 4, Issue 1,   Pages 47-53

Abstract:

In all kinds of digital video processing algorithms, motion compensated algorithm can acquire perfect performance because the motion information in the video signal has been considered. The hardware implementation of the motion estimator is the core of the various motion compensated digital video processings, which will be applied in real systems. Block-matching motion estimating algorithm is widely used in real system because of its low computing complication, easy realization and high call frequency of the block-matching criterion in the hardware systems. A new criterion called weighted minimized maximum error is proposed in this paper. This criterion can reduce the complexity of the motion estimator, decrease the area of the hardware and increase the speed of the hardware. On the other hand, the criterion is suitable to be applied to the recursive searching strategy which has an inherent weakness called error propagation.

Keywords: video processing     motion compensation     motion estimation     block-matching criterion    

Title Author Date Type Operation

Shot classification and replay detection for sports video summarization

Ali JAVED, Amen ALI KHAN,ali.javed@uettaxila.edu.pk

Journal Article

Video summarization with a graph convolutional attention network

Ping Li, Chao Tang, Xianghua Xu,patriclouis.lee@gmail.com

Journal Article

Avideo conferencing system based on SDN-enabled SVCmulticast Project supported by the National Natural Science Foundation of China (Nos. 61573329 and 61233003), the Youth Innovation Promotion Association CAS, and the Fundamental Research Funds for the Central Universities, China

En-zhong YANG,Lin-kai ZHANG,Zhen YAO,Jian YANG

Journal Article

Scheme and Techniques for Hierarchical Oranization of Video

Zhang Yujin,Lu Haibin

Journal Article

Summary of the Symposium on the Current Situation and Development Trend of New Materials in the 21st Century

Journal Article

Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots

Xu-guang ZUO, Lu YU

Journal Article

Summary of the Symposium on the Current Situation and Development Trend of New Materials in the 21st Century

null

Journal Article

High-resolution spectral video acquisition

Lin-sen CHEN, Tao YUE, Xun CAO, Zhan MA, David J. BRADY

Journal Article

Design of Real-time Video Processing System Based onMultimedia Processor TMS320DM642

Zhao Zhen,Zhang Weining,Tian Honglei

Journal Article

A novel robotic visual perception framework for underwater operation

Yue LU, Xingyu CHEN, Zhengxing WU, Junzhi YU, Li WEN

Journal Article

A Motion-Adaptive Algorithm for Video Scan Format Conversion and the Hardware Implementation

Zhang Guanglie,Zheng Nanning,Wu Yong,Zhang Xia

Journal Article

Unsupervised object detection with scene-adaptive concept learning

Shiliang Pu, Wei Zhao, Weijie Chen, Shicai Yang, Di Xie, Yunhe Pan,xiedi@hikvision.com

Journal Article

Crowd modeling based on purposiveness and a destination-driven analysis method

Ning Ding, Weimin Qi, Huihuan Qian,hhqian@cuhk.edu.cn

Journal Article

Learning-based parameter prediction for quality control in three-dimensional medical image compression

Yuxuan Hou, Zhong Ren, Yubo Tao, Wei Chen,3140104190@zju.edu.cn,renzhong@cad.zju.edu.cn

Journal Article

A Weighted Block-matching Criterion for the Hardware Implementation of Motion Estimators

Zhang Xia,Zheng Nanning,Zhang Guanglie,Wu Yong,Wang Shaorui,Xu Weipu

Journal Article