《1 Engineering research fronts》

1 Engineering research fronts

《1.1 Development trends in the top 10 engineering research fronts》

1.1 Development trends in the top 10 engineering research fronts

The top 10 engineering research fronts reviewed by the Information and Electronic Engineering Group are summarized in Table 1.1.1, and include electronic science and technology, optical engineering and technology, instrument science and technology, information and communication engineering, computer science and technology, and control science. Among these research fronts, "radar stealth technology," "new- generation mobile communication technology," "quantum coherence measurement and decoherence control," "software robot control method," "high-resolution remote sensing scene classification and image processing technology," and "human body gesture recognition method based on deep neural network” are based on the popular topics provided by Clarivate Analytics. Furthermore, "interpretable deep learning," "networked collaborative sensing and control theory," "blockchain technology," and "silicon-based optical interconnect chip technology" are recommended by the experts.

The number of core papers related to each front, published from 2012–2017, is shown in Table 1.1.2. Considering the number of core papers published in recent years, "high- resolution remote sensing scene classification and image processing technology" is the most significant front.

(1)  Radar stealth technology

Radar stealth technology is also known as radar low observable technology or radar target feature signal control technology. It is used for controlling the scattering direction, polarization mode, and radiation intensity and mode of the incident electromagnetic wave on the target surface by its shape, material, and comprehensive object and circuit designs, thereby reducing the probability of detection and recognition by the enemy radar system. Using radar is the

《Table 1.1.1》

Table 1.1.1 Top 10 engineering research fronts in information and electronic engineering

No. Engineering research front Core papers Citations Citations per paper Mean year Percentage of consistently-cited papers  Patent-cited papers
1  Radar stealth technology 24 788 32.83 2015.42 25.00% 0.00
2  Interpretable deep learning 5 148 29.6 2015.4
3  New-generation mobile communication technology 9 386 42.89 2016 33.30% 0.00
4  Networked collaborative sensing and control theory 139 5602 40.3 2013.45
5  Blockchain technology 16 213 13.31 2015.88
6  Quantum coherence measurement and decoherence control 37 1015 27.43 2015.81 40.50% 0.00
7  Software robot control method 24 1498 62.42 2014.67 20.80% 0.00
8  High-resolution remote sensing scene classification and image processing technology 69 2327 33.72 2015.55 36.20% 0.00
9  Silicon-based optical interconnect chip technology 21 790 37.62 2013.52
10  Human body gesture recognition method based on deep neural network 7 229 32.71 2016.29 28.60% 0.00

《Table 1.1.2》

Table 1.1.2 Annual number of core papers published for each of the top 10 engineering research fronts in information and electronic engineering

No. Engineering research front 2012 2013 2014 2015 2016 2017
1 Radar stealth technology 0 1 3 6 13 1
2 Interpretable deep learning 0 1 1 0 1 2
3 New-generation mobile communication technology 0 0 1 2 2 4
4 Networked collaborative sensing and control theory 43 37 24 23 12 0
5 Blockchain technology 0 0 2 3 6 5
6 Quantum coherence measurement and decoherence control 0 0 0 11 22 4
7 Software robot control method 1 1 8 10 3 1
8 High-resolution remote sensing scene classification and image processing technology 1 3 7 18 26 14
9 Silicon-based optical interconnect chip technology 5 4 8 4 0 0
10 Human body gesture recognition method based on deep neural network 0 0 0 2 1 4

most efficient method of detecting targets from a long distance; thus, radar stealth technology has been the focus of stealth technology development since its inception.

Considering electromagnetic wave manipulation in technology implementation, radar stealth technology includes mainly integrated design and evaluation of shape stealth, material stealth, and stealth-integrated design and evaluation. In shape stealth, the incident direction and scattering mode of the electromagnetic wave are adjusted by the shape design to achieve the stealth effect. It aims at reflecting the incident wave to the nonthreatening directions and reducing or blocking the strong scattering source and discontinuous structure scattering of the target surface at dihedral/trihedral angles. In material stealth, the electromagnetic waves are either attenuated or their radiation is changed by applying and mounting special functional materials on the target surface or its key parts. The active stealth technology refers to real-time generation of electromagnetic field signals that are opposite to the target’s scattering field, according to the incident electromagnetic wave and target states of the target surface, thereby realizing zero scattering by spatial cancellation and reducing the target’s radar cross section. Stealth-integrated design and evaluation must be emphasized and focused on in the application of radar stealth technology. This is one of the primary reasons that although the principle of stealth technology is open to everyone, only a few countries have mastered it.

As the detection and anti-detection technologies continue to progress, the main directions of future radar stealth technology development are: ① very low and ultra-low observables; ② wideband and omnidirectional stealth; ③ dual/multiple station radar stealth; ④ integrated synthesis and thin stealth design; and ⑤ disturbance/attenuation field reduction.

(2)  Interpretable deep learning

In recent years, deep learning methods represented by convolution depth, cyclic, and generic neural networks, and deep reinforcement learning have been applied in the fields of image classification and target detection, speech recognition and synthesis, and natural language processing; and their performance has improved dramatically. Although deep learning exhibits excellent performance in various artificial intelligence (AI) applications, its interpretability has always been its weakness. Currently, the highly discriminative ability of deep neural networks is layered by constructing multiple layers of nonlinear mapping functions. As abstraction, the black box effect is presented, and the relationship between its internal network structure and learning parameters, and its decision output is difficult to establish. Interpretable AI can break through the primary bottlenecks of deep learning, such as effective learning of small samples or weakly labeled data, human–computer interaction learning at the semantic level, and semantic debugging of neural network representation.

Currently, the study of understanding or decoupling complex neural network representation to improve its interpretability includes mainly five aspects: ① visualization of the convolutional neural network representation in an intermediate network layer; ② pairs of convolution features, mapping space, and correspondence of different semantic categories for diagnosis; ③ decoupling mixed modes of different convolutional layers; ④ establishing interpretable deep network models, such as interpreting convolution depth neural networks and capsule networks; and ⑤ semantic layer learning by human–computer interaction. Moreover, anti- machine learning detects the vulnerability of deep learning models by constructing anti-samples. Anti-machine learning builds against sample detection based on the vulnerability of deep learning models. Preliminary studies have shown that it is helpful to open the black box of a depth model by adding the confrontation sample formed by the perturbation to the meta-predictor or constructing the influential function traceability model prediction for its training data, and to a certain extent its predicted behavior or decision boundaries are interpreted.

Combining rule-based symbolic reasoning with data-driven learning processes can enhance the interpretability of intelligent learning processes. For example, inductive learning combines neural perceptrons with logical reasoning so that the learning process can directly use domain knowledge, thereby enhancing the learning process. Both attention and memory play an important role in the process of human cognitive reasoning, especially for knowledge acquisition, understanding, and reasoning of sequence data, such as text, speech, and video. Constructing deep neural reasoning mechanisms with the support of attention mechanism and memory structure to enhance the interpretability of intelligent learning and reasoning is an important research field of interpretable deep learning.

(3)  New-generation mobile communication technology

To cope with the explosive growth of mobile data traffic, simultaneous access of massive devices, and emergence of various services in the future, the new-generation mobile communication technology needs to provide ms-level and end-to-end connectivity for hundreds of billions of devices, provide peak Gbit/s transmission rate, and support ultra- high mobility, traffic density, connectivity density, and other diverse application scenarios. The new generation of mobile communication technology will integrate various wireless access modes to achieve flexible network deployment, operation, and maintenance; comprehensively enhance the spectrum, energy, and cost efficiencies; and promote the sustainable development of the mobile communication industry. The new generation of communication technology involves the design, development, and production of the new generation of mobile chips and related core technologies. Furthermore, it involves the development of the entire information industry. It is directly related to industrial transformation and upgrading, and has a direct, significant, and simultaneous impact on the manufacturing industry and services. Further, development of a communication technology involves global standards, various patents, and huge network construction costs. Thus, for every country, the development of a new mobile communication technology brings great economic and social benefits.

(4)  Networked collaborative sensing and control theory

A networked collaborative sensing and control system usually consists of a set of sensors and controllers, in which the sensors get information from the environment to be monitored, while the controllers try to change the environment via control signals, and the sensors and controllers are connected via communication networks. A networked collaborative sensing and control system has the functions of sensing and changing the physical world, and is applied in various areas, such as disaster relief, intelligent building, factory automation, underground safety control, and biochemical attack detection.

Compared with conventional centralized sensing and control methods, the networked distributed cooperative sensing and control methods have the following advantages. ① Decentralization of sensing and control functions improves the robustness of the system to the failure of sensing and control nodes. ② In large-scale distributed systems, achieving full network synchronization is difficult, and the nodes may work asynchronously. Conventional centralized sensing and control methods cannot deal with asynchronous information. However, in a network-based distributed structure, the asynchronous sensing and control problem can be solved via the hierarchical information processing approach. ③ The computational efficiency of a system can be greatly improved by the decentralization of sensing and control functions. Through information interaction among subsystems, computational complexity can be reduced while achieving good performance with centralized sensing and control.

In recent years, the research on the theory and method of distributed sensing and control based on information interaction has received great attention. However, many theoretical and application-related problems are still open, including distributed sensing and control based on the cooperative (such as controllers and sensors) and noncooperative modes of nodes. These include the asynchronous multi-rate distributed sensing and control of each subsystem in an asynchronous working state, extensibility of distributed cooperative sensing and control (allowing the arbitrary access and exit of any node), and the influence of network topology and vulnerability on distributed sensing and control. It is believed that breakthroughs in the future will effectively promote the development of intelligent manufacturing, networked autonomous systems, smart grids, and other fields under industrial Internet.

(5)  Blockchain technology

Blockchain technology, which includes P2P technology, cryptography, and consensus algorithms, aims to provide a mechanism for information and value transfer in untrusted environments and is a cornerstone for the development of the future Internet. It has the features of unchangeable data, collective system maintenance, and information disclosure. Blockchain, as a versatile technique, accelerates distributed technology applications from digital currencies into other areas and integrates innovations from all industries. It has the following three advantages. ① The consensus algorithm ensures that the data on the blockchain is secure and difficult to tamper with. ② Each node of a blockchain has a piece of data; thus, it is heterogeneous and reliable. ③ With smart contract attributes, it can automatically perform decentralization of applications.

From a single application to multiple fields, the blockchain technology still has a long way to go. It will continue to evolve in the future, and technical processes, such as consensus algorithms, sharded services, processing methods, and organizational forms will continue to change. However, there are a few challenges as well. First, performance and scalability cannot meet requirements. The transaction throughput and storage bandwidth are far from meeting the actual requirements of applications. Currently, Bitcoin’s transaction throughput rate is seven transactions per second and that of Ethereum is 14 transactions per second. How to improve transaction throughput without affecting the overall security of the system will be worth studying. Furthermore, by compressing the block time and increasing the block size, the sharding technology can effectively improve the transaction throughput rate. Second, data privacy and access control require improvement. In the existing public blockchains, each participant can obtain a complete data backup. All data are transparent to the participants, and it is impossible for the participants to obtain only specific information. How to ensure transaction privacy without affecting the public blockchain execution efficiency is still a challenge. Currently, research in this area includes the coin-conserving, zero-knowledge proof, and ring signature mechanisms to protect a user's transaction privacy data. Third, the governance mechanism requires improvement. Public blockchain communities have explored various upgrade mechanisms, such as “hard fork” and “soft fork”; however, the remaining problems cannot be ignored. Public blockchains cannot be “closed,” and its bug fixes are extremely troublesome; thus, it can be fatal if there is a security breach.

(6)   Quantum coherence measurement and decoherence control

Quantum coherence and quantum phase transition play an important role in quantum information. Although systems have always been coherent in theory in the evolution process, an actual system is not strictly closed. Thus, it will inevitably become entangled with the environment and cause decoherence. In quantum mechanics, the quantum coherence of open quantum systems is gradually lost over time due to quantum entanglement with the external environment. This effect is called “quantum decoherence.” Quantum decoherence is the consequence of quantum entanglement between quantum systems and the environment. The interference phenomenon caused by quantum coherence will disappear because of quantum decoherence. Quantum decoherence changes the quantum behavior of a system into a classical behavior, and this transition is called quantum-to-classical transition. Hans Zehe, a German physicist, first proposed this concept of quantum decoherence in 1970. Since 1980, quantum decoherence has become a popular research topic.

Quantum decoherence usually occurs very quickly; therefore, making superimposed objects in macroscopic or mesoscopic forms is difficult. To experimentally verify the quantum decoherence effect, witness the smooth boundary between quantum and classical behaviors, test and improve the theoretical model of quantum decoherence, and find out any difference from quantum mechanical evolution, we must perform the following challenging tasks: ① preparing several quantum superposition states of macroscopic or mesoscopic states that can be resolved; ② designing a set of methods to confirm quantum superposition; ③ having the duration of quantum decoherence long enough to observe it correctly; and ④ designing a set of methods to supervise quantum decoherence.

The influence of decoherence on quantum information science can be roughly divided into two parts: quantum computing and quantum communication. We know that in quantum information science, the state of quantum systems contains information. Quantum decoherence causes loss of some or all information of the system; thus, it will cause calculation errors during quantum computation. In quantum communication, the information in the transmission channel is prone to disturbance and interference, and the receiver at the end of the channel can possibly receive noise and error information; thus, it requires assistance from a debugging system, such as an encoder.

(7)  Software robot control method

The research on control of soft robots is essentially to solve the inverse kinematics problems. Unlike conventional rigid- bodied robots that are driven by electric motors, soft robots generate motions using soft materials similar to natural muscles, which arm the soft robots with animal-like agility and flexibility, increase adaptability in complex working environments, and reduce hazards related to human–machine interactions. Theoretically, a soft robot has an infinite number of degrees of freedom, making it possible to produce extremely complicated motions, such as stretching, bending, and twisting. Thus, motion control in soft robots is faced with enormous challenges. Currently, soft materials employed for artificial muscles include pneumatic artificial muscles, shape memory alloys, electroactive polymers, and so on. Although these materials exhibit intriguing attributes, such as high energy density, large deformation, and low weight, they are usually highly nonlinear with strong viscoelasticity. Moreover, considering their inherent compliance, soft robots are likely to work in environments with remarkably high uncertainties. Therefore, fine adaptability and strong robustness are preferred for the control of soft robots. Generally, control of soft robots can be achieved using two types of approaches: ① model-based and ② learning-based control approaches.

The former approach develops a dynamic model of the system through the first principle or data-driven methods and achieves motion control using conventional feedback control schemes. However, the drawback of this method is that it partly depends on the model; a kinetic/dynamic model of soft robots will result in a lot of uncertainties during their interactions with the environments. To overcome such limitations, the latter approach, inspired by the similarities between soft actuators and natural muscles, draws lessons from the motor control of natural muscles and compensates for the model uncertainties through on-line learning, which is processed in an artificial neural network. However, many explorations are required for the optimization of the network structure and the tuning of the parameters.

(8)   High-resolution remote sensing scene classification and image processing technology

High-resolution remote sensing imagery is a large spatial data related to emergency and disaster mitigation applications, which influence national economy and livelihood of the people. Remote sensing scene classification belongs to the research category of overall image understanding. Scene classification should analyze, judge, interpret, and label image scenes. It is a mapping process for learning and discovering images and scene semantic content tags. Scenes usually contain multiple targets, and scene classification is also a key issue in image understanding. Common scene classification algorithms include scene classification of local and middle- level semantics, and that of semantic topic models. According to the hierarchy, scene classification can be divided into two main methods: low-level and middle-level feature descriptions. However, low-level feature descriptions tend to have poor generalization performance and are difficult to process during image classification outside the training set; thus, most of the current scene classification algorithms focus on scene classification based on middle-level semantic modeling.

The middle-layer feature is a type of aggregation and integration of low-level features, and its essence is to use statistical distribution to establish a relationship between features and categories. Generally, the global low-level features cannot reflect local objects. Considering local low- level feature description, multi-local feature fusion and integrated learning can improve the recognition rate of scene classification. The scene classification based on the bag-of-visual-words (BoVW) model is a widely used middle- level semantic algorithm. Considering that spatial symbiosis and context of the vocabulary will help to improve the interpretation of the semantics of the scene structure, the BoVW model does not need to analyze the specific target composition of the scene, establish visual words according to the statistical characteristics of the low-level features of the scene, and then express the image scene information by using the visual word distribution of the image. However, the number of words to be set in the BoVW model is not known. The generated objects tend to have a greater correlation with the training samples that is an important factor affecting the robustness of the algorithm.

Statistical learning theory is a relatively mature research algorithm. In multi-sensor fusion remote sensing image scene classification, the use of support vector machines based on structural risk minimization principle and random forest algorithm using bootstrap resampling has been reported. To achieve scene classification of airborne radar and multispectral images, classification is often conducted using random forests combined with a classical priori model and an unsupervised image segmentation tool, Markov random field (MRF). Recently, semantic topic models such as pLSA and LDA that originated from text classification research have achieved good experimental results in scene classification of spaceborne landslides and airborne aerial photography.

The middle-level semantic scene classification alleviates the problem of semantic gap to a certain extent; however, there are no good solutions to the change of scene scale, the difference of sensor shooting angle and time, and the change of semantic object combination.

(9)  Silicon-based optical interconnect chip technology

Silicon is not only an electronic material but also a photonic material. Using silicon as the substrate, photonic devices can be fabricated using existing integrated circuit processes. Silicon-based optical interconnect chip technology uses silicon and silicon-based substrate materials as an optical media to fabricate corresponding photonic devices and optoelectronic devices (including silicon-based optical transceivers, silicon-based optical modulators, and silicon- based optical waveguides) through integrated circuit processes. Further, they use these devices to process and manipulate photons to achieve optical interconnection between systems, motherboards, chips, CPUs, and CPU cores. Inter-chip and on-chip optical interconnections, compared with electrical interconnection technology, has the fundamental advantages of ultra-high bandwidth, high speed, low power consumption, low distortion, low crosstalk, and no electromagnetic interference. The main research directions of silicon-based photonic integration for optical interconnection are as follows: ① light sources for emitting light waves as an information carrier; ② optical waveguides for transmitting optical signals; ③ modulator for generating calculation unit (the electrical signal is loaded on the optical wave carrier); ④ optical signal receivers for receiving the optical signal and converting it into an electrical signal for feedback to the computing unit; and ⑤ system integration.

Although silicon has many unique advantages as an optical interconnect material, as a single material it cannot accomplish all the functions of an optical interconnect device. For example, optical transmission and detection are a contradiction. Light transmission requires the material be transparent to photons, while light detection requires the material be opaque to light and absorb photons. Silicon is an indirect bandgap semiconductor that cannot achieve stimulated radiation easily; thus, fabricating it as a source material is difficult. Furthermore, silicon has some technical bottlenecks to overcome in optical interconnect applications. For example, the silicon-based photonic processors developed in 2015 and 2018 by the University of Berkeley and other institutes, as reported by Nature, used external light sources. Therefore, an important development trend is to find other materials that are compatible with silicon-based CMOS processes to compensate for the deficiencies of silicon in the optical interconnect chip technology.

(10)  Human body gesture recognition method based on deep neural network

Human motion recognition based on smart phones and wearable devices, such as wristbands and watches, has become a mainstream recognition method. Traditional machine learning methods, such as support vector machines, Bayesian networks, time- and frequency-domain analysis, and other machine learning methods, require extraction of features based on professional human motion domain knowledge. Neural network-based methods are still few, and neural networks and deep learning still require artificial extraction features. Feature extraction is a key step in machine learning and deep learning. Similarly, for human motion recognition, feature extraction of sensor data is extremely important.

Human motion recognition is an important research topic, especially in the popularity of smartphones and smart wearable devices nowadays. For human motion recognition, machine-learning methods include mainly traditional support vector machine, decision tree, KNN, naive Bayes, neural network, and deep learning. Model-dependent training data sources include a single acceleration sensor, or a combination of gyroscopes, magnetic fields, and even sound information. The position of the sensor in motion recognition is divided mainly into a fixed position (a plurality of sensors are generally fixed positions) and a nonfixed position. Feature extraction is conducted mainly in the time domain, and in fewer cases, in the frequency domain.

《1.2 Interpretations for three key engineering research fronts》

1.2 Interpretations for three key engineering research fronts

1.2.1 Radar stealth technology

Radar stealth (invisibility) technology, also known as radar low observability technology, is professionally called as radar target signature control technology. It is used for manipulating the scattering direction, polarization, intensity, and pattern of the electromagnetic wave illuminating the target surface via synthetic designing of shape, material, circuit and other aspects, thereby reducing the probability of detection and recognition by the enemy radar system. Using radar is the most efficient method to detect remote targets; thus, radar stealth has always been the focus of stealth technology.

Considering electromagnetic wave manipulation in technology implementation, radar stealth technology includes mainly shape stealth, material stealth, active stealth, and stealth-integrated design and evaluation.

Shape stealth was the first to receive attention and development in the radar stealth technology. It regulates the incident direction and the scattering mode of electromagnetic waves to achieve stealth. The focus is to reflect the incident waves to nonthreatening directions, and to reduce or block the strong scattering sources, such as dihedral angles/trihedral angles of the target surface as well as the discontinuous structure scattering. Considering a stealth aircraft as an example, the critical technologies to be solved in the shape stealth design are the integrated design of stealth shape and aerodynamic layout, the real-time electromagnetic calculation technology of the electric large size target, and the integrated fast forming technology.

Material stealth is used to change the attenuation or radiation of the electromagnetic waves by applying and mounting special functional materials on the target surface or its key parts (such as electronic system antennas and some strong scattering points). According to the electromagnetic regulation method of materials, the stealth materials currently in research and development include mainly: absorbing type stealth materials that introduce the radar waves into the material and then dissipate their energy; surface type stealth materials represented by metamaterials that change the scattering pattern and the polarization mode of electromagnetic wave by sub-wavelength periodic geometric circuit structure design. The key technologies to be addressed in the development of material stealth technology include electromagnetic/thermal/force integrated design and analysis, micro-nano processing, and metamaterial technology. The surface electromagnetic control materials represented by metamaterials are the focus of future stealth materials development.

The active stealth technology refers to real-time generation of electromagnetic field signals opposite to the target scattering field according to the incident electromagnetic wave state (incident direction and polarization) and target state (attitude velocity), thereby achieving zero scattering through space cancellation and RCS reduction of the target. The active stealth technology is still in the stage of exploration and development. The key technologies to be solved include real- time electromagnetic spectrum sensing and measurement, real-time generation and precise control of electromagnetic field signals, information metamaterials, and so on. Intelligent skinning for active stealth is an important direction for the development of future stealth technology.

Stealth integrated design and evaluation is a technical field that must be emphasized. This is one of the important reasons why theoretical principle is basically open but the stealth technology is mastered by only a few countries. The development of multi-sensor battlefield sensing technology, stealth technology must also emphasize the integrated control technology of active and passive features such as electromagnetic, optical, infrared, and acoustic ones. Therefore, the key technologies that must be solved and developed include multi-field coupling parallel computing, simulation testing and evaluation of stealth performance, as well as related design tools and instrumentation techniques.

Currently, radar stealth technology has been successfully applied to multi-type weapons and equipment after more than 50 years of development. Taking stealth aircraft as an example, it evolves from the earliest representative F117 to the current F22, F35, and B2. China and Russia have also become two of the few countries that have mastered the design techniques of stealth aircraft. As the escalation of the game between detection and anti-detection technology, the main directions of future radar stealth technology include:

(1)   Development in a very low observable and ultra-low observable direction;

(2)   Development in the broadband and omnidirectional stealth direction;

(3)  Development from mono-static radar stealth to dual/multi- static radar stealth direction;

(4)  Stealth design in the integrated and lightweight direction;

(5) Development from scattering field reduction to disturbance field/attenuation field reduction.

The countries or regions with the greatest output of core papers, institutions with the greatest output of core papers, countries or regions with the greatest output of citing papers, and institutions with the greatest output of citing papers on“radar stealth technology” are shown in Tables 1.2.1–1.2.4. The collaboration networks among major countries or regions as well as among major institutions are shown in Figure 1.2.1 and Figure 1.2.2, respectively.

1.2.2 Interpretable deep learning

(1)  Core of artificial intelligence: learning and reasoning

The intrinsic feature of AI lies in its abilities of never-ending or self-taught learning from data and experiences, intuitively reasoning, and self-adaptation. Reasoning, the basic form of thinking and simulation, is a process of obtaining new judgments or conclusions from one or several given judgments or premises. Reasoning could be categorized into deductive reasoning, inductive reasoning, analogy reasoning, presumptive or abduction reasoning, causality reasoning, synthesis reasoning, etc. Early studies towards reasoning started in areas of logicism and knowledge engineering. Logicism resorts to the formalization method to represent the objective world, e.g., using the first-order logic or predicate logic to conduct reasoning. Recently developed knowledge- graph reasoning, memory-driven reasoning, multi-agent reasoning, and cross-media synthesis reasoning receive more and more attention of researchers.

Intelligent learning follows three main paradigms: ① learning with formalization methods that first represents rules by symbolic logic and then conduct reasoning. ② Statistical learning that could be considered as learning from data or the so-called data-driven learning. First, we can construct supervised learning process on large-scale labeled data samples, such as deep learning methods. Moreover, using data distributions or prior knowledge, Bayesian learning can be effectively performed on small-size data samples. ③ Learning based on cybernetics—i.e., self-improving from experiences—that could

《Table 1.2.1》

Table 1.2.1 Countries or regions with the greatest output of core papers on the “radar stealth technology”

No. Country/Region Core papers Percentage of core papers Citations Percentage of citations Citations per paper
1 China 19 79.17% 606 76.90% 31.89
2 USA 4 16.67% 146 18.53% 36.5
3 Australia 2 8.33% 52 6.60% 26
4 Netherlands 1 4.17% 66 8.38% 66
5 Spain 1 4.17% 66 8.38% 66
6 Iran 1 4.17% 12 1.52% 12

《Table 1.2.2》

Table 1.2.2 Institutions with the greatest output of core papers on the “radar stealth technology”

No. Institution Core papers Percentage of core papers Citations Percentage of citations Citations per paper
1 Southeast Univ 11 45.83% 469 59.52% 42.64
2 Cooperat Innovat Ctr Terahertz Sci 5 20.83% 165 20.94% 33.00
3 Nankai Univ 4 16.67% 173 21.95% 43.25
4 Tianjin Univ 4 16.67% 171 21.70% 42.75
5 Xidian Univ 4 16.67% 76 9.64% 19.00
6 Nanjing Univ 3 12.50% 144 18.27% 48.00
7 Arizona State Univ 2 8.33% 48 6.09% 24.00
8 Fudan Univ 2 8.33% 113 14.34% 56.50
9 Univ Elect Sci & Technol China 2 8.33% 129 16.37% 64.50
10 Peking Univ 2 8.33% 42 5.33% 21.00

《Figure 1.2.1》

Figure 1.2.1  Collaboration network among major countries  or regions in the engineering research front of “radar stealth technology”

be considered as the question-guided or feedback-guided learning, such as reinforcement learning.

(2)  Deep learning: black box versus interpretability

Deep learning methods, such as deep convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial network (GAN), and deep reinforcement learning (DRL), have recently achieved outstanding predictive performance in a wide range of applications, including visual object recognition, speech recognition and synthesis, and natural language processing. There has been an explosion of interest in interpreting the representations learned by these models.

Currently, deep neural networks obtain high discrimination

《Figure 1.2.2》

Figure 1.2.2 Collaboration network among major institutions in the engineering research front of “radar stealth technology”

《Table 1.2.3》

Table 1.2.3 Countries or regions with the greatest output of citing papers on the “radar stealth technology”

No. Country/Region Citing papers Percentage of citing papers Mean year
1 China 255 71.63% 2016.48
2 USA 36 10.11% 2016.44
3 Singapore 19 5.34% 2016.58
4 India 9 2.53% 2016.56
5 Iran 9 2.53% 2016.33
6 Canada 7 1.97% 2016.14
7 UK 6 1.69% 2016.33
8 Australia 6 1.69% 2016.5
9 Italy 5 1.40% 2016.6
10 Denmark 4 1.12% 2017

《Table 1.2.4》

Table 1.2.4 Institutions with the greatest output of citing papers on the “radar stealth technology”

No. Institution Citing papers Percentage of citing papers Mean year
1 Southeast Univ 53 24.65% 2016.32
2 Air Force Eng Univ 51 23.72% 2016.49
3 Xidian Univ 19 8.84% 2016.79
4 Cooperat Innovat Ctr Terahertz Sci 16 7.44% 2016.19
5 Tianjin Univ 16 7.44% 2016.06
6 Nanjing Univ 15 6.98% 2016.2
7 Chinese Acad Sci 12 5.58% 2016.5
8 Nankai Univ 12 5.58% 2015.83
9 Nat Univ Singapore 11 5.12% 2016.55
10 Commun Univ China 10 4.65% 2016.4

power at the cost of low interpretability of their black-box representations. For example, the end-to-end learning strategy makes CNN representations a black-box. Except for the final network output, it is difficult for people to understand the logic of CNN predictions hidden inside the network. Though dramatic success in deep learning has led to a torrent of AI applications and systems that can perceive, learn, decide, and act on their own, the effectiveness of these systems is limited by their current inability to explain their rationale, characterize their strengths and weaknesses, predict their applicability on new task, and even convey an understanding of how they will behave in the future. A high-model interpretability may help people to break several bottlenecks of deep learning, such as learning from very few annotations, learning via human– computer interaction at the semantic level, and semantically debugging network representations.

(3)  Interpretable deep learning: data-driven and knowledge- guided

As discussed above, deep neural networks’ low interpretability lies in their black-box representations. It is difficult for people to understand the relations between decision outputs and logic of predictions and parameters hidden inside the network. Current studies on understanding neural- network representations with interpretable or disentangled representations fall mainly into five directions: ① visualization of the CNN representations in intermediate networks layers, ② diagnosis of the CNN representations, ③ disentanglement of the mixture of patterns encoded in each filter of CNNs, ④ building explainable models, such as interpretable CNNs and capsule networks, and ⑤ semantic-level middle-to- end learning via human–computer interaction. Moreover, adversarial machine learning tries to explore deep learning methods’ vulnerability and its robustness to adversarial attacks. Recent studies show that learning meta-predictor via imposing explainable rules to the process of constructing adversarial examples could help open up the black-box of deep neural networks and make explanations to the decision boundary of their prediction models.

As a data-driven learning paradigm, incorporating explicit reasoning rules into the deep learning models is difficult and limits their interpretability. Symbolic AI refers to a set of methods based on high-level symbolic problem representations and its goal is to dene intelligent systems in an explicit way that is understandable by humans and thus is interpretable. Recent developed abductive learning connects symbolic reasoning with neural perception. Owing to the expressive power of first-order logic, abductive learning is capable of directly exploiting general domain knowledge, so as to enhance the interpretability of the learning process. Moreover, inspired by the intelligence of human brain, perception or cognition requires brain not only to handle the data of target tasks but also to activate related information stored in the brain. Therefore, attention and memory play an important role in the process of cognitive reasoning, especially for the knowledge extraction, understanding and reasoning from sequential media data of texts, audios and videos. Based on these motivations, learning from external knowledge could be combined with the data-driven learning paradigm if we appropriately incorporate the attention mechanism and memory structures into end-to-end deep learning. Representative methods include neural tuning machine, memory networks, adaptive computation time, neural GPU, neural random-access machines, as well as random accessing to external memories via reinforcement training, and can be taken as new intelligent learning frameworks of combining data-driven and knowledge-guided paradigms. Thus, constructing deep neural reasoning on top of attention mechanism or memory structures should be a promising direction to improve the interpretability of AI’s learning and reasoning abilities.

(4)  Related academic endeavors and projects

Interpretable machine learning has received more and more focuses from academic community. There has been a tutorial of “Interpretable Machine Learning: The fuss, the concrete and the questions” at the International Conference on Machine Learning, 2017 (ICML 2017), organized by researchers from Harvard University and Google Brain. In 2017 conference on Neural Information Processing Systems, researchers from MIT, MPI, Microsoft, Cornell, UW-Madison, Johns Hopkins, JPL, DeepMind, UCSD, and NYU gathered together and organized an Interpretable ML Symposium. The symposium solicited more than 30 papers from areas of deep learning, kernel or probabilistic methods, automatic scientific discovery, safe AI and AI ethics, causality, human–computer interaction, quantifying or visualizing interpretability, symbolic regression, etc. This symposium was designed to broadly engage the machine learning community to discuss how to enhance the interpretability of machine learning.

The defense advanced research projects agency (DARPA) launched the program of “explainable artificial intelligence (XAI)” on Oct. 10, 2017, whose goal is to create a suite of machine learning techniques that produce more explainable models while maintaining a high level of learning performance and enable human users to understand, appropriately trust, and effectively manage the emerging generation of artificial intelligent partners. XAI is one of a handful of current DARPA programs expected to enable “third-wave AI systems,” where machines understand the context and environment in which they operate, and over time build underlying explanatory models that allow them to characterize real world phenomena.

Very recently, on July 20, 2018, DARPA announced its Artificial Intelligence Exploration (AIE) program, a key component of the agency’s broader AI investment strategy. AIE continues DARPA’s five-decade streak of pioneering groundbreaking research and development in AI. Past DARPA investments facilitated the advancement of “first wave” (rule-based) and “second wave” (statistical-learning-based) AI technologies. DARPA-funded projects enabled some of the first successes in AI, such as expert systems and search, and more recently the agency has advanced machine learning algorithms and hardware. DARPA is now interested in researching and developing “third wave” AI theory and applications that address the limitations of first and second wave technologies by making it possible for machines to contextually adapt to changing situations.

The countries or regions with the greatest output of core papers, institutions with the greatest output of core papers, countries or regions with the greatest output of citing papers, and institutions with the greatest output of citing papers on “interpretable deep learning” are shown in Tables 1.2.5–1.2.8. The collaboration network among major countries or regions and the collaboration network among major institutions are shown in Figure 1.2.3 and Figure 1.2.4, respectively.

1.2.3 New-generation mobile communication technology

Speaking of the research paradigm, the new generation of mobile communication technology will continue to advance along the three main paradigms of the past, namely measurement modeling, performance analysis, and system design. Measurement modeling refers to the measurement, characterization, and modeling of the objective physical world; performance analysis refers to given wireless communication system and transmission mechanism, and then analyzes its system performance; system design refers to design and optimizes the mobile communication network architecture given the system design indicators. Moreover, with the advancement of big data, machine learning, and AI technology, how to combine it with mobile communication to achieve interdisciplinary integration development is one of the key research directions of the next-generation mobile communication technology, and even may become a mainstream research and an important branch in the next five years. The key technologies to be solved are: ① co-integration technology of storage, computing, and communication; ② ubiquitous network architecture and ultra-dense heterogeneous network technology supporting massive terminal access; ③ the theory and implementation of ultra-reliable low-latency mobile communication system; ④ intelligent mobile communication system design for scenes and services; ⑤ theory and implementation of data- driven mobile communication system; ⑥ optimization theory and real-time implementation of ultra-large-scale mobile communication network.

《Table 1.2.5》

Table 1.2.5 Countries or regions with the greatest output of core papers on the “interpretable deep learning”

No. Country/Region Core papers Percentage of core papers Citations Percentage of citations Citations per paper
1 USA 2 40.00% 107 72.30% 53.50
2 Colombia 1 20.00% 60 40.54% 60.00
3 France 1 20.00% 47 31.76% 47.00
4 UK 1 20.00% 47 31.76% 47.00
5 Austria 1 20.00% 17 11.49% 17.00
6 China 1 20.00% 13 8.78% 13.00
7 Denmark 1 20.00% 13 8.78% 13.00
8 Germany 1 20.00% 11 7.43% 11.00

《Table 1.2.6》

Table 1.2.6 Institutions with the greatest output of core papers on the “interpretable deep learning”

No. Institution Core papers Percentage of core papers Citations Percentage of citations Citations per paper
1 Case Western Reserve Univ 1 20.00% 60 40.54% 60.00
2 Univ Nacl Colombia 1 20.00% 60 40.54% 60.00
3 Cent Supelec INRIA Saclay 1 20.00% 47 31.76% 47.00
4 Univ Massachusetts 1 20.00% 47 31.76% 47.00
5 Univ Oxford 1 20.00% 47 31.76% 47.00
6 IST Austria 1 20.00% 17 11.49% 17.00
7 Shanghai Jiao Tong Univ 1 20.00% 13 8.78% 13.00
8 Univ Copenhagen 1 20.00% 13 8.78% 13.00
9 Staatliche Berufliche Oberschule Kaufbeuren 1 20.00% 11 7.43% 11.00
10 Univ Appl Sci Mittweida 1 20.00% 11 7.43% 11.00

《Figure 1.2.3》

Figure 1.2.3 Collaboration network among major countries in the engineering research front of “interpretable deep learning”

Currently, the development priorities of major countries and regions in the world are as follows: ① The United States is committed to the intellectual property protection of the fifth generation mobile communication (5G) core technology, and focuses on data-driven, AI-based mobile communication technology; ② Europe is promoting the use of 5G core technology, such as the core challenges and commercialization of large-scale antennas, at the same time, several countries, mainly Germany, pay special attention to the research of ultra-reliable low-latency mobile communication systems relying on industrialization 4.0; ③ NTT Docomo is the representative of Japan, focusing on a series of core issues of 5G commercialization, such as the implementation of large-scale antennas.

《Figure 1.2.4》

Figure 1.2.4 Collaboration network among major institutions in the engineering research front of “interpretable deep learning”

《Table 1.2.7》

Table 1.2.7 Countries or regions with the greatest output of citing papers on the “interpretable deep learning”

No. Country/Region Citing core papers Percentage of citing papers Mean year
1 China 53 26.77% 2017.11
2 USA 49 24.75% 2016.73
3 UK 16 8.08% 2017.38
4 Germany 14 7.07% 2017.29
5 Netherlands 13 6.57% 2017
6 Colombia 13 6.57% 2015.62
7 Canada 12 6.06% 2016.67
8 Australia 11 5.56% 2016.55
9 Singapore 9 4.55% 2017
10 France 8 4.04% 2016.75

《Table 1.2.8》

Table 1.2.8 Institutions with the greatest output of citing papers on the “interpretable deep learning”

No. Institution Citing core papers Percentage of citing papers Mean year
1 Univ Nacl Colombia 13 20.00% 2015.62
2 Case Western Reserve Univ 12 18.46% 2015.83
3 Chinese Acad Sci 8 12.31% 2017.63
4 Radboud Univ Nijmegen 5 7.69% 2016.2
5 Nanyang Technol Univ 5 7.69% 2017
6 Univ Adelaide 5 7.69% 2016.4
7 Wuhan Univ 5 7.69% 2017.2
8 Nanjing Univ Informat Sci & Technol 4 6.15% 2016.25
9 Univ Florida 4 6.15% 2016.75
10 Shanghai Univ 4 6.15% 2016.75

Currently, research on next-generation mobile communication technologies based on 5G focuses mainly on the following mainstream directions and branches:

(1)  Channel measurement and modeling

To meet the needs of enhanced mobile broadband (eMBB) services, 5G systems communicate in high-frequency bands, so channel measurement and modeling around high- frequency bands is one of the key research directions; high- band channel modeling based on big data is also important in the future.

(2)  Large-scale antenna

The 5G system further uses large-scale antennas to provide spectrum efficiency and support more user access. Currently, large-scale antennas, especially for high-frequency large- scale antennas, still have many problems. For the first time, large-scale commercialization of 5G is carried out, and the realization of the level of challenges is also one of the key research directions.

(3)  Massive connection

For the needs of the Internet of Things, smart cities, smart grids, wireless communication under massive connection scenarios will still be one of the key research directions.

(4)  Ultra-reliable low-latency wireless communication

Currently, there is no theory of ultra-reliable low-latency wireless communication to guide the specific design of this system, so this research direction will become one of the research priorities of the sixth-generation (6G) mobile communication in the future.

Other wireless communication technologies such as drone- based, based on energy harvesting and wireless energy transmission, especially the inter-satellite communication technology based on wireless energy transmission in space scenarios, are also the research focus of next-generation wireless communication technology.

Future mobile communication technologies will be developed in the following key directions: ① co-integration technology for storage, computing and communication; ② ubiquitous network architecture for supporting massive terminal access and ultra- dense heterogeneous network technology; ③ ultra-reliable low- latency mobile Theory and implementation of communication system; ④ intelligent mobile communication system design for scene and service; ⑤ theory and implementation of data- driven mobile communication system; ⑥ optimization theory and real-time implementation of ultra-large-scale mobile communication network; ⑦ the design and implementation of communication systems, especially the integration of terrestrial satellites with terrestrial networks.

The countries or regions with the greatest output of core papers, institutions with the greatest output of core papers, countries or regions with the greatest output of citing papers, and institutions with the greatest output of citing papers on “new-generation mobile communication technology” are shown in Tables 1.2.9–1.2.12. The collaboration network among major countries or regions and the collaboration network among major institutions are shown in Figure 1.2.5 and Figure 1.2.6, respectively.

《Table 1.2.9》

Table 1.2.9 Countries or regions with the greatest output of core papers on the “new-generation mobile communication technology”

No. Country/Region Core papers Percentage of core papers Citations Percentage of citations Citations per paper
1 China 8 88.89% 381 98.70% 47.63
2 Canada 7 77.78% 228 59.07% 32.57
3 UK 5 55.56% 295 76.42% 59
4 Singapore 1 11.11% 115 29.79% 115
5 Taiwan of China 1 11.11% 53 13.73% 53
6 Australia 1 11.11% 18 4.66% 18
7 USA 1 11.11% 18 4.66% 18

《Table 1.2.10》

Table 1.2.10 Institutions with the greatest output of core papers on the “new-generation mobile communication technology”

No. Institution Core papers Percentage of core papers Citations Percentage of citations Citations per paper
1 Beijing Univ Chem Technol 7 77.78% 366 94.82% 52.29
2 Univ British Columbia 6 66.67% 223 57.77% 37.17
3 Tsinghua Univ 5 55.56% 344 89.12% 68.80
4 Univ Sheffield 4 44.44% 278 72.02% 69.50
5 Beijing Univ Posts & Telecommun 3 33.33% 175 45.34% 58.30
6 Kings Coll London 3 33.33% 43 11.14% 14.33
7 Univ Sci & Technol Beijing 3 33.33% 50 12.95% 16.67
8 Minist Educ China 1 11.11% 137 35.49% 137.00
9 Shanghai Jiao Tong Univ 1 11.11% 137 35.49% 137.00
10 Univ Elect Sci & Technol China 1 11.11% 137 35.49% 137.00

《Figure 1.2.5》

Figure 1.2.5 Collaboration network among major countries in the engineering research front of “new-generation mobile communication technology”

《Figure 1.2.6》

Figure 1.2.6 Collaboration network among major institutions in the engineering research front of “new-generation mobile communication technology”

《Table 1.2.11》

Table 1.2.11 Countries or regions with the greatest output of citing papers on the “new-generation mobile communication technology”

No. Country/Region Citing core papers Percentage of citing papers Mean year
1 China 157 48.61% 2016.36
2 Canada 45 13.93% 2016.02
3 USA 26 8.05% 2016.31
4 South Korea 25 7.74% 2016.68
5 UK 23 7.12% 2016
6 Iran 12 3.72% 2016.42
7 Australia 11 3.41% 2016.27
8 India 9 2.79% 2016.67
9 Singapore 8 2.48% 2016.25
10 Taiwan of China 7 2.17% 2016.43

《Table 1.2.12》

Table 1.2.12 Institutions with the greatest output of citing papers on the “new-generation mobile communication technology”

No. Institution Citing core papers Percentage of citing papers Mean year
1 Beijing Univ Posts & Telecommun 34 24.46% 2016.18
2 Univ British Columbia 23 16.55% 2015.96
3 Tsinghua Univ 17 12.23% 2016.12
4 Southeast Univ 17 12.23% 2016.29
5 Xidian Univ 12 8.63% 2016.33
6 China Univ Min & Technol 9 6.47% 2016.11
7 Nanjing Univ Posts & Telecommun 8 5.76% 2016.63
8 Chinese Acad Sci 7 5.04% 2016.43
9 Univ Essex 6 4.32% 2015.83
10 Beijing Univ Chem Technol 6 4.32% 2016.33

《2 Engineering development fronts》

2 Engineering development fronts

《2.1 Development trends in the top 10 engineering development fronts》

2.1 Development trends in the top 10 engineering development fronts

The top 10 project development fronts reviewed by the Information and Electronic Engineering Group are summarized in Table 2.1.1, covering electronic science and technology, optical engineering and technology, instrument science and technology, information and communication engineering, computer science and technology, and control science. The annual disclosure of core patents involved in various development fronts from 2012 to 2017 is shown in Table 2.1.2.

(1)    Unmanned aerial vehicles and autonomous driving technology

An unmanned aerial vehicle is an aircraft that relies on the system's own perception, processing and operation capabilities, and changes its own decisions in real time according to the external environment to complete the established tasks. The autonomous driving technology for vehicles is based on high-precision maps, supplemented by intra-vehicle sensing equipment to collect data, makes

《Table 2.1.1》

Table 2.1.1 Top 10 engineering development fronts in information and electronic engineering

No. Engineering research front Core papers Citations Citations per paper Mean year
1  Unmanned aerial vehicles and autonomous driving technology 177 14988 84.68 2013.8
2  Multi-dimensional image information acquisition, processing, and fusion technology 33 1200 36.36 2013.97
3  Display, interaction, and manipulation techniques for virtual reality and augmented reality systems 45 1980 44 2014.33
4  Optical fiber communication and all-optical network 156 6166 39.53 2013.28
5  Identity authentication and access control in network security 23 1278 55.57 2013.7
6  Cloud computing platform 111 9269 83.5 2013.05
7  Human–computer interaction sensing method and application 39 1534 39.33 2013.38
8  Array sensor and array sensing big data processing technology 64 2800 43.75 2012.83
9  Broadband wireless communication system 83 7121 85.8 2013.14
10  New storage system based on nonvolatile memory 50 4480 89.6 2013.02

《Table 2.1.2》

Table 2.1.2 Annual number of core patents published for the top 10 engineering development fronts in information and electronic engineering

No. Engineering research front 2012 2013 2014 2015 2016 2017
1  Unmanned aerial vehicles and autonomous driving technology 35 26 67 38 10 1
2  Multi-dimensional image information acquisition, processing, and fusion technology 5 9 7 8 2 2
3  Display, interaction, and manipulation techniques for virtual reality and augmented reality systems 8 5 9 12 9 2
4  Optical fiber communication and all-optical network 60 45 21 13 11 6
5  Identity authentication and access control in network security 2 10 6 3 2 0
6  Cloud computing platform 43 33 26 6 2 1
7  Human–computer interaction sensing method and application 11 9 14 3 2 0
8  Array sensor and array sensing big data processing technology 30 19 12 2 1 0
9  Broadband wireless communication system 35 25 8 9 3 3
10  New storage system based on nonvolatile memory 18 18 10 3 1 0

decisions based on the identification and calculation of intelligent algorithms with deep learning capabilities, and independently controls the vehicle driving. Unmanned driving with autonomous control can greatly improve system performance and reduce personnel burden and is the future development direction of aircraft and vehicles. With the rapid development of sensor technology, network technology, information technology and AI technology, drones and autonomous driving technologies continue to make important breakthroughs, system performance continues to improve, and practical applications are gradually being implemented.

(2)    Multi-dimensional image information acquisition, processing, and fusion technology

Image information fusion enables software to combine different images of the same target or scene into an accurate description of the same target or scene. The application requirements of military, medical science, natural resource exploration, marine resource management, environment and land use management, topographical analysis, and biology have strongly stimulated the development of image processing and image fusion technologies. With the development of remote sensing technology, the means of obtaining remote sensing data is more and more abundant. The image data obtained by various sensors form image pyramids in the same area. Image fusion technology realizes the complementary advantages of multi-source data, and provides an effective way for improving the utilization efficiency of these data. A good image fusion method can lay a solid foundation for subsequent computer automated processing.

(3)   Display, interaction, and manipulation techniques for virtual reality and augmented reality systems

Virtual reality technology takes advantage of computer simulation to generate a pure virtual three-dimensional environment that provides people with a perception experience being consistent with the objective world. Augmented reality technology integrates the computer- simulated virtual environment and the real environment in multiple levels. It enhances people's perception of the real environment by superimposing virtual information on the real environment or enhances users' sense of reality for virtual object experience with the integration of real objects or real scenes. Both virtual reality and augmented reality technologies greatly expand the ability of human to understand the world and space, especially for those that human physiological activities are difficult to reach. For examples, human beings can be free from the limitations of time and space to experience the events which have occurred or not in the world, to observe and study the same events in various hypothetical conditions of the occurrence and development process, to explore the macroscopic or microcosmic world, etc. Thus, it provides new tools for people to know the world and change the world. In order for people to perceive the virtual environment/object as well as the real world, it must take geometry, appearance, physical, behavioral, and other aspects into consideration for realistic modeling of virtual environment/objects. At the same time, it must generate real-time virtual environment/object to satisfy people’s visual, auditory, tactile and other sensory channels where the simulation information is needed. Therefore, the virtual reality and augmented reality system is a highly integrated computer simulation system that integrates computer graphics technology, computer simulation technology, AI, sensor technology, display technology, interaction technology and other multiple technical categories.

(4)  Optical fiber communication and all-optical network

Optical fiber communication technology refers to a way to transmit information using optical signals and optical fibers. Optical signals can carry a large amount of information after being modulated by different modulation methods. The main material of the optical fiber is glass, which is an electrical insulator, and there is no need to consider the ground loop problem. Because the optical fiber communication system has the advantages of large communication capacity (currently several tens of Tbps), long transmission distance, strong anti- interference performance, good confidentiality and low cost, it has developed rapidly and has received wide attention from the industry, especially in today's information explosion. In the era, the application of optical fiber communication technology has made great contributions to the development of the communication industry and the transformation of the entire society.

All-optical network refers to all optical signal processing in the process of network transmission and exchange, because it does not need to realize electro-optical and photoelectric conversion in it, so it can greatly increase the transmission bandwidth while reducing power consumption and cost. The main technologies of all-optical network include optical fiber technology, wavelength division multiplexing technology, optical switching technology (ROADM, OXC, etc.), passive optical network technology (FTTX), optical fiber amplifier technology (EDFA, Raman amplification), and so on.

The rapid development of optical fiber communication and all-optical networks laid the foundation for the national strategy of “speeding up and reducing fees.”

(5)   Identity authentication and access control in network security

Identity authentication, also known as “authentication” or “identification,” refers to the process of confirming the identity of an operator in a computer and computer network system to determine whether the user has access to and use of a certain resource; thereby, making the computer and the access policy of the network system can be executed reliably and effectively. Moreover, it prevents the attacker from impersonating the legitimate user to obtain the access rights of the resource, ensuring the security of the system and the data, and authorizing the legitimate interests of the visitor.

Currently, mainstream identity authentication and access control technologies include: command-based identity authentication and access control technology, smart card- based identity authentication and access control technology, password-based identity authentication and access control technology, blockchain-based identity authentication and access control, and biometric-based identity authentication and access control technologies.

The development trend of identity authentication and access control technology will break through and develop in the following aspects: ① a combination of multiple identity authentication and access control technologies to provide authentication security and effectiveness. For example, the command authentication method is simple and easy, the password-based authentication method is mature and stable, the smart card has high security features, and the biometric features such as fingerprint iris are very unique and will not be lost or fraudulent. There will be better effect to combine their respective advantages. ② The decentralized distributed verification technology based on smart contract can use Token to intelligently control the identity, authority and access control of users in the system. ③ The identity authentication and access control technology based on user attributes relies on identity encryption mechanism by using the attributes such as user mailbox and identity ID as input credentials to solve the problem that the traditional password mechanism is unfriendly and unrecoverable. ④ The standardization of authentication. By specifying a unified standard, the authentication mechanism of different application systems is compatible.

(6)  Cloud computing platform

The cloud computing platform enables users to easily access configurable, shared resource pools on demand in the cloud system, such as servers, networks, storage, applications, and other services, often over networks including the Internet. At the same time, the cloud platform can provide rapid resource provision/release with minimal administrative overhead and minimal interaction with vendors. The ultimate goal of cloud computing is to provide all kinds of IT resources as a public facility to the public, enabling people to use them on demand like water and electricity, and pay for the actual usage without self-construction. To this end, cloud computing is recognized as the third revolution after the personal computing and the Internet; thus the major countries in the world have brought it into the national overall development strategies, and promoted it as the strategic commanding heights of the future informatization.

The services provided by the cloud computing platform can be divided into three levels: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). According to the deployment form, it can be divided into public cloud, private cloud and hybrid cloud. Key technologies for cloud computing platform implementation include server virtualization technology, distributed computing technology, distributed storage, software-defined networking, and cloud security technologies. Cloud computing platform has become the main form of information technology infrastructure in the Internet era, and the hybrid cloud composed of public cloud plus private cloud or multiple private clouds is becoming the preferred choice for most enterprises and organizations in the transformation into cloud computing. The cloud computing platform itself is also evolving, and new modes such as mobile cloud and edge cloud have appeared. Meanwhile, the cloud computing platform is experiencing fusion development with the emerging applications such as big data, Internet of Things, AI, and blockchain, providing basic resource support for them.

(7)   Human–computer interaction sensing methods and applications

The human–computer interaction technology establishes an interface and channel for information exchange between the user and the computing device. From the early manual paper tape, command line interface, graphical user interface, to the current multi-touch interface, physical user interface, the current human–computer interaction technology is entering the stage of high-efficiency, intelligent and invisible natural interaction. In the human-centered interaction process, the computer accurately recognizes the user’s interactive instructions by sensing various information such as the user’s voice, expression, posture, and physiological data, models the user’s cultural background, personality habits, and emotional preferences, more accurately understands the user’s interaction intentions, predicting interaction behavior, and greatly reduces the cognitive burden of users in the interaction process. The key methods of human–computer interaction sensing are: interactive sensing methods based on voice and other sound signals, interactive sensing methods based on handwriting or contact data, interactive methods based on behavioral expression understanding, interactive sensing methods based on emotion computing, interactive sensing methods based on physiological calculations, and so on.

The new interactive device will adopt one or more key sensing methods to improve the accuracy of understanding the user’s interaction intentions and construct a realistic virtual environment of user experience. Through various artificial sensory technologies such as projection, body, sound, and smell, the growth of the bandwidth of information exchange with the machine and the establishment of a new human– computer interaction interface bring a new service experience to the new generation of computing systems. The human– computer interaction sensing method is a key supporting technology for wearable computing systems, intelligent living space systems, game entertainment systems, and virtual/ enhanced/mixed reality systems, and is widely used in various important fields such as media, entertainment, education, medical care, and national defense.

(8)   Array sensor and array sensing big data processing technology

Sensor arrays have been widely applied in radar, sonar, microphone, communication, navigation, seismology and other fields. The sensor array is formed by an array of sensors that are placed in a specific position with a specific architecture, enabling receiving multi-dimensional spatial signals. The technology of sensor array processing is developed to analyze these spatial signals, to extract the useful information. It has more flexible beamforming control, higher signal gain, and better spatial resolution over the signal processing of one-dimensional signals acquired by a single sensor. The arrangement, size, and number of the sensors of a sensor array which serves as a hardware, play a significant role in the acquisition of the valid signal. Moreover, the effectiveness and accuracy of the extracted signals are more dependent on array signal processing. The studies regarding the theory of the array signal processing started from 1960s and had focused on adaptive beamforming first and spatial spectrum estimation subsequently. In the recent two years, high-resolution spatial spectrum estimation has attracted much attention; however, it requires expensive computation, and thus the establishment of the real-time algorithm in terms of both hardware and software is increasingly demanding.

The type and number of sensors are greatly increasing and the signal acquisition technology is rapidly improving; thus, the data acquisition amount grows dramatically with widening bandwidths in both frequency and spatial domains. The technique of sensors as well as signal processing has extended its application from the traditional fields to other new fields, such as astronomy, energy, finance, geography, safety control, and social networks. The rapid development of the acquired signals in bandwidth, dimension, resolution, and cyberization has raised the growth rate of data acquisition amount higher than those of the data storage amount and the signal processing speed, indicating that signal processing has stepped into the big data era.

The signal processing with big data requires much effort in several aspects. First, smart sensors with capacity of signal acquisition, data compression, and signal pre-processing are required to reduce the data transmission amount, facilitating the extraction of the valuable information. Second, the new advanced algorithms for the multi-sensor signals are required for the information fusion of the diverse and complex signals with matched time-space dimensions. Third, the high-speed signal processing technology is necessary for extending the I/O bandwidth and enhancing real-time high-performance and high-speed data processing. Finally, the introduction of the AI technique in the signal processing with big data is an alternative way to improve the efficiency of data mining based on its ability of information integration, fusion and cooperative analysis.

(9)  Broadband wireless communication system

The new-generation mobile communication system is a stage and goal in the evolution of mobile communication systems. It not only adopts the new wireless transmission technology to improve the performance of the existing communication systems, but also integrates with various existing wired and wireless networks. Further, it not only includes the existing mobile cellular systems, network structure, and some other environments, but also adopts the ad-hoc mode networking, or a combination of the two structures to form a two-hop or multi-hop network structure mode under the cellular network. In general, a cellular network is a widely covered networking approach with the goal of achieving wide- area wireless coverage with limited frequency and power resources. Compared with the cellular network structure, the ad-hoc mobile network structure is more flexible. It adopts distributed management technology, and a group of autonomous wireless nodes cooperate with each other to form a mobile communication network. The wireless node is a mobile terminal in a general sense. It can also be used as a wireless relay and routing device to forward other user data; thus, it has the functions of dynamic search, rapid network construction, and network self-recovery, and has broad application prospects. Considering the importance of ad-hoc in wireless networking and next-generation wireless networks, the Internet Engineering Task Force (IETF) has established a MANET working group to conduct ad-hoc network research. In addition, in recent years, with the development and application of ultra-wide band (UWB) technology, nodes in wireless networks can transmit a large number of very short and fast energy pulses while working, and their transmitted signal power spectral density is low, making UWB suitable as an implementation technology in personal domain networks in a new generation of mobile communication systems.

With the continuous development and maturity of the soft switch technology based on the separation of control and bearer, and the wide application of the intelligent network technologies that can quickly realize various value- added services based on the separation of switching and service control, they are in the new generation of mobile communication systems. Furthermore, with the wide application of the IP technology, the industry generally believes that the development trend of the communication network structure based on next-generation mobile communication is based on the IP network. At the same time, with the rapid development of network capacity and users, IPv6 technology will become the core protocol of the next- generation network.

Wireless communication systems tend to be integrated because of the following reasons. ① Different applicable standards in each wireless communication system begin to seek common ground while reserving differences, complementing each other, and tend to merge; ② systems are merging through integration, and systems are constantly improving; and ③ wireless communication systems and the Internet convergence between them is conducive to the transparency of IP service transmission.

(10)  New storage system based on nonvolatile memory

The performance of traditional memory is much different than that of CPU and has become the biggest bottleneck restricting the entire computing system. Moreover, with the rapid development of mobile terminals, large data, cloud computing, machine learning and other emerging applications, the requirement of read-write performance, I/O speed, bandwidth and capacity of memory is constantly improving. The existing multi-layer storage architecture of cache (SRAM)/main memory (DRAM)/hard disk (Flash or HDD) is far from meeting the requirements. As the contradiction between memory and CPU becomes increasingly prominent, a new generation of nonvolatile memory with superior performance is urgently needed to replace DRAM and Flash. A new generation of nonvolatile memory with superior performance is urgently needed to replace DRAM and Flash. In response to this demand, IBM proposed a new memory concept named storage-class-memory (SCM) in 2008 that is defined as having the advantages of nonvolatile, high- capacity, low-cost hard drives. It also has the advantages of fast read/write speed, high reliability, and byte addressable DRAM. After decades of research and development, a variety of nonvolatile memory technologies have become potential for SCM, including STT-MRAM, PCRAM (including Intel's latest 3D-xPoint), ReRAM, FRAM, and so on. SCM is divided into two categories: Memory-SCM with main memory and Storage-SCM with data storage. In terms of performance metrics, memory-SCM requires a durability greater than 109 and a read-write latency of less than 200 ns to be able to be used as a peer memory with DRAM (50 ns level) without having to pass through the I/O controller. Nonvolatile memory-SCM with these performance metrics can solve a series of traditional storage system problems, fully optimize the storage system, and can even completely change the entire computing architecture.

As the amount of data increases, a major development trend in computing architectures is a significant increase in cache capacity and main storage capacity. If you use traditional SRAM and DRAM to achieve capacity expansion, due to their volatility, they need to be used with more than a few orders of magnitude of external storage (Flash or HDD), in turn resulting in lower performance metrics for the entire system. The fast and nonvolatile nature of Memory-SCM can solve this contradiction well, not only to achieve memory expansion, reduce internal processing overhead, but also greatly improve overall performance. A variety of new architectures can be developed. First of all, Memory-SCM can replace the DRAM in the memory. The low cost and high density of SCM allows the memory capacity to be greatly improved. Further, the data are “nonvolatile;” thus, the data exchange efficiency with the CPU can be greatly improved. Reduced data exchange with Flash/ HDD. In some specific application scenarios, all DRAMs in memory can be replaced by SCMs, especially with STT-MRAM with a durability of 1014. However, in most cases, because the durability and read/write speed of the new nonvolatile memory are worse than that of the DRAM, retaining part of the DRAM in the main memory and forming a mixed main memory system with the SCM are still necessary. This hybrid main memory system requires the development of a new main memory management system, buffer management algorithm, memory interface, and interface/interconnect technology to manage two different memories simultaneously. At the same time, it is necessary to develop a fault management algorithm for the characteristics of SCM technology, as well as a wear leveling algorithm that compensates for the low durability of SCM, and transfer the frequently read and write “hot data” to the DRAM to reduce the wear of the SCM and thus extend the SCM. For example, the new start-gap algorithm can extend the service life of SCMs with a durability of only 107 to 3 years. The industry has developed a variety of such SCM/DRAM hybrid main memory systems, such as 16 TB of mixed memory used in Amazon’s cloud computing system x1e. Another advantage of SCM replacing DRAM is to reduce power consumption because SCM does not need to be refreshed like DRAM. At least 1/3 of the power consumption in large computing systems is in the memory system, and SCM can greatly reduce power consumption by replacing DRAM. Another type of nonvolatile memory Storage-SCM application goal is to replace the hard disk. Compared with Flash, the advantages of SCM include faster speed and lower power consumption.

Although the above-mentioned new architecture is much better than the SRAM/DRAM/Flash system, there is a flaw that limits its overall performance: the data need to be transferred between different functional blocks, thereby consuming too much time and power. A more radical improvement is to overthrow the existing von Neumann computing architecture, abandon the CPU-centric architecture, transform into a new “data-centric computing architecture” (non-Von Neumann architecture), and re-start from scratch with the design software and hardware interface. In a data-centric computing architecture, data are stored in a large-capacity nonvolatile SCM memory array (called “general-purpose memory”), and the system no longer sends data to the CPU for computation, but instead uses distributed computing. The CPU lays out the CPU calculation functions around the data to perform approximate calculations on the data. IBM’s analog computing shows that the performance of a data-centric computing architecture using nonvolatile SCM memory technology will be several orders of magnitude higher than that of the Von Neumann architecture in terms of speed, power consumption, and floor space.

《2.2 Interpretations for three key engineering development fronts》

2.2 Interpretations for three key engineering development fronts

2.2.1 Unmanned aerial vehicles and autonomous driving technology

The autonomous control of unmanned aerial vehicles (UAVs) is divided into three layers: autonomous motion, autonomous task, and autonomous cooperative control layers. The key technologies are adaptive control technology for complex environment of UAV, task-oriented autonomous control technology of UAV and cooperative control technology of UAV. Adaptive control technology for complex environment solves mainly the problem of autonomous motion control for unmanned aerial vehicles under complex conditions. Task- oriented autonomous control technology for UAVs solves mainly the problem of autonomous decision-making and control when UAVs perform complex tasks in confrontation environment. The cooperative control technology of unmanned system solves mainly the control and decision- making problems of multiple unmanned systems in cooperative task execution.

The classification of autonomous driving is divided into 0–5 levels, proposed by the Society of Automotive Engineers (Table 2.2.1). The L0 level is completely unassisted and operated by the driver; the L1 level is still operated by the driver but is embedded in individual auxiliary systems; the L2 level is still controlled by the driver, but the burden can be reduced through the active driving safety assistance system. The L3 level allows the system to take over the operation of the vehicle under certain conditions, but the driver must take over the vehicle when the system determines that a driving operation is required. Above the L4 level, the main control of the vehicle has been the auxiliary system; however, in extreme cases, the driver still needs to take over. The L5 level is a self- driving car that does not need to be driven at all. People are completely passengers.

The key technologies of autonomous driving mainly sensor include, high-precision map, vehicle and environment interaction, and self-determination technologies. A sensor is equivalent to the eye of an autonomous car. The car uses it to identify roads, other vehicles, pedestrian obstacles, and basic transportation facilities. Sensors are usually divided into laser radar, traditional radar, and camera. The high-precision map is accurately positioned by the vehicle to accurately restore the vehicle in a dynamically changing three-dimensional traffic environment. To achieve the safety of autonomous driving, high-precision maps need to be accurate to the centimeter level. The vehicle and environment interaction technology is a technology of interactive information between the vehicle and the surrounding environment, such as vehicles, transportation facilities, and cloud databases, to help the self-driving vehicle to grasp real-time driving information and road condition information, and provide information support for decision-making. The most critical part of autonomous decision-making technology supporting autonomous driving is currently implemented by the machine learning and AI algorithms.

The UAV technology has made breakthroughs in recent years. Various types of UAV are widely used in production, life and combat. In the area of high-performance UAVs, the United States has been at the forefront of the world, the development and installation of a series of high-performance military UAVs. The US MQ-9 has both reconnaissance, surveillance, and firepower strike capabilities, and has achieved good results in actual combat applications. X-47B successfully took off and landed independently on the carrier, and has the ability of multi-platform coordination. In the next step, the control of UAV will change from simple remote control and program control to man–machine intelligent integrated interactive control mode, and gradually develop to fully autonomous control mode; the architecture of the system will develop from specialization and simplification to generalization and standardization; the application mode will change from single- platform independent use to multi-platform cooperation. It is not the direction of cluster applications.

The development of autonomous driving technology is relatively lagging and is still in the auxiliary driving phase of L2. However, major automakers and technology companies have intensified their development of autonomous driving technology in recent years and are expected to make significant progress in the near future. Google began

《Table 2.2.1》

Table 2.2.1 List of autonomous driving system levels

Grade Name Steering, acceleration, and deceleration control Observation of the environment Fierce driving response Coping conditions
L0 Manual driving Driver Driver Driver
L1 Assisted driving Driver + system Driver Driver Partial
L2 Semi-automatic driving System Driver Driver Partial
L3 Highly automated driving System System Driver Partial
L4 Super altitude automatic driving System System System Partial
L5 Fully automatic driving System System System All

autonomous driving technology research in 2009, and its driverless cars have completed 6 million miles of road tests and 5 billion miles of virtual road tests. The Tesla Autopilot system has reached the level of L2 and is iteratively updated. In China, Baidu launched the unmanned vehicle project in 2013. In 2015, it established a special autonomous driving division to directly develop the L4 fully automatic driving technology. In April 2018, the mass production of the world's first L4-class mass production self-driving bus “Apolong” was completed. Improving the reliability of the system, adapting to the environment, and reducing the system costs are the future development directions.

The countries or regions with the greatest output of core patents, institutions with the greatest output of core patents, collaboration network among major countries, and the collaboration network among major institutions regarding “UAV and autonomous driving technology” are shown in Table 2.2.2, Table 2.2.3, Figure 2.2.1, and Figure 2.2.2, respectively.

2.2.2 Multi-dimensional image information acquisition, processing, and fusion technology

In recent years, with the development of sensor technology, the variety of information expressions, huge amount of information, complex relationship of information, and timeliness, accuracy, and reliability of information processing are unprecedented. This is the rapid development of the use of computer technology to analyze and optimize the multi- source information obtained under certain criteria to complete the required estimation and decision-making (i.e., multi-sensor information fusion technology). Information fusion can be described as synthesizing multi-source information to obtain high-quality and useful information. Generally, various single sensors cannot extract enough information from a scene; thus, it is difficult or even impossible to obtain a comprehensive description of a scene independently. Multi- sensors are required to acquire target data for fusion analysis, so that classification and recognition decisions can be effectively performed.

As a type of information fusion, image fusion is a synthesis of multiple scene information. The image fusion technology is an advanced image processing technology that combines multiple source image information. The so-called multi-source or multi-dimensional image fusion performs appropriate fusion processing on multiple source images collected by the multiple sensors on the same scene or target to obtain a more accurate, more comprehensive and more reliable image for the same scene description. An image is a two-dimensional signal, and the image fusion technology is an important branch of the multi-source information fusion technology. Therefore, image fusion and multi-sensor information fusion have common advantages. Image fusion can enhance useful information in images, increase the reliability of image understanding, and obtain more accurate results, making the system more practical. Simultaneously, the system has good robustness, such as increasing confidence, reducing ambiguity, and improving classification performance.

Currently, the main purposes of applying image fusion technology to digital image processing are as follows:

《Table 2.2.2》

Table 2.2.2 Countries or regions with the greatest output of core patents on the “UAV and autonomous driving technology”

No. Country/Region Published patents Percentage of published patents Citations Percentage of citations Citations per patent
1 USA 154 87.01% 13112 87.48% 85.14
2 Germany 6 3.39% 435 2.90% 72.5
3 Japan 6 3.39% 650 4.34% 108.33
4 China 5 2.82% 318 2.12% 63.6
5 Canada 4 2.26% 331 2.21% 82.75
6 UK 2 1.13% 131 0.87% 65.5
7 Israel 2 1.13% 148 0.99% 74
8 Ireland 1 0.56% 101 0.67% 101
9 South Korea 1 0.56% 85 0.57% 85

《Table 2.2.3》

Table 2.2.3 Institutions with the greatest output of core patents on the “UAV and vehicle autonomous driving technology”

No. Institution Published patents Percentage of published patents Citations Percentage of citations Citations per patent
1 FORD 26 14.69% 1700 11.34% 65.38
2 GOOG 20 11.30% 1386 9.25% 69.3
3 FLXT 15 8.47% 1964 13.10% 130.93
4 GENK 9 5.08% 632 4.22% 70.22
5 MGIN 7 3.95% 553 3.69% 79
6 Autoconnect Holdings LLC 6 3.39% 1102 7.35% 183.67
7 DJII 5 2.82% 318 2.12% 63.6
8 HOND 4 2.26% 510 3.40% 127.5
9 BOSC 3 1.69% 233 1.55% 77.67
10 HONE 3 1.69% 341 2.28% 113.67

FORF: Ford Global Technologies Inc.; GOOG: Google Inc.; FLXT: Lextronics Ap LLC; GENK: GM Global Technologies Operations Inc.; MGIN: Magna Electronics Inc.; AUTO: Autoconnect Holdings LLC; DJII: SZ DJI Technology Co., Ltd.; HOND: Honda Motor Co., Ltd.; BOSC: Robert Bosch Gmbh; HONE: Honeywell Int. Inc.

《Figure 2.2.1》

Figure 2.2.1 Collaboration network among major countries in the engineering development front of “UAV and autonomous driving technology”

《Figure 2.2.2》

Figure 2.2.2 Collaboration network among major institutions  in the engineering development front of “UAV and autonomous driving technology” project development

① increasing the content of useful information in the image, improving the sharpness of the image, and enhancing certain features that cannot be seen/visible in a single sensor image; ② improving the spatial resolution of the image, increasing the content of the spectral information, obtaining supplementary image information for improving detection/ classification/understanding/recognition   performance; ③ detecting the change of a scene/target through image sequence fusion at different moments; ④ producing three-dimensional images by fusion of multiple two-dimensional images with stereoscopic vision that can be used for three- dimensional image reconstruction or stereo photography, measurements, etc.; and ⑤ using images from other sensors to replace or compensate for missing or failure information in a certain sensor image.

In general, image fusion can be performed on the following three levels. ① Pixel level. This refers to fusing on the acquired image information, which can retain more information and improve the fusion precision. However, because of the large amount of processed information, the fusion efficiency is low, the real-time performance is poor, and the pixel-level fusion image is accurately registered. Furthermore, the fusion result is prone to large errors. ② Feature level. In the process of feature level fusion, the image needs to be extracted first, and then the data are analyzed and processed by the feature level fusion algorithm based on the extracted feature information. In this process, the amount of information is greatly compressed, which is conducive to real-time processing, while the fusion results maximize the information required for decision analysis. ③ Decision level. It is the highest level of integration, and the result of integration provides the basis for command, control, and decision-making. Therefore, the fusion result directly affects this decision level. Decision-level fusion can make decisions in the case of loss of some data sources, so it is fault-tolerant. In addition, compared with the first two levels, the decision-level fusion has a good real-time performance, low data requirements, and strong analytical ability. However, it has high requirements for preprocessing and feature extraction; thus, the cost of decision-level fusion is higher. Usually, level-3 fusion can be used together to achieve better fusion.

The information fusion methods used in image fusion can be divided into six categories: algebra-based, component substitution, multi-scale decomposition, statistical, variational, and learning-based methods.

Image fusion is an important topic in the field of image analysis.

It has a wide range of applications, such as medical research, map drawing, and hidden weapon inspection. A good fusion image can lay a good foundation for computer automatic processing, such as target recognition and classification.

The countries or regions with the greatest output of core patents, institutions with the greatest output of core patents, collaboration network among major countries, and the collaboration network among major institutions regarding “multi-dimensional image information acquisition, processing, and fusion technology” are shown in Table 2.2.4, Table 2.2.5. Figure 2.2.3, and Figure 2.2.4, respectively.

2.2.3 Display, interaction, and manipulation techniques for virtual reality and augmented reality systems

Both virtual reality and augmented reality technologies model the real world in a virtual environment, which depends on the data acquisition and modeling technology in the real environment. The present development trends of this technology are to model more categories of objects, collect data with higher dimensions, and create a scene more precisely and efficiently. Specifically, modeled objects are no longer limited to geometric surfaces, and they also include features such as appearance, illumination, texture, and bidirectional reflectance. In terms of data dimension, it includes dynamic geometry, illumination, and texture of time dimension, aside from static data. The speed and precision of the collection also increase. The geometric

《Table 2.2.4》

Table 2.2.4 Countries or regions with the greatest output of core patents on the “multi-dimensional image information acquisition, processing, and fusion technology”

No. Country/Region Published patents Percentage of published patents Citations Percentage of citations Citations per patent
1 USA 24 72.73% 940 78.33% 39.17
2 Japan 3 9.09% 80 6.67% 26.67
3 Canada 1 3.03% 33 2.75% 33
4 Switzerland 1 3.03% 19 1.58% 19
5 Germany 1 3.03% 40 3.33% 40
6 Finland 1 3.03% 24 2.00% 24
7 UK 1 3.03% 35 2.92% 35
8 Israel 1 3.03% 27 2.25% 27
9 South Korea 1 3.03% 24 2.00% 24
10 Sweden 1 3.03% 33 2.75% 33

《Table 2.2.5》

Table 2.2.5 Institutions with the greatest output of core patents on the “multi-dimensional image information acquisition, processing, and fusion technology”

No. Institution Published patents Percentage of published patents Citations Percentage of citations Citations per patent
1 Pelican Imaging Corp 9 27.27% 428 35.67% 47.56
2 Fotonation Cayman Ltd 5 15.15% 241 20.08% 48.2
3 Lytro Inc 4 12.12% 157 13.08% 39.25
4 MICT 4 12.12% 165 13.75% 41.25
5 CANO 2 6.06% 57 4.75% 28.5
6 AMSH 1 3.03% 33 2.75% 33
7 APPY 1 3.03% 31 2.58% 31
8 Aspect Imaging Ltd 1 3.03% 27 2.25% 27
9 ETRO 1 3.03% 46 3.83% 46
10 GENE 1 3.03% 33 2.75% 33

MICT: Microsoft Corp.; CANO: Canon KK or Canon Kabushiki Kaisha; AMSH: Amersham Pharmacia Biotech Inc.; APPY: Apple Inc.; ETRO: Etron Technologies Inc.; GENE: Ge Healthcare Bio-Sci Corp.

《Figure 2.2.3》

Figure 2.2.3 Collaboration network among major countries in the engineering development front of “multi-dimensional image information acquisition, processing, and fusion technology”

《Figure 2.2.4》

Figure 2.2.4 Collaboration network among major institutions in the engineering development front of “multi-dimensional image information acquisition, processing and convergence technology”

calculation is accurate to micron. Further, the illumination reconstruction realizes the collection of dynamic light field. For the modeling method, the means of modeling have been expanded with the comprehensive utilization of the latest acquisition equipment, and the use of new laser scanner, depth camera, and femtosecond camera. Based on vision, interaction, and AI, new methods and technologies provide a more convenient and intelligent modeling method. For example, over the past few years, modeling methods based on machine learning and geometric modeling design methods have been explored from the perspective of achieving high-level semantic understanding. Thus, it is rapidly designed in batches of models with similar styles, but different sizes and details. The emergence of these technologies provides a new technological approach for high-fidelity and high-efficiency reconstruction of the real environment.

Both virtual reality and augmented reality technologies present virtual environment/object in real time with high fidelity and provide people with realistic visual experience, which is a problem that the computer rendering technology continuously tries to improve. Though traditional graphics technology is able to construct complex and interactive virtual scenes, the accuracy and efficiency of presentation need to be improved. As the application goes deeper, the performance of the virtual scene becomes more sophisticated, leading to the rapid growth of its complexity, posing serious challenges to the interactive processing of the virtual environment. Although many effective accelerated processing technologies have been invented, such as scene simplification, sampling prediction, and parallel distribution processing, the authenticity and real- time contradictions of virtual environment perception are still the bottleneck influencing the popularization of the virtual reality technology. Currently, there are two new development trends in this field. One is regarding some special application requirements, the methods of presentation and driving of virtual scenes with low complexity and high authenticity are explored by appropriately limiting the interactive freedom. Contrary to this, new rendering technologies, such as visibility prediction, external storage, parallel distributed computing, GPU computing, and real-time light tracking, are developed specifically to achieve high-fidelity virtual environment/object rendering.

To provide an immersive virtual experience, the current display output technology of virtual reality and augmented reality is different from the traditional display methods based on the displayer. In recent years, a series of high-tech display devices have been developed, such as high-resolution large projection display device, lightweight helmet, and real 3D optical field display, by exploring new display mechanisms and methods. The recent progress in the virtual reality industry is also driven by the launch of Oculus’ consumer virtual reality headset. Further, with the launch and release of a series of products such as HTC Vive virtual reality headset, Microsoft Hololens penetrating augmented reality headset, and Magic Leap optical field display helmet, and the emergence of display helmet using holographic technology, new display and devices are provided for people, greatly improving the immersive experience of virtual and augmented realities. In addition, in terms of 3D displays, the new-generation real 3D optical field displays overcome the defects of traditional naked eye 3D displays with only fixed optimal observation points and provide a natural presentation of the optical field of virtual objects and realize continuous motion parallax and binocular parallax. However, regarding display technology, for realistic rendering of virtual environments/objects, the following problems need to be resolved: contradiction among virtual and real consistent experience, limited calculation, and processing and display bandwidth.

In terms of human–computer interaction, a traditional interactive technology based on voice, pen, data glove, and 3D mouse is increasingly mature. However, there are still many problems in the nature and efficiency of interaction, especially in the aspect of tactile perception interaction. Though many researchers are continuing to perfect these interactive technologies, some new natural and harmonious human– computer interaction technologies and interfaces of virtual- real fusion have rapidly become the mainstream development directions. The new technology tries to map people’s interaction of objects and environment in real life to the user interface of the interactive process in information space, and apply a life experience to the human–computer interaction system as closely as possible, to lower the learning threshold of the computer technology and increase the naturalness of the interaction. In recent years, multi-touch user interface and interactive method based on hand gestures is in-depth study of a human–computer interaction interface. Although these interactive technologies have a preliminary application, they are often limited by reliability, speed, flexibility, accuracy, convenience, and other such factors. This is still quite far from what people expect of natural and intuitive human–computer interaction technology.

The object in both virtual reality and augmented reality technologies is human; thus, realizing the consistent experience of virtual and real environment in the sense of human perception is necessary. Real perception research on the psychological and physiological aspects of people should also be conducted. Among these aspects, the first thing that needs to be solved is the physical discomfort brought by virtual environment simulation, usually referred to as “virtual reality motion sickness.” There are many factors leading to virtual reality motion sickness. The first is that the visual information (the virtual image seen by the eyes) does not match the real position information perceived by the vestibular system in the ear, causing a feeling of vertigo. Another reason is that the calculation results in the delay of the visual image and a delay lag in the movement. In addition, everyone’s pupil distance is different, and the three points of the eye, namely pupil center, lens center, and image center may not be in the same line. Thus, there is a phenomenon of ghosting that can also cause discomfort. Finally, it is possible to experience vertigo as the depth of field is out of sync in a virtual scene. Currently, there are many ways to solve motion sickness. One is the biosynchronous feedback technology, such as using omnidirectional treadmills to make real simulation of real responses in the virtual world. Second, the electromagnetic stimulation technology, such as using vestibular stimulation technology to stimulate users’ vestibular receptors through feedback from the electrodes behind the ears, so that the user can have a strong sense of virtual environment vision and body position immersion. Third, improving the computing capacity, such as reducing delay. Fourth, adding a virtual reference object, such as a virtual nose, to reduce the perceived discomfort.

The proposal of the conception of virtual reality and augmented reality can be traced back to science fiction from the 1930s. The first virtual reality lab headset, which turned the concept parts into reality, appeared in the 1960s, while the Oculus headset that entered the consumer market and led the development of the virtual reality industry was officially released in 2016. After more than half a century of progress in modeling, rendering, display, and interaction technologies, virtual and augmented realities have finally entered the home of ordinary people and begun to affect the lives of people. The rapid development of consumer market will further promote the improvement and development of technology. Currently, internationally renowned IT enterprises (such as Microsoft, Facebook (Oculus), Google, and Apple) have increased their investment in virtual and augmented reality technologies. The achievement of mutual promotion and development between market and technology will be evident over the next five years. Further development will be achieved in the way and authenticity of virtual environment presentation, and the convenience and accuracy of man–machine interaction. In the future, a larger breakthrough is expected in the fields of vision displays helmet, realistic rendering algorithm based on ray tracing, natural gesture interaction, mobile precise positioning and position perception, and so on. Further, the authenticity of creating a virtual environment by virtual and augmented realities is further improved, thereby providing people with barrier-free, accessible, and comfortable virtual and enhanced environment.

The countries or regions with the greatest output of core patents, institutions with the greatest output of core patents, collaboration network among major countries, and the collaboration network among major institutions regarding “display, interaction, and manipulation techniques for virtual reality and augmented reality systems” are shown in Table 2.2.6, Table 2.2.7, Figure 2.2.5, and Figure 2.2.6, respectively.

《Table 2.2.6》

Table 2.2.6 Countries or regions with the greatest output of core patents on the “display, interaction, and manipulation techniques for virtual reality and augmented reality systems”

No. Country/Region Published patents Percentage of published patents Citations Percentage of citations Citations per patent
1 USA 34 75.56% 1507 76.11% 44.32
2 Japan 7 15.56% 208 10.51% 29.71
3 Canada 2 4.44% 82 4.14% 41
4 Germany 1 2.22% 121 6.11% 121
5 South Korea 1 2.22% 62 3.13% 62

《Table 2.2.7》

Table 2.2.7 Institutions with the greatest output of core patents on the “display, interaction, and techniques for virtual reality and augmented reality systems”

No. Institution Published patents Percentage of published patents Citations Percentage of citations Citations per patent
1 MICT 10 22.22% 735 37.12% 73.5
2 Magic Leap Inc 7 15.56% 309 15.61% 44.14
3 DAQRI LLC 3 6.67% 54 2.73% 18
4 SHIH 3 6.67% 71 3.59% 23.67
5 SONY 3 6.67% 94 4.75% 31.33
6 GOOG 2 4.44% 86 4.34% 43
7 Microsoft Technology Licensing LLC 2 4.44% 49 2.47% 24.5
8 BOSC 1 2.22% 121 6.11% 121
9 ETRI 1 2.22% 62 3.13% 62
10 Eye Labs LLC 1 2.22% 15 0.76% 15

MICT: Microsoft Corp.; SHIH: Seiko Epson Corp.; SONY: Sony Corp.; GOOG: Google Inc.; BOSC: Robert Bosch Gmbh; ETRI: Electronics & Telecom Res Inst.

《Figure 2.2.5》

Figure 2.2.5 Collaboration network among major countries in the engineering development front of “display, interaction, and manipulation techniques for virtual reality and augmented reality systems”

《Figure 2.2.6》

Figure 2.2.6 Collaboration network among major institutions in the engineering development front of “display, interaction, and manipulation techniques for virtual reality and augmented reality systems”

 

 

 

Participants of the Field Group

Leaders

PAN Yunhe, LU Xicheng

Deputy Leaders

WU Jiangxing, JIANG Huilin

Members

Academicians

Group 1: LI Tianchu, CHEN Lianghui, GONG Huixing, JIANG Huilin, ZHOU Shouheng, WEI Yu, LIU Zejin

Group 2: WU Jiangxing, DUAN Baoyan, CHEN Zhijie, LIU Yunjie, FAN Bangkui, WU Weiren

Group 3: PAN Yunhe, LU Xicheng, ZHENG Nanning, LI Bohu, FEI Aiguo, ZHAO Qinping, WU Jianping

Library and Information Specialists: YANG Weiqiang, LIU Shulei, GENG Guotong, HUO Ningkun, QU Tingting,

WU Ji, YANG Xiao, LIU Baolin, CHEN Zhenying, YE Wenying

Liaisons: FAN Guimei, WANG Bing, ZHANG Jia, ZENG Jianlin

Secretaries: ZHAI Ziyang, HU Xiaonv, YANG Weiqiang

Report Writers (in alphabetic order of the last name)

AI Bo, CHENG Zhiyuan, DENG Qiwen, FAN Hongqi, HAN Yahong, HU Xiaonv, HUANG Chenlin, HUANG Tao,

HUANG Yuzhen, JIN Zhonghe, LI Bao, LIU Ken, LIU Meiqin, REN Qinyuan, TAN Shuang, WANG Rui, WU Fei,

XI Peng, XIE Renchao, YANG Yi, ZHANG Chengliang, ZHENG Nenggan