《1. Introduction 》

1. Introduction

During the past decade, considerable attention has been paid to metasurfaces, due to their strong capability of manipulating electromagnetic (EM) waves [1–5]. A metasurface is a kind of twodimensional (2D) artificial structure array composed of periodic or nonperiod subwavelength unit cells. Compared with threedimensional (3D) metamaterials, metasurfaces have the advantages of being ultrathin, having low loss, and enabling easy integration. However, the functionalities of a passive metasurface are unchangeable once the metasurface is fabricated.

Recent trends in active metasurfaces have led to a proliferation of studies on attaining tunable control of EM waves. Thus far, active metasurfaces with incorporated diodes [6–9], graphene [10,11], phase-change materials [12], and transparent conducting oxides [13] have been proposed and investigated. Benefiting from the flexible tunability of active metasurfaces, the asymmetric transmission of a chiral metasurface can be dynamically controlled [6], and the metasurface-assisted Fabry–Pérot cavity antenna can change its operating frequencies and radiation angles [7]. Furthermore, a reconfigurable metasurface can switch between several functions and produce intricate beams. Li et al. [8] presented an active metasurface that can reconfigure the number of reflected beams from one to five. In Ref. [9], wave polarization is designed to be dynamically switched among linear, elliptical, and circular polarization by the varactor diodes embedded on a metasurface.

To develop a new way of realizing more different functions in real time, Cui et al. [14] proposed digital coding and programmable metasurfaces in 2014, in which the metasurface is represented from a digital perspective, and a link between a physical metasurface and information science is established. By engineering the coding sequences of ‘‘0” and ‘‘1,” a programmable metasurface with independently adjustable elements has the capacity to implement more advanced and practical function devices [15,16]. More recently, several attempts have been made to establish digital metasurfaces that can be programmed by light [17,18] and temperature [19,20]. Thanks to the rapid development of information technology promoting the integration of many disciplines, metasurfaces have been developed that sense vision and motion [21,22]. In fact, sound has the ability to deliver information. By integrating a speech-recognition module into a programmable metasurface, a link between voice signals and EM signals can be established. We envision that the combination of metasurfaces and acoustics will have broad application prospects in the fields of intelligent communication, human–computer interaction, and smart cities.

Here, we propose, design, and demonstrate a digital metasurface for programming EM manipulations based on speech recognition. By altering the voice commands in real time, different kinds of functionalities can be implemented in a contact-free manner. Furthermore, a genetic algorithm (GA) is adopted to assist in the design of the metasurface for the optimization of coding sequences according to different functionalities. As a proof of concept, three different functions are demonstrated, including radar crosssection (RCS) reduction, vortex beam generation, and beam splitting. The measured and simulated results verify the good performance of the smart metasurface system. This work provides a new method of speech recognition to control EM waves based on a smart metasurface.

《2. Principle of the smart metasurface based on speech recognition》

2. Principle of the smart metasurface based on speech recognition

The proposed smart metasurface platform based on speech recognition is composed of a digital metasurface, a digital-toanalog converter (DAC) circuit, a single-chip computer, and a speech-recognition module, as shown in Fig. 1. In order to realize a smart metasurface, varactor diodes are integrated to construct the active elements. By tuning the bias voltage applied to varactor diodes, the reflection phase response of the element can be dynamically changed. All the elements in a super unit cell share the same reverse voltage, and all the super unit cells are controlled independently. The extended voice-control subsystem is made up of three parts: a speech-recognition module, a single-chip computer, and a DAC circuit. When a voice command is issued, the speechrecognition module recognizes the command, and the single-chip computer calculates the bias voltage sequences according to the corresponding serial port data, transfers the format, and sends the data to the DAC. The DAC then generates the voltage distributions required by the metasurface according to the data received from the single-chip computer. Through the above processes, the metasurface for programming EM manipulation is controlled by changing the voice commands.

《Fig. 1》

Fig. 1. Schematic of the speech-recognition digital metasurface for programming EM manipulations.

To establish a two-bit digital metasurface, we first designed an active element whose reflection state could be controlled dynamically, as shown in Fig. 2(a). The designed element consists of two metal layers separated by a dielectric substrate. On the top layer, there are two interdigital copper patches with a varactor diode embedded in the gap. The middle layer is a F4B substrate (dielectric constant 2.65 and loss tangent 0.001), and the bottom layer is covered by a copper plate that acts as a metallic ground. Direct current (DC) bias lines are placed at two ends of each copper patch to connect the adjacent elements in a column. Through the bias lines, DC bias voltages can be loaded to feed the varactor diodes and then change their capacitances. As a result, the elements will exhibit different reflection states with different capacitances when the DC bias voltages are tuned. The period of the element is L = 12 mm (approximately /4.2), the height of the substrate is H = 3 mm (approximately /16.7), and the thickness of the copper is 0.018 mm. Based on an optimization using simulation software, the other geometric dimensions (as shown in Fig. 2(a)) were set as follows: = 0.9 mm, b = 6.5 mm, c = 3.3 mm, d = 0.7 mm, e = 4.3 mm, = 8.4 mm, g = 2.5 mm, i = 1 mm, and t = 0.2 mm.

In order to fulfill the wide phase coverage and operation bandwidth, selecting a varactor diode with low capacitance, a high capacitance ratio, and low loss was critical for our design. Herein, we chose the varactor ‘‘MA46H120” from MACOM Technology Solutions [23]. Fig. 2(b) presents the equivalent resistor– inductance–capacitor (RLC) series circuit of this varactor diode, in which the equivalent parameters were chosen as follows: The series resistance Rs is 0.88 Ω, the parasitic inductance Ls is 0.4 nH, the variable capacitance CT varies from 1.1 to 0.14 pF, and the reverse bias voltage changes from 0 to 10 V. In the design process, the effect of a bonding pad with the dimensions 0.2 mm × 0.2 mm between the copper patches was also considered.

《Fig. 2》

Fig. 2. The designed smart metasurface and its reflection characteristics. (a) Structure of the active element; (b) equivalent resistor–inductance–capacitor (RLC) series circuit of the varactor diode with model ‘‘MA46H120”; (c) simulated reflection magnitudes of the element with different CT; (d) simulated reflection phases of the element with different CT; (e) photograph of the fabricated speech-recognition digital metasurface; (f) framework of the voice-control system and its operating process.

To investigate the reflection characteristics of the designed element, we carried out numerical simulations in CST Microwave Studio software. The simulated reflection magnitudes and phases of the proposed element with different CT are shown in Figs. 2(c) and (d). From 4.74 to 8.48 GHz, all the reflection magnitudes are greater than –1.5 dB and the reflection phase turning range is 250.0° to 306.4°, as shown by the yellow region. In such an operation bandwidth, the element works well; that is, the designed element presents the significant advantages of broad band, high reflectivity, and large phase-shift range. In particular, at a frequency of 6 GHz, the phase differences for four cases with 1.10, 0.45, 0.32, and 0.14 pF are almost 90°, and the corresponding reverse bias voltages are 0, 3.0, 4.8, and 10.0 V, respectively. Therefore, a two-bit element was completely designed, and we encoded these four states as ‘‘00,” ‘‘01,” ‘‘10,” and ‘‘11.” By altering the reverse voltage, the proposed element can realize the phase shifts required by two-bit coding at the operation bandwidth.

Then, we constructed a super unit cell by utilizing 4 × 4 elements to alleviate the influence of the phase jump between the adjacent elements. The left and right bias lines of the 16 elements are respectively linked together. Hence, all the elements in a super unit cell share the same bias voltage. The positive and negative electrodes of all the super unit cells are diverted to the backside of the metasurface. As experimental validation, we fabricated a digital metasurface that contains 6 × 6 super unit cells, as shown in Fig. 2(e). Each super unit cell is controlled independently.

The key to realizing voice control of a metasurface is to generate different voltage sequences according to different voice commands, so that the digital metasurface presents various coding states and realizes various functions. The framework of the voice-control system is shown in Fig. 2(f). Voice commands are recognized by a REC-V2 speech-recognition module, which is a high-performance voice–user interface that can realize human– machine voice interaction. This speech-recognition module can pre-store up to 50 different voice commands and can recognize voice commands from within 6 m when the signal-to-noise ratio is higher than 30 dB. The recognition rate is as high as 97%, and the response time is less than 0.5 s. The module is convenient for various human–computer interactions and plays the role of recognizing a voice command and sending the corresponding serial port data to the single-chip computer after recognition. It should be noted that a voice signal saying a different word results in a different waveform in the time domain, so voice recognition can be realized by sampling the waveform. When different people pronounce the same word, their waveforms are almost the same in the time domain. Therefore, this voice-control system is suitable for all users.

The single-chip computer calculates the bias voltage distributions according to the received serial port data, converts the data format, and sends the converted data and the information of the channel number to the DAC module. The means of data transmission is serial communication, with a baud rate of 9600 B. The DAC module generates the bias voltage sequences required by the metasurface according to the received data, and the voicecontrolled metasurface is realized successfully. The DAC module is based on a TLC5628 DAC board, and each DAC module has eight independent voltage output channels. This module is able to generate DC voltages varying from 0 to 10 V with a step of 0.01 V. The metasurface is composed of 6 × 6 super unit cells; therefore, five DAC modules are needed. Fig. 2(e) shows a photograph and other details of the voice-control system. To clearly demonstrate the operation process of the voice-control system, we made a short movie, which is attached in Appendix A; for more details, interested readers can see Appendix A, Section S1. The bias voltages generated by the 36 channels are clearly presented in the screens of the DAC. It can be seen that the voltage sequence that is stored in advance changes correspondingly with the voice commands. It is notable that this DAC can only display a maximum value of 9.99 V. To ensure the veracity of the control voltage, as shown in the movie, we also measured the voltage value with a multimeter. Good agreement indicates the effectiveness of this DAC.

《3. Function verification and simulation results》

3. Function verification and simulation results

We present three functions to demonstrate the abovementioned smart metasurface. Reasonable utilization of optimization algorithms in metasurface design can greatly reduce the design time and improve the design efficiency. In this work, for ease of metasurface design, we introduced a GA to assist in looking for optimal coding sequences according to the specified functionalities. The GA is a random global search algorithm that simulates the Darwinian biological evolution of natural selection and genetic mechanisms. It can search for the optimal solution effectively when solving complex combinatorial optimization problems [19– 21]. The fitness function in our optimization is defined as follows:

where θ and φ are the azimuth angle and elevation angle, respectively; is the far-field pattern of the metasurface; and is the far-field pattern of the optimization goal. The far-field pattern scattered by the metasurface can be expressed as follows:

where p and q refer to the number of elements in the x and y direction, respectively; is the far-field pattern of the element in line q, column p; j is the imaginary unit; k is the wavenumber; and dx and dy are the element dimensions along the x and y directions, respectively. Both the amplitude and the phase information are taken into consideration. The termination condition is attained when the number of iterations reaches 500 or the optimization goal is achieved. Hence, the GA will continue to look for the optimized parameter distribution until the termination condition is satisfied.

To validate the feasibility of the GA in the metasurface design, we present the function of RCS reduction for the GA to optimize. A coding metasurface that is carefully designed makes it possible to randomly disperse the incident waves in multiple directions [24–27]. In this case, our aim is to make the incident waves reflect evenly in all directions. The final optimized coding sequence after the fitness value reached stability is shown in Fig. 3(a), and the simulated 3D far-field radiation pattern at 6 GHz is shown in Fig. 3(b). It can be observed that the reflected main beam disappears, and the divergent energies in all directions are relatively small. The effect of the RCS reduction is maintained in a wide frequency band from 5.3 to 7.7 GHz and the capacitances remain unchanged by using two-bit unit cells; that is, just four coding states were used instead of continuous phase distributions.

《Fig. 3》

Fig. 3. Coding patterns and simulated results for the programmable metasurface. (a) Coding pattern and (b) simulated 3D far-field radiation pattern of the RCS reduction at 6 GHz; coding patterns of (c) the positive first-order vortex beam and (d) the positive second-order vortex beam; simulated 3D far-field radiation patterns of (e) the positive first-order vortex beam and (f) the positive second-order vortex beam at 6 GHz; simulated phase distribution of the Ex electric field component of (g) the positive first-order vortex beam and (h) the positive second-order vortex beam on the xoy plane at 6 GHz; (i) coding pattern and (j) simulated 3D far-field radiation pattern of the beam splitting at 6 GHz.

To further verify the performance of the programmable metasurface, we present the second function: vortex beam generation. Vortex beams carry orbital angular momentum (OAM) with spiral phase distributions. Theoretically, different modes of OAM beams possess orthogonality, which keeps them from interfering with each other, and the phase distribution of each mode remains stable during the process of signal transmission. That is to say, using OAM beams to carry microwave signals can realize an infinite number of communication channels and amount of channel capacity, allowing the transmission rate to be significantly improved [28–31].

The quantized phase distributions of the positive first-order and positive second-order vortex beams are illustrated in Figs. 3(c) and (d), respectively. The simulated 3D far-field radiation patterns of the positive first-order and positive second-order vortex beams at 6 GHz are shown in Figs. 3(e) and (f). It can clearly be observed that there is a hollow ring in the middle of each pattern and that the ring-shaped pattern of the positive second-order vortex beam is a little larger than that of the positive first-order vortex beam, which confirms the properties of the OAM beams. The near-field phase distribution is also an important characteristic to depict the performance of the vortex beams. Figs. 3(g) and (h) illustrate the simulated phase distributions of the Ex electric field component of the positive first-order and positive second-order vortex beams on the xoy plane. Helical phase distributions with one and two round gradual phase shifts can be clearly observed on the monitoring plane.

The third function we will discuss is beam splitting. For simplicity, we design a coding sequence for two symmetric beams. The deflection angle θ should satisfy the following equation:

where is the operating wavelength, is the phase difference of the adjacent super unit cell, and is the periodic length of the coding sequence along the x direction. In this case, the operating frequency is 6 GHz, is 50 mm, is ±π, and is 48 mm; hence, the calculated value of the deflection angle θ is ±31.4°. The simulated 3D far-field radiation pattern is shown in Fig. 3(j), where there are two conspicuous reflected beams on the xoz plane. We can observe that the deflection angles are 30° and –32°, which agree well with the theoretical values.

Aside from the three functions mentioned above, many other interesting functions can be realized by adjusting the coding sequences, such as the generation of a wide beam, a highly directional beam, multiple beams, beam scanning, and so on. In general, the number of the different functionalities is limited by the number of the coding patterns, which correspond to the scattering patterns. The numerical verifications of the wide-beam and multibeam functions are detailed in Appendix A, Section S2. In addition, the presented smart metasurface can be extended further to other frequency bands.

《4. Experimental results and discussion》

4. Experimental results and discussion

The entire fabricated platform is composed of a voice-control module, a DAC circuit, and the digital metasurface, whose dimensions are 300 mm × 300 mm. The measurement was carried out in a microwave anechoic chamber; the experimental setup is depicted in Fig. 4(a). There is a long board on the mechanical turntable. The fabricated sample is fixed on one end, and a standard horn antenna serves as the feeding antenna on the other end. Another standard horn antenna serving as the receiving antenna is placed about 5 m away, facing the fabricated metasurface, and the altitudes of the two horn antennas and the metasurface are the same. As the mechanical turntable rotates on the horizontal plane, the feeding antenna maintains a vertical incidence to the metasurface, and the EM waves reflected by the metasurface in all directions are received by the receiving antenna. After that, the received data are transmitted to the computer for post-processing.

During the experiments, all three functions were measured when we changed the voice commands for the RCS reduction, generation of the positive first-order and positive second-order vortex beams, and beam splitting. The measured results for the smart metasurface based on speech recognition are presented in Fig. 4. First, we measured the monostatic RCS reduction from 5 to 8 GHz. To accurately represent the value of the energy dissipation, we also tested the radiation pattern of a reference perfect electrical conductivity plane with the same dimensions as the metasurface. As shown in Fig. 4(b), an RCS reduction of –7 dB could be realized from 5.1 to 8.0 GHz, with the maximum RCS reduction reaching –15.88 dB at 6.5 GHz. For the vortex beams, Figs. 4(c)–(f) show the measured results of the positive first-order and positive second-order vortex beams in the E-plane and H-plane at 6 GHz. It can be clearly observed that a hollow ring appears in the center and there are two wave peaks around the hollow. The divergence angle of the positive second-order vortex beam is a little larger than that of the positive first-order vortex beam. All the measured scattering patterns are in good agreement with the simulated ones. It is notable that the asymmetric field distributions of the vortex beams in both the simulation and experimental results are mainly caused by the low resolution of the super cells and the inhomogeneous amplitudes of the unit cells. The performance of the smart metasurface can be improved by choosing higher bit coding elements and increasing the number of super unit cells. Finally, we measured a 2D far-field pattern for beam splitting at 6 GHz. It can be observed in Fig. 4(g) that these two reflected beams point to 31° and –32°, which correspond with the simulated results. It is noted that the designed unit cell works in a narrow band. For the vortex beams, the device can work well from 5.9 to 6.3 GHz; for beam splitting, the device works well from 5.8 to 6.4 GHz.

《Fig. 4》

Fig. 4. Simulated and measured results for the smart metasurface based on speech recognition. (a) Experimental setup in the microwave chamber; (b) simulated and measured results for RCS reduction; (c) simulated and measured 2D normalized scattering patterns of the positive first-order vortex beam in the E-plane at 6 GHz; (d) simulated and measured 2D normalized scattering patterns of positive first-order vortex beam in the H-plane at 6 GHz; (e) simulated and measured 2D normalized scattering patterns of positive second-order vortex beam in the E-plane at 6 GHz; (f) simulated and measured 2D normalized scattering patterns of positive second-order vortex beam in the H-plane at 6 GHz; (g) simulated and measured 2D normalized scattering patterns of beam splitting at 6 GHz.

《5. Conclusions》

5. Conclusions

In this work, we proposed, designed, and verified a smart metasurface platform for programming EM manipulations based on speech recognition. The metasurface is connected externally to a voice-control system, in which the speech-recognition module recognizes voice commands, the single-chip computer calculates the voltage distributions required by the metasurface, and the DAC generates the corresponding voltage sequences. By altering the voice commands, different kinds of functions can be implemented in a noncontact manner. The proposed programmable metasurface consists of 6 × 6 super unit cells, and all 16 elements in a super unit cell are excited simultaneously. By controlling the bias voltage loaded on the varactor diodes, the super unit cells can be made to embody four different reflective states. Therefore, each super unit cell can be controlled independently. A two-bit smart metasurface is constructed, and multifarious scattering patterns can be flexibly reconfigured in the coding sequence design. In order to simplify the metasurface design, GA was applied to optimize the phase distribution for specific functions. As a proof of principle, we demonstrated three different functions: RCS reduction, vortex beam generation, and beam splitting. A prototype was also fabricated to validate the versatility of our design. Both numerical simulations and experimental measurements verify the smart performance of the voice-controlled metasurface platform. Compared with previous works, the current work provides a new strategy to control programmable metasurfaces in a noncontact and real-time manner. Potential applications of such a smart metasurface system are expected in future intelligent communications.

《Acknowledgments》

Acknowledgments

This work was supported by the National Key Research and Development Program of China (2017YFA0700201, 2017YFA0700203, and 2016YFC0800401), the National Natural Science Foundation of China (61890544), the Fundamental Research Funds for the Central Universities (2242021k30040), and the 111 Project (111-2-05).

《Compliance with ethics guidelines》

Compliance with ethics guidelines

Lin Bai, Yuan Ke Liu, Liang Xu, Zheng Zhang, Qiang Wang, Wei Xiang Jiang, Cheng-Wei Qiu, and Tie Jun Cui declare that they have no conflict of interest or financial conflicts to disclose.

《Appendix A. Supplementary data》

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.eng.2022.06.026.