《1. Introduction》

1. Introduction

The China Geodetic Coordinate System 2000 (CGCS2000) was first officially declared as the national standard coordinate system on 1 July 2008 [1]. The new frame was aligned to International Terrestrial Reference Frame (ITRF)97 at the epoch of 2000.0 (i.e., the reference epoch) and included 2600 Global Positioning System (GPS) geodetic control points for the determination of the reference frame [2] and the velocities of the GPS control points were estimated [3]. In order to obtain accurate positioning of GPS stations at the reference epoch of CGCS2000 from GPS observations, which are from different observing times, two major steps normally need to be carried out: ① observations from all global navigation satellite system (GNSS) stations are processed in the ITRF to obtain high-precision coordinates; and ② the coordinates of these stations at the observing times are corrected to the coordinates at the reference epoch 2000.0 using a plate motion model—that is, a high-accuracy plate motion model based on a linear or nonlinear velocity model for the Chinese mainland.

Before the ITRF2014, several ITRFs had been established based on a linear model that fits the velocity of geodetic reference sites [4,5]. The linear assumption is significant for regional tectonic interpretations. However, the residuals at stations with nonlinear motions can reach the magnitude of a few centimeters, especially when loading effects are neglected. Global studies mainly for sea level rise require a terrestrial reference frame (TRF) at an accuracy level of 1 mm [6]. In addition, previous studies [7–9] indicated that the amplitudes and phases of seasonal signals detected from the time series of coordinates at many GPS stations varied with time. It is not expected that these periodic signals that are presented within the station position time series will affect the parameters defined by the ITRF, especially the origin and the scale [10], although the velocities of stations with less than 2.5 years of observations might be impacted by these effects [11]. It is, however, expected that the estimation of the periodic signals will improve the accuracy of the velocity even at stations that present a linear movement trend—especially for stations with large seasonal variations in position. Another main advantage of estimating seasonal signals is that it is helpful in the detection of discontinuities in the position time series—if any—and consequently improves the offset determination.

For the first time in ITRF history, the ITRF2014, as the latest ITRF release, was generated with enhanced modeling of nonlinear station motions, including seasonal (annual and semiannual) signals of station positions and post-seismic deformation for sites that are subject to major earthquakes. The ITRF2014 has been demonstrated to be superior to past ITRF releases. However, the ITRF2014 was constructed based on the assumption that the long-term variation trend is linear and the amplitudes of seasonal signals are constant. This assumption has significant ramifications for many studies that require a TRF with a 1 mm level of accuracy. To address this issue and to further improve the accuracy of the TRF, this study investigates a nonlinear site movement model constructed with the singular spectrum analysis (SSA) method.

The outline of this paper is as follows. Section 2 introduces data processing strategies for the coordinate estimation of GNSS stations, including the determination of the criteria for the selection of GNSS reference stations, the station partition scheme, and so forth. Section 3 presents the performance of different models, including the plate motion model based on a linear velocity assumption, and the correction of station coordinates from a specific epoch to the 2000.0 epoch. More details on the plate motion model are introduced and analyzed in Section 3. Section 4 presents the procedure of the SSA method for the modeling of nonlinear site motion proposed in this study. A comparison of the SSA method with the method implemented in the ITRF2014 is also discussed in this section. Section 5 gives summary and conclusions.

《2. Data processing strategy》

2. Data processing strategy

《2.1. Selection of global reference stations》

2.1. Selection of global reference stations

For the realization of an optimal Chinese reference frame, a set of appropriate GNSS reference control stations covering China and its surrounding areas must be selected. Reference station candidates were selected from the global international GNSS service (IGS) stations located in the Eurasian plate; next, stations that met all of the following three criteria were identified and selected as the initial candidates: ① the sigma of the station’s velocity component was less than 3 mm∙a^{–1} ; ② the station was located away from plate tectonic deformation zones; and ③ the station was continuously observed over at least five years. In this research, a continuous observing span of more than ten years was adopted; more specifically, stations that were observed throughout the ten-year span from 1999 to 2009 were selected. Based on these criteria, a total of 126 stations were selected. These 126 candidates then needed to be further checked according to the following criteria: ① the velocity components of all the initial candidate stations in the same subplate were approximately in the same motion direction with the moving trend of the subplate and in the same order of magnitude; and ② all the candidate stations were geographically well distributed. The three-step selection procedure is described below.

**Step 1: **The least-squares method and two sets of velocity data were used to estimate the seven parameters for the transformation between the two sets of velocities. One set of velocities was from the NNR-NUVEL-1A model and the other set comprised measured velocities—that is, velocities estimated from GNSS data processing for all the initial candidate stations within the ITRF2005 frame [12]. Stations with velocity residuals greater than the two-sigma value were regarded as containing gross errors, and were thus deselected.

**Step 2:** Stations whose measured velocities and azimuths were much greater than their average values were identified and deselected. This step was based on the assumption that, on a rigid plate (i.e., a rigid block), the magnitudes of the velocities and the motion directions of all the candidate stations had no significant difference—that is, they slowly changed in space [13]. The procedure for this identification is as follows.

First, both the measured velocity (*V _{x}*,

*V*,

_{y}*V*) and the modelderived (e.g., NNR-NUVEL-1A model) velocity (

_{z}*V*,

_{x}*V*,

_{y}*V*)

_{z}_{model}of each candidate station in the geocentric reference system were transformed into the horizontal components in the topocentric coordinate system.

*V*,

_{x}*V*, and

_{y}*V*are velocities in

_{z}*x*,

*y*, and

*z*directions, respectively. The horizontal component vector

**= (**

*V**V*

_{e},

*V*

_{n}) (

*V*

_{e}and

*V*

_{n}are the velocities in east and north directions, respectively) can be obtained by the following equation:

where and are the latitude and longitude of the station, respectively.

Second, based on the horizontal vectors from all the candidate stations, the Euler vector (*Ω _{x}*,

*Ω*,

_{y}*Ω*) of the plate can be obtained from the least-squares estimation. The observation equation used for the Euler vector estimation is

_{z}where* r* is the distance between the station and the center of the earth.

The velocity obtained from Eq. (2) is the model-derived velocity, and the azimuth calculated from Eq. (3) is the model-derived azimuth; these are denoted by *V*_{plate_model} and , respectively. The model-derived velocity and the azimuth of the velocity were then compared with the measured velocity and azimuth of the same station. The azimuth of a velocity was defined as follows:

If the difference in either velocity or azimuth was greater than two-sigma (Eq. (4)), then the station was excluded.

where is the standard deviation (STD) value of the differences between the model-derived and measured velocities of all the candidate reference stations within the plate, and is the STD of the azimuth results.

**Step 3: **Stations that were well distributed across the plate were selected. In practice, however, the candidate stations are usually not evenly distributed as desired. Thus, only stations located in densely istributed areas with low precision would be normally excluded. In this study, the so-called hierarchical rasterization method was adopted to identify this type of station. The two principles of this method were as follows: ① The area covered by the selected stations was as large as possible; and ② the selected stations were distributed as evenly as possible.

The detailed station selection procedure included three steps: ① Each block or subplate was divided into multiple grids, according to their latitude and longitude; ② the number of the stations in each grid was counted and the mean of the numbers of the stations of all the grids was calculated; and ③ stations that either had lower precision or were located in a grid in which the number of stations was greater than the mean value obtained in step ② were identified and excluded.

After all the above criteria were checked, 92 IGS stations were finally selected from the initial 126 candidate stations, as the GNSS reference control stations.

《2.2. Partition spacing method》

2.2. Partition spacing method

Current software packages, such as GAMIT or Bernese, have a limited ability to process large amounts of GNSS data; if they are used to process data from more than 50 GNSS stations, their processing efficiency is likely to drop rapidly. Thus, partitioning the whole area into several blocks or groups is a frequently adopted solution. In addition to considering coordinate accuracy in the selection of reference control stations, as previously discussed, it is necessary to consider the length of the baselines between two reference stations. Unlike the commonly used partitioning scheme, which is simply based on the station’s geographical location, a new strategy named the partition spacing method (PSM) is proposed in this study. The procedure for the PSM includes three steps: ① dividing the Chinese continental area into grids with a resolution of 2.5°×2.5°, and counting the number of all station in each grid; ② determining the number of blocks according to the number of all stations processed and the capacity of the software; and ③ separating the densely distributed stations in each grid into different blocks. Fig. 1 provides an example in which the upper left panel indicates the distribution of all the processed stations, and the other three panels show the three blocks portioned using the PSM. It should be noted that the PSM can be used to partition any number of blocks, as desired.

《Fig. 1》

**Fig. 1. Schematic diagram for the PSM. All the processed GNSS stations shown in the upper left panel were separated into three blocks, as shown in the other three panels. Note: the use of different colors is merely to aid understanding.**

The PSM has been successfully used in daily data processing for the national GNSS Continuously Operating Reference Stations (CORS) network. 210 nationally distributed CORS were grouped into five blocks. Each block included about 43 stations that were distributed across the whole expanse of China, to resemble parallel partitioning.

In order to assess the improvement of the PSM over general partitioning methods such as the geographical partitioning method (GPM), testing results from both the PSM and the GPM were compared with the result from a non-partition scheme, which was taken as the standard. In the non-partition scheme, instead of using different blocks, all the stations participated in the solution without partitioning. Figs. 2 and 3 demonstrate the differences in the baseline relative accuracies and station coordinates resulting from the PSM and GPM. It can be seen that the baseline relative accuracy resulting from the PSM is significantly better than that of the GPM. Furthermore, the positions resulting from the PSM are more uniform and stable than those from the GPM, with coordinate accuracies of 1 and 2 mm in the horizontal and vertical directions, respectively, from PSM, and 2 and 4 mm in the horizontal and vertical directions, respectively, from GPM.

《Fig. 2》

**Fig. 2. Comparison of the baseline relative accuracies resulting from the PSM (red line) and GPM (blue line).**

《Fig. 3》

**Fig. 3. Comparison of the differences in coordinates resulting from the PSM (red line) and GPM (blue line). B, L, and H denote latitude, longitude, and geodetic height, respectively.**

《3. Plate motion model development and linear TRF maintenance》

3. Plate motion model development and linear TRF maintenance

The two ways to maintain a reference frame are: ① using an improved plate motion model to construct a linear velocity model [14,15]; and ② constructing a nonlinear site movement model. The advantage of the first way is that a plate motion model can be developed using the linear velocities at all the reference stations on the plate in order to transform the coordinates of any station into those of any specific epoch. However, because the nonlinear movement is neglected, this type of model can only achieve an accuracy of centimeters. The nonlinear model can achieve a millimeter-level accuracy; however, as a site-specific model, it cannot be expanded and used for other stations.

Various plate motion models such as NNR-NUVEL-1A, APKIM2005 [16–18], and PB2002 [19] have been developed by many scholars. However, these global models do not perform well when they are used to correct GNSS station coordinates to CGCS2000, due to a lack of sufficient China-based stations being used in the development of these models.

《3.1. Velocity field estimation and plate motion model development》

3.1. Velocity field estimation and plate motion model development

In this study, observations spanning ten years from the Crustal Movement Observation Network of China (CMONOC) were used to estimate the velocity field for China Plate Model (CPM) for CGCS2000 (CPM-CGCS2000). A total of 1720 GNSS stations were processed, including 34 CORS for national reference control, 56 GPS base stations, and 1630 regional tracking stations from local networks. The whole of the Chinese mainland was divided into four blocks in order to use the GAMIT software (version 10.35).

Abrupt motions occurred at some of the reference stations during the ten-year observation period, caused by severe earthquakes, tsunamis, and so forth. Therefore, the approach of segmenting the coordinate time series and estimating the piecewise velocities ensured the derivation of more stable and accurate plate motion models for rigid-plate bodies. This finding implies that the sudden displacements of these stations during the events were not taken into account in the velocity field calculation.

The positions of all the 1720 stations and the velocities at 1072 of these stations were estimated in the ITRF2005, constrained by the 92 reference control stations that were selected, as described in Section 2.1. The 92 reference stations served as ties with the ITRF by means of GLOBK software. Based on the coordinates and velocities of 12 surrounding IGS stations and 23 domestic reference stations in the ITRF2005, the coordinates and velocities of the Chinese national base network and regional networks in the ITRF2005 were obtained.

Earthquakes frequently occur in China. During the selected tenyear CORS observation span, at least two large earthquakes occurred: one in Kunlun Mountain in 2001, and the other in Wenchuan in 2008. In addition, a tsunami that occurred in Indonesia in 2004 had an impact on some parts of China. Based on the coordinate time series of the CORS in China, it was found that some stations were affected by these three events; for example, some sites moved and the trend of the movement velocities changed after the events. In order to obtain a reliable velocity field for the affected regions, the time series observations from these reference stations were segmented according to the dates of the events, so that data in different segments were processed separately. During the time period of these events, none of the affected CORS were used as control stations.

The results in Table 1 show that most of the differences between the velocity components measured during the pair of segments before and after an earthquake or a tsunami were in the range of 3–5 mm∙a^{–1 }(e.g., QION, XIAG, WUHN, and HAIK), and the directions of the velocity vectors resulting from the velocity components also changed. The stations in the earthquake/ tsunami-affected regions were the most likely to move during the events; thus, dividing the time series velocity (or GPS observations) into several segments based on the date/time of these events ensured that the resultant velocities would not be affected by sudden changes in the coordinates of the reference stations. Excluding the observations that fall within the periods of those events in the segments was a reasonable approach to obtain an accurate and reliable velocity field, and was used for the development of a new plate motion model, CPM-CGCS2000.

《Table 1》

**Table 1 CORS with time series segmentation.**

Tables 2 and 3 show the statistical precision of the velocity field of the 1072 reference stations (including 25 CORS, 56 base stations, and 991 regional stations in CGCS2000) in the horizontal and vertical directions, respectively. It can be seen that 99% of the precisions of the velocity fields in the horizontal directions are smaller than 0.5 mm∙a^{–1 } , and 92% of the precisions of the velocities in the vertical direction are smaller than 1.0 mm∙a^{–1 } .

《Table 2》

**Table 2 Statistical precision of the velocity fields in the horizontal directions.**

The statistics was based on total number of 1072 stations, from which 1025 stations were later used as reference samples for velocity field interpolation (deselected 47 stations with big velocity root mean square).

《Table 3》

**Table 3 Statistical precision of velocities in the vertical direction.**

《3.2. Plate motion model CPM-CGCS2000》

3.2. Plate motion model CPM-CGCS2000

The accuracy of a plate motion model depends on the following factors: ① the quantity and quality of observations and the distribution of reference stations; ② the reasonability of plate division and the accuracy of the determined plate boundaries; and ③ the reasonability of the criteria used to select the reference stations— in particular, the deselection of stations located in deforming regions in order to comply with the rigid-plate hypothesis. Ref. [20] showed that the resulting differences in the Euler pole positions when using or not using stations in deforming regions could reach tens of degrees, and the differences in velocities reached 20%–30%. The purpose of the time-series segmentation approach is to exclude stations that are in deforming regions or that have been affected by severe natural disasters, and to get the velocity which truly reflected the subplate motion.

3.2.1. Plate division in China

As one of the most complex regions in terms of geological formation history, plates in China have obtained a great deal of attention from many scientists around the world in the study of intraplate tectonics and continental dynamics [21–27]. The Chinese mainland has been divided into different tectonic units by different researchers over different periods of time, as discussed in Refs. [28–38], for example. All these divisions were reasonable within the given periods of time. However, at present, one of the most commonly used and acceptable division is the one from the subproject of the National Key Basic Research Program of China (973 Program)—the “Mechanism and prediction of strong earthquakes in the Chinese mainland”—in which active blocks in Chinese mainland and its adjacent areas are classified into two levels, Class I and Class II. Here, we only focus on 22 Class II blocks: Lhasa, Qiangtang, Bayan Har, Qaidam, Qilian, Chuan-Dian (Sichuan–Yunnan), Dianxi (western Yunnan), Diannan (southern Yunnan), Tarim, Tianshan, Junggar, Sayan, Altai, Alxa, Mongolia– China, Korea–China, Ordos, Yanshan, North China Plain, Eastern Shandong–Yellow Sea, South China, and the South China Sea. It should be noted that, among these blocks, the Sayan block was not investigated because there was no observation station in it, and the Dianxi and Diannan blocks were merged into the South west Yunnan block. Thus, the actual number of Class II blocks or subplates under study was 20.

The velocity post-fit residuals on each subplate are about 2 mm∙a^{–1} for the subplates of North China Plain, Eastern Shandong–Yellow Sea, Yanshan, Ordos, Tianshan, Korea–China, Mongolia–China, South China, Alxa, Altai, and the South China Sea; and 3–4 mm∙a^{–1} for the subplates of Qilian, Chuan-Dian, Tarim, Qaidam, Qiangtang, Bayan Har, Southwest Yunnan, and Jungger. The Lhasa subplate shows more perturbation than the others in West China; thus, it reflects less accordance motion.

The movement of each crustal plate follows Euler’s law according to the theory of plate tectonics; that is, the horizontal movement of point* j* on plate *P _{i} *can be expressed as follows:

where is the Euler vector of plate *P _{i}*

_{ }and is the position vector of the

*j*th point.

Based on Eq. (5), several plate motion models have been developed by many researchers in the last decade.

In the current study, the boundaries between two neighboring subplates were initially determined based on the division from the subproject of the National Key Basic Research Program of China, and all stations used to develop the plate motion model were projected to their associated subplates according to their coordinates. For every subplate, an initial least-squares adjustment was performed to obtain the estimated velocities of all the stations in the subplate. Based on the magnitudes and directions of all the stations’ velocities, especially those from the stations close to the initial boundaries, the edge lines of the initial boundaries could be further adjusted for the final determination of the boundaries, if reasonable. In addition, for a reliable plate motion model to be developed, the stations located in apparent deforming regions, or near plate boundaries, or in very active seismic areas were regarded as unstable stations, and were therefore removed. Moreover, the distribution of the stations affects the strength of the normal matrix for the optimal parameter estimates of the plate motion model. Thus, we selected stations that were distributed as evenly as possible and that were in stable regions. The criterion used to further identify other unstable stations was: if the O – C (observed minus calculated) velocity value of a station was larger than three-sigma of the least-squares fitting results, then the velocity of this station was considered an outlier and was thus removed. After all the abovementioned filtering procedures were performed, 17 stations were excluded from the 1025 candidate stations. The remaining 1008 stations, which were regarded as stable and which had no significant outliers, were the stations that were finally selected to be used to develop the final plate motion model with the optimal fitting process. The resulting Euler vectors of the 20 subplates and their fitting precision are listed in Table 4, from which it can be seen that the precision of the developed CPM-CGCS2000 for each of the subplates was high.

《Table 4》

**Table 4 Euler vectors and precision of the 20 subplates.**

In Table 4, and are the longitude and latitude, respectively, of the pole depicting the subplate rotation (in decimal degrees); and is the rotation rate of the subplate in degrees per million years (Ma). Furthermore, in order to assess the precision of the optimal CPM-CGCS2000, the statistical value of root mean square (RMS) for the O – C values of all the selected stations was also calculated, and is shown in column 3 of Table 4. The results show that the best, worst, and mean RMSs were 0.31, 5.02, and 1.62 mm∙a^{–1} , respectively. These results suggest that the developed CPM-CGCS2000 best fits the velocity sample data.

3.2.2. CPM-CGCS2000 model validation

The developed CPM-CGCS2000 in this research was validated by comparing the positions/coordinates of some of China’s CORS derived from the new CPM-CGCS2000 model with the known coordinates of the same stations in CGCS2000. CPM-CGCS2000 model-derived position was obtained by the following steps: ① To obtain the position/coordinates of a station in reference epoch in CGCS2000, the correction derived from the new CPM for the movement correction of the station for that time period from the observing epoch was added in the observing epoch in the ITRF2005, yielding the position of the station in the reference epoch of CGCS2000; ② the result from step ①, which was in the ITRF2005, was then transformed to the CGCS2000 frame using the commonly used seven-parameter transformation method. The formulas for these two steps are expressed by Eqs. (6) and (7), respectively. Assuming that the coordinates of station *i *in the ITRF2005 at epoch *t*_{s} is and the position of the station in the reference epoch of CGCS2000—that is, *t*_{c0}—is (still in the ITRF2005), then can be obtained by the following:

where is the velocity derived from CPM-CGCS2000 in the ITRF2005.

can then be transformed from the ITRF2005 to CGCS2000 by the following:

where *t*_{k} denotes the reference epoch of the ITRF2005; *T*_{k} , *D*_{k} , and *R*_{k} are the transformation parameters of the translation vector, scale factor, and rotation matrix, respectively, from the ITRF2005 to CGCS2000; are their rates; and is the transformed coordinates in CGCS2000.

The accuracy of CPM-CGCS2000 was assessed or validated by the differences between the coordinates of selected stations obtained from Eq. (7) and the known coordinates of these stations in CGCS2000. In this study, 28 CORS had known coordinates in CGCS2000; thus, they were used for the validation. Fig. 4 shows the position accuracies of the 28 stations in the horizontal directions (where B and L denote latitude and longitude, respectively), because CPM-CGCS2000 only contained velocity components in these two directions.

《Fig. 4》

**Fig. 4. Accuracy of the plane coordinates of the 28 CORS.**

Fig. 4 shows that most of the accuracy values were in the range of ±2 cm, which was reasonable compared with the coordinate precision of ±3 cm for CGCS2000. The worst accuracies were from stations LHAS, XIAG, QION, KMIN, and XIAA; this was because these stations are located in the subplates where either an earthquake or a tsunami occurred. The sudden movements or changes of these positions caused by the events were not taken into account in CPM-CGCS2000; that is, the displacements of these stations during the period of the events were not compensated for or accumulated in the results obtained from Eq. (6). In fact, only seamless accumulation of the movements of these stations could be used to obtain the precise position of these stations. However, the segmentation processing adopted in this study left a gap in the time series due to the events. Therefore, the worse accuracies of the four stations could not be used to indicate the real accuracy of CPM-CGCS2000. In addition, LHAS is located in the subplate that is squeezed by the Indian and Bangladesh plates. It can be considered to be located in a local deforming area. Thus, there must have been a significant discrepancy between the real velocity of this station and the fitted velocity that modeled the rigid body of the subplate. The accuracies of the remaining stations were better than ±3 cm. Based on these results, we can draw the conclusion that the newly developed CPM-CGCS2000 can achieve a centimeter-level accuracy.

《4. Nonlinear movement modeling》

4. Nonlinear movement modeling

It is important to construct a non-secular model to depict the real geophysical movement of GPS sites from their time series if the data contain offsets, non-secular trends, and periodic components with time-varying amplitudes. Previous research has focused on improving the accuracy of linear velocity estimation, rather than using non-secular models to characterize the real movement of GPS sites. Even though the latest ITRF2014 [39]— which was generated with an enhanced modeling of nonlinear station motions—fitted the site time series by including constant trends and periodic components with constant amplitudes, it still cannot reflect the real movement of GPS sites accurately. In this study, the SSA method for nonlinear site motion modeling was introduced and compared with the method implemented in the ITRF2014.

SSA is a data-driven technique that uses time domain data to extract information from noisy time series without the use of prior knowledge of physical phenomena in the time series [40–44]. In recent years, SSA has also been used in geodetic study; one of its latest applications is the modeling of seasonal signals from GPS time series [9]. The advantage of SSA over sinusoid function for fitting nonlinear movement is discussed below.

Figs. 5–7 show the fitting results of coordinate time series from a sinusoidal function and SSA. In Fig. 5, the red curve indicates the original coordinate time series in the up (U) direction, and the blue line is the fitting curve with a linear trend and sinusoidal function. From the fitting residuals, it can be seen that the residuals still contain some oscillating signals. Fig. 6 shows the result from SSA, which indicates that SSA accurately captured the seasonal variations in the coordinate time series; thus, the residuals did not contain noticeable unmodeled oscillating signals.

《Fig. 5》

**Fig. 5. (a) Original coordinate time series (red) and fitting curve (blue) with sinusoidal function and (b) its residuals (green line) in up (U) direction.**

《Fig. 6》

**Fig. 6. (a) Original time series and its trend; (b) detrend time series and its fitting curve (red line) with SSA method; (c) residuals.**

A special use of SSA on the Tsukuba (Japan) site can be seen in Fig. 7(b). Along the trajectory (raw data in blue), there is a big offset of about 60 cm in the year 2012. Thus, Fig. 7(a) shows that, although the ITRF2014 improves the accuracy of the frame, it does not solve all the problems of the nonlinear movement modeling. Fig. 7(b) is the result from our analysis with SSA; after moving the offset, fitting the time series, and then adding the offset back, we obtained this result. Considering the earthquake, two pieces in the east (E) direction were divided and modeled separately. In the U direction, however, even though a big jump occurred during the earthquake, the SSA method still matched well with the original time series, indicating its powerful self-adapting fitting ability.

《Fig. 7》

**Fig. 7. Trajectory of Tsukuba (Japan) site with fitting result from (a) the ITRF2014 [39] (blue: raw data; green: piecewise linear trajectories given by the ITRF2014 coordinates; red: adding the parametric post-seismic deformation (PSD) model) and (b) SSA method (blue: raw data; red: SSA modelling result).**

The SSA procedure mainly consists of two steps: decomposition and reconstruction. More details can be found in Ref. [45], Appendix A. In addition, an enhanced method known as SSA with pseudo data (SSA-PD) to address the phase shift issue of SSA is detailed in Ref. [45], Appendix B1, and another method named SSA for prediction (SSA-P) for predicting the coordinates of GPS sites at any specific time is detailed in Ref. [45], Appendix B2.

《4.1. SSA for interpolation and error detection》

4.1. SSA for interpolation and error detection

SSA is applicable for uniformly sampled time series. However, in practice, gaps and gross errors or jumps often exist in an original coordinate time series of a GPS station. Thus, interpolation for missing data and gross data detection are needed before nonlinear movement modeling is performed.

In this study, the SSA for missing data (SSAM) interpolation method was adopted and a new method named SSAM-IQR was used for gross error detection by combining SSAM and the interquartile range (IQR) criterion [43]. Due to the restriction on the length of this paper, these two methods are not discussed here, as the theory can be found in two studies [9,44]. The main advantage of using SSAM combined with SSAM-IQR is that no prior knowledge of the time series is needed. The SSAM model was used to obtain the values in a gap; for a detailed description of SSAM, refer to Refs. [9,46]. The performance of SSAM was assessed using weekly GPS station time series in the period of 2000–2009 from 31 China national GPS stations. For the convenience of later analysis, as shown in Table 5, the site position time series listed here are grouped according to the division of the Chinese mainland subplates. The interpolation and gross-error detection results are shown in Fig. 8 and summarized in Table 6. The units of the horizontal axis in all of the following figures are the time year.

《Table 5》

**Table 5 Stations in each subplate.**

《Table 6》

**Table 6 Statistics of missing rate, gross error rate, and mean and STD of residuals.**

The statistics of the missing rate, gross error rate, mean of the residuals, and STD of the residuals of these stations resulting from SSAM are summarized in Table 6.

Fig. 8 shows that all the gross errors were correctly detected by the SSAM-IQR. The statistic results of the residuals fitted with all of the China national GPS station time series with the SSAM model, listed in Table 6, show that the interpolation accuracies in the horizontal and vertical directions were mostly better than 5 mm and 1 cm, respectively, which is very high. Even the biggest missing rate of 30%, which corresponds to missing data for more than 200 days, and which occurred at the YONG station, was correctly interpolated. Here, we only show the interpolation effects at stations with more than 5% missing data.

《Fig. 8》

**Fig. 8. Interpolation and gross error detection results for China national sites coordinate time series in the east (E), north (N), and up (U) directions in (a) the South China Sea subplate, (b) the Korea–China subplate, (c) the South China subplate, (d) the Lhasa subplate, (e) the Eastern Shandong–Yellow Sea subplate, and (f) the Tarim subplate.**

《4.2. Nonlinear movement modeling with SSA》

4.2. Nonlinear movement modeling with SSA

After interpolation for missing data and gross error detection, all position time series were modeled with SSA-PD; the modeling results are shown in Fig. 9, and the residuals are summarized in Table 6. As our main focus was on the expansion ability of nonlinear movement modeling with SSA, hereafter, the subplates in which only one station is located, such as Junggar and the Yanshan subplates, were not taken into consideration.

As shown in Fig. 9, the original signals and modeled signals at most stations agreed well, and their differences—which were in fact the accuracies of the model results—were better than ±3, ±2, and ±5 mm in the E, N, and U directions, respectively. It is worth mentioning that the LHAS station, which apparently had a big jump of about 6 cm in the U direction, had a ±5.8 mm accuracy, which again indicates the powerful self-adapting ability of SSA.

《Fig. 9》

**Fig. 9. Modeling results from the SSA-PD for China national sites. Modeling results for station nonlinear movements in the E, N, and U directions in (a) the North China Plain subplate, (b) the Qaidam subplate, (c) the South China subplate, (d) the Mongolia–China subplate, (e) the Chuan-Dian subplate, (f) the Lhasa subplate, (g) the Eastern Shandong–Yellow Sea subplate, (h) the Tarim subplate, (i) the Ordos subplate, (j) the South China Sea subplate, and (k) the Korea–China subplate.**

Fig. 9 also shows that similar trends and oscillation terms exist at the sites located in the same subplate, especially in the horizontal directions. Thus, it is possible for us to model the nonlinear movement at sites with a long period of time series and expand the nonlinear movement to the sites with a short period of observations, and then transform its position from the current epoch to any epoch needed.

In order to study the geophysical accordance relation among the stations in the same subplate, the South China subplate, which included five stations, was taken as an example. Distances between two stations are shown in Table 7. As the vertical movement was far more complicated than the horizontal movement due to geophysical factors, and as vertical accuracy improvement is more significant, our experiments mainly focus on the U component hereafter.

《Table 7》

**Table 7 Distances between two stations in the South China subplate.**

Firstly, the detrended oscillatory components of the time series in the U direction at all sites in the South China subplate were modeled using SSA; they are demonstrated in Fig. 10 in different colors. Secondly, the segment of the time series that met the requirements was chosen. Fig. 10 shows that the amplitudes and phases are approximately in accordance with each other, except for a big peak near the epoch 2001.882 at the WUHN station, and near the epoch 2007.921 at the GUAN station. Fig. 9(c) shows that a big jump occurred in the U direction at the WUHN station near the year 2002, and a little abrupt variation occurred with 1 cm in the U direction in comparison with its forward and backward peaks at epoch 2007.921 at the GUAN station. Thus, the time series spanning from 2002.875 to 2007.384 was used as the test data; then, the value of the nonlinear movement at one station was used to replace by that from another station, and the accuracy of the replaced one was assessed. The experimental results from some baselines with various lengths ranging from several hundred kilometers to thousands of kilometers in the same subplate are shown in Fig. 11.

《Fig. 10》

**Fig. 10. Oscillatory components in the U direction at the sites in the South China subplate.**

《Fig. 11》

**Fig. 11. The matching degrees of the oscillatory components in the U direction at several pairs of sites with various distances. (a) XIAM–GUAN and HAIK–GUAN; (b) WUHN– HAIK and XIAM–LUZH; (c) LUZH–GUAN and LUZH–HAIK; (d) WUHN–GUAN and WUHN–LUZH.**

The time series of two stations with a distance of under 500 km had approximately the same amplitude and phase. If the distance between two stations was greater, both the amplitude and the phase drift at the two stations were not synchronized. It seems that only the nonlinear movement at two stations within a 500 km distance can be replaced in this way.

《4.3. SSA for the prediction of GPS station coordinates》

4.3. SSA for the prediction of GPS station coordinates

This study proposes another expanded method called SSA-P for predicting the coordinates of GPS sites at any specific time. Its performance was assessed using ten-year GPS time series of weekly coordinates of 32 national GPS sites. The first 468 weekly samples were used to reconstruct the signals; the reconstructed model was then used to predict the next 52 weekly samples to validate the model’s prediction results.

Apparently, the lower the residuals of the nonlinear movement modeling, the higher the accuracy of the nonlinear movement prediction, if the site movements are stable. This characteristic can be seen in the horizontal direction, as vertical movements are much more complicated due to co-seismic and post-seismic deformations, global geophysical fluid dynamics, and so forth. It was found to be typical that the characteristic of the modeling based on earlier years could not be expanded forward to following years or had poor accuracy; thus, the accuracy of the modeling for the vertical component was usually poorer than that of the horizontal components, even though the accuracy of the modeling in the vertical direction was better than ±5 mm.

In general, the results shown in Fig. 12 indicate insignificant differences between the predicted signals and the real signals; the accuracy of the predicted results is quite stable, with an accuracy of about ±3 mm at most stations in N and E directions (Table 6).

《Fig. 12》

**Fig. 12. Prediction resulting from SSA-P for China national sites (blue: raw data; green: prediction result). Prediction results for station nonlinear movements in the E, N, and U directions in (a) the North China Plain subplate, (b) the Qaidam subplate, (c) the South China subplate, (d) the Mongolia–China subplate, (e) the Yanshan subplate, (f) the Chuan-Dian subplate, (g) the Lhasa subplate, (h) the Eastern Shandong–Yellow Sea subplate, (i) the South China Sea subplate, (j) the Junggar subplate, (k) the Tarim subplate, and (l) the Ordos subplate.**

The results for the performance assessment of SSA-P indicate that it is possible to obtain a high accuracy even when the noise level is larger than the oscillatory components.

The accuracy of the SSA-P used to predict the coordinates of the GPS sites in the horizontal and vertical directions was better than ±5 mm and ±1 cm, respectively, for most GPS sites, even though annual and semiannual signals in amplitude exist in the time series. However, the prediction results for the stations in the Lhasa and Chuan-Dian plates were poor, compared with those for the stations in other plates, resulting in the same conclusion as the fitted error from the plates.

《5. Conclusions》

5. Conclusions

To realize an optimal dynamic geodetic reference frame in China, several new strategies or methods were proposed and applied in the data processing of China national GNSS stations and China reference frame maintenance. These new approaches were tested using over ten years of observations from 1720 stations. The new approaches and their results are summarized below:

(1) As a supplement of the ITRF reference station selection rules, the supervised clustering method was applied to the selection of national reference control stations for the construction of the CGCS2000 frame, and the resultant precision of the velocities at 1072 national reference stations in CGCS2000 was significantly improved. Furthermore, the accuracies of these velocities were significantly reduced in comparison with those obtained without applying this rule: The RMSs of the velocities in the *x*, *y*, and *z* directions from 0.92, 0.72, and 0.97 mm∙a^{–1} [3] were reduced to 0.19, 0.45, and 0.32 mm∙a^{–1} , respectively (transformed from 0.124, 0.127, and 0.563 mm∙a^{–1} in N, E, and U directions respectively).

(2) To overcome the problem of current software, which has a limited capacity for processing large amounts of GNSS data, a new method called PSM was proposed. The baseline relative accuracy resulting from the PSM significantly outperformed that of the commonly used GPM, which is simply based on the geographic location of GNSS stations. In addition, the positions resulting from the PSM were more uniform and stable than those resulting from the GPM, with 1 and 2 mm position accuracies in the horizontal and vertical directions, respectively.

(3) Regarding CGCS2000 maintenance, the best-fit domestic plate motion model CPM-CGCS2000 was obtained from the leastsquares estimation with the mean precisions of ±0.127, ±0.124, and ±0.563 mm∙a^{–1} in the E, N, and U directions, respectively. The precision results from CPM-CGCS2000 were better than 1 mm∙a^{–1} in most parts of China, with best, worst, and mean values of ±0.31, ±5.02, and ±1.62 mm∙a^{–1} , respectively.

(4) For CGCS2000 maintenance at a millimeter-level accuracy, data-driven and self-adapted SSA was applied. In addition, several extended methods based on SSA were applied, including SSAM for interpolation, SSAM-IQR for gross error detection, SSA-PD for GPS station nonlinear movement modeling, and SSA-P for the coordinate prediction of GPS station nonlinear movements. The accuracies of SSA-PD and SSA-P for modeling and predicting position nonlinear movements in both the horizontal and vertical directions were very high, with accuracies better than ±3, ±2, and ±5 mm in the E, N, and U directions, respectively, from SSA-PD, and better than 5 mm and 1 cm for the horizontal and vertical directions, respectively, from SSA-P.

It is notable that, in the release of the ITRF2014 to date, nonlinear movement has been considered in the frame construction, although the previous frame that is based on linear velocity is still maintained with a centimeter-level accuracy, thus cannot meet the millimeter-level accuracy that is required by some frame maintenance, such as the global geodetic observing system (GGOS). The enhanced SSA has proven to be a very powerful tool for analyzing and extracting the non-secular trend and periodic components from a noisy time series, and has greatly improved the accuracy of the CGCS2000 frame coordinate estimation and prediction. However, a shortcoming of the SSA-P for coordinate prediction is that it does not provide parameters based on which the coordinate of a station can be easily obtained. Fortunately, the coordinate of a reference station in any epoch can be easily obtained from the coordinate time series and the SSA method, and the accuracy of SSA method is much higher since it takes into account the nonsecular and periodic variation of the site movement.

《Acknowledgements》

Acknowledgements

This study is supported by the National Key Research and Development Program of China (2016YFB0501405) and Natural Resources Innovation Platform Construction and Capacity Improvement (A19090) and The Fundamental Research Funds for Chinese Academy of Surveying and Mapping (AR1903 and AR2005).

《Compliance with ethics guidelines》

Compliance with ethics guidelines

Pengfei Cheng, Yingyan Cheng, Xiaoming Wang, Suqin Wu, and Yantian Xu declare that they have no conflict of interest or financial conflicts to disclose.