《1. Introduction》

1. Introduction

Amid China's rapid industrialization and urbanization, the rise in the population, manufacturing, and traffic of its cities is becoming increasingly intense and complex leading to a variety of urban diseases such as rapid population growth, traffic jams, environmental deterioration, housing shortages, employment problems, and public safety challenges. This is just a short list of the side effects of urbanization while there is a host of other less prominent policy problems facing Chinese policymakers. All of these factors have become serious constraints upon the healthy and sustainable development of China’s urban ecosystems [1]. On one hand, the urban ecosystem is composed of urban infrastructure and diverse social environment among urban residents that is becoming increasingly intricate and scaling upon a daily basis; on the other, decision makers and administrators are not fully conscious of this complexity and are consequently deficient in the efficient management of this ecosystem. Modern cities have been upgraded to ternary spaces from dual spaces. The first-dimensional space is a physical space made up solely of a physical environment with all its resources in a natural state. The second-dimensional space contains a human society space shaped and sustained by the culture, norms, and social interactions of urban residents. A third-dimensional space, unlike the previous two, is a cyber space, which is comprised of computers, internet access, and the data flowing through these systems to informationized domain [2]. This new structural concept of urban life calls for new philosophies, theories, and practices for analyzing the structures, economic development, and governance problems of urban life in a meaningful and systematical way, to better understand the new direction of the new urban landscape in context of this technological revolution.

The advent of “intelligent city” and “big data” provides new potential platforms for resolving various “urban diseases.” Intelligent cities provide an artificial nervous system based upon the blueprints of traditional urban ecosystems, while urban big data describes the physical domain of real objects (buildings, cars, roads, and so on) and social domain of urban residents in real space, all of which has a virtual form that reflects the real forms in first-dimensional space and second-dimensional space. These virtual forms constitute the third-dimensional space: urban big data. Scholar Guojie Li, member of the Chinese Academy of Engineering (CAE), believes that the role of big data is similar to the “Honeybee Model,” which is based on the value of having honey bees: The bees improve agricultural output of farms through the pollination of crop plants rather than the honey they produce [3]. Just like the human and non-human resources of cities, urban big data has become an important strategic resource for the development of intelligent city and strategic direction. As a city evolves toward informatization and intelligence, numerous information bases and data centers have been emerging, which should be properly interconnected to form urban big data. Urban big data can be converged, analyzed, and mined with depth via the Internet of Things, cloud computing, and artificial intelligence technology to help people understand the forces guiding the development for each layer of the urban system to assist the government and society with decision-making and urban planning to achieve the goal of intelligent administration of the city. Meanwhile, urban big data will bring about profound changes to the operating mode of various urban sectors, and speed the transformation and upgrading of traditional industries as well as the development of emerging industries. Thus, urban big data is speeding up the development of city intelligence. Specifically, city intelligence is implemented through the combined development of the Internet, Internet of Things, telecommunication networks, radio and television networks, and wireless broadband networks; it is characterized mainly by thorough integration of information technologies and comprehensive applications of big data; it focuses on intelligent technologies, intelligent industries, intelligent services, intelligent administration, and intelligent life; lastly, this transformation is committed to building a new form of urban development that is capable of self-correction and solving critical social, economic, and ecological problems in a more automated and timely fashion [2].

《2. Overview of urban big data》

2. Overview of urban big data

《2.1. Definition and features of urban big data》

2.1. Definition and features of urban big data

Urban big data is a massive amount of dynamic and static data generated from the subjects and objects including various urban facilities, organizations, and individuals, which have been being collected and collated by city governments, public institutions, enterprises, and individuals using a new generation information technologies. Big data can be shared, integrated, analyzed, and mined to give people a deeper understanding of the status of urban operations and help them make more informed decisions on urban administration with a more scientific approach, thereby optimizing the allocation of urban resources, reducing the operating costs of the urban system, and promoting the safe, efficient, green, harmonious, and intelligent development of the cities as a whole. In addition to their general features (e.g., volume, velocity, variety, veracity, and value), urban big data also has the additional features below:

(1) Hierarchy: For example, electronic medical records are categorized by hospital or region, while medical images are categorized in terms of individual medical devices and hospitals. Meanwhile, health data can be categorized in terms of individuals, hospitalized patients, communities, or health and anti-epidemic authorities. The hierarchy of urban big data deeply reflects the organizational hierarchy of a city's physical and social systems.

(2) Integrity: As an urban system evolves, the data coverage of each subsystem becomes increasingly broad. In recent years, for example, the inclusion of environmental protection data in China's cities has improved rapidly. Due to the rapid improvement of data integrity, urban big data has acquired the capacity to uncover the overall dynamics of urban development increasingly accurately.

(3) Correlation: Types of urban data are highly correlated with each other. For example, information about urban logistics is included not only in the data of logistics enterprises but also in the data of the manufacturing, commercial, and transport industries, and even in the financial industry. Such correlations can be used, not only for mutual corroboration, but also for cooperative reasoning and mining rules of cities operation.

Due to these general and unique features, urban big data must be applied using a new data processing technology—targeted data extraction based on a target-driven method [4]. The entire data process, from data acquisition to data processing to data modeling, is automated [5,6]. This occurs as follows: The first step is acquiring and storing the original data, including pattern extraction and filtering the data obtained from the desired data source according to the target requirements and then cleaning and preprocessing the acquired data (i.e., data filling, data optimization, data merging, data normalization, data consistency check, and preliminary organization of diverse data attributes), and establishing the dataset to be processed. The second step is processing and analyzing the dataset (including linear analysis, nonlinear analysis, factor analysis, sequential analysis, linear regression, variable curve analysis, and bivariate statistics), and then categorizing the data and analyzing the inter-data and inter-category relationships via the support vector machine (SVM), Naïve Bayes, random forest, and logistic regression. The third step is identifying the inherent relationships among the categorized data and uncovering the further patterns, rules, and knowledge via an artificial neural network, genetic algorithm, and cross-media algorithm. Finally, the relationships among the variables are explained in an interactive and visual way to express a deeper understanding of the results.

《2.2. Categorization of urban big data》

2.2. Categorization of urban big data

Urban big data describes the real-time status of various urban elements, including buildings, streets, pipelines, environments, enterprises, finance, commerce, products, markets, logistics, medicine, culture, education, traffic, public order, and population. As proposed in Ref. [7], urban big data can be categorized into five types: sensor data on urban infrastructure and moving objects, user data on society and humans, governmental administration data, customer and transaction record data, and arts and humanities data. Table 1 lists examples and user groups for the five types.

Urban big data can be categorized in more than one way. In essence, data categorization is not effective without a tree structure, but urban data information is organized using a mesh structure. Therefore, urban big data should be categorized according to the data processing method used and the application objective. The big data on China’s cities is typically categorized using the three methods below.

《Table 1》

Table 1

Examples and user groups of urban big data for five types.

(1) Supply side of urban functions: Urban big data is categorized in terms of the urban administration systems—that is, the clustering systems of existing urban hierarchy data. This categorization method promotes organizational development.

(2) Demand side of municipal services: Urban big data is categorized in terms of the stakeholders (e.g., residents, enterprises, non-profit institutions, and governmental organs). Urban big data can be further categorized, thus deriving various urban application service systems. This categorization method serves to promote applications.

(3) According to the reason for urban data generation: For example, urban big data may be categorized into sensor data based on the urban physical system, data from the economic activities of urban actors, data on the social activities of urban individuals and organizations, data on the scientific and educational activities of urban populations and actors, and data on urban life.

《2.3. Applications of urban big data in urban development》

2.3. Applications of urban big data in urban development

The advent of urban big data provides not only a new approach to the in-depth study of urban operations and development [8-11], but also a new opportunity to renew the competitive advantage of cities [12]. In the context of rapid global informatization, big data has become a vital strategic resource for every city. Strengthening urban competitiveness requires that every city make full use of its advantages in scale, quality, and applications, to tap into and unleash the potential value of data resources, and improve the socioeconomic benefits of big data. Meanwhile, big data has also become a new driving force of urban economic transformation [13]. Specifically, big data plays an important role in the following: ① promoting web-based sharing, intensive integration, and collaborative utilization of production factors; ② facilitating innovation in the business and circulation modes for production materials, technologies, human resources, and funds; and ③ improving enterprises’ core value and strengths. In addition, the use of big data constantly gives birth to new business patterns and new economic growth points. Last, but not least, big data provides a new way to improve the administrative capacity of governments [14]. Using big data can reveal the latent relationships beyond the reach of traditional technological methods for identifying correlations among seemingly unrelated knowledge and transform such information into new knowledge—and diagnose and evaluate urban development via qualitative and quantitative analysis. Accordingly, big data helps governments improve their data-driven decision-making ability and provide a new means of solving complex social problems.

In essence, using urban big data to explore the urban mode and urbanization process is to analyze, visualize, and understand urban big data and interpret both structured and unstructured data in order to enhance the dynamic management of urban resources, knowledge creation, and in-depth analysis of the urban mode and urbanization process, the effective participation of urban residents, and reasonable urban planning and scientific analysis of urban policies. Fig. 1 shows the relationships among the objectives, methods, and applications of urban big data.

《3. Urban big data promoting city intelligence》

3. Urban big data promoting city intelligence

Originally proposed by IBM in 2008, the “smart city” concept focuses on measurement, interconnection, and intelligence [15], aiming to apply specific information technology (IT) systems to the urban administration toolbox. This concept is suitable for the developed countries of Europe and North America whereas urbanization, industrialization, and agricultural modernization have been successfully accomplished. Smart cities of the developed countries are developing mainly in the fields of government administration and intelligent services [16,17]. However, the mayors of China’s cities must perform far more administrative functions and are responsible for more things than the mayors of cities in developed countries. In addition, China is at the apex of its industrialization, informatization, and urbanization while facing a range of puzzles and problems unique in both quality and quantity. As a result, China is following a city intelligence development path different from that of developed countries as shown in Fig. 2. Obviously, the development roadmap for smart cities in developed countries is not an appropriate solution for the diverse problems being encountered in China’s urban development process.

《Fig. 1》

Fig.1 Relationships among the objectives, methods, and applications of urban big data.

Promoting city intelligence is the process of developing intelligent cities with the aim of developing the urban ternary space (comprising urban physical facilities, human society, and urban data) with a scientific approach based on the intelligence consolidated from citizens, enterprises, and governments. The key is to artfully dispatch comprehensive urban resources reasonably, optimize urban economic development, urban construction and administration, improve urban development and citizens’ lives constantly, and satisfy urban citizens’ current and future needs more effectively [2]. The development of city intelligence is a process from decentralization to centralization and from the surface to the depths. An appropriate starting point for building an intelligent city is implementing an intelligent application system based on the existing urban data. The subsequent tasks are enhancing the automated datafication of the physical urban infrastructure that gradually integrates and shares data with innovation in the applications of urban big data. The aim is to promote the deep development of macro decision-making and micro services and to promote industrial upgrading. Therefore, the applications of urban big data serve to advance city intelligence from a local level to a systematic and global level and produce the city intelligence suited to the users’ economic, social, and ecological needs. As shown in Fig. 3, the development model for city intelligence and urban big data comprises five parts: infrastructural support system, application system, industrial system, index system, and operation assurance system.

《Fig. 2》

     

Fig.2 Comparison between China and developed countries of Europe and North America in the development of informatization, urbanization, and industrialization.

The development process occurs as follows: ① Massive amounts of structured and unstructured data are generated from urban sources, and consolidated into a unified urban data platform, thus generating an urban foundational and comprehensive database; ② by correlating, integrating, cleaning, processing, analyzing, mining, and visualizing the massive amounts of data, valuable information is obtained that can reflect the more objective course of events in cities to satisfy the needs of governmental affairs, commerce, and urban administration and improve the capacities for decision-making, knowledge discovery, and process optimization; ③ the transformation and upgrading of other industries are promoted alongside the development of the big data industry (e.g., data acquisition, data analysis, and data exchange), and the development of urban informatization and intelligence is quickened; and ④ indexes suitable for measuring the development effect and level of urban big data are created, and a big data operation assurance mechanism is developed to ensure the stable and reliable operation of the urban big data service architecture.

《Fig. 3》

     

Fig.3 Development model for city intelligence and urban big data.

《3.1. Urban big data is the cornerstone of and core element in city intelligence》

3.1. Urban big data is the cornerstone of and core element in city intelligence

Urban big data is derived from interchanging and integrating the data generated during the operations of numerous physical facilities and human activities in a city. Using appropriate processing and analysis technologies, this data can be used to extrapolate various complex relationships in the operating status of physical facilities, trends in industrial and economic developments, and the status, relationships, and rules of citizen health, education, science and technology, and culture. Therefore, urban big data not only lays down basic information for understanding a whole city, but also plays a core role in promoting city intelligence. Fig. 4 shows the infrastructural support model for urban big data. The sensing layer detects and acquires urban data via the Internet of Things. The network layer focuses on unified network construction and information convergence. The data layer collates the large amount of data generated by the Internet of Things and information systems, thus generating an urban public database. The platform layer mainly comprises various cloud computing facilities, a public information platform, and a big data analysis and processing platform.

《Fig. 4》

     

Fig.4 Infrastructural support model for urban big data.

The key to advancing the multitude of urban interaction and sustainable development of the ternary world that comprises cyber space, physical space, and human space (CPH) uses the big data approach and technology to facilitate the informatization of the urban physical world with the following effects: breaking through information barriers; fostering an open access to the data flow, capital flow, and human resources; scientifically automating and regulating the urban physical facilities; promoting a reasonable and efficient circulation of people, material, capital, and technology; helping to solve the contradictions between urban development and environments, resources, and space; and building new-style future cities to meet the needs of citizens, enterprises, and governments.

《3.2. Urban big data as an inexhaustible resource and intelligence for the sustainable development of city intelligence》

3.2. Urban big data as an inexhaustible resource and intelligence for the sustainable development of city intelligence

The use of big data is speeding the integration between ITs and diverse industrial sectors, giving birth to new forms of business and further opening up the development space of the IT industry. In its development process, city intelligence needs to be continuously improved by reactivating existing data resources and making full use of big data increments. Urban big data plays an important role in revealing the essential nature of typical urban diseases, analyzing complex systems, conducting empirical urban research, and sensing the cities collaboratively. Using urban big data is also an important starting point for solving various urban problems. In urban planning, urban big data provides powerful decision-making support via the mining of information on the natural status (including geography, meteorology, and environments) and societal information (including economy, society, culture, and population) for cities, increasing the scientific approach and foresight of urban administration. The applications in terms of urban planning focus on infrastructure construction, traffic administration, public facilities, and public security [18,19]. City intelligence enables data sharing; the databases of different governmental departments can be interconnected and interoperable at an efficient level. Thus, inter-departmental collaboration and governmental efficiency is significantly improved while reducing the administrative costs of governance [20]. Big data lays the foundation for the prospective “smart” life and allows people to create personalized archives and manage their daily affairs, personal health, routines, shopping, and travel intelligently. Furthermore, using big data strengthens the link between public services and personal lives and provides a variety of applications (e.g., information inquiry, content delivery, and mobile payment) for medicine and health, education and training, traffic, and safety. In this sense, big data will turn people’s simple and planar lives into multi-dimensional ones [21,22] and make city intelligence serve people’s livelihoods. Urban big data is used extensively to research and solve diverse urban problems, rapidly becoming a vital bridge and means of promoting the development of city intelligence in China.

《3.3. Existing problems for developing urban big data in China》

3.3. Existing problems for developing urban big data in China

By kick starting out the initial stages of “digital city” and “smart city” projects, most Chinese cities have constructed an information network infrastructure, governmental information integration and sharing platforms, and intelligent application systems. They have also developed an information economy and accumulated a large quantity of urban data resources, thus laying a firm foundation for the further development of city intelligence. However, the rapid development of urban big data is also confronted by various challenges, including: ① The existing operating mechanisms presents certain difficulties for data integration and sharing; ② no standard laws or regulation systems for governing information security and sharing have been promulgated; ③ technological innovations lag behind the development of big data; ④ the ambiguous business mode affects the sustainable development of big data; and ⑤ the shortage of qualified human resources constrains the development of big data.

《4. Key points of urban big data development》

4. Key points of urban big data development

China actively encourages the development of big data and the construction of intelligent cities. Local governments attach great importance to the development of an information economy. In this context, it is necessary to correctly examine the relationship between decentralization and centralization during the development of urban big data and stick to a close combination between the various priorities: overall planning, cooperative construction of the systems, and balancing technological-push and demand-pull. Accordingly, the key to the development of city intelligence is enhancing the top-level design for urban big data and defining the key points of urban big data development. In light of China’s development model for city intelligence, this study proposes the key points for the development of city intelligence are based on four perspectives: infrastructural support, urban governance, urban services, and economic development (Fig. 5).

《4.1. Unifying the infrastructural support system of big data》

4.1. Unifying the infrastructural support system of big data

A city should build an Internet of Things platform for urban public facilities. The technological architecture of this platform is characterized by an integrated service platform and interoperable application modules, while the management system is characterized by professional operations and open services. The platform is intended to dynamically detect the fluctuating changes in the conditions of urban infrastructure (e.g., road facilities, water, electricity and fuel gas, and underground pipelines), be well-informed about the status and features of urban operation, accurately forecast the development trends of urban elements, and provide powerful decision-making support for urban development. To this end, the city should survey and determine the status quo of the citywide governmental data resources (e.g., types, scale, growth rate, ease of sharing, source, and ownership), identify the data directories and data fields that can be prioritized for sharing or opening access to, and list the data that are confidential or involve public security and personal privacy. The city should also build an urban governmental data architecture, and set up an urban data sharing and exchange service hub committed to the cleaning, processing, integration, mining, and sharing of urban governmental data. Backed by a tool platform, the service hub would focus on hierarchical data services under an appropriate mechanism to promote the integration and sharing of governmental and other data resources across the whole city. Finally, the city should build a unified open platform for governmental data—which would be perfect for the management specifications for governmental data resources—and develop a directory and service center allowing the public to access helpful data for everyday living such as personal credit, traffic, medicine, health, employment, social insurance, geography, culture, education, science and technology, resource, agriculture, environments, safety supervision, finance, quality assurance, statistics, meteorology, and commercial paperwork.

《Fig. 5》

     

Fig.5 Key points of urban big data development.

《4.2. Promoting big data applications in urban governance》

4.2. Promoting big data applications in urban governance

Urban governance involves a variety of fields, including public security administration, urban planning and construction, market supervision, and environmental protection. Therefore, it is important to apply big data to urban governance. The big data application platform for grassroots social governance carries data in diverse fields, including civil air security, energy, electricity, environmental protection, health, traffic, municipal administration, water conservancy, safety supervision, meteorology, and earthquakes. Data resources should be mined, analyzed, and applied more deeply so as to identify social administration problems quickly, provide early warning, and solve the problems satisfactorily, thus enhancing the grassroots social governance capacity. To attain multi-planning integration and inter-departmental collaboration, a city should perfect the One-Map database of land resources, build a management platform for space-time big data and real-estate registration information, conduct big data analysis for land resources, develop a unified big data service system for land resources, and build a big data application platform for urban planning in support of inter-departmental collaboration management, so as to provide a powerful support for the whole process—from formulating and reviewing through to the implementation of urban planning. The city should also intensify the credit monitoring of main market players, create public and personal credit databases, and acquire big data on credit covering all credit matters (mainly enterprises, along with individuals and governments) and all categories of credit information, to make urban citizens more credit-conscious and build a favorable credit environment. Finally, the city should improve the layout of the urban environmental quality-monitoring network; develop a real-time merging and analysis platform for the comprehensive monitoring of big data concerning water, soil, garbage, and air pollution; and acquire information on urban environmental quality and pollution sources comprehensively and accurately in real time in order to provide decision-making support for environmental protection. Moreover, the city should build a big data energy management system to improve the efficiency of energy utilization, perfect the resource pricing mechanism, and innovate in energy consumption patterns.

《4.3. Speeding the application of big data to urban services》

4.3. Speeding the application of big data to urban services

Urban services are comprised of a variety of fields, such as traffic, medicine, education, social welfare, culture, and tourism. Concerning urban traffic, a city should build a big data traffic application platform that would allow urban citizens to access various kinds of traffic information (including public transport, road condition, taxies, public transport facilities, and road maintenance) in real time. The platform would provide various value-added services (e.g., traffic information service and traffic guidance), aid the optimization of traffic planning, and design to boost transportation efficiency and improving the traffic experience. The city should set up a unified big data health application platform using collected and integrated population data, electronic health archives, and electronic medical records, and develop a big data application system for health care management and services for public health, medical services, medical insurance, and drug supply and management. The city should also encourage enterprises and institutions to develop innovative big data applications for health care, and enhance comprehensive health care services. Concerning social benefits, the city should build an integrated urban-rural big data platform for social relief, social welfare, and social security, encourage data exchange and information-sharing between related departments, and support the application of urban big data to the management of employment status and social security funds, monitoring and controlling medical services by insurance authorities, labor oversight, auditing of internal control, and the formulation of policies for human resources and social security policies—as well as the evaluation of their effects—in order to provide highly personalized and targeted services to the public. Finally, the city should integrate the digital cultural resources available from digital libraries, archives, museums, art galleries, mass culture sites, and technological museums, thus building a comprehensive big data service platform for cultural propagation. To promote tourism, the city should encourage the sharing of tourist information, integrate tourism-related data resources (e.g., public security, traffic, environmental protection, commerce, aviation, postal service, telecommunication, and meteorology), and develop a big data tourism resource pool in cooperation with the main network search engines and online tourist service providers.

《4.4. Promoting the application of big data to economic and industrial development》

4.4. Promoting the application of big data to economic and industrial development

A city should encourage the application of big data to industrial development, cross-border e-commerce, logistics, and technological and knowledge services. To support the development of different industrial sectors, the city should take the following measures: ① build cloud service platforms for key sectors and promote the application of big data resources to research and development (R&D) design, data management, and systematic marketing; ② implement the Internet of Things cloud project “Cloud + Web + End,” introduce cloud service enterprises, and develop a cloud engineering industry that provides integrated services such as whole-process overall design, equipment manufacturing, software development, system integration, engineering installation, and network operation and maintenance; ③ optimize the statistical standards for various industrial sectors, improve the efficiency of the acquisition and use of the big data resources of the related industrial sectors, tap the industrial value and potential of data resources, and provide a powerful support for the innovation of R&D systems, the reformation of production management modes, and the conceptualization of industrial value chain systems. Concerning e-commerce, the city should develop manufacturer-to-consumer (M2C) and manufacturer-to-business (M2B) services, enhance the development of cross-border e-commerce platforms, and build an integrated cross-border e-commerce service system that supports the acquisition, cleaning, integration, analysis, and presentation of data in order to provide a powerful support for cross-border e-commerce platforms as well as basic logistics and warehousing, credit rating, and comprehensive information services for such platforms. The city should also develop an e-marketing and price-comparison system oriented to the countries along the “Online Silk Road” to provide a convenient and fast shopping experience to global customers. The city should support the development of a logistics industry by building a big data logistics platform, integrate logistics data (e.g., commodities, traffic networks, freight, and goods turnover) under unified standards, and provide optimal transport routes for logistics enterprises through an all-round analysis of vehicles, routes, and commodities, thus improving logistical efficiency. Using logistics big data would help the city obtain up-to-date inventory information and dynamic demand information for many types of commodity in a timely manner, enabling the optimization of inventories and warehousing and dynamic allocation adjustments of logistics and warehousing resources. The city should promote mass entrepreneurship and innovation by encouraging people to tap into open data resources and explore new technologies and patterns of data mining, analysis, and application. People should also promote a close combination of big data development with scientific innovation, technological development, and government and market demand, thus establishing big data-driven innovation and facilitating open and coordinated innovation.

《5. Conclusion and outlook》

5. Conclusion and outlook

Urban big data plays a core role in the development of city intelligence, the ideal cut-in point to urban development of China. As China’s population and economic output are concentrated in cities, successful urban development would signal that the main body of China is well developed. Therefore, the promotion of city intelligence has bright prospects in China. China’s macro environment of industrialization and urbanization and governmental structure are favorable to the development of urban big data. The use of urban big data that is successfully administered and opened will promote the development of an urban knowledge-based service industry, create new markets and business opportunities, and further promote the development of city intelligence. It is thus imperative that China make full use of its unique advantages to promote the development of city intelligence through urban big data.

《Acknowledgements》

Acknowledgements

This work was partly supported by the Major Strategic Consulting Projects of Chinese Academy of Engineering (2012-ZD-6 and 2014-ZD-01) and the Key Consulting Project of Chinese Academy of Engineering (2015-XZ-14). The authors would like to thank all experts from the above projects for their contributions.

Yunhe Pan, Yun Tian, Xiaolong Liu, Dedao Gu, and Gang Hua declare that they have no conflict of interest or financial conflicts to disclose.