1. Introduction
Conventionally, urban planners prepare for the future using the “project-and-plan” approach. In this method, a single preferred future state or trend, such as estimates of future population size, is used as the basis to forecast future urban changes. Infrastructure and land-use changes are then planned to accommodate the envisioned future [
1]. This approach is effective in stable and predictable environments [
1]. However, the high uncertainty and complexity of city development can make singular prediction approaches less effective [
2]. Cities are complex systems comprising vast and heterogeneous components with dynamic and nonlinear interactions. These interactions result in complex behaviors such as self-organization and feedback, which make it difficult to produce reliable projections for future scenarios. Moreover, contemporary cities face multiple challenges such as extreme climatic events, technological disruptions, and increasing population dynamics, which further complicate urban environments and make future outcomes even less predictable. Scenario planning offers a powerful approach for addressing these challenges by enabling stakeholders to analyze how present-day decisions can lead to different future outcomes across various assumed scenarios [
3], [
4]. These analyses contribute to urban resilience by providing information on urban planning and management practices to avoid undesirable outcomes. Consequently, researchers and practitioners have increasingly recognized the significance of scenario planning in recent decades [
3], [
5], [
6].
Existing scenario-planning practices typically follow this procedure: A team of experts and stakeholders collaborates to identify the key drivers of change, such as environmental change or economic development. They then develop alternative scenarios based on these drivers and evaluate the performance of existing or candidate plans within each scenario using various criteria such as sustainability, equity, and economic viability. Throughout this process, stakeholder engagement and collaboration are emphasized to ensure that the proposed scenarios and plans are grounded in the local context and address the needs and values of the community. Although some computer-aided tools are employed for data analysis or visualization purposes, the entire process is dominated by professionals, who may be limited in certain respects. For instance, the proposal of alternative scenarios or plans may be biased toward the domain knowledge or perspective of planning professionals, leading to blind spots and oversights. Stakeholders may insufficiently account for potential uncertainties, whether they manifest as black swan or “elephants in the room” [
7]. Additionally, planning professionals may struggle to analyze complex scenarios in which multiple drivers are entangled to influence the future. This limitation is why current scenario planning considers only a limited number of scenarios or plans.
We consider artificial intelligence (AI) a potential solution for addressing the limitations of existing scenario planning practices, thereby contributing to the development of smart and resilient cities. The integration of AI into urban planning has recently received increasing attention owing to its potential to revolutionize the field [
8], [
9], [
10], [
11], [
12]. For instance, planners expect AI to replace low-level planners in performing tedious and repetitive tasks such as traffic counting or rendering sketches [
12], [
13], [
14]. Some researchers have also explored the use of large language models, such as ChatGPT, for conducting strengths, weaknesses, opportunities, and threats (SWOT) analyses in planning projects or leveraging social media analysis to survey public opinions [
15], [
16]. AI has the potential to aid scenario planning in several ways.
• First, AI can think creatively and unconventionally. An example is the “Move 37” made by AlphaGo in its historical match with Sedol Lee, which was an unusual move that stunned Go experts worldwide and led to the ultimate victory in that match. This case demonstrates that AI can think beyond conventional strategies employed by human experts in complex settings. When applied to scenario planning, AI may generate innovative plans similar to the “divine move” in the Go game, creating scenarios that are beyond the imagination of human experts.
• Second, AI algorithms can significantly improve the efficiency of the scenario planning process by automatically analyzing vast amounts of relevant data. This allows for the consideration of a broader range of driving factors and scenarios, thereby expanding the scope and depth of the scenario planning process.
• Third, AI models, such as deep learning, excel in capturing implicit and dynamic interactions within complex systems and supporting expert decision-making. These models have proven effective in various domains, such as climate science, ecology, and geography [
17], [
18], [
19]. When applied to scenario planning, AI models can capture complex patterns and trends in cities that are difficult for humans to discern and articulate, thereby enabling more accurate predictions when composing predictive scenarios and evaluating proposed plans [
20], [
21].
Despite their great potential, existing scenario planning practices have not incorporated AI techniques, as many human planners are unfamiliar with AI and are unclear about its potential in this field. This study aims to examine the potential integration of AI into scenario planning practices. It is worth noting that a few other excellent review articles focusing on AI applications in urban planning have been published recently [
11], [
16], [
22]. Our work differs from theirs by focusing on a specific method used in urban planning—scenario planning—and exploring the intricacies of AI techniques. Additionally, it is essential to note that AI encompasses a diverse range of approaches and this paper focuses exclusively on deep learning and reinforcement learning (RL) techniques.
The remainder of this paper is organized as follows. Section 2 outlines the three key steps in scenario planning to which AI can contribute. Section 3 introduces various AI techniques used to supplement these three steps. Section 4 discusses the roles played by human stakeholders in this process. Section 5 examines the challenges and potential solutions of AI-empowered urban scenario planning. Section 6 lists the expectations for future research that could contribute to this field, and Section 7 concludes the paper.
2. Key elements of scenario planning and limitations of existing practices
Chakraborty and McMillan [
3] developed a scenario planning typology. However, this typology is more geared toward practitioners and is less focused on methods. Here, we refer to their work [
3] and present a framework (
Fig. 1) that highlights the key steps in scenario planning to which AI may contribute. The framework underscores four key components—plan generation, scenario construction, plan evaluation, and stakeholder engagement—and demonstrates their interconnectedness in the scenario planning process.
2.1. Plan generations
The outcome of the plan generation is a collection of candidate plans for stakeholder evaluation. Here, we define plans as items that local governments control and can modify to address different future scenarios. Examples of the generated plans include transportation network designs, land-use zoning regulations, building codes, policies, and investment strategies. Despite being adjustable, plan generation often considers constraints such as available resources, funding, geographic barriers, and preexisting designs. Existing scenario planning practices generally involve a single or limited set of plans due to the extensive efforts to craft detailed plans [
3], [
23]. AI can contribute to plan generation tasks by creating diverse, plausible, and creative candidate plans through automated algorithms, which can significantly reduce the workload of human planners.
2.2. Scenario construction
The outcome of scenario construction is a series of plausible futures, or scenarios, for consideration. It is challenging to define exactly what a “scenario” is. According to Ref. [
24], scenarios are “hypothetical illustrations of the future that describe a cross-section in an established context.” In Ref. [
25], the authors listed six key scenario attributes:
future-oriented,
externally contextualized,
narrative description,
plausibly possible,
a systematized set, and
comparatively different. The characteristics of being
future-oriented and
externally contextualized imply that scenarios are meaningful alternatives to the future, influenced by external drivers beyond the control of local stakeholders. In this research, we specifically define “scenarios” as possible futures over which local stakeholders have limited or no control but should actively prepare for or adapt to. Examples of scenarios frequently discussed in the literature include extreme events (e.g., sea level rise), technological advancements (e.g., market penetration of autonomous vehicles), and demographic shifts (e.g., an aging population).
Despite the complexity of the definition, there are no set of rules for constructing scenarios [
25], [
26]. Intuitive Logic (IL) is perhaps the most widely used method for scenario construction, but it lacks a specific form and involves a group of individuals brainstorming about the future. Advances in computer modeling and simulations have led to the utilization of computer-aided tools to construct scenarios. These scenarios are not limited to narrative descriptions or stories but can also be numeric or visionary results obtained from the models and simulations.
The developed scenarios can be broadly classified into three categories:
normative (preferred),
predictive, and
exploratory [
3], [
4], [
27], [
28], [
29].
Normative scenarios represent preferred and aspirational targets, inspiring planners to strategize and attain such targets. The construction of normative scenarios often involves workshops or Delphi methods in which a panel of experts is consulted to reach a consensus on the desired future state. Backcasting is then applied to work backwards and identify the necessary steps to achieve this outcome.
Predictive scenarios depict the most likely future, offering insights into “
what will happen.” The construction of
predictive scenarios typically involves time-series predictions or data-driven modeling approaches that extrapolate the most probable future. Conversely,
exploratory scenarios explore a wider range of uncertain futures, broadly representing “
what could happen” [
3], [
27]. Recent research has emphasized the importance of “
exploratory scenarios” in preparing for uncertain futures [
28], [
29], [
30].
However, despite these methods, existing scenario-construction practices still consider only one or two key drivers. For instance, climate scientists define scenarios based on varying levels of carbon emissions and socio-economic developments [
31], [
32]. Planners commonly utilize 2 × 2 matrices, focusing on two drivers, to delineate different scenarios for plan evaluation. However, real-world urban development is influenced by a multitude of drivers entangled with various functions. Moreover, existing scenario planning predominantly focuses on normative scenarios that envision an individual’s preferred future and attempts to actualize it [
33], such as the Sustainable Communities Initiative administered by the US Housing and Urban Development [
3], which insufficiently accounts for the complexity and uncertainty of future cities [
34].
AI can supplement scenario construction by accounting for a wide array of drivers and their interactions, thereby capturing the complexity of real-world urban development in a more comprehensive manner. Moreover, AI can streamline the generation of diverse and exploratory scenarios, helping planners move beyond normative and predictive scenarios and prepare better for uncertainties.
2.3. Plan evaluation
For the plan evaluation, analyses were conducted to assess the performance of the proposed plans within the assumed scenarios by referring to certain criteria. These criteria typically reflect public interests, such as equity, system efficiency, sustainability, economic well-being, and quality of life. Such analyses can be carried out using either qualitative methods, such as opinion surveys, or sophisticated computer-aided approaches, such as data-driven simulations and planning support systems [
23], [
35], [
36]. The evaluation results then provide feedback for plan generation and help iteratively improve the proposed plans while maximizing public interest (
Fig. 1) [
23].
AI can facilitate qualitative plan evaluation by providing a vivid representation of the performance of candidate plans across scenarios. For quantitative plan evaluation, AI can incorporate a broader range of quantitative criteria into plan evaluation and automate the data analysis process, thereby improving the effectiveness of plan evaluation.
3. Empowering scenario planning with AI
The current body of literature on AI applications in scenario planning is scant and fragmented primarily because this field is new. Therefore, in this section, we present approaches that can potentially contribute to the key steps of urban scenario planning, as shown in
Fig. 1, including plan generation, scenario construction, and plan evaluation.
3.1. Plan generation
When applied to plan generation, AI can contribute to smart and resilient cities in at least two ways. First, AI has the potential to generate long-term plans that are not only logically rigorous and grounded in data but also extend beyond the imagination of human professionals, which may include innovative solutions to address urban challenges. Second, algorithms such as RL can optimize ad hoc decision-making related to the operation of urban systems in anticipation of upcoming adversities such as tropical storms, thereby reducing the negative impacts of such events.
3.1.1. Generating candidate plans with generative deep learning models
Generative deep learning models, such as Generative Adversarial Network (GAN) and Variational AutoEncoder (VAE), can generate realistic examples after learning the hidden distribution of training samples. The GAN trains a generator that produces fake samples from random inputs, and a discriminator that distinguishes between real and fake samples. These two networks, the generator and the discriminator, compete with each other through the learning process until the generator can produce fake samples that are indistinguishable from the real ones. Meanwhile, the VAE uses an encoder-decoder architecture. The encoder maps the inputs to a low-dimensional latent space, which is regularized to follow a normal distribution. The decoder then reconstructs the inputs from the latent space. The model was optimized to minimize differences between the input and reconstructed samples. The generator, discriminator, encoder, and decoder in the GAN and VAE are deep learning networks, such as convolutional neural networks (CNNs), which extract features from samples and reconstruct samples from low-dimension vectors. After model development, the GAN generator or VAE decoder can create unseen samples resembling real samples in the training data with random inputs.
Both models have been widely used in computer vision tasks to generate two-dimensional (2D) images. In the context of plan generation, these models can be calibrated with real-world spatial plans and processed into gridded formats to generate 2D spatial layouts of roadway networks [
37], [
38], [
39], building footprints [
40], [
41], [
42], land use [
43], and facility distributions [
44], [
45], which can then be considered in the scenario planning process.
An issue with classic generative models is that their outputs are random, leaving users with no control. However, real-world planning problems are often contextualized by considering factors such as surrounding geographical or built environments. This issue can be addressed using conditional generative models, such as conditional GAN and conditional VAE, which allow users to input additional guidance for the model to generate the intended plans [
46], [
47] (
Fig. 2). These models use the conditions (or labels) together with the training samples as model inputs. These conditions can be as simple as a one-hot vector, suggesting the type of terrain (e.g., flat or mountainous), or as complex as multi-sourced data such as images and texts. When used to create unseen samples, these conditions are also input as additional information obtained by the generator and decoder. For example, a conditional GAN was used in Ref. [
48] to generate road networks aware of the roadway topology, which can specify whether the generated roadways are for flat or hilly terrain. In Ref. [
45], the authors used a conditional VAE to consider human instructions (e.g., greenspace rate) in the generated plans. This line of research also emphasizes the need for more efforts to factorize urban development plans or policies, ranging from simple quantitative ratings to categorical typologies that can be seamlessly integrated into algorithms.
Moreover, Image2Image, or Pixel2Pixel, is a special type of conditional GAN that performs pixel-level transitions from source images (e.g., color-coded masks) to target images [
49]. For instance, some researchers have adopted this method to generate potential building footprints within one or a few urban blocks, with inputs delineating their boundaries [
40], [
41], [
42]. Such a method may be more suitable for mesoscale planning projects such as the proposal of a few urban blocks. In Ref. [
43], the authors employed the Image2Image approach to predict future land use by integrating historical land-use data and city planning data, such as planned metro station locations.
In addition to 2D plans, recent studies have explored the application of GAN for the generation of three-dimensional (3D) models. For example, some researchers have proposed Urban-GAN, which can generate 2.5D building footprints within urban blocks [
50] by referring to exemplary urban form in the database. In Ref. [
51], the authors proposed a pipeline that generated a 3D virtual city from a single street-view image. The pipeline employs a conditional GAN to create a terrain map and style that represent the appearance of the virtual city. Whereas GAN and VAE are commonly used for gridded data, such as images and rasters, leveraging deep CNNs [
52], recent research has also integrated generative models with graph convolutional networks (GCNs) to generate graph-structured data, such as roadway networks and points of interest (POI) distributions [
39], [
44].
3.1.2. Generating candidate plans: Coupling cellular automata and deep neural networks
Planning scholars have utilized cellular automata (CA) to predict urban development. CA operates on gridded data comprising spatially arranged grids (i.e., cells). The algorithm initially learns transition rules from historical land-use data and driving factors such as the presence of critical facilities, road proximity, water proximity, terrain, and population density (
Fig. 3). These rules are then used to update the states of the cells (e.g., land-use type) based on their previous states and the states of neighboring cells. Recent studies have extended this method to operate on vector data with arbitrary shapes (e.g., points and polygons), leading to a vector-based or graph-CA. After model development, scholars can manually adjust transition rules or set restrictions to generate plans corresponding to different urban development policies [
53], [
54].
Previous studies have typically formulated transition rules using regression models or Markov chains [
53], [
55]. Some studies have also considered deep learning models such as artificial neural networks (ANNs) [
54], [
55], [
56], CNNs [
57], [
58], and GCNs [
59], [
60] to learn the transition rules. These deep learning models can incorporate more variables to depict the diverse driving forces behind urban development. CNNs and GCNs can capture interactions between cells and account for the spatial heterogeneity of cellular interactions [
58], [
60]. CNNs were originally designed for gridded data (i.e., image pixels), making them a natural fit for raster-based CA [
57], [
58]. Graph-based neural networks can be coupled with a vector-based CA using an adjacency matrix to store the relations (i.e., contiguity or proximity) among vectors [
59], [
60].
3.1.3. Generating ad-hoc plans with reinforcement learning
While scenario planning mostly focuses on long-term plans to address future uncertainties, studies on resilient planning also suggest flexible urban designs or management strategies to address near-future disruptions with less uncertainty, such as an upcoming storm [
61]. Various approaches have been proposed to achieve “flexible” planning, such as the creation of multifunctional spaces and dynamic control of urban systems [
61], [
62], [
63]. However, both the switching of space functions or the dynamic control of city systems/facilities involves optimization tasks and requires prompt decision-making to respond to impending disruptions. This provides opportunities for AI models, particularly RL, to aid in this process.
RL is an AI approach that trains autonomous agents to make sequential decisions in response to environmental stimuli, making it particularly suitable for applications in which optimal decisions may change over time depending on the changing environment.
Fig. 4 shows the general architecture of RL. RL has found wide applications in the development of robotics such as autonomous vehicles (AVs) and humanoid robots. When applied to an urban context, researchers have explored the use of RL for the dynamic control of different urban systems, such as drainage systems [
64], energy systems [
65], and traffic control [
66], in response to changing environmental conditions and demands. For example, in a smart energy system, an RL agent can act as a central control, receiving inputs depicting environmental conditions (e.g., the spatial distribution of the surface temperature) and outputting actions (e.g., opening or closing additional power plants) according to the policy. These actions can lead to consequences (e.g., satisfaction of citizens’ needs) coded as rewards or penalties. The RL algorithm continually updates policy parameters based on actions, observations, and rewards/penalties. Commonly used algorithms for RL agents include Q tables (also known as Q-learning) and deep learning models, such as ANN and CNN.
3.2. Scenario construction
AI can facilitate the construction of scenarios and contribute to the development of smart and resilient cities in multiple ways. AI can collaborate with humans to brainstorm potential adversities that humans may neglect. AI can generate challenging scenarios to “debug” existing urban planning and management practices. Additionally, AI can generate vivid photorealistic scenarios of adverse events for stakeholder evaluation.
3.2.1. Constructing descriptive scenarios with large language models
A recent paper [
67] has innovatively explored the use of the AI chatbot, that is, ChatGPT, in generating descriptive narratives for future scenarios. The authors highlighted the benefits of AI-assisted scenario generation, emphasizing its low cost and high efficiency. In the experiment, researchers tasked ChatGPT with “writing three scenarios for the future of transport,” and found the responses were highly imaginative. Following Ref. [
67], we also prompted ChatGPT to “list five future scenarios that may largely change urban residents’ uses of urban spaces and urban planners should consider in scenario planning.” The responses included the following five trends: ① increased adoption of remote working, ② growth in e-commerce, ③ aging population, and ④ shift toward sustainable transportation; and ⑤ continued growth of smart cities [
68]. These trends are focus point in the urban planning and require further attention from urban planners. It is understood that the AI bot may have synthesized information from a wide range of online articles, news, and blogs discussing these issues. According to Ref. [
67], these answers can provide “raw material” for urban scenario planning, with further manual choosing/rejecting, evaluations, and developments made by human planners.
We further asked ChatGPT the same question, but in the specific context of the City of Miami, Florida. In addition to trends such as autonomous vehicles, extreme weather events, and affordable housing, which are common to many modern cities, the responses also highlighted rising sea levels, flooding, and tourism. These reflect the city’s unique context as a coastal tourism destination vulnerable to sea level rise and fluvial flooding.
3.2.2. Constructing spatio-temporal event scenario with generative deep learning models
In addition to generating spatial plans, as discussed in Section 3.1.1, generative deep learning models can also construct unforeseen adverse scenarios to test candidate plans, thereby providing resilient strategies for hazard-prone cities. For instance, scholars have used generative models to simulate spatio-temporal urban dynamics, such as energy consumption or traffic flow, and help “debug” existing urban planning and management practices [
69], [
70]. In Ref. [
69], the authors used infoGAN, a type of conditional GAN, for the controllable generation of temporal profiles of renewable energy production, whereas Ref. [
70] applied long short term memory (LSTM)-GAN to generate the spatiotemporal distribution of traffic flows.
Researchers have proposed various approaches for generating extreme event scenarios. For instance, in Ref. [
71], the authors proposed a GAN-based method for generating typhoon tracks, including typhoon centers and cloud structures, by learning from historical satellite images capturing such tracks. To account for the rarity of extreme events, some researchers combined GANs and Extreme Value Theory (EVT) to develop generative models capable of creating spatiotemporal distributions of extreme weather or climatic parameters, such as extreme heat and rainstorms [
72], [
73]. In particular, the use of deep learning models, instead of conventional numerical models, can better capture the spatial correlations among different meteorological and aerodynamic variables, such as air pressure, temperature, and precipitation, thereby enabling the generation of more realistic extreme event scenarios used in urban scenario planning [
73], [
74].
3.2.3. Constructing photorealistic scenarios for qualitative assessments
Scenario generation also plays an important role in tasks other than urban planning. For instance, scholars in computer science have proposed approaches for generating photorealistic scenes for training and debugging AV algorithms [
75], [
76], [
77]. These approaches can generate realistic traffic conditions with structured arrangements of fixed and moveable objects, roadway topologies, and environmental conditions (e.g., lighting and weather conditions) using specific coding languages or narratives [
75], [
77]. These generated scenes may include some of the most challenging and rare conditions for AV operations, where data collection is challenging during real-world operations [
77]. When applied to urban scenario planning, these approaches facilitate the presentation of urban plans across both normal and adverse scenarios, enabling stakeholders to effectively review and assess their performance.
3.3. Plan evaluation
An essential component of plan evaluation is the logic that enables planners to assess the performance of the proposed plans across assumed scenarios. The methods used to derive this reasoning can vary, ranging from qualitative intuitive logic to quantitative approaches, such as statistical models and numerical simulations [
6], [
35]. The use of an AI model enables the consideration of a wide array of variables and criteria for plan evaluation. The integration of agent-based modeling (ABM) and AI allows for bottom-up evaluations, capturing the intricate and nonlinear interactions among different actors in a city, thereby enabling a more comprehensive plan evaluation for the development of smart and resilient cities.
3.3.1. Integrating AI and ABM for bottom-up plan evaluations
ABM offers a powerful bottom-up approach for modeling complex systems and has been widely applied in urban studies [
78]. In an ABM, agents make decisions by interacting with each other and the ambient environment. The collection of agents can jointly capture the complex behaviors of urban systems such as nonlinearities, feedback, tipping points, hierarchy, and self-organization [
78], [
79]. Thus, the ABM can be utilized to examine the performance of different urban plans across scenarios. For instance, some researchers used ABM to examine different curb space allocation strategies under various scenarios, for instance, varying market penetration rates of mobility-on-demand [
80], [
81], [
82]. In response to the rise of shared AVs, researchers used ABM to explore associated parking strategies, including the number of parking spaces and parking locations (e.g., idling at the curb or cruising to parking garages) [
83], [
84], [
85], [
86]. Other studies have also used ABM to examine the effectiveness of different climate adaptation strategies [
87] and urban forms [
88].
Conventionally, the behavior of agents is explicitly coded using a set of predetermined rules or sampled from empirical distributions. Some researchers have integrated machine learning or deep learning models with ABM [
89], [
90]. In particular, RL trains autonomous agents for optimal decision-making when interacting with dynamic environments, enabling its seamless integration into ABM to optimize agent behaviors. This treatment is particularly suitable for evaluating plans involving AVs or ride-sharing drivers whose behaviors are expected to be optimized using advanced algorithms such as RL. Although extensive studies in computer science have explored the use of RL for AVs and ridesharing platforms, no specific study has been conducted to evaluate relevant planning, such as the spatial planning of curb spaces accommodating idling AVs or ride-sharing customers.
Some researchers have integrated ABM and RL to study human behavior [
91], [
92]. However, it is less realistic to assume that humans such as robots make optimal decisions. A recent study that we found inspiring is Ref. [
93], where the authors developed “generative agents” by iteratively querying ChatGPT about the agent’s schedules, behaviors, decisions, and perceptions. Considering large language models, such as ChatGPT, learn from a vast amount of online data encoding human behaviors, they can yield more realistic references to human responses in different situations. Despite its great potential, this thread of research—integrating AI and ABM for the bottom-up evaluation of urban plans across scenarios—is underexplored.
3.3.2. Generalizable deep learning model capturing complex and dynamic urban processes
Planners often utilize various statistical approaches, such as regression models to analyze the performance of different plans across scenarios. However, both conventional statistical and deep learning models assume that the model is applied to samples that follow the same distribution as the training data. This assumption limits their application in scenario planning because future scenarios can deviate significantly from the historical data on which the models are trained [
29]. Deep learning models are even less capable of this task because they use a larger number of parameters than conventional statistical models and are prone to overfitting the distribution of training data. Therefore, it is common to see deep learning models developed for predicting urban dynamics in the “near future,” such as traffic conditions in the next few hours or days, when social and built environment conditions in cities do not undergo drastic changes. However, deep learning has rarely been used for long-term predictions.
Computer science scholars have acknowledged the challenge of generalizability in deep learning models and have made efforts to address this issue. They used “inductive bias” to refer to the biased (preferred) assumptions made by AI models for relational reasoning, which made the developed models more generalizable for certain tasks but not for others [
94]. Many deep learning models exhibit varying degrees of inductive bias. For example, models such as CNN and RNN that reuse the same rule across localities or sequences are associated with a stronger inductive bias than linear layers [
94]. Similarly, graph-based deep learning models that reuse the same set of per-node or per-edge functions across the edges or nodes of a graph also exhibit a strong inductive bias. Building on these concepts, in Ref. [
95], the authors proposed strategies inspired by human conscious processes to incorporate additional inductive biases into deep learning models. Examples of these strategies include the use of attention to enable dynamic connections, and regularization techniques to ensure the sparsity of the learned models.
In addition to theoretical explanations, a few empirical studies have explored the use of deep learning models to examine the efficacy of plans across scenarios. For example, researchers [
35] proposed the Geo-contextual Multitask Embedding Learner (GMEL) to forecast the changing human mobility patterns in hypothetical scenarios, such as the construction of a new bike lane or a new high-rise building. Other researchers have also developed a graph-based deep learning model to compare different curb uses amidst potential changes in curb management policies or the built environment [
96].
3.3.3. Integrating AI and extended reality for more immersive participatory planning
In addition to simulation- and modeling-based approaches that quantitatively compare the performances of different plans across scenarios, it is crucial to include expert feedback and community participation in the process [
97]. The use of virtual reality (VR) for immersive participatory planning has been explored in recent studies [
98], [
99], where participants were presented with different plans in a VR setting for evaluation. AI can contribute to this process by automatically generating 3D landscapes and built environments in VR, or by adding details to draft plans for better visualization [
13], [
100]. For instance, FrankenGAN, proposed by Kelly et al. [
100], can add multiscale details to coarse mass models while allowing users to control style, geometry, and texture through a cascade of GANs. The output can be exported to the VR setting for participant evaluation.
Other extended reality (XR) technologies have also been explored for plan evaluation. For instance, the Massachusetts Institute of Technology (MIT) City Science Group developed an interactive platform that couples generative models and a tangible user interface (TUI) for participants to visualize the impact of the proposed interventions in real time [
101]. Additionally, some scholars investigated the use of mixed reality (MR) to achieve real-time on-site visualization and interactions between participants and candidate plans [
102].
Based on the reviewed literature, we summarize the merits and demerits of the conventional and AI-empowered scenario planning approaches in
Table 1.4. The role of human stakeholders in AI-empowered planning practices
With the diverse potential for AI to be integrated into the urban scenario planning process, ongoing discussions revolve around the role of human stakeholders.
One thread of scholars considers human-AI collaborations in decision-making. For example, the smart design framework presented in Ref. [
103] specified how different stakeholders can collaborate with AI algorithms for the generation and evaluation of urban plans. This framework delineates the planning process into two levels: ① the general design process, in which human planners use computational tools to guide the coevolution of the design problem and solutions, and ② narrow design generation, in which computational tools and AI techniques are employed to generate and optimize plans with well-defined design problems. Human-defined constraints such as the distance between buildings can be incorporated to guide the generation of plans. Similarly, in Ref. [
14], a simple human-AI collaborative workflow was proposed, where the authors considered that human planners should take key responsibilities for the conceptual work; for instance, setting the evaluation criteria and leveraging AI tools as intelligent assistance for tedious computing and optimization jobs. Similar opinions were also expressed by other scholars in the planning field [
12], [
13].
Inspired by the five autonomy levels of autonomous vehicles, scholars [
16] suggested the evolving role of humans in plan-making with the increasing adoption of AI. This includes “planner-in-the-loop” where planners hold the primary responsibility in the plan-making process and AI is mainly leveraged as tools; “planner-on-the-loop” where AI agents assume important roles and planners assist AI in making key planning decisions; and “planner-out-of-the-loop” where planners are removed from the plan-making process but only need to evaluate the results of plans and adjust initial goals and objectives input to the AI, as well as maintaining and upgrading AI algorithms.
Moreover, given that urban planning frequently involves high-stakes decision-making that may unequally benefit different communities and population groups, human stakeholders should play a pivotal role in establishing ethical rules to ensure the ethicality of AI algorithms. This can be achieved in multiple ways, such as involving human stakeholders in the process of setting objectives for AI algorithms or evaluating the resulting outputs.
Recently, some researchers have proposed the concept of “AI envelopment,” which implies creating boundaries or constraints for AI algorithms by specifying their operational contexts or environments, thereby ensuring their adherence to ethical principles, legal requirements, and societal norms [
104], [
105]. The design of such AI envelopes is a socio-technical process that should involve professionals from different fields, including computer science, urban planning, social science, legislation, among others [
105].
5. Technical challenges and solutions for planners to use AI in scenario planning
5.1. Low transparency of AI algorithms
One of the primary obstacles limiting the adoption of AI algorithms in urban scenario planning is their “black-box” nature, where users do not know how individual decisions are made within the developed AI model. This lack of transparency hinders planners from trusting the model output, especially given that urban planning often involves high-stakes decisions that can significantly affect city residents’ welfare [
106].
One strategy to address this limitation is to enhance the interpretability of black box AI algorithms. If users can obtain a sense of how AI algorithms work and which set of variables contributes to their decision-making process, they can identify decisions driven by unreasonable logic for further scrutiny. This set of methods, known as eXplainable AI (XAI) or interpretable AI [
107], [
108], [
109], has gradually been used in domains such as healthcare, finance, and criminal justice, where domain experts can review the model’s explanations to determine whether to accept the output decisions [
109]. Various XAI methods have been proposed for different tasks and deep learning structures, including SHAP, DeepLIFT, and LIME [
108], [
110]. However, it should also be noted that including human experts in examining AI algorithm outputs may potentially introduce the “confirmation bias” [
111].
In addition to XAI, researchers may consider using “grey-box” models to enhance the interpretability of AI algorithms. Grey-box models are those whose designs are analogous to established laws, theories, or principles in certain domains [
105], [
112]. By incorporating domain knowledge, grey-box models can achieve high performance similar to other black-box models while providing a certain level of explainability like white-box models. Several recent studies that have leveraged deep learning to model urban dynamics have demonstrated such grey-box designs. For instance, Hao and Wang [
113] constructed a GCN-LSTM model to predict urban mobility perturbations during tropical storms. The model structure is designed analogously to an established mathematical model [
114] that depicts how complex networked systems respond to the impact of external forces, considering that urban mobility is highly networked. In another study, a Graph AttenTion (GAT) network-based model, inspired by the gravity model in the field of econometric geography, was proposed to capture visitation flows between commercial centers and outlet areas [
115]. Moreover, many researchers have acknowledged that graph-based models, such as GCN and GAT, work better for modeling urban transportation systems that are graph-structured and spatially correlated [
70], [
96], [
112]. Models such as diffusion-GCN can more accurately capture the spatial diffusion phenomenon in urban events such as transportation jams or air pollution owing to their unique architecture [
70], [
112], [
116].
5.2. Unclear accountability of AI algorithms
Another obstacle that limits the use of AI in high-stakes decisions, such as urban planning, is the issue of unclear accountability. AV developers cannot anticipate how an AV will act when navigating ethical dilemmas such as the “trolley puzzle.” However, regardless of the decision the AV makes, the question arises as to who should be held accountable for the resulting losses: the algorithm developer, AV seller, AV manufacturer, or the general public? The application of AI in urban planning presents many explicit or implicit conflicts, similar to the trolley puzzle, in which the final adopted plans may prioritize the rights or interests of certain groups at the expense of others.
The aforementioned AI envelopment provides a solution. For instance, allowing AVs to operate unrestrictedly on urban streets may raise safety concerns as they may conflict with other road users. However, if we “envelope” the operation of AV, for example, by designating an exclusive lane for AV use (i.e., enveloping the operation environment); installing specialized sensors along the lane that AVs can read more easily (i.e., enveloping the inputs), or imposing restrictions on the range of actions AVs can execute (i.e., enveloping the outputs), then the AV operation can be largely simplified and road safety can be ensured.
5.3. Need for high-quality data
The development of AI models requires a vast amount of data [
117], especially to capture intricate spatiotemporal urban dynamics (e.g., economy, transportation, and energy) for scenario planning. This necessitates continuous collection of high-quality and fine-grained data across cities over time. Such massive archiving of urban dynamics data demands substantial efforts in data collection, cleaning, storage, and management, which was not feasible until recent years owing to advancements in urban sensing and computing technologies. However, cities have made uneven progress in integrating these advanced technologies, leading to geographic disparities in urban data collection, and subsequently, uneven urban innovation. These disparities not only amplify existing inequities among cities but also pose challenges for developing AI models that can function across cities of different scales. Furthermore, different cities may adopt different standards for data collection and management, further complicating the development and validation of AI models. To address these challenges, local planning professionals can collaborate with data scientists to standardize data collection, archiving, and management practices to support the development of AI applications. Low-cost methods should also be considered to bridge the urban innovation gaps among cities with varying capacities.
5.4. Insufficient validation
Conventionally, AI algorithms are developed using a train-test-validation process, in which a large dataset is randomly split into three subsets for model development and validation. However, such validation methods may be insufficient for AI algorithms developed for scenario planning because the social and environmental conditions represented by future scenarios may differ from those present in the data used for model development. In addition to the approaches proposed for developing the generalizable deep learning models discussed in Section 3.3.2, two additional validation approaches for AI algorithms intended for scenario planning could be considered.
The first involves collecting data from a wide spectrum of cities and events. The validation set included data collected from cities or events that were entirely distinct from those used in the training or testing set. In this case, users can examine whether the developed model can be applied to different cities and choose the most generalizable model for scenario planning. In cases where access to data from different cities is limited, a viable compromise is to use data from the same city but covering a long period during which significant changes have occurred, such as policy changes or built-environment updates. The models were developed using data from the period before the change and then validated using data from the period after the change. In this manner, users can examine whether the developed model can be effectively generalized to potential future scenarios where changes may occur.
5.5. Low adoption in the planning profession
Despite the great potential of AI to support planning, it has been found that the extant planning profession has low adoption of AI technologies [
9]. One reason for this is fear of job displacements. Currently, AI is expected to serve as a planning assistance tool for strenuous and well-defined tasks with human planners controlling the overall process [
103], [
118]. The integration of AI in planning may enable human planners to focus more on tasks requiring human innovation [
14], [
118]. This may reduce the demand for lower-level planning roles but create opportunities for new occupations in the planning field, such as database management and AI algorithm testing/validation. This transformation may mirror the adoption of geographic information system (GIS) tools, which reduced the demand for outdoor land surveyors but simultaneously created new occupational opportunities for GIS specialists. The next generation of planners should actively improve their skills and prepare for the shift.
6. Prospects for future AI-empowered scenario planning research
Based on the reviewed literature, we identified five research directions that could contribute to the adoption of AI in urban scenario planning and the construction of smart and resilient cities.
(1)
Integrated and dedicated platform for urban scenario planning. Although various AI approaches have been proposed for different steps in scenario planning, there is a lack of dedicated and integrated platforms or software to streamline the different approaches for easy deployment. Although some professional software programs such as ArcGIS include certain machine learning algorithms, they have progressed slowly in incorporating more sophisticated AI algorithms. One pioneering company in this field is Xkool [
118], a technology company dedicated to integrating AI into urban planning and architectural design practices. The company has developed a collection of AI-enabled cloud-based products to assist in planning and design at the urban, block, and building levels. These tools can be further improved by integrating scenario analyses that account for future uncertainties and adversities, and by providing flexible connections to other off-the-shelf AI algorithms (e.g., plug-ins) to support stakeholders’ pre-investment decision-making.
(2)
Standardization of ethical AI adoption in the planning field. In addition to the absence of professional software, it is necessary to standardize the processes for addressing the concerns of transparency, accountability, and generalizability of AI algorithms applied in the planning domain. While some European councils have made progress in specifying ethical standards for AV operations [
119], less adoption has been observed in other regions and AI application fields. When developing a standard for the planning field, the concept of AI envelopment serves as a promising starting point for professionals to consider the application context, quality of inputs, and permitted outputs. Such standards should also delineate the involvement of human stakeholders in the various stages of the process.
(3)
Multi-sensory generation/metaverse construction for immersive plan evaluation. An interesting research direction for qualitative plan evaluation is the integration of generative deep learning models with XR technologies to create more interactive and immersive “scenarios” for participants’ evaluation, for instance, through participatory planning. Certain progress has been made in this direction [
101], [
102], but there is still a lack of integrated software or platforms to archive the varied methods and automate the entire process, including AI for plan/scenario generation, XR for visualization, and participatory planning for evaluation. In addition to generating visionary inputs for XR, AI has the potential to generate congruent multi-sensory stimuli, such as visual and acoustic stimuli, to provide participants with even more immersive experiences. However, this topic has yet to be explored in detail. Moving forward in this research direction, an intriguing prospect could be the integration of metaverse and participatory planning, where the metaverse is firstly constructed according to proposed plans, such as a community park, and nearby residents are then invited to visit the metaverse and “interact” with each other. Users can experience the park’s amenities and activities in a virtual environment and provide valuable feedback to refine the design before it is constructed in the real world. Analogous to beta testing in software development, this approach offers a more inclusive approach to urban planning.
(4)
Smooth transition plans bridging existing and future cities. Numerous studies have demonstrated potential futures such as the widespread adoption of shared autonomous vehicles, which may significantly change city residents’ utilization of urban spaces. Corresponding plans have also been discussed, such as relocating intracity parking spaces to the outskirts [
83], [
86]. However, there is a notable deficiency in the methods that bridge planned future city layouts with the current city layout, showing how existing built environments can be incrementally updated to accommodate future residents and technological needs in a smooth and sustainable manner that is accepted by the public.
(5) Cities as robots for ad-hoc decision-making in complex and challenging situations. Last, but not least, Smart City initiatives tended to treat cities as platforms where different “smart systems” can be loaded on the platform similar to how smartphone users buy phone applications from the app store. However, countering future challenges such as extreme events requires the collaboration of different urban systems, including smart ones. This leads us to consider whether the next generation of Smart City initiatives should view cities as giant robots, in which different urban systems interact and function in response to internal and external stimuli. This “robot view” of cities encourages future research and practices to incorporate deep RL (widely used for robotics) in ad-hoc urban decision-making and help cities maneuver through emerging adversities, like how an autonomous vehicle navigates through road barriers, and achieve greater urban resilience.
7. Conclusions
This review explores the intersection of AI and urban scenario planning, elucidating AI’s potential to transform the automation of plan-making, management, and the evaluation of smart and resilient objectives. Through a comprehensive review of the interdisciplinary literature, this study illustrates how AI can empower different stages of urban scenario planning while also identifying potential challenges and discussing possible solutions. The discussion further extends to recommendations for future research and practices, emphasizing the need for integrated AI-enabled scenario planning systems, standardization of procedures, the utilization of the metaverse for participatory planning, sustainable transformation planning, and embracing the concept of “robotic view” of smart cities. Collaboration among professionals from diverse disciplines and stakeholders is essential for realizing these recommendations. With AI streamlining conventional planning processes, human stakeholders can devote more attention to addressing the ethical and inclusive aspects of such an integration, thereby building more smart and resilient cities.
Acknowledgments
This work is supported by the University Development Fund (UDF01003238) provided by the Chinese University of Hong Kong (Shenzhen) and graduate school fellowship program at the University of Florida. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Chinese University of Hong Kong (Shenzhen) and University of Florida.
Compliance with ethics guidelines
Haiyan Hao, Yan Wang, and Jiayu Chen declare that they have no conflict of interest or financial conflicts to disclose.