A Survey on an Emerging Safety Challenge for Autonomous Vehicles: Safety of the Intended Functionality

Hong Wang , Wenbo Shao , Chen Sun , Kai Yang , Dongpu Cao , Jun Li

Engineering ›› 2024, Vol. 33 ›› Issue (2) : 20 -40.

PDF (3007KB)
Engineering ›› 2024, Vol. 33 ›› Issue (2) :20 -40. DOI: 10.1016/j.eng.2023.10.011
Research
Review

A Survey on an Emerging Safety Challenge for Autonomous Vehicles: Safety of the Intended Functionality

Author information +
History +
PDF (3007KB)

Abstract

As the complexity of autonomous vehicles (AVs) continues to increase and artificial intelligence algorithms are becoming increasingly ubiquitous, a novel safety concern known as the safety of the intended functionality (SOTIF) has emerged, presenting significant challenges to the widespread deployment of AVs. SOTIF focuses on issues arising from the functional insufficiencies of the AVs’ intended functionality or its implementation, apart from conventional safety considerations. From the systems engineering standpoint, this study offers a comprehensive exploration of the SOTIF landscape by reviewing academic research, practical activities, challenges, and perspectives across the development, verification, validation, and operation phases. Academic research encompasses system-level SOTIF studies and algorithm-related SOTIF issues and solutions. Moreover, it encapsulates practical SOTIF activities undertaken by corporations, government entities, and academic institutions spanning international and Chinese contexts, focusing on the overarching methodologies and practices in different phases. Finally, the paper presents future challenges and outlook pertaining to the development, verification, validation, and operation phases, motivating stakeholders to address the remaining obstacles and challenges.

Graphical abstract

Keywords

Safety of the intended functionality / Autonomous vehicles / Artificial intelligence / Uncertainty / Verification / Validation

Highlight

1.

Novel challenge: Safety of the intended functionality for autonomous vehicles.。

2.

Comprehensive exploration: Covers academic research and practical aspects.

3.

Future focus: Challenges and perspectives.

Cite this article

Download citation ▾
Hong Wang, Wenbo Shao, Chen Sun, Kai Yang, Dongpu Cao, Jun Li. A Survey on an Emerging Safety Challenge for Autonomous Vehicles: Safety of the Intended Functionality. Engineering, 2024, 33(2): 20-40 DOI:10.1016/j.eng.2023.10.011

登录浏览全文

4963

注册一个新账户 忘记密码

1. Introduction

1.1. Motivation

Extensive efforts have been directed toward the development of autonomous vehicles (AVs), to increase the safety and efficiency of future transportation means. AVs are promising owing to their ability to minimize accidents and enable effortless navigation. However, AVs introduce a novel challenge, known as the safety of the intended functionality (SOTIF), which emerges owing to the high complexity and diversity of systems, high-dimensional and dynamic environments, and the uncertainty and opacity of artificial intelligence (AI) algorithms. SOTIF pertains to issues arising from functional insufficiencies of the AVs intended functionality or its implementation, as standardized by the International Organization for Standardization (ISO) 21448 [1]. A closely related research domain is functional safety (FUSA), which originates from hardware failures, software failures, or systematic faults, as defined in ISO 26262 [2]. From the systems engineering perspective [3], both FUSA and SOTIF are associated with systems safety (which involves intricate systems and interdisciplinary research), thus sharing certain commonalities. However, unlike FUSA, which focuses on faults, SOTIF emphasizes the risks associated with limitations of a system or its functional modules. This becomes particularly significant as the AV automation level increases and the spectrum of the AV deployment scenarios widens. The FUSA-SOTIF interplay is shown in Fig. 1. SOTIF concerns are contingent on two essential conditions: ① trigger conditions and ②performance limitations. Trigger conditions include factors such as extreme weather, road conditions, and unexpected behavior of traffic participants [4]. Performance limitations arise from insufficient model performance with respect to sensing, perception, decision-making, and execution, as well as deficient system specifications.

Recently, several illustrative cases of SOTIF have been reported. One instance involved a tragic collision between an AV and truck at an intersection in Florida in May 2016, in which a perception error resulted in misinterpreting a white carriage as a white cloud. Similarly, in June 2020, another AV failed to detect a white carriage, resulting in a collision with an overturned truck on the highway. A self-driving Uber struck and killed a pedestrian in Tempe, Arizona, because the decision-making algorithm ignored the jaywalking pedestrian. These real-world accidents reveal that, while AVs seek to improve transportation safety and efficiency, they also introduce unique safety challenges. Analyzing the causes of disengagement based on the reports published by the California Department of Motor Vehicles [5], over 90% of takeover instances were found to be rooted in software SOTIF issues, whereas only 0.28% were attributed to hardware FUSA. SOTIF has emerged as a significant impediment to the widespread adoption of AVs. Existing SOTIF research focuses on ensuring the absence of undue risks owing to functional insufficiencies or foreseeable misuses. Hence, a systematic review and analysis of current SOTIF-related endeavors is imperative for guiding future research.

1.2. SOTIF standards

ISO 21448 acts as an adjunct to ISO 26262, concentrating on addressing unfamiliar and hazardous scenarios of autonomous driving (AD) while encompassing known unsafe situations. It extends the activities of systems engineering based on ISO 26262, focusing on deficiencies in the intended functionality or implementation. It provides a standardized operating procedure for system specifications, safety analysis, functional modifications, verification and validation (V&V) [6], [7], and operation-phase activities. The performance limitations of the AI component frequently introduce gaps in the AV lifecycle [8], posing a novel and critical challenge for SOTIF. As illustrated by the example of the perception module, data-driven AI algorithms are extensively employed to process sensory data (acquired by cameras and light detection and ranging (LIDAR) devices). While improving performance, these algorithms also introduce challenges in terms of accuracy, robustness, reliability, and interpretability, which have been extensively explored [9], [10], albeit not entirely resolved. Furthermore, perception deficiencies can cascade downstream of the decision-making and control modules, exacerbating potential hazards for autonomous systems [10].

ISO 26262 and ISO 21448 serve as prescriptive methodologies outlining “what to do” to mitigate potential risks associated with AD systems (ADSs). In contrast, the Underwriters Laboratories (UL) 4600 standard [11], [12] adopts a goal-based strategy focused on “how to” assess safety of fully AVs. Introduced as a safety goal-oriented standard, UL 4600 employs use cases for safety analysis [13] and addresses infrastructure and lifecycle considerations for safety. UL 4600 suggests independent safety assessments across the spectrum of development tools, vehicle lifecycle, road user studies, and operational design domains (ODDs). The development scope encompasses black/white box testing of hardware/software and machine-learning components. As part of the ongoing engineering process, deployment, operation, incidents, and maintenance within the AV lifecycle contribute to the development of functional and safety metrics in the feedback loop. In addition, for quantifying nominal SOTIF issues, such as perception failures, UL 4600 proposes a prototype safety performance indicator (SPI), which explicitly furnishes metrics for safety validation, including behavioral metrics for interaction safety. Fig. 2 illustrates the evolution of ISO 26262, ISO 21448, and UL 4600. In addition, recently specific standards have emerged around certain SOTIF aspects, enhancing their practical applicability. For instance, ISO 34502 [14], which serves as a standard for scenario-based safety evaluations, offers detailed specifications and guidance in areas such as scenario generation.

Although the aforementioned standards attempt to tackle the SOTIF challenge, SOTIF is a complex system-safety issue that demands collaborative endeavors of both academic researchers and industrial developers, which are essential for building systematic and comprehensive solutions.

1.3. Contribution

SOTIF is a relatively new concept proposed in recent years; thus, there are no systematic surveys of this field. Therefore, the main objective of this study was to fully investigate and summarize the core studies on the SOTIF of AVs. As of June 2023, a state-of-the-art survey was conducted, using three major databases: ① IEEE Xplore, ② ScienceDirect, and ③ SAE Mobilus. The survey was conducted using “SOTIF” and “safety of the intended functionality” keywords. In addition, Google Trends was used for determining the frequency at which SOTIF-related terms were queried in Google during the time window from January of 2016 to June of 2023. The results are summarized in Fig. 3, revealing a general increase in the number of publications mentioning SOTIF during the past six years.

However, comprehensive and systematic SOTIF-focused surveys are still lacking, which served as the primary motivation for conducting the present study. This study sought to provide insights and references to SOTIF researchers and practitioners, by comprehensively reviewing and summarizing the existing key challenges, technologies, and activities associated with SOTIF. Considering the close association of SOTIF with FUSA as well as valuable information that can be borrowed from the FUSA domain, this study assessed relevant research in the FUSA domain, to provide a more comprehensive perspective to the relatively nascent field of SOTIF. This included exploring the typical safety analysis methodologies with valuable references for SOTIF research. Moreover, although human misuse constitutes a significant trigger condition, its in-depth exploration warrants a separate discussion. The present study primarily focused on factors beyond human misuse, seeking to deliver a comprehensive overview of SOTIF research on both the system and algorithm levels.

From the systems safety perspective, the research and practice of SOTIF consist of three main phases: ① development, ② V&V, and ③ operation. This approach facilitates the organization and synthesis of the existing literature, culminating in the establishment of the overarching framework in Fig. 4.

The present study provides a comprehensive synthesis of SOTIF-related academic research, including both system- and algorithm-level SOTIF studies. The present study focuses on the key SOTIF challenges, pertinent technologies, and corresponding solutions, categorizing them according to their development, V&V, and operation phases. Second, recognizing SOTIF’s intrinsic emphasis on the engineering practice, this study systematically outlines typical practical SOTIF activities. These activities span overarching methodological practices as well as specific practices tailored to the development, V&V, and operation phases. In addition, the latest SOTIF practices pertaining to China are presented and discussed. Finally, centering on the aforementioned three phases, an in-depth analysis of the SOTIF future challenges and prospects is conducted.

SOTIF-related academic research is discussed in Section 2. Section 3 describes the practical SOTIF activities undertaken by various stakeholders, including enterprises, academic institutions, and government bodies. Finally, Section 4 provides a concise overview of the SOTIF research perspectives.

2. Academic research

2.1. System-level SOTIF research

System-level SOTIF research is crucial for ensuring the safety and reliability of AVs. This section delves into three key aspects of SOTIF research: development, V&V, and operation phases. Each phase is important for comprehensively addressing the existing SOTIF issues and building self-aware automotive safety systems for monitoring and mitigating potential risks.

2.1.1. Development phase

During the development phase of a system-level SOTIF research project, a central emphasis is placed on hazard analysis and risk assessment (HARA), which are pivotal for safety-guided design [3]. Appropriate safety analysis tools are crucial for ensuring a comprehensive and logical analysis of SOTIF issues. ISO 21448 [1] suggests several promising solutions, including fault tree analysis (FTA), failure mode and effects analysis (FMEA), hazard and operability analysis (HAZOP), and system-theoretic process analysis (STPA), as listed in Table 1 [3], [9], [15], [16], [17], [18], [19], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31].

FTA is a top−down deduction method that has been widely used to analyze the reliability and safety of systems [15]. Schönemann et al. [16] proposed an approach that derives the functional safety requirements for AD based on FTA, considering both FUSA and SOTIF. In contrast, FMEA, a bottom-up induction method [9], focuses on determining all possible failure modes of systems, subsystems, or components. In Ref. [17], a functional HARA based on the FMEA was used, for identifying possible SOTIF-related hazardous events. HAZOP, a keyword-based brainstorming method, was used for investigating significant deviations in specific behavioral scenarios, thereby allowing to identify possible hazards [18]. In Ref. [19], hazards owing to insufficient functionality of AVs with respect to perception and interpretation of environmental situations, as well as maneuvering and trajectory planning, were aligned with the SOTIF perspective.

Conventional methods rely on direct causal chains and lack a unified guide for hazard analysis [20]. They often focus on component defects and faults within the analyst’s mental model. However, for modern complex autonomous systems, the use of these tools in isolation may not adequately address safety challenges. Moreover, most of these methods are based on the deduction and layering of components rather than considering the entire system. While valuable for simpler systems, they may prove insufficient for highly complex systems, such as Level-5 AVs and systems based on novel mobility concepts.

Addressing safety challenges associated with novel technologies, such as AD, requires augmenting the SOTIF analysis with effective and systematic safety methods. The system−theoretic accident model and process (STAMP) framework, introduced in 2000 [21], is a system−theory-based accident model that emphasizes a comprehensive system analysis. STPA, a potent analytical tool derived from STAMP, uses a top−down risk analysis approach involving four steps: ① definition of analysis goals, ② modeling of control structures, ③ identification of unsafe actions, and ④ pinpointing of loss scenarios.

STPA has proven to be a valuable tool for addressing functional insufficiencies, human misuse, and other factors beyond hardware or software failures in the context of SOTIF analysis, particularly in scenarios involving unsafe interactions among components [3]. Its applications in the automotive domain include safety analyses of various systems such as the drive-by-wire shift system, electronic control system, and adaptive cruise control (ACC) [22], [23], [24], [25]. Moreover, STPA has been utilized in specific safety domains, such as FUSA, cybersecurity, and SOTIF [26], with studies associating it with driving behavior safety guarantee, SOTIF guarantee, and HARA of FUSA [27]. Although STPA is beneficial for SOTIF research, its application in ADSs has certain limitations. Researchers have examined the weaknesses of STPA in the context of ADSs, and have introduced the finite-state machine (FSM) as a complement, particularly for analyzing high-level AVs with multiple autonomous modes and functions [28]. Furthermore, improved STPA methodologies have been proposed, for addressing the challenges associated with multiagent environments [29], [30].

In conclusion, no single analysis tool appears to outperform on all aspects of SOTIF research. Combining the advantages of different analysis tools may be more effective for ensuring a comprehensive and robust safety analysis of AVs [31]. Integrating various methodologies will help develop safer and more reliable ADSs, ultimately leading to widespread adoption and acceptance of this transformative technology.

2.1.2. V&V phase

The V&V phase is critical for generating evidence that components meet their functional requirements, and that an AV’s residual risk is acceptable. Various V&V activities are conducted for assessing the environmental modeling capabilities of sensors and perception models, the ability of decision-making models to handle known and unknown scenarios and make rational decisions, and the stability of systems or functions. These activities involve strategies developed through requirement, internal and external interface, system architecture, trigger event, and functional dependency analyses. This section focuses on system-level V&V methods.

Three approaches stand out for assessing systems’ safety: ① verification, ② falsification, and ③ testing [32], [33], as summarized in Table 2 [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53]. Formal verification [34], [35] has emerged as a noteworthy approach, offering safety proofs through mathematical models and ensuring the SOTIF of AVs across the ODD. It translates a system’s specifications and traffic rules into a machine-readable format, thereby ensuring the AV safety. Theorem proving [36], reachability analysis [37], and correct-by-construction synthesis [38] are typical formal verification methods that aid in ensuring SOTIF. Signal temporal logic (SIL) [39] supports formal techniques such as the synthesis and verification of trajectory planners, controllers, and runtime monitors [40]. Responsibility-sensitive safety (RSS) [41], [42] is another formal model for verifying AV modules, which uses worst-case assumptions and mathematical induction. Despite the effectiveness of formal verification, it is expensive and lacks scalability for complex, rapidly iterative systems in high-dimensional open scenarios. Researchers have often explored falsification and testing methods for practical V&V of AVs. Common approaches of relevance to AVs include function-based methods, real-world testing, and scenario-based methods.

Function-based methods involve AV testing based on well-defined system functionalities, making these methods particularly suitable for verifying specific functions, such as those found in advanced driver assistance systems (ADASs) [32], [33]. However, this approach may be limited with respect to capturing system-level interactions and emergent behaviors that can arise in complex AV systems. Thus, it is more appropriate for addressing the functionality of ADASs than the comprehensive safety challenges of AVs.

Real-world testing is advantageous for validating the tested system’s functionality and performance in actual environments, where previously unknown scenarios can be encountered [43]. However, this approach has several limitations. Real-world testing tends to be time- and resource- consuming, and its scope may be limited. Consequently, it can be challenging to simulate the multitude of potential scenarios and edge cases that AVs may encounter, making it difficult to effectively reproduce and isolate relevant issues.

Scenario-based approaches consider scenarios as the basis for evaluating the analyzed system’s safety, and have been widely studied and practiced in both academic and industrial settings in recent years [44]. On the one hand, these approaches effectively use different platforms to reasonably allocate resources, save costs, and further reduce the workload when combined with technologies such as scenario coverage assessment, importance sampling, and risk behavior identification. On the other hand, scenario-based methods can be used not only for SOTIF verification in scenarios corresponding to the identified trigger conditions obtained by HARA, but also for SOTIF validation by extracting the parameter distributions associated with realistic traffic scenarios and conducting random or targeted tests. Thus, these methods facilitate transitions and iterations between V&V activities.

Earlier studies defined scenario-related concepts [45] and constructed general hierarchical scenario models [46] that supported the implementation of scenario-based approaches. The approach involves the following basic steps: ① scenario generation/extraction, ② scenario database establishment, ③ scenario selection, and ④ scenario execution. Knowledge-based scenario generation [47] and data-driven scenario extraction [48] can be used for building the SOTIF scenario database. Selecting specific scenarios from the database is an important step toward determining the representativeness, scenario coverage, and verification costs, where testing and falsification can be used as principles for scenario-selection methods. Considering the complexity and continuity of the scenario parameters, sampling and other approaches can be utilized for selecting scenarios, to achieve accelerated testing. According to the prior information about the scenario parameters, sampling can be performed based on the range [49] or distribution [50] of parameters. During the scenario execution step, different platforms can be selected based on specific requirements, including virtual simulations, hardware-in-the-loop, vehicle-in-the-loop, and physical test sites. The rarity of safety-critical events is the main bottleneck for progress in the AD field, leading to prohibitively high costs associated with V&V in natural driving environments. Consequently, in recent years the emphasis has been on constructing intelligent testing environments [51], [52]. AI-powered back-end agents have been leveraged for validating the safety of AVs. In summary, developing a technology that would encompass all of the existing advantages is exceedingly challenging. Hence, a synergistic approach that combines verification, falsification, and testing is promising for efficiently improving the SOTIF of AVs.

Furthermore, current metrics for evaluating the safety of AD encompass various dimensions, including subjective/objective, micro/macro, short-term/long-term, and other categories. A comprehensive summary of commonly used metrics in the field of AV safety performance testing was presented before [53], describing proximal surrogate indicators, driving behavior, and rule-breaking. However, many existing methods focus on the safety of driving, and do not meet the SOTIF V&V requirements. Therefore, it is necessary to develop innovative methods and collect new data while building targeted evaluation metric systems. Finally, but of paramount importance, unified, quantifiable, and actionable SOTIF acceptance criteria are highly necessary, albeit challenging to establish.

2.1.3. Operation phase

The analyses, improvements, and V&V performed during the development phase cannot entirely eliminate residual risks. During the operation phase, ADSs may exhibit functional insufficiencies. Each component of an ADS has its own performance limitations; in addition, the incorporation of AI introduces uncertainties. To address these risks, it is essential to monitor and safeguard against SOTIF-related risks that may arise from a combination of environmental triggering conditions and the internal state of the ADS. This necessitates the development of self-aware automotive-safety systems.

The concept of self-awareness, previously studied in philosophy, psychology, logic, and computation, has been applied to various domains, including robotics and self-driving cars [54]. In the context of AVs, self-awareness refers to a system’s ability to introspect its internal functions and contextual information, with promising implications for ensuring system-level SOTIF in the operation phase. Achieving system-level self-awareness requires simultaneous consideration of the external environment, including the triggering conditions and internal state of the system [55]. When an AV operates in a dynamic and unpredictable environment, it should be capable of autonomously detecting and responding to unknown scenarios with respect to different system layers. This may involve restricting the ODD and degrading the autonomy level. For example, the concept of restriction of the operational design domain (ROD) has been proposed [56], it is a modified ODD that considers the monitoring of the current system capabilities.

In the pursuit of self-awareness for improving the safety of AVs, the “Stadtpilot” project stands out as a notable example. The device demonstrated AD on Braunschweig’s inner-city ring road by prioritizing safety, with a safety unit integrated into the tested vehicle’s real-time guidance and control system. The longitudinal control strategy proposed in 2012 [57] enabled dynamic ODD adaptation through sensor-based calculation of the “grip value”. The architecture included a surveillance and safety system embedded within a hierarchical framework that collected data from sensors and actuators to detect system issues and take urgent real-time actions [58]. The safety unit implemented functional degradation actions based on rule-based performance criteria, considering various factors such as the position accuracy, grip value, viewing area, system operation status, and reaction time. Subsequent updates to the functional system architecture for AVs introduced a self-monitoring system [59], [60] that provided information about the ego vehicle’s entities and attributes, including errors or health states. In 2020, the concepts of self-perception, self-representation, and self-awareness [61] were applied to the monitoring and safety decision making of complex AVs, enabling appropriate responses to performance limitations.

Despite significant progress in ADSs, academic research on SOTIF remains relatively limited, and a systematic approach to ensuring SOTIF in AVs is lacking. Recently, Shao et al. [62] proposed an onboard safety protection system that monitors AI models, ODD, and traffic compliance of AVs. The model demonstrated self-adaptive decision-making and planning to ensure SOTIF, as shown in Fig. 5. Further studies and development are necessary for exploring system-level SOTIF solutions, alongside advances in the AV technology. Establishing the safety and reliability of AVs with respect to realistic scenarios requires comprehensive and rigorous studies.

2.2. Algorithm-related SOTIF research

Algorithms play a crucial role in the various functional modules of AVs. With advances in the AV technology, there has been a notable shift toward the integration of AI approaches, which significantly improves the intelligence level of these vehicles [63]. However, this transition to AI-based algorithms has introduced new challenges for ensuring SOTIF. The high complexity of the underlying models, as well as the uncertainty and lack of interpretability associated with AI algorithms have been noted and attracted considerable attention of academic researchers, highlighting the need for robust SOTIF solutions. Therefore, this section primarily focuses on AI approaches, particularly learning-based algorithms, while providing insights into relevant conventional algorithms. The overall framework is illustrated in Fig. 6.

2.2.1. Development phase

2.2.1.1. General development process

The development phase consists of three main sub-phases: ① requirement analysis, ② data acquisition and processing, and ③ model design and training. Each sub-phase has unique challenges, potentially leading to unsatisfactory performance of AVs and SOTIF issues [64].

(1) Requirement analysis: Before implementing an algorithm for a specific model, it is essential to clearly define the model’s input, output, scope of application, functional requirements, and technical specifications [64], [65]. Ambiguous or incorrect requirements can significantly affect downstream sub-phases, and an insufficient definition of intended functions is a key factor underlying inadequate specifications. Research on requirement analysis for model development is currently limited, making the integration of systems engineering theory and AI an important research direction [66].

(2) Data acquisition and processing: Machine learning requires data. After identifying specific requirements, sufficient data must be collected to build an appropriate dataset [67]. The quality of the training data directly determines the performance of the model. Inadequate training data lead to poor robustness and insufficient generalization of trained machine-learning model. AV scenarios are highly diverse, posing challenges for obtaining sufficient amounts of relevant data, compounded by the cost and technical limitations associated with the processes of data collection and labeling. Non-learning-based algorithms also require realistic data for parameter calibration or verification; the results of these processes can be affected by one-sidedness and noise in the collected data [68]. To address these data-related challenges, researchers have explored solutions such as reducing data acquisition costs by using large-scale, low-cost methods, and optimizing the training process using techniques like data augmentation, adversarial training, and transfer learning [69], [70], [71].

(3) Model design and training: Model design involves the selection of appropriate algorithms and modeling techniques to fulfill various functional requirements. However, different models have different assumptions, and real-world environments often do not fully conform to these assumptions, limiting the practical effectiveness of the models. Addressing these challenges requires exploring solutions that relax theoretical assumptions [72]. In addition, improper reward or loss functions during training can compromise the models’ performance. Adopting more direct rewards, avoiding complex subjective rewards, and learning the reward function directly from data can help mitigate subjectivity [73]. Ensuring convergence, avoiding overfitting and underfitting, and adapting the models to realistic operating environments are crucial for ensuring the models’ safety and generalization. Regularization techniques such as dropout and batch normalization are effective in addressing overfitting. Furthermore, to address the difficult convergence of imitation learning (IL)-based training, researchers proposed alternative algorithms that gradually transform the learner’s policy from an expert policy to a learned policy over several interaction epochs [74]. These approaches mitigate problems such as error accumulation and regret bounds.

In rule-based methods, the aforementioned data-collection and model-training processes are replaced by expert-guided rule formulation. Although this approach circumvents performance limitations stemming from data scarcity and inadequate training, its applicability to the intricacies of long-tail scenarios in AD remains limited, owing to the inherent constraints on the expert knowledge. The complexity, diversity, and dynamic nature of ADSs and possible scenarios can pose significant challenges to effectively expanding and maintaining rule libraries. Consequently, this method may be inadequate for addressing the complexities of high-level ADSs in a comprehensive and scalable manner, warranting more sophisticated and data-driven approaches for optimal performance and adaptability.

2.2.1.2. Key SOTIF challenges for AV algorithms: uncertainty and inexplicability

The uncertainty and inexplicability of AI present significant challenges precluding the development of safe and reliable AVs [75], [76]. These challenges directly affect the SOTIF of AV algorithms, leading to several critical issues.

AI uncertainties, specifically model and data uncertainties, pose a major challenge for the development of AV algorithms [77], [78], [79]. Model uncertainty stems from a lack of knowledge or data in the model, and can result from incomplete requirement analysis, insufficient or biased training data, improper model design, and inadequate training. This uncertainty can lead to ambiguous and ill-defined model functionalities, making it difficult to define the technical requirements and specifications of AVs precisely. As a result, the subsequent sub-phases of AV development, such as data acquisition and processing as well as model design and training, may suffer from performance limitations and inefficiencies, owing to unclear model objectives and functionality. Moreover, model uncertainty may hinder the ability of the algorithms to handle novel and unforeseen scenarios, thereby increasing the risk of SOTIF-related incidents during the operation of AVs.

Similarly, data uncertainty, which arises from errors and uncertainties in the data, can significantly affect the performance of AV algorithms [78], [80]. Inaccurate labeling, noisy sensor data, and variations in real-world conditions contribute to data uncertainty, leading to incorrect and unstable perceptions and decision-making. Such uncertainties can have severe consequences in critical driving situations because AVs rely heavily on accurate and reliable sensor inputs and perception. Consequently, mitigating data uncertainty is crucial for improving the robustness and reliability of AV algorithms as well as for reducing the risk of SOTIF-related incidents.

Inexplicability, particularly the black-box behavior exhibited by many machine-learning models, introduces additional challenges with respect to the development of AV algorithms [81], [82], [83]. Deep-learning models such as deep neural networks (DNNs) are known for their impressive performance on various tasks, including perception and decision-making. However, the lack of interpretability of these models makes it difficult to understand their internal workings and reasoning processes. This lack of interpretability can hinder requirement analysis and model validation, as it becomes challenging to comprehend the logic underlying decisions and guarantee that AV algorithms satisfy critical safety requirements. In addition, the inexplicability of certain machine-learning models increases the cost and complexity of evaluating the models’ reliability and safety, increasing the difficulty of establishing safety assurance measures.

To address these challenges, various studies have been conducted to improve uncertainty quantification and interpretability. As shown in Table 3 [84], [85], [86], [87], [88], [89], [90], [91], [92], [93], [94], [95], the existing methods of model uncertainty quantification primarily include the Bayesian neural network (BNN)-based [84], [85], [86], [87], ensemble-based [88], [89], [90], and single-pass methods [91], [92]. On the other hand, data-uncertainty quantification methods can be categorized into discriminative [80], [93] and generative methods [94], [95]. By modeling and understanding uncertainty, developers can optimize the design process and reduce residual risks, thereby improving the performance of AV algorithms in challenging and uncertain environments [91], [96], [97]. Moreover, better interpretability methods are essential for improving the transparency and interpretability of AV algorithms [98], [99]. Techniques such as prototype construction, feature identification, and explanation models help to understand the reasoning behind the AV model outputs and ensure that technical requirements are met effectively.

In conclusion, the key SOTIF challenges for AV algorithms are the uncertainties and inexplicabilities inherent to AI systems. Addressing these challenges is crucial for ensuring the safe and reliable deployment of AVs [75], [76], [81]. By quantifying and managing uncertainty and by improving interpretability, developers can improve the functionality and safety of AV algorithms, reduce the risk of SOTIF-related incidents, and foster widespread adoption of AV technology.

2.2.1.3. Special considerations for different layers of modules

In addition to the general algorithm-development process and challenges, addressing SOTIF issues in AVs requires a detailed examination of algorithm-related challenges and solutions with respect to specific functional modules. This section focuses on the perception and decision-making layers of AVs and explores their respective SOTIF concerns and mitigation strategies, as shown in Table 4 [100], [101], [102], [103], [104], [105], [106], [107], [108], [109], [110], [111], [112], [113], [114], [115], [116], [117], [118], [119], [120], [121], [122], [123], [124], [125], [126], [127], [128].

(1) Perception layer: The perception layer is fundamental to AVs and it encompasses crucial tasks, such as recognition, scene reconstruction, motion estimation, and path tracking. Most state-of-the-art methods rely on machine learning, particularly deep-learning techniques. However, insufficient perception can lead to erroneous and unstable results, posing safety risks, because planning and control modules rely heavily on accurate perception.

One of the prominent SOTIF risks in the development phase of the perception layer is related to conceptual and labeling uncertainties. Ambiguities and inaccuracies in requirement analysis and data labeling can lead to unsatisfactory performance and affect the safety of AVs. To address these issues, formal scenario representations, unified labeling standards, and improved data systems have been proposed [100]. In addition, research aims to improve the accuracy, robustness, generalization, and interpretability of perception models by improving the data, algorithms, and overall pipeline development [101], [102], [103], [104].

With respect to the perception functionality of AVs, challenges arise from scenarios, sensor input, and model uncertainties. This can lead to unsatisfactory perception under adverse conditions or in dynamic scenarios. It is necessary to improve the accuracy, robustness, generalization, and interpretability of existing prediction models, for proper operation in these complex traffic scenarios [105], [106].

To address the limitations of a single-perception module, sensor-fusion techniques have been employed to integrate information from multiple sensors [107]. This approach provides redundancy and enhances the overall perception reliability. Sensor-fusion methods include data-, feature-, and decision-level fusion approaches, each of which has strengths in certain scenarios [108], [109]. Furthermore, collaborative perception technologies that incorporate information from roadside units and city perceptions have been proposed for improving perception in complex urban traffic scenarios [110], [111].

(2) Decision-making layer: The AV decision-making layer faces the complexities, uncertainties, and diverse nature of the possible scenarios. Inadequate decision-making directly affects the AV responsiveness with respect to dangerous scenarios, leading to safety risks. Decision-making methods can be broadly categorized into rule-based and learning-based approaches [112], [113], [114].

Rule-based methods, such as the FSM and model predictive control (MPC), provide interpretability and transparency, but may suffer from insufficient specifications owing to the limitations of expert knowledge. The optimization of rule-design specifications has been proposed along with improved prediction models for improving rule-based decision-making [115], [116], [117], [118], [119], [120]. In addition, AI-based prediction modules can be incorporated, for improving the decision model’s cognition of scenarios.

Learning-based methods, such as deep reinforcement learning (RL) and IL, are promising for decision-making in complex uncertainty scenarios. However, the reliability and generalizability of learning-based methods depend on sufficient data and accurate modeling [112], [121]. Efforts have been made to improve the interpretability of AI-based decision-making algorithms using methods such as inverse reinforcement learning (IRL), conservative Q-improvement RL algorithms, and maximum entropy RL [122], [123], [124]. However, interpretability research remains nascent [125].

Hybrid decision-making combines rule-based and learning-based methods to leverage their complementary strengths and provide redundancy. This approach can address the functional insufficiency of individual decision-making methods [126], [127], [128]. For example, knowledge or rules can be used for tuning reward functions, exploration processes, output actions, or policy training iterations, to ensure that a conservative and safe policy is activated when needed.

In conclusion, addressing algorithm related SOTIF issues in AVs requires a comprehensive approach that will consider both the common challenges of algorithm development and the specific requirements of different functional modules. By combining general improvements with module-specific strategies, AVs can achieve high safety and reliability in real-world scenarios.

2.2.2. V&V phase

Model evaluation is critical for ensuring that a trained AV algorithm meets the requirements of safe and reliable operation [129]. Incorrect or imprecise evaluation can yield functional insufficiencies during the operation phase, resulting in severe SOTIF issues. To address these challenges, various model-evaluation techniques have been developed, including formal verification and model testing.

Formal verification is a rigorous mathematical method for proving the correctness of a model [130]. It provides deterministic guarantees by verifying whether the analyzed model satisfies specified properties or requirements. Model checking, satisfiability modular theory (SMT), and mixed-integer linear programming (MILP) are examples of formal verification techniques applied to AV algorithms [131], [132]. Model checking exhaustively checks all possible states of a model to ensure the absence of errors, whereas SMT and MILP use mathematical logic to verify model properties. Although these methods offer reliable and complete verification results, they may be difficult to scale-up into complex models. One-sided and convergence guarantees provide bounded estimates of sufficient conditions for specific properties, and are more scalable for complex models [133], [134]. Statistical guarantees, on the other hand, quantify the probability that certain assumptions hold, making these guarantees useful for robustness assessment [135]. However, as the models’ complexity increases and interpretability decreases, the application of formal verification techniques to large-scale and complex AV models becomes increasingly challenging.

Another approach to model evaluation is test-based evaluation, which includes white- and black-box testing [136], [137]. White-box testing involves verifying the internal logic of a model based on a sufficient understanding of its structure. Techniques such as mutation testing, metamorphic testing, and adversarial testing have been used for white-box-based testing of AV algorithms [138], [139], [140]. Although white-box testing allows detecting the analyzed model’s defects, it may face challenges in large-scale applications, owing to the complexity of AV models and the vast amount of data involved. By contrast, black-box testing assesses the correctness of the analyzed model without detailed knowledge of its internal structure [141]. Validators evaluate the analyzed model’s output for given inputs and compare the results with the expected behavior. Black-box testing is reusable and easier to apply; however, it may not provide the same level of rigor as formal verification or white-box testing. Rigorous verification and accurate localization of model defects in black-box testing can be challenging.

Conventional model-evaluation metrics for assessing SOTIF requirements may not be adequate [142], [143]. Metrics such as precision, recall, receiver operating characteristic (ROC), and root mean square error (RMSE) are commonly used in model testing; however, they do not directly capture SOTIF-related safety concerns. Researchers have started redefining the model evaluation criteria from the safety perspective, for establishing a more direct connection between SOTIF requirements and algorithm evaluation.

In the process of integrating models into AVs, several general techniques have been proposed to control computation and memory costs, including model compression, hardware acceleration, and efficient processing by DNNs [144], [145]. However, it is essential to explicitly identify and analyze the potential performance degradation owing to these techniques, for ensuring that AV algorithms maintain their safety and reliability.

In conclusion, effective V&V of AV algorithms is crucial for ensuring their safety and reliability. Formal verification techniques offer rigorous mathematical proofs of correctness but may face scalability challenges in the case of complex models. Test-based evaluation approaches, including white-box and black-box testing, are complementary to the models’ correctness assessment, but may require addressing issues related to complexity and rigor. Redefining the model evaluation metrics from the safety perspective is vital for establishing a direct connection between SOTIF requirements and algorithm evaluation. Furthermore, careful consideration of the impact of integration techniques on the model performance is necessary for maintaining the safety and effectiveness of AV algorithms.

2.2.3. Operation phase

During the operation phase, AD scenarios exhibit openness, high dimensionality, dynamism, and complexity. AV models must process diverse inputs encompassing static and dynamic distortions as well as edge cases. Long-tail scenarios make it challenging to fully eradicate residual risks during development, resulting in potential functional insufficiencies [146]. To counter this, it is essential to be able to monitor the developed model’s performance in the operation phase.

Runtime monitors are typically utilized in the operation phase for detecting functional insufficiencies by identifying abnormal data and model states. This topic has garnered significant attention of researchers [147], [148], [149], [150], with monitoring strategies falling into three categories: ① input monitoring, ② internal state monitoring, and ③ output monitoring.

2.2.3.1. Input monitoring

This approach leverages the original model’s inputs to create a runtime monitor aimed at predicting potential performance deterioration. Inputs are categorized as in-distribution (ID) or out-of-distribution (OOD), based on their alignment with the model’s training domain. Errors may arise when processing ID data owing to flawed model design or training, while OOD data can significantly undermine the performance of learning-based models owing to distribution shifts, with adversarial attacks posing a notable threat [151].

The training and evaluation of the model can guide the creation of monitors for effectively detecting ID errors. The training of an auxiliary safety model based on the outcomes of the main model during the development phase enables the prediction of potential main model failures. For instance, a student model trained on the input generated by a perception system and the corresponding steering accuracy can predict AV steering control errors [152]. The disparity between the prediction of the main model and human operations produces an error score, which enhances the performance monitoring of the main model [153]. Drawing from similar workspace experiences contributes to the probabilistic prediction of the model performance [154], accounting for input similarities in terms of geographic location and appearance.

Techniques such as anomaly, OOD, and outlier detection are promising. They assume that the model’s training data adhere to a certain distribution. Anomalies, OOD instances, and outliers exhibit low probabilities within the assumed data distribution or deviate from it significantly, thereby impacting model predictions and necessitating detection. Common methods encompass clustering-based, probability-based, classification-based, distance-based, and reconstruction-based approaches [155], [156], [157], [158], [159]. DNNs excel in anomaly detection [160], capable of identifying new or unseen patterns in data. Long short-term memory (LSTM) networks are well-suited for detecting anomalies in time series data [161], autoencoders encode and decode data, capturing abnormal instances based on high reconstruction loss [162]. Variational autoencoders (VAEs) and generative adversarial networks (GANs) are also harnessed for anomaly detection [163], [164].

2.2.3.2. Internal state monitoring

This approach focuses on extracting the model features for evaluating the current state. For example, the features in the hidden layers of a DNN can serve as inputs to the monitor, while techniques such as anomaly detection for identifying potential model performance degradation can be used as well [165]. Supervised and unsupervised methods are primary techniques.

Supervised methods [166] employ metrics such as accuracy during the main model training process as labels for performance detectors. The true class probability (TCP) acts as a confidence criterion learned by networks such as ConfidNet [165]. Cascaded neural networks monitor object-detection models by analyzing hidden features [167]. Furthermore, mean, maximum, and statistical pooling techniques have shown some promise for online monitoring using the internal integration method [168]. General supervised methods excel at detecting ID errors but lack OOD generalization.

Unsupervised methods model the internal state of the original model with respect to the training set and detect operation-phase inputs that differ from the training data. Abstraction-based frameworks extract values from specific neural network layers to create box abstractions, which enable anomaly detection [169]. Distance-based and density-based methods utilize multiple hidden layers for anomaly detection [170]. For example, the Mahalanobis distance measures the test samples’ density over the feature space, enabling the detection of OOD samples and adversarial attacks [171]. Moreover, gradient spatial information may be promising as well, with GradNorm proposed for detecting distribution shifts [172]. However, unsupervised methods may yield less intuitive trends, requiring postprocessing for the desired anomaly scores.

2.2.3.3. Output monitoring

Output monitoring assesses the model performance by analyzing output information. Models with confidence or performance estimates allow direct runtime monitoring. However, most models lack estimates or have limited estimation accuracy. Therefore, it is necessary to endow models with introspection abilities using specific methods or rules.

During classification, neural networks typically predict class probabilities using the softmax function. The detection of misclassified and OOD examples by neural networks was proposed using the maximum softmax probability, which indicates model failure [173]. Temperature scaling and minor perturbations have been introduced for better distinction of ID and OOD data using softmax scores [174]. However, the efficacy of these monitors relies significantly on the training and validation set configurations, which limit their OOD detection capability. Thus, certain studies introduced model uncertainty quantification methods [175], [176], allowing models to understand what they know or do not know. Kaur et al. [177] introduced an object-oriented ensemble method that detects samples with high epistemic or aleatoric uncertainty, which is considered an effective OOD detection technique. In addition, UNCERTAINTY-WIZARD, an open-source tool, aggregates typical uncertainty estimation methods such as Monte Carlo dropout and deep ensemble methods, for uncertainty monitoring [178].

Multiple outputs can be compared for identifying abnormal states in specific model outputs. For example, three methods of consistency analysis have been used for detecting errors: ① temporal consistency analysis, obtaining identical model outputs at different times; ② spatial consistency analysis, acquiring differing model outputs simultaneously; and ③ spatiotemporal consistency analysis, which combines ① and ② [179], [180]. Moreover, formal verification [181] is also prevalent in online model status monitoring, yielding remarkable outcomes in AV safety verification.

In conclusion, monitoring the model state during the operation phase is vital for mitigating the risks stemming from functional inadequacies of AV algorithms. Various runtime monitoring techniques, including monitoring of the input, internal state, and output of the model, have been explored for detecting potential performance degradation and deviations from the training data distribution. These monitoring strategies critically improve the safety and reliability of AVs.

3. Practical SOTIF activities

This study attempts to examine recent SOTIF-related activities among original equipment manufacturers (OEMs), TIER1 suppliers, new AV companies, and government agencies based on limited-release information. In recent years, numerous companies in the AD industry, including BMW [182], Baidu [183], GM [184], Ford [185], Nuro [186], NAVYA [187], and NVIDIA [188], have integrated the SOTIF standard into their development pipelines, emphasizing its significance for AD-related safety reports. SOTIF, along with the FUSA cybersecurity standards, is critical for ensuring the systems’ safety and their ability to mitigate functional insufficiencies, failures, and cyberattacks during development. For example, BMW incorporated both the FUSA and SOTIF standards into its design and verification processes, implementing sensors, decision-making channels, and actuator redundancies for improved safety. Similarly, Bosch [189] proposed an improved V-shaped development process for SOTIF, encompassing function description, risk identification, failure analysis, functional modifications, V&V tests, and formalized documentation for SOTIF release. In addition, a joint whitepaper [190] by 11 companies, including APTIV, underscored the significance of SOTIF, FUSA, and cybersecurity as crucial domains of AD dependability. The white paper introduced 12 principles, 13 AD capabilities, and 20 elements for realization, emphasizing safety by designing and highlighting V&V strategies. This led to the release of a new technical report, ISO/TR 4804 [64], specific to the AD industry.

Following this introduction, the subsequent sections provide an overview of the relevant typical practices in the development, V&V, and operation phases. In addition, a dedicated section illustrates typical SOTIF practices in China.

3.1. Development phase

Several companies have extended their safety analyses to include SOTIF considerations. For instance, Continental AG [22] adopted a systems analysis method based on STPA for a fully automated driving vehicle (FAV) project covering FUSA, SOTIF, and cybersecurity. ANSYS [191] updated its ANSYS Medini Analyze tool to handle FUSA and SOTIF-related risks through FTA, while also exploring STPA for SOTIF analysis.

Institutions such as the Netherlands Organization for Applied Scientific Research (TNO) and NHTSA have undertaken SOTIF analysis projects. TNO and Volvo collaborated on the ENSEMBLE project, analyzed a multibrand fleet under various scenarios, and proposed countermeasures for SOTIF-related risks. NHTSA integrated SOTIF activities into its safety analysis process to identify hazards, triggering events, and mitigation measures [192].

Various companies have proposed measures for addressing SOTIF-related risks. For example, HELLA [193] proposed using heating elements in front of a LIDAR to mitigate cold weather effects; it also proposed using AI for better environmental perception. Mobileye [194] focused on risk-reduction strategies, such as monitoring severe weather and increasing algorithm redundancy.

Research projects such as the ICADAC project seek to enhance camera-based object detection under adverse conditions. The European Union (EU) aDverse wEather eNvironmental Sensing systEm (DENSE) project, led by Mercedes-Benz, developed all-weather AD sensor suites through testing and research, including parameter tuning, hardware improvement, algorithm optimization, and data fusion.

3.2. V&V phase

Recently, V&V practices for AVs that consider SOTIF issues have garnered considerable attention. A prominent example is the PEGASUS joint project, initiated by the Federal Ministry for Economic Affairs and Energy (BMWi) in 2016, which targeted critical gaps in the testing of highly AD functions. Focusing on the highway chauffeur as a representative test subject, the project devised a comprehensive 21-step methodology encompassing activities such as requirement analysis, data processing, database creation, safety assessment, and argumentation [195], [196]. This project serves as a typical scenario-based V&V approach that incorporates SOTIF principles. Another noteworthy endeavor is ENABLE-S3, an EU-driven initiative aimed at validating highly automated, safe, and secure systems. The project concentrated on formulating and promoting corresponding test frameworks, technologies, platforms, environments, standards, and ecosystems [197], [198]. Some companies have dedicated themselves to driving the implementation of scenario-based V&V systems. Collaborative efforts between SIEMENS and IVEX led to advances in scenario-based testing of ADSs [199]. They introduced an integrated software toolchain and safety models using V&V metrics. Software tools such as Simcenter PreScan, Simcenter Amesim, and HEED have been instrumental in expediting the development of ADSs.

The SAKURA project, funded by the Ministry of Economy, Trade, and Industry (METI) in Japan, pursued the development of an engineering process and related technologies for the safety assurance of ADSs, as shown in Fig. 7. The project categorized scenarios into four groups based on their foreseeability and preventability and defined safety requirements to ensure that AVs do not cause any foreseeable and preventable traffic accidents resulting in injury or death within their ODD. SAKURA proposed the construction of a SOTIF safety structure based on these safety requirements and established a process for deriving test scenarios. Essential functional scenarios were identified, including instances of traffic, perception, and vehicle interferences, which encompass typical SOTIF hazard-triggering conditions. Subsequently, logical scenarios with foreseeable implications were identified by collecting and processing real-world data, as well as by extracting the distributions of the different scenario parameters [200], [201], [202]. Specific scenarios were then extracted using parameter sampling or search methods, and subsequently evaluated using track tests, simulations, and on-road tests.

To facilitate the internationalization and standardization of SAKURA’s outcomes, the project’s expert group led the development of ISO 34502 [14], emphasizing the importance of SOTIF considerations in V&V activities. In addition, as the world’s first Level 3 AD certification regulation, UN R157 [203] was formulated to establish uniform provisions for the vehicle approval of the Automated Lane Keeping System (ALKS). This regulation evaluates the ALKS functionality in relation to a skilled human driver’s ability to stay in a lane, aiming to address intricate issues related to system safety assessment. The Japan Automobile Manufacturers Association Inc. (JAMA) formulated an automated driving safety evaluation framework [204], which consolidated their practices in safety argumentation structuring, safety evaluation, and safety assessment methods. This project, aligned with the UN R157 regulation and ISO 3450X series standards, provided robust support for Level 3 AD certification for companies such as Mercedes-Benz in Germany and Honda in Japan, fostering a common understanding and consensus toward the development of international regulations and standards.

3.3. Operation phase

As previously mentioned, effective real-time monitoring is essential for SOTIF solutions. To address the challenges of integrating deep machine learning (DML) models into safety-critical vehicles, the research institutes of Sweden (RISE) initiated the Safety analysis and verification/validation of MachIne Learning (SMILE) project, consisting of three research phases. SMILE I investigated advanced V&V for DML systems, highlighting the concept of significant safety coverage [205], [206], [207]. Building on this foundation, SMILE II advocated ISO/PAS 21,448 as a supplement to the insufficient ISO 26262: 2018 standard for ML-based systems. Subsequently, SMILE III expanded the safety coverage concept into a reference architecture and prototype aligned with evolving SOTIF standards. Safety Co-Pilot [208], developed by IVEX, is a set of embedded software components designed for examining and quantifying the risks associated with planning of AD trajectories. It assesses vehicle predictions and the alignment of planning with safety policies. The Co-Pilot includes modules for trajectory inspection, risk analysis, emergency maneuver trajectory library, and decision-making. The British Standards Institution (BSI) introduced the concept of automated vehicle monitoring operation (AVMO) in PAS 1880 [209]. This facilitates monitoring of the operational state of vehicles and their compliance with the ODD, enabling controlled safety interventions when necessary.

With advances in big data and AI technologies, many companies have successfully achieved closed-loop data and iterative updates of functions by monitoring and error detection during the operation phase. Tesla’s closed-loop data framework encompasses several essential components, including confirmation of model errors, data labeling and cleaning, model training, and redeployment/delivery [210]. Similarly, Waymo [211] has developed a closed-loop data platform featuring data mining, active learning, automatic annotation, automatic model debugging, optimization, test verification, deployment, and release. NVIDIA’s AV ML platform MAGLEV [212] also implemented model iterations based on closed-loop data principles. In addition, Motional [213] has established a continuous learning framework by combining technologies, such as automatic annotation, while Cruise [214] has developed a continuous learning machine to address the long-tail problem in AD prediction. These closed-loop data and iterative update mechanisms improve safety and performance, while considering SOTIF concerns in real-world AV operations.

3.4. Chinese practical activities

In China, the SOTIF technical alliance to combat the SOTIF problem was initiated by Tsinghua University, together with two policymakers, nine OEMs, nine universities, and thirty companies. The primary objective was to reveal SOTIF-related requirements and technologies through the collaborative research involving all members. In 2021, thousands of data points on traffic risk events were collected, and more than 3000 minutes of intersection videos were recorded by drones. These data were converted into a dataset with path-trajectory and map-information annotations. In addition, a seven-layer SOTIF scenario structure was proposed based on the analysis of the triggering conditions and performance limitations of various systems, as shown in Fig. 8. Using the proposed methodology and structure, a China-specific SOTIF scenario shared library containing over one thousand typical SOTIF scenarios and 300 test cases was established. This is the first shared SOTIF scenario library in China that enables all companies to train their models and evaluate their performance. Based on this library, SOTIF was developed for various autonomous system functions and layers. In addition, a critical scenario-based dual-loop testing and verification system for SOTIF was established. With cycles of closed-loop verification and dynamic evaluation, the system performance can be accurately evaluated, and the residual risk can be theoretically quantized. The procedure begins with a definition of the function/system/algorithm, analogous to the flowchart of ISO 21448 activities. Several methods can be used to generate test cases based on the analysis of performance limitations and trigger conditions. Performance can be evaluated following the completion of the test procedures.

4. Future challenges and perspectives

Advances in the academic SOTIF research and practical SOTIF activities have been discussed in detail. It follows that at present there is no general solution to SOTIF challenges associated with AD. In this section, future challenges and perspectives are discussed to facilitate future SOTIF research on AVs. The discussion focuses on three phases: ① development, ② verification & validation, and ③ operation.

4.1. Challenges associated with the development phase

(1) Utilization of different approaches for SOTIF hazard analysis. The systematic identification and analysis of SOTIF hazards should be considered early in the development phase. As discussed previously, STPA has shown potential in hazard analysis and has some preliminary applications and practices in the AV SOTIF field [215], [216]. However, the effectiveness of STPA for high-level AVs remains to be demonstrated. State-of-the-art research focuses on a single system or single-function low-level AVs. Combining different analysis tools is necessary, and corresponding toolchains should be developed [19].

(2) SOTIF-related risk quantification. The systematic quantification of SOTIF-related risks is another key challenge. The FTA, FMEA, HAZOP, and STPA methods are widely used for hazard analysis. However, risk qualification has not yet been thoroughly investigated. According to ISO 26262, risk is defined as a function of the probability of occurrence, controllability, and potential severity. However, whether this can be used as a standard for SOTIF risk assessment remains debatable. In addition, the uncertainty and limitations of the interpretability of AI algorithms make it difficult to reliably assess risks, owing to their performance limitations. In summary, no generic or effective methods exist for quantifying SOTIF-related risks.

(3) Addressing SOTIF-related risks with comprehensive system functionality modifications. Although there are several functional improvement methods for SOTIF, there is still a long way to go to achieve the goal of fully reliable SOTIF assurance. First, the limitations of AI algorithms, such as their interpretability, generalization, and robustness, restrict the reliability of AVs. Unpredictable driving behaviors pose particular challenges in preventing foreseeable misuse. In addition, the synthesis of functional improvement techniques for different components at the system level is worth discussing. Furthermore, constantly updated algorithms lead to new problems; therefore, it is necessary to ensure a dynamic development pipeline to continuously improve functions.

4.2. Challenges associated with the V&V phase

(1) Development of formal verification technology for improving its role in industrial applications. Owing to the advantages of rigorous logic, formal verification has received attention in the context of the verification of AI and AV systems. However, with the increasing complexity of algorithms, systems, and ODD, formal verification that relies on rigorous mathematical modeling and derivation has become increasingly difficult to apply. Formal verification remains a field worth exploring; however, there is still a long way to go. Four specific challenges must be addressed: ① dealing with black box models; ② dealing with constantly updated and highly diverse algorithms and systems; ③ reducing the difficulty and cost of formal validation; ④ combining the acceptance criteria and establishing the closed-loop V&V process.

(2) Development of SOTIF-oriented scenario-construction technology with high coverage. Currently, constructing a SOTIF scenario library is [217] an effective validation method for exposing unknown unsafe scenarios. However, most enterprises focus on data collection and do not pay much attention to data quality. This may result in a scenario library containing a large amount of repetitive and low-quality data, leading to lower coverage and reduced credibility. The primary criteria for the scenario design are diversity, rationality, and criticality. Specifically, ① the scenario designer must develop or select valuable cases for a limited number of tested scenarios; ② rationality requires the virtual scenarios to follow real-world principles; and ③ criticality requires the scenario designer to produce or select more valuable instances in a limited number of scenarios for evaluation.

(3) Development of high-fidelity testing technology. In addition to the SOTIF-oriented scenario construction, fidelity is another key factor associated with the testing technology. The test toolchain generally incorporates three components: sensors, vehicles, and virtual-world models. However, the accuracy of these models does not fully satisfy test requirements. In particular, the precision of the sensor model significantly affects the test credibility, which is a challenge that must be addressed [218], [219]. To facilitate the development of testing technologies, three key issues must be addressed: ① high-fidelity vehicle-in-the-loop physical modeling; ② high-fidelity field testing; and ③ dynamic real-time identification and quantitative evaluation of whole-vehicle hazard events.

(4) Safety certification involving SOTIF activities. The safety certification of AVs remains a hurdle for their commercialization. Currently, only Japan [220] and Germany [221], [222] have released safety certification for level-3 AV products. At present, four particular challenges stand out for autonomous safety certification, as pointed out in [32]: ① determination of unavoidable collisions; ② determination of liability; ③ verification cost for reasonable scenario coverage; and ④ additional cost of reverification for the AD software updates. Note that challenge ④ is important for ensuring safety because AVs rely heavily on software, especially those that use AI algorithms.

(5) Acceptance criteria for quantified SOTIF-related risks. Acceptance criteria are used for determining whether AVs have reached a reasonable level of safety [223], [224]. Although mileage accumulation tests on public roads usually serve as a safety baseline, the industry currently lacks a unified definition method for total mileage. More importantly, the choice of test roads and scenarios is not rooted in theory. The reasonable allocation of system-level acceptance criteria to various components is a challenging but significant problem. The following methods may be used for defining acceptance criteria: ① comparison with existing traffic statistics [225]; ② comparison with experienced and cautious human drivers; ③ other risk acceptance criteria, such as the GAMAB principle [226] and the ALARP principle [227].

4.3. Challenges associated with the operation phase

(1) On-board AI algorithm monitoring. Existing ADSs make substantial use of AI, and as stated previously, uncertainty quantification techniques are commonly used for monitoring the real-time status of AI algorithms in ADSs [228], [229]. However, the question of how to evaluate and verify whether the estimated uncertainty is correct has not yet been resolved. Furthermore, certain approaches, such as the Monte Carlo dropout, require major modifications of the original AI-based algorithms, potentially precluding their wide implementation [230], [231]. Therefore, efficient monitoring techniques are required for the AI algorithms. More importantly, AI algorithms often reduce the interpretability of the AVs’ behavior.

(2) ODD monitoring. As mentioned previously, anomalies within the ODD trigger SOTIF issues for AVs. Currently, weather and road conditions are two important ODD aspects. Existing monitoring techniques discover abnormalities, whereas problem-solving techniques are entirely dependent on decision-making and control modules. Nonetheless, their main function is to process the driving environment with the most prevalent scenarios, and additional requirements may necessitate extensive algorithm rebuilding. Therefore, a modularized and unified approach similar to fault detection and diagnostic procedures in functional safety should be devised to aid autonomous systems in addressing ODD abnormalities.

(3) Compliance with traffic regulations. These facts prove that obeying traffic laws is promising for improving the safety of AVs [232]. In addition, not all crashes can be avoided, and in emergencies, AVs may be forced to make tough judgments combining ethical and legal considerations [233], [234]. Researchers have attempted to digitalize traffic laws to monitor or normalize the driving behavior of AVs. Currently, most related approaches directly code regulations into decision-making systems to guarantee that cars adhere to rules [235], [236]. However, ensuring that AVs comply with traffic regulations and ethics remains an open problem.

5. Conclusions

This study offers a comprehensive examination of SOTIF in the context of AVs. By delving into academic research and real-world implementation across the development, verification, validation, and operation phases, it is evident that ensuring safety beyond functional capabilities is intricate. These challenges encompass a wide spectrum, ranging from hazard analysis, risk quantification, scenario construction, and formal verification to AI algorithm monitoring, vigilance with respect to ODDs, and compliance with traffic regulations. Collaborative efforts of academic researchers, industry engineers, and regulators are crucial for establishing robust safety standards and effectively integrating AVs to address these challenges.

Acknowledgments

This work was supported by the National Science Foundation of China Project (52072215, U1964203, 52242213, and 52221005), National Key Research and Development (R&D) Program of China (2022YFB2503003), and State Key Laboratory of Intelligent Green Vehicle and Mobility.

Compliance with ethics guidelines

Hong Wang, Wenbo Shao, Chen Sun, Kai Yang, Dongpu Cao, and Jun Li declare that they have no conflict of interest or financial conflicts to disclose.

References

[1]

ISO 21448: Road vehicles—safety of the intended functionality. International standard. Switzerland: International Organization for Standardization; 2022.

[2]

ISO 26262: Road vehicles—functional safety. International standard. Switzerland: International Organization for Standardization; 2018.

[3]

N.G. Leveson. Engineering a safer world:systems thinking applied to safety. The MIT Press, Cambridge (2012)

[4]

P. Koopman, F. Fratrik. How many operational design domains, objects, and events?. Safeai@aaai, 4 (2019)

[5]

dmv.ca.gov Internet. Sacramento: California Department of Motor Vehicles; 2023 [cited 2023 Oct 26]. Available from:

[6]

S.A. Seshia, D. Sadigh, S.S. Sastry. Toward verified artificial intelligence. Commun ACM, 65 (7) (2022), pp. 46-55

[7]

Li J, Shao W, Wang H. Key challenges and Chinese solutions for SOTIF in intelligent connected vehicles. Engineering, in press.

[8]

S. Burton, I. Habli, T. Lawton, J. McDermid, P. Morgan, Z. Porter. Mind the gaps: assuring the safety of autonomous systems from an engineering, ethical, and legal perspective. Artif Intell, 279 (2020), Article 103201

[9]

J. Koo, J. Kwac, W. Ju, M. Steinert, L. Leifer, C. Nass. Why did my car just do that? Explaining semi-autonomous driving actions to improve driver understanding, trust, and performance. Int J Interact Des Manuf, 9 (4) (2015), pp. 269-275

[10]

K. Czarnecki, R. Salay. Towards a framework to manage perceptual uncertainty for safe automated driving. B. Gallina, A. Skavhaug, E. Schoitsch, F. Bitsch (Eds.), Computer Safety, Reliability, and Security, Springer, Berlin (2018), pp. 439-445

[11]

P. Koopman, U. Ferrell, F. Fratrik, M. Wagner. A safety standard approach for fully autonomous vehicles. A. Romanovsky, E. Troubitsyna, I. Gashi, E. Schoitsch (Eds.), Computer Safety, Reliability, and Security, Springer, Berlin (2019), pp. 326-332

[12]

UL 4600: Evaluation of autonomous products. UL standard. Underwriters Laboratories; 2020.

[13]

F. Concas, J.K. Nurminen, T. Mikkonen, S. Tarkoma. Validation frameworks for self-driving vehicles:a survey. M.A. Khan, F. Algarni, M.T. Quasim (Eds.), Smart Cities: A Data Analytics Perspective, Springer, Berlin (2021), pp. 197-212

[14]

ISO 34502: Road vehicles test scenarios for automated driving systems: scenario based safety evaluation framework. International standard. Switzerland: International Organization for Standardization; 2022.

[15]

Vesely WE, Goldberg FF, Roberts NH, Haasl DF. Fault tree handbook, systems and reliability research, Office of Nuclear Regulatory Research, US; 1981.

[16]

Schönemann V, Winner H, Glock T, Sax E, Boeddeker B, vom Dorff S, et al. Fault tree-based derivation of safety requirements for automated driving on the example of cooperative valet parking. In: 26th International Technical Conference on the Enhanced Safety of Vehicles (ESV); 2019 Jun 10-13; Eindhoven, Netherlands; 2019.

[17]

A. Börger, R. Hosse, S. Von Der Decken. SOTIF—a new challenge for functional testing. ATZelectronics Worldwide, 15 (10) (2020), pp. 56-60

[18]

J. Dunjó, V. Fthenakis, J.A. Vílchez, J. Arnaldos. Hazard and operability (HAZOP) analysis. A literature review. J Hazard Mater, 173 (1-3) (2010), pp. 19-32

[19]

Kramer B, Neurohr C, Büker M, Fränzle M, Damm W. Identification and quantification of hazardous scenarios for automated driving. In: Proceeding of Model-Based Safety and Assessment: 7th International Symposium; 2020 Sep 14-16; Lisbon, Portugal. Berlin: Springer; 2020. p. 163-78.

[20]

Song Y. Applying system-theoretic accident model and processes (STAMP) to hazard analysis [dissertation]. Hamilton: McMaster University; 2012.

[21]

N. Leveson. A new accident model for engineering safer systems. Saf Sci, 42 (4) (2004), pp. 237-270

[22]

Sundaram P, Vernacchia M, Wagner MS, Thomas J, Placke S. Application of STPA to an automotive shift-by-wire system. In: Workshop: Cambridge, MA, USA; 2014.

[23]

Van Eikema Hommes Q. Safety analysis approaches for automotive electronic control systems. In: Society of Automotive Engineers’ Meeting; 2015.

[24]

Van Eikema Hommes Q. Assessment of safety standards for automotive electronic control systems. Report. Washington, DC: National Highway Traffic Safety; 2016 Jun. Report No.: DOT HS 812 285.

[25]

A. Abdulkhaleq, S. Wagner, N. Leveson. A comprehensive safety engineering approach for software-intensive systems based on STPA. Procedia Eng, 128 (2015), pp. 2-11

[26]

A. Abdulkhaleq, D. Lammering, S. Wagner, J. Röder, N. Balbierer, L. Ramsauer, et al.. A systematic approach based on STPA for developing a dependable architecture for fully automated driving vehicles. Procedia Eng, 179 (2017), pp. 41-51

[27]

K. Czarnecki. On-road safety of automated driving system (ADS)—taxonomy and safety analysis methods. University of Waterloo, Waterloo (2018)

[28]

Xing X, Zhou T, Chen J, Xiong L, Yu Z. A hazard analysis approach based on STPA and finite state machine for autonomous vehicles. In: Proceeding of 2021 IEEE Intelligent Vehicles Symposium (IV); 2021 Jul 11-17; Nagoya, Japan. Piscataway: IEEE; 2006. p. 150-6.

[29]

C. Bensaci, Y. Zennir, D. Pomorski, F. Innal, Y. Liu, C. Tolba. STPA and Bowtie risk analysis study for centralized and hierarchical control architectures comparison. Alex Eng J, 59 (5) (2020), pp. 3799-3816

[30]

C. Bensaci, Y. Zennir, D. Pomorski, F. Innal, Y. Liu. Distributed vs. hybrid control architecture using STPA and AHP—application to an autonomous mobile multi-robot system. International Journal of Safety and Security. Engineering, 11 (1) (2021), pp. 1-12

[31]

Capito L, Redmill KA. Methodology for hazard identification and mitigation strategies applied to an overtaking assistant ADAS. In: Proceeding of 2021 IEEE International Intelligent Transportation Systems Conference (ITSC); 2021 Sep 19-22; Indianapolis, IN, USA. Piscataway: IEEE; 2021. p. 3972-7.

[32]

Zhao T, Yurtsever E, Paulson JA, Rizzoni G, Automated vehicle safety guarantee, verification and certification: a survey. 2022. arXiv:2202.02818v1.

[33]

J. Kapinski, J.V. Deshmukh, X. Jin, H. Ito, K. Butts. Simulation-based approaches for verification of embedded control systems: an overview of traditional and advanced modeling, testing, and verification techniques. IEEE Contr Syst Mag, 36 (6) (2016), pp. 45-64

[34]

Krook J, Svensson L, Li Y, Feng L, Fabian M. Design and formal verification of a safe stop supervisor for an automated vehicle. In: Proceeding of 2019 International Conference on Robotics and Automation (ICRA); 2019 May 20-24; Montreal, QC, Canada. Piscataway: IEEE; 5607-13.

[35]

C. Radojicic, C. Grimm, A. Jantsch, M. Rathmair. Towards verification of uncertain cyber-physical systems. Electron Proc Theor Comput Sci, 247 (2017), pp. 1-17

[36]

Arechiga N, Loos SM, Platzer A, Krogh BH. Using theorem provers to guarantee closed-loop system properties. In: Proceeding of 2012 American Control Conference (ACC); 2012 Jun 27-29; Montreal, QC, Canada. Piscataway: IEEE; 3573-80.

[37]

Gruber F, Althoff M. Anytime safety verification of autonomous vehicles. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC); 2018 Nov 4-7; Maui, HI, USA. Piscataway: IEEE; 2018. p. 1708-14.

[38]

B. Johnson, F. Havlak, H. Kress-Gazit, M. Campbell. Experimental evaluation and formal analysis of high-level tasks with dynamic obstacle anticipation on a full-sized autonomous vehicle. J Field Robot, 34 (5) (2017), pp. 897-911

[39]

Arechiga N. Specifying Safety of Autonomous vehicles in signal temporal logic. In: Proceeding of 2019 IEEE Intelligent Vehicles Symposium (IV); 2019 Jun 9-12; Paris. Piscataway: IEEE; 2018. p. 58-63.

[40]

E. Zapridou, E. Bartocci, P. Katsaros. Runtime verification of autonomous driving systems in CARLA. J. Deshmukh, D. Ničković (Eds.), Runtime Verification, Springer, Berlin (2020), pp. 172-183

[41]

Shalev-Shwartz S, Shammah S, Shashua A. On a formal model of safe and scalable self-driving cars. 2017. arXiv:1708.06374.

[42]

P. Nilsson, O. Hussien, A. Balkan, Y. Chen, A.D. Ames, J.W. Grizzle, et al.. Correct-by-construction adaptive cruise control: two approaches. IEEE Trans Control Syst Technol, 24 (4) (2016), pp. 1294-1307

[43]

N. Kalra, S.M. Paddock. Driving to safety: how many miles of driving would it take to demonstrate autonomous vehicle reliability?. Transp Res Part A Policy Pract, 94 (2016), pp. 182-193

[44]

S. Riedmaier, T. Ponn, D. Ludwig, B. Schick, F. Diermeyer. Survey on scenario-based safety assessment of automated vehicles. IEEE Access, 8 (2020), pp. 87456-87477

[45]

Ulbrich S, Menzel T, Reschka A, Schuldt F, Maurer M. Defining and substantiating the terms scene, situation, and scenario for automated driving. In: Proceeding of 2015 IEEE 18th International Conference on Intelligent Transportation Systems; 2015 Sep 15-18; Gran Canaria. Piscataway: IEEE; 2015. p. 982-8.

[46]

Bagschik G, Menzel T, Maurer M. Ontology based scene creation for the development of automated vehicles. In: 2018 IEEE Intelligent Vehicles Symposium (IV); 2018 Oct 21; Changshu, China. Piscataway: IEEE; 2018. p. 1813-20.

[47]

Khatun M, Glaß M, Jung R. A systematic approach of reduced scenario-based safety analysis for highly automated driving function. In: Proceedings of the 7th International Conference on Vehicle Technology and Intelligent Transport Systems; 2021 Apr 28; New York City, USA:2021. p. 301-8

[48]

W. Wang, D. Zhao. Extracting traffic primitives directly from naturalistically logged data for self-driving applications. IEEE Robot Autom Lett, 3 (2) (2018), pp. 1223-2129

[49]

Gladisch C, Heinzemann C, Herrmann M, Woehrle M. Leveraging combinatorial testing for safety-critical computer vision datasets. In: Proceeding of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); 2020 Jun 14-19; Seattle, WA, USA; Piscataway. IEEE; 2018. p. 1314-21.

[50]

D. Zhao, X. Huang, H. Peng, H. Lam, D.J. LeBlanc. Accelerated evaluation of automated vehicles in car-following maneuvers. IEEE Trans Intell Transp Syst, 19 (3) (2018), pp. 733-744

[51]

S. Feng, X. Yan, H. Sun, Y. Feng, H.X. Liu. Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Nat Commun, 12 (1) (2021), p. 748

[52]

S. Feng, H. Sun, X. Yan, H. Zhu, Z. Zou, S. Shen, et al.. Dense reinforcement learning for safety validation of autonomous vehicles. Nature, 615 (7953) (2023), pp. 620-627

[53]

J. Wishart, S. Como, M. Elli, B. Russo, J. Weast, N. Altekar, et al.. Driving safety performance assessment metrics for ads-equipped vehicles. SAE Int J Adv Curr Prac Mobility, 2 (5) (2020), pp. 2881-2899

[54]

L.A. Dennis, M. Fisher. Verifiable self-aware agent-based autonomous systems. Proc IEEE, 108 (7) (2020), pp. 1011-1026

[55]

Gyllenhammar M, Johansson R, Warg F, Chen D, Heyn H M, Sanfridson M, et al. Towards an operational design domain that supports the safety argumentation of an automated driving system. In: Proceeding of 10th European Congress on Embedded Real Time Systems (ERTS 2020); 2020 Jan 29-31; Toulouse, France. 2020.

[56]

Colwell I, Phan B, Saleem S, Salay R, Czarnecki K. An automated vehicle safety concept based on runtime restriction of the operational design domain. In: Proceeding of 2018 IEEE Intelligent Vehicles Symposium (IV); 2018 Jun 26-30; Changshu, China; Piscataway. IEEE; 2018. p. 1910-7.

[57]

Reschka A, Bohmer JR, Saust F, Lichte B, Maurer M. Safe, dynamic and comfortable longitudinal control for an autonomous vehicle. In: Proceeding of 2012 IEEE Intelligent Vehicles Symposium; 2012 Jun 3-7; Madrid, Spain; Piscataway. IEEE; 2012. p. 346-51.

[58]

Reschka A, Böhmer JR, Nothdurft T, Hecker P, Lichte B, Maurer M. A surveillance and safety system based on performance criteria and functional degradation for an autonomous vehicle. In: Proceeding of 2012 15th International IEEE Conference on Intelligent Transportation Systems; 2012 Sep 16-19; Anchorage, AK, USA; Piscataway. IEEE; 2012. p. 237-42.

[59]

Schlatow J, Moostl M, Ernst R, Nolte M, Jatzkowski I, Maurer M, et al. Self-awareness in autonomous automotive systems. In: Proceeding of Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017; 2017 Mar 27-31; Lausanne, Switzerland; Piscataway. IEEE; 2017. p. 1050-5.

[60]

Ulbrich S, Reschka A, Rieken J, Ernst S, Bagschik G, Dierkes F, et al. Towards a functional system architecture for automated vehicles. 2017. arXiv:2107.08142.

[61]

Nolte M, Jatzkowski I, Ernst S, Maurer M. Supporting safe decision making through holistic system-level representations & monitoring—a summary and taxonomy of self-representation concepts for automated vehicles. 2020. arXiv:2007.13807.

[62]

W. Shao, J. Li, Y. Zhang, H. Wang. [Key technologies to ensure the safety of the intended functionality for intelligent vehicles]. Automot Eng, 44 (2022), pp. 1289-1304Chinese

[63]

Jain A, Del Pero L, Grimmett H, Ondruska P. Autonomy 2.0: why is self-driving always 5 years away? 2021. arXiv:2107.08142.

[64]

ISO/TR 4804: Road vehicles—safety and cybersecurity for automated driving systems: design, verification and validation. International standard. Switzerland: International Organization for Standardization, 2020.

[65]

Kuang X, Zhang Y, Li H. SOTIF requirement analysis based on STPA. In: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence; 2021 Dec 22-24; Sanya, China; New York City: ACM Digital Library. 2021. p. 1-5.

[66]

H.H. Goode, R.E. Machol, T. Teichmann. System engineering. Phys Today, 10 (9) (1957), pp. 34-36

[67]

Xu Y, Shao W, Li J, Yang K, Wang W, Huang H, et al. SIND: A drone dataset at signalized intersection in China. In: Proceeding of 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC); 2022 Oct 8-12; Macao, China; Piscataway: IEEE; 2022. p. 2471-8.

[68]

J. Wang, J. Wu, X. Zheng, D. Ni, K. Li. Driving safety field theory modeling and its application in pre-collision warning system. Transp Res, Part C Emerg Technol, 72 (2016), pp. 306-324

[69]

Hendrycks D, Basart S, Mu N, Kadavath S, Wang F, Dorundo E, et al. The many faces of robustness: a critical analysis of out-of-distribution generalization. 2021. arXiv.2006.16241.

[70]

Shafahi A, Najibi M, Ghiasi A, Xu Z, Dickerson J, Studer C, et al. Adversarial training for free!In: Proceedings of the 33rd International Conference on Neural Information Processing Systems; 2019 Dec 8-14; Red Hook, NY, USA; New York Ctiy: ACM Digital Library; 2019. p. 3358-69.

[71]

F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, et al.. A comprehensive survey on transfer learning. Proc IEEE, 109 (1) (2021), pp. 43-76

[72]

R. Luo, S. Zhao, J. Kuck, B. Ivanovic, S. Savarese, E. Schmerling, et al.. Sample-efficient safety assurances using conformal prediction. Proceeding of, S.M.LaValle, J.M.O’Kane, M.Otte, D.Sadigh, P.Tokekar (Eds.), Algorithmic foundations of robotics XV, Springer International Publishing, Springer, Berlin (2023), pp. 149-169

[73]

Sadat A, Ren M, Pokrovsky A, Lin YC, Yumer E, Urtasun R. Jointly learnable behavior and trajectory planning for self-driving vehicles. In: Proceeding of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2019 Nov 1 Macao, China. New York City: ACM Digital Library; 3949-56.

[74]

Ross S, Bagnell D. Efficient Reductions for Imitation Learning. In: Teh, Y.W., Titterington, M., editors. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics; 2010 May 13-15; Sardinia, Italy. Pittsburgh: PMLR; 2010. p. 661-8.

[75]

Gansch R, Adee A. System theoretic view on uncertainties. In: Proceeding of 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE); 2020 Mar 9-13; Grenoble, France; Piscataway: IEEE; 2022. p. 1345-50.

[76]

Adee A, Munk P, Gansch R, Liggesmeyer P. Uncertainty representation with extended evidential networks for modeling safety of the intended functionality (SOTIF). In:Proceedings of the 30th European Safety and Reliability Conference and 15th Probabilistic Safety Assessment and Management Conference; 2020 Nov 1-5; Venice, Italy; 2020. p. 4148-55.

[77]

W. Shao, Y. Xu, J. Li, C. Lv, W. Wang, H. Wang. How does traffic environment quantitatively affect the autonomous driving prediction?. IEEE Trans Intell Transp Syst, 24 (10) (2023), pp. 11238-11253

[78]

E. Hüllermeier, W. Waegeman. Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach Learn, 110 (3) (2021), pp. 457-506

[79]

K. Yang, X. Tang, J. Li, H. Wang, G. Zhong, J. Chen, et al.. Uncertainties in onboard algorithms for autonomous vehicles: challenges, mitigation, and perspectives. IEEE Trans Intell Transp Syst, 24 (9) (2023), pp. 8963-8987

[80]

Kendall A, Gal Y. What uncertainties do we need in Bayesian deep learning for computer vision? In: Proceedings of the 31st International Conference on Neural Information Processing Systems; 2017 Dec 4-9; Red Hook, NY, USA; New York City: ACM Digital Library; 2017.

[81]

G. Montavon, W. Samek, K.R. Müller. Methods for interpreting and understanding deep neural networks. Digit Signal Process, 73 (2018), pp. 1-15

[82]

Y. Zhang, P. Tino, A. Leonardis, K. Tang. A survey on neural network interpretability. IEEE Trans Emerg Top Comput Intell, 5 (5) (2021), pp. 726-742

[83]

W.J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, B. Yu. Definitions, methods, and applications in interpretable machine learning. Proc Natl Acad Sci USA, 116 (44) (2019), pp. 22071-22080

[84]

Louizos C, Welling M. Multiplicative normalizing flows for variational Bayesian neural networks. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6-11; Sydney, Australia; New York City: ACM Digital Library; 2017. p. 2218-27.

[85]

Kristiadi A, Hein M, Hennig P. Learnable uncertainty under laplace approximations. 2020. arXiv:2010.02720.

[86]

Salimans T, Kingma DP, Welling M. Markov Chain Monte Carlo and variational inference: bridging the gap. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning—Volume 37; 2015 Jul 7-9; Lille, France; 2015. p. 1218-26.

[87]

Gal Y, Ghahramani Z. Dropout as a bayesian approximation:representing model uncertainty in deep learning. In: Proceedings of the 33rd International Conference on Machine Learning; 2016 Jun 19-24; New York, NY, USA. New York City: ACM Digital Library; 2016. p. 1050-9.

[88]

Lakshminarayanan B, Pritzel A, Blundell C. Simple and scalable predictive uncertainty estimation using deep ensembles. In: Proceedings of the 31st International Conference on Neural Information Processing Systems; 2017 Dec 4-9; Red Hook, NY, USA. New York City: ACM Digital Library; 2017.

[89]

Wenzel F, Snoek J, Tran D, Jenatton R. Hyperparameter ensembles for robustness and uncertainty quantification. In: Proceedings of the 34th International Conference on Neural Information Processing Systems; 2020 Dec 6-12; Red Hook, NY, USA. New York City: ACM Digital Library; 2020. p. 6514-27.

[90]

Wen Y, Tran D, Ba J. BatchEnsemble: an alternative approach to efficient ensemble and lifelong learning. 2020. arXiv:2002.06715.

[91]

Malinin A, Gales M. Predictive uncertainty estimation via prior networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems; 2018 Dec 3-8; Red Hook, NY, USA. New York City: ACM Digital Library; 2018.

[92]

Sensoy M, Kaplan L, Kandemir M. Evidential deep learning to quantify classification uncertainty. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems; 2018 Dec 3-8; Red Hook, NY, USA. New York City: ACM Digital Library; 2018.

[93]

Rosenfeld N, Mansour Y, Yom-Tov E. Discriminative learning of prediction intervals. In: Storkey A, Perez-Cruz F, editors. In:Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics; 2018 Apr 9- 11; Lanzarote, Spain. Pittsburgh: PMLR; 2018. p. 347-55.

[94]

Chang J, Lan Z, Cheng C, Wei Y. Data uncertainty learning in face recognition. In: Proceeding of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020 Jun 14-19; Seattle, WA, USA. Piscataway: IEEE. 2020. p. 5709-18.

[95]

Y. Gao, M.K. Ng. Wasserstein generative adversarial uncertainty quantification in physics-informed neural networks. J Comput Phys, 463 (2022), Article 111270

[96]

M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu, M. Ghavamzadeh, et al.. A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf Fusion, 76 (2021), pp. 243-297

[97]

D. Feng, A. Harakeh, S.L. Waslander, K. Dietmayer. A review and comparative study on probabilistic object detection in autonomous driving. IEEE Trans Intell Transp Syst, 23 (8) (2022), pp. 9961-9980

[98]

M. Du, N. Liu, X. Hu. Techniques for interpretable machine learning. Commun ACM, 63 (1) (2019), pp. 68-77

[99]

D.V. Carvalho, E.M. Pereira, J.S. Cardoso. Machine learning interpretability: a survey on methods and metrics. Electronics, 8 (8) (2019), p. 832

[100]

J. Janai, F. Güney, A. Behl, A. Geiger. Computer vision for autonomous vehicles: problems, datasets and state of the art. Found Trends Comput Graph Vis (2017) arXiv:1704.05519

[101]

O. Willers, S. Sudholt, S. Raafatnia, S. Abrecht. Safety concerns and mitigation approaches regarding the use of deep learning in safety-critical perception tasks. A. Casimiro, F. Ortmeier, E. Schoitsch, F. Bitsch, P. Ferreira (Eds.), Computer safety, reliability, and security, Springer, Berlin (2020), pp. 336-350

[102]

Alcorn MA, Li Q, Gong Z, Wang C, Mai L, Ku WS, et al. Strike (with) a pose: neural networks are easily fooled by strange poses of familiar objects. In: Proceeding of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019 Jun 15-20; Long Beach, CA, USA. Piscataway: IEEE. 2019. p. 4840-9.

[103]

Remeli V, Morapitiye S, Rovid A, Szalay Z. Towards verifiable specifications for neural networks in autonomous driving. In:Proceeding of 2019 IEEE 19th International Symposium on Computational Intelligence and Informatics and 7th IEEE International Conference on Recent Achievements in Mechatronics, Automation, Computer Sciences and Robotics (CINTI-MACRo); 2019 Nov 14- 16 ; Szeged, Hungary. Piscataway: IEEE. 2019. p. 175-80.

[104]

Sämann T, Schlicht P, Hüger Strategy to increase the safety of a DNN-based perception for HAD systems. 2020. arXiv:2002.08935.

[105]

J. Hariyono, K.H. Jo. Detection of pedestrian crossing road: a study on pedestrian pose recognition. Neurocomputing, 234 (2017), pp. 144-153

[106]

Ajanovic Z, Lacevic B, Shyrokau B, Stolz M, Horn M. Search-based optimal motion planning for automated driving. In: Proceeding of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2018 Oct 1-5; Madrid, Spain. Piscataway: IEEE. 2018. p. 4523-30.

[107]

J. Hu, B. Zheng, C. Wang, C. Zhao, X. Hou, Q. Pan, et al.. A survey on multi-sensor fusion based obstacle detection for intelligent ground vehicles in off-road environments. Front Inf Technol Electron Eng, 21 (5) (2020), pp. 675-692

[108]

Zhao X, Liu Z, Hu R, Huang K. 3D object detection using scale invariant and feature reweighting networks.Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence; 2019 Jan 27-Feb 1; Honolulu, HI, New York City: ACM Digital Library. 2019. p. 9267-74.

[109]

L. Guan, Y. Chen, G. Wang, X. Lei. Real-time vehicle detection framework based on the fusion of lidar and camera. Electronics, 9 (3) (2020), p. 451

[110]

Xu R, Xiang H, Xia X, Han X, Li J, Ma J. OPV2V:an open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication. In: Proceeding of 2022 International Conference on Robotics and Automation (ICRA); 2022 May 23-27; Philadelphia, PA, USA. Piscataway: IEEE; 2022. p. 2583-9.

[111]

S. Khan, F. Andert, N. Wojke, J. Schindler, A. Correa, A. Wijbenga. Towards collaborative perception for automated vehicles in heterogeneous traffic. J. Dubbert, B. Müller, G. Meyer (Eds.), Advanced microsystems for automotive applications 2018, Springer, Berlin (2019), pp. 31-42

[112]

S. Aradi. Survey of deep reinforcement learning for motion planning of autonomous vehicles. IEEE Trans Intell Transp Syst, 23 (2) (2022), pp. 740-759

[113]

Z. Zhu, H. Zhao. A survey of deep RL and IL for autonomous driving policy learning. IEEE Trans Intell Transp Syst, 23 (9) (2022), pp. 14043-14065

[114]

Xu J, Shao W, Xu Y, Wang W, Li J, Wang H. A risk probability predictor for effective downstream planning tasks. In: Proceeding of 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC); 2023 Sep 24-28; Bilbao, Spain. Piscataway: IEEE; 2023.

[115]

X. Wang, X. Qi, P. Wang, J. Yang. Decision making framework for autonomous vehicles driving behavior in complex scenarios via hierarchical state machine. Autonomous Intel Syst, 1 (1) (2021), p. 10

[116]

Orzechowski PF, Burger C, Lauer M. Decision-making for automated vehicles using a hierarchical behavior-based arbitration scheme. In: Proceeding of 2020 IEEE Intelligent Vehicles Symposium (IV); 2020 Oct 19-Nov 13; Las Vegas, NV, USA. Piscataway: IEEE; 2020. p. 767-74.

[117]

P. Hang, S. Huang, X. Chen, K.K. Tan. Path planning of collision avoidance for unmanned ground vehicles: a nonlinear model predictive control approach. Proc Inst Mech Eng, Part I, J Syst Control Eng, 235 (2) (2021), pp. 222-236

[118]

X. Zhang, W. Shao, M. Zhou, Q. Tan, J. Li. A scene comprehensive safety evaluation method based on binocular camera. Robot Auton Syst, 128 (2020), Article 103503

[119]

Zhao S, Hou Q, Zhai Y. Decision mechanism of vehicle autonomous lane change based on rough set theory. In: Proceedings of the 2021 1st International Conference on Control and Intelligent Robotics; 2021 Jun 18-20; New York, NY, USA. New York City: ACM Digital Library; 2021. p. 33-9.

[120]

Beheshtitabar E, Mohammad Alipour E. A rule based control algorithm for on-ramp merge with connected and automated vehicles. In: Proceeding of International Conference on Transportation and Development 2020; 2020 May 26-29; Washington, DC, USA. Washington, DC: American Society of Civil Engineers; 2020. p. 303-16.

[121]

B.R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A.A.A. Sallab, S. Yogamani, et al.. Deep reinforcement learning for autonomous driving: a survey. IEEE Trans Intell Transp Syst, 23 (6) (2022), pp. 4909-4926

[122]

H. Gao, G. Shi, G. Xie, B. Cheng. Car-following method based on inverse reinforcement learning for autonomous vehicle decision-making. Int J Adv Robot Syst, 15 (6) (2018), pp. 1-11

[123]

Brown A, Petrik, Interpretable reinforcement learning with ensemble methods; 2018. arXiv:1809.06995.

[124]

Moldovan TM, Abbeel P. Safe exploration in Markov decision processes; 2012.arXiv:1205.4810.

[125]

Nishimura H, Ivanovic B, Gaidon A, Pavone M, Schwager M. Risk-sensitive sequential action control with multi-modal human trajectory forecasting for safe crowd-robot interaction. In: Proceeding of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2020 Oct 24; Las Vegas, NV, USA. New York City: ACM Digital Library. 2020. p. 11205-12.

[126]

Z. Cao, S. Xu, X. Jiao, H. Peng, D. Yang. Trustworthy safety improvement for autonomous driving using reinforcement learning. Transp Res, Part C Emerg Technol, 138 (2022), Article 103656

[127]

C. Pek, S. Manzinger, M. Koschi, M. Althoff. Using online verification to prevent autonomous vehicles from causing accidents. Nat Mach Intell, 2 (9) (2020), pp. 518-528

[128]

Z. Cao, K. Jiang, W. Zhou, S. Xu, H. Peng, D. Yang. Continuous improvement of self-driving cars using dynamic confidence-aware reinforcement learning. Nat Mach Intell, 5 (2) (2023), pp. 145-158

[129]

Urban C, Miné A. A review of formal methods applied to machine learning. 2021. arXiv.2104.02466.

[130]

E.M. Clarke, T.A. Henzinger, H. Veith, R. Bloem (Eds.), Handbook of model checking, Springer, Berlin (2018)

[131]

G. Katz, D.A. Huang, D. Ibeling, K. Julian, C. Lazarus, R. Lim, et al.. The marabou framework for verification and analysis of deep neural networks. I.Dillig, S.Tasiran (Eds.), Computer aided verification, Springer, Berlin (2019), pp. 443-452

[132]

R.R. Bunel, I. Turkaslan, P.H.S. Torr, P. Kohli, P.K. Mudigonda Proceeding of A unified view of piecewise linear neural network verification, ACM Digital Library, Red Hook, NY, USA. New York (2018), pp. 4795-4804

[133]

G. Singh, T. Gehr, M. Püschel, M. Vechev. An abstract domain for certifying neural networks. Proc ACM Program Lang, 3 (2019), p. 41

[134]

X. Huang, M. Kwiatkowska, S. Wang, M. Wu. Safety verification of deep neural networks. R. Majumdar, V. Kunčak (Eds.), Computer aided verification, Springer, Berlin (2017), pp. 3-29

[135]

Weng TW, Zhang H, Chen PY, Yi J, Su D, Gao Y, et al., Evaluating the robustness of neural networks: an extreme value theory approach. 2018. arXiv:1801.10578.

[136]

Lee S, Cha S, Lee D, Oh H. Effective white-box testing of deep neural networks with adaptive neuron-selection strategy. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis; 2020 Jul 18-22; New York, USA. New York City: ACM Digital Library; 2020. p. 165-76.

[137]

Byun T, Rayadurgam S, Heimdahl MPE. Black-box testing of deep neural networks. In: Proceeding of 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE); 2021 Oct 25-28; Wuhan, China. Piscataway: IEEE; 2021. p. 309-20.

[138]

Ma L, Zhang F, Sun J, Xue M, Li B, Juefei-Xu F, et al. DeepMutation: mutation testing of deep learning systems. In: Proceeding of 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE); 2018 Oct 15-18; Memphis, TN, USA. Piscataway: IEEE; 2018. p. 100-11.

[139]

Z.Q. Zhou, L. Sun. Metamorphic testing of driverless cars. Commun ACM, 62 (3) (2019), pp. 61-67

[140]

Ma L, Juefei-Xu F, Zhang F, Sun J, Xue M, Li B, et al. DeepGauge:multi-granularity testing criteria for deep learning systems. In:Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering; 2018 Sep 3-7; Montpellier, France. Piscataway: IEEE; 2018. p. 120-31.

[141]

Aggarwal A, Shaikh S, Hans S, Haldar S, Ananthanarayanan R., Saha D. Testing framework for black-box AI models. In:Proceeding of 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion); 2021 May 25-25; Madrid, Spain. Piscataway: IEEE; 2021. p. 81-4.

[142]

Volk G, Gamerdinger J, von Bernuth A, Bringmann O. A comprehensive safety metric to evaluate perception in autonomous systems. In: Proceeding of 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC); 2020 Sep 20-23; Rhodes, Greece. Piscataway: IEEE; 2020. p. 1-8.

[143]

Ivanovic B., Pavone M., Rethinking trajectory forecasting evaluation; 2021. arXiv:2106.12732.

[144]

B.L. Deng, G. Li, S. Han, L. Shi, Y. Xie. Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc IEEE, 108 (4) (2020), pp. 485-532

[145]

V. Sze, Y.H. Chen, T.J. Yang, J.S. Emer. Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE, 105 (12) (2017), pp. 2295-2329

[146]

Wei T, Liu C. Online Verification of deep neural networks under domain shift or network updates; 2023. arXiv:2106.12732.

[147]

F. Zhao, C. Zhang, N. Dong, Z. You, Z. Wu. A uniform framework for anomaly detection in deep neural networks. Neural Process Lett, 54 (4) (2022), pp. 3467-3488

[148]

Shao W, Li B, Yu W, Xu J, Wang H. When is it likely to fail? Performance monitor for black-box trajectory prediction model. 2023. techrxiv.24265672.v1.

[149]

Mougan C, Nielsen DS. Monitoring model deterioration with explainable uncertainty estimation via non-parametric bootstrap. In:Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence; 2023 Feb 7- 14 ; New York City: ACM Digital Library; 2023. p. 15037-45.

[150]

Q.M. Rahman, P. Corke, F. Dayoub. Run-time monitoring of machine learning for robotic perception: a survey of emerging trends. IEEE Access, 9 (2021), pp. 20067-20075

[151]

Mohseni S, Pitale M, Singh V, Wang Z. Practical solutions for machine learning safety in autonomous vehicles. 2019. arXiv:1912.09630.

[152]

Mohseni S, Jagadeesh A, Wang Z. Predicting model failure using saliency maps in autonomous driving systems. 2019. arXiv:1905.07679.

[153]

Hecker S, Dai D, Van Gool L. Failure prediction for autonomous driving. In: Proceeding of 2018 IEEE Intelligent Vehicles Symposium (IV); 2018 Jun 26-30; Changshu, China. Piscataway: IEEE; 2018. p. 1792-9.

[154]

C. Gurău, D. Rao, C.H. Tong, I. Posner. Learn from experience: probabilistic prediction of perception performance to avoid failure. Int J Robot Res, 37 (9) (2018), pp. 981-995

[155]

J. Yang, S. Rahardja, P. Fränti. Mean-shift outlier detection and filtering. Pattern Recognit, 115 (2021), Article 107874

[156]

Grathwohl W, Wang KC, Jacobsen JH, Duvenaud D, Norouzi M, Swersky K. Your classifier is secretly an energy based model and you should treat it like one. 2020. arXiv:1912.03263.

[157]

C. Gautam, R. Balaji, A. Tiwari, K. Ahuja. Localized multiple kernel learning for anomaly detection: one-class classification. Knowl Base Syst, 165 (2019), pp. 241-252

[158]

Gu X, Akoglu L, Rinaldo A. Statistical analysis of nearest neighbor methods for anomaly detection. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems; 2018 Dec 8; Red Hook, NY, USA. New York City: ACM Digital Library; 2019. p. 10923-33.

[159]

Sharan V, Gopalan P, Wieder U. Efficient anomaly detection via matrix sketching. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems; 2018 Dec 3-8; Red Hook, NY, USA. New York City: ACM Digital Library; 2018. p. 8080-91.

[160]

Chalapathy R, Chawla S. Deep learning for anomaly detection: a survey. 2019. arXiv:1901.03407.

[161]

B. Lindemann, B. Maschler, N. Sahlab, M. Weyrich. A survey on anomaly detection for technical systems using LSTM networks. Comput Ind, 131 (2021), Article 103498

[162]

Kim KH, Shim S, Lim Y, Jeon J, Choi J, Kim B, et al. RaPP: novelty detection with reconstruction along projection pathway. In:Proceedings of the International Conference on Learning Representations (ICLR) 2020; 2020; Virtual Conference.

[163]

Goodfellow J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al., Generative adversarial networks. 2014. arXiv:1406.2661.

[164]

D.P. Kingma, M. Welling. An introduction to variational autoencoders. Found Trends Mach Learn, 12 (4) (2019), pp. 307-392

[165]

Corbière C, Thome N, Bar-Hen A, Cord M, Pérez P. Addressing failure prediction by learning model confidence. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems; 2018 Dec 8; Red Hook, NY, USA. New York City: ACM Digital Library; 2019. p. 2902-2913.

[166]

Shao W, Li J, Wang H. Self-aware trajectory prediction for safe autonomous driving. In: Proceeding of 2023 IEEE Intelligent Vehicles Symposium (IV); 2023 Jun 4-7; Anchorage, AK, USA. Piscataway: IEEE; 2023. p. 1-8.

[167]

Rahman QM, Sunderhauf N, Dayoub F. Online monitoring of object detection performance during deployment. In: Proceeding of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2021 Sep 27-Oct 1; Prague, Czech Republic. Piscataway: IEEE; 4839-45.

[168]

Rahman QM, Sunderhauf N, Dayoub F. Per-frame mAP Prediction for continuous performance monitoring of object detection during deployment. In: 2021 IEEE Winter Conference on Applications of Computer Vision Workshops (WACVW); 2021 Jan 5-9; Waikola, HI, USA. Piscataway: IEEE; 2021. p. 152-60.

[169]

Henzinger TA, Lukina A, Schilling C. Outside the box: abstraction-based monitoring of neural networks. 2020. arXiv:1911.09032.

[170]

S. Luan, Z. Gu, L.B. Freidovich, L. Jiang, Q. Zhao. Out-of-distribution detection for deep neural networks with isolation forest and local outlier factor. IEEE Access, 9 (2021), pp. 132980-132989

[171]

Lee K, Lee K, Lee H, Shin J. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems; 2018 Dec 3-8; Red Hook, NY, USA. New York City: ACM Digital Library; 2018.

[172]

Huang R, Geng A, Li Y. On the importance of gradients for detecting distributional shifts in the wild. In:Proceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS 2021); 2021 Dec; Virtual Conference.

[173]

Hendrycks D, Gimpel K. A baseline for detecting misclassified and out-of-distribution examples in neural networks. In:Proceedings of International Conference on Learning Representations; 2017.

[174]

Liang S, Li Y, Srikant R. Enhancing the reliability of out-of-distribution image detection in neural networks. 2018. arXiv:1706.02690.

[175]

Shao W, Xu Y, Peng L, Li J, Li J, Wang H. Failure detection for motion prediction of autonomous driving: an uncertainty perspective. In: Proceeding of 2023 IEEE International Conference on Robotics and Automation (ICRA); 2023 May 29-Jun 2; London, UK. Piscataway: IEEE; 2023. p. 12721-8.

[176]

K. Yang, B. Li, W. Shao, X. Tang, X. Liu, H. Wang. Prediction failure risk-aware decision-making for autonomous vehicles on signalized intersections. IEEE Trans Intell Transp Syst, 24 (11) (2023), pp. 12806-12820

[177]

Kaur R, Jha S, Roy A, Park S, Sokolsky O, Lee I. Detecting OODs as datapoints with high uncertainty; 2021. arXiv:2108.06380.

[178]

Weiss M, Tonella P. Fail-safe execution of deep learning based systems through uncertainty monitoring. In: Proceeding of 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST); 2021 Apr 12-16; Porto de Galinhas, Brazil. Piscataway: IEEE; 2021. p. 24-35.

[179]

M.S. Ramanagopal, C. Anderson, R. Vasudevan, M. Johnson-Roberson. Failing to learn: autonomously identifying perception failures for self-driving cars. IEEE Robot Autom Lett, 3 (4) (2018), pp. 3860-3867

[180]

Antonante P, Spivak DI, Carlone L. Monitoring and diagnosability of perception systems. In: Proceeding of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2021 Sep 27-Oct 1; Prague, Czech Republic. Piscataway: IEEE; 168-75.

[181]

T. Stahl, F. Diermeyer. Online verification enabling approval of driving functions—implementation for a planner of an autonomous race vehicle. IEEE Open J Intell Transp Syst, 2 (2021), pp. 97-110

[182]

BMW Group. Safety assessment report: SAE Level 3 automated driving system. [Internet]. 2022 [cited 2023 Oct 26]. Available from:

[183]

Baidu, Apollo pilot safety report. [Internet]. 2018 [cited 2023 Oct 26]. Available from:

[184]

GM. Self-driving safety report [Internet]. 2018 [cited 2023 Oct 26]. Available from:

[185]

Ford Motor Company. A matter of trust: ford releases safety assessment report for self-driving vehicle development. [Internet]. Dearborn, MI: Business Wire; 2018 Aug 16 [cited 2023 Oct 26]. Available from:

[186]

Nuro. Delivering safety: Nuro VSSA Dec 2021 [Internet]. 2021 Dec 16 [cited 2023 Oct 26]. Available from:

[187]

NAVYA. Safety report [Internet]. 2019 [cited 2023 Oct 26]. Available from:

[188]

NVIDIA. Self-driving safety report 2018 [Internet]. 2018 [cited 2023 Oct 26]. Available from:

[189]

Ebel S., Bosch case study: application of SOTIF for ADAS. Report. Robert Bosch GmbH 2018, n.d.

[190]

APTIV, AUDI, BAIDU, BMW, Continental, FCA, et al. Safety first for automated driving [Internet]. 2019 [cited 2023 Oct 26]. Available from:

[191]

Kaiser B. An integrative solution towards SOTIF and AV safety. In: Proceeding of IQPC SOTIF Conference; 2019 Oct 1-2; Austin, TX, USA; 2019 ; Austin, TX, USA; 2019.

[192]

Becker C, Brewer JC, Yount L, John A. Safety of the intended functionality of lane-centering and lane-changing maneuvers of a generic level 3 highway chauffeur system. Report. Washington D. C.: National Highway Traffic Safety Administration; 2020. Report No.: DOT HS 812 879.

[193]

hella.com [Internet]. Lippstadt: HELLA GmbH & Co. KGaA; [cited 2023 Oct 26]. Available from:

[194]

Mobileye Mobileye safety methodology. Report. Mobileye; 2023.

[195]

Junietz P, Wachenfeld W, Klonecki K, Winner H. Evaluation of different approaches to address safety validation of automated driving. In: Proceeding of 2018 21st International Conference on Intelligent Transportation Systems (ITSC); 2018 Nov 4-7; Maui, HI, USA. Piscataway: IEEE; 2018. p. 491-6.

[196]

S. Hallerbach, Y. Xia, U. Eberle, F. Koester. Simulation-based identification of critical scenarios for cooperative and automated vehicles. SAE Int J Connect Autom Veh, 1 (2) (2018), pp. 93-106

[197]

Holder M, Rosenberger P, Winner H, Dhondt T, Makkapati VP, Maier M, et al. Measurements revealing challenges in radar sensor modeling for virtual validation of autonomous driving. In: Proceeding of 2018 21st International Conference on Intelligent Transportation Systems (ITSC); 2018 Nov 4-7; Maui, HI, USA. Piscataway: IEEE; 2018. p. 2616-22.

[198]

Duy Son T, Bhave A, Van Der Auweraer H. Simulation-based testing framework for autonomous driving development. In: Proceeding of 2019 IEEE International Conference on Mechatronics (ICM); 2019 Mar 18-20; Ilmenau, Germany. Piscataway: IEEE; 2019. p. 576-83.

[199]

Siemens AG. Scenario-based validation and verification of automated driving systems [Internet]. Siemens AG; 2022 [cite 2023 Oct 26]. Available from:

[200]

Akagi Y, Kato R, Kitajima S, Antona-Makoshi J, Uchida N. A risk-index based sampling method to generate scenarios for the evaluation of automated driving vehicle safety. In: Proceeding of 2019 IEEE Intelligent Transportation Systems Conference (ITSC); 2019 Oct 27-30; Auckland, New Zealand. Piscataway: IEEE; 2019. p. 667-72.

[201]

H. Nakamura, H. Muslim, R. Kato, S. Préfontaine-Watanabe, H. Nakamura, H. Kaneko, et al.. Defining reasonably foreseeable vehicle parameter ranges for scenario-based testing of automated vehicles in consideration of risk acceptance. IEEE Access, 10 (2021), pp. 37743-37760

[202]

Thal S, Znamiec H, Henze R, Nakamura H, Imanaga H, Antona-Makoshi J, et al. Incorporating safety relevance and realistic parameter combinations in test-case generation for automated driving safety assessment. In: Proceeding of 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC); 2020 Sep 20-23; Rhodes, Greece. Piscataway: IEEE; 2020. p. 1-6.

[203]

Economic Commission for Europe. UN Regulation No. 157 (Automated Lane Keeping Systems) [Internet]. Geneva: United Nations Economic Commission for Europe; 2022 Mar 5 [cited 2023 Oct 26]. Available from:

[204]

Japan Automobile Manufacturers Association, Inc. Automated Driving Safety Evaluation Framework Ver 3.0 [Internet]. Tokyo:Japan Automobile Manufacturers Association, Inc.; 2022 Dec. Available from:

[205]

Borg M, Englund C, Wnuk K, Duran B, Levandowski C, Gao S, et al. Safely entering the deep: a review of verification and validation for machine learning and a challenge elicitation in the automotive industry. 2018. arXiv:1812.05389.

[206]

Henriksson J, Borg M, Englund C. Automotive safety and machine learning:initial results from a study on how to adapt the ISO 26262 safety standard. In:Proceedings of the 1st International Workshop on Software Engineering for AI in Autonomous Systems; 2018 May 28; Gothenburg, Sweden. Piscataway: IEEE; 2018. p. 47-49.

[207]

Henriksson J, Berger C, Borg M, Tornberg L, Englund C, Sathyamoorthy SR, et al. Towards structured evaluation of deep neural network supervisors. In: Proceeding of 2019 IEEE International Conference On Artificial Intelligence Testing (AITest); 2019 Apr 4-9, Newark, CA, USA. Piscataway: IEEE; 2019. p. 27-34.

[208]

IVEX NV. Safety Co-pilot [Internet]. Heverlee: IVEX NV; 2023 [cited 2023 Oct 26]. Available from:

[209]

PAS 1880:2020: Guidelines for developing and assessing control systems for automated vehicles. British Standards Institution; 2020.

[210]

Karpathy A. Multi-Task Learning in the Wilderness [Internet]. Long Beach, CA: SlidesLive; 2019 Jun 15 [cited 2023 Oct 26]. Available from:

[211]

Gao P. You Should Try Active Learning! [Internet]. Medium; 2021 Jan 28 [cited 2023 Oct 26]. Available from:

[212]

Koumchatzky N. Maglev:software 2.0 platform for autonomous vehicles development. Report. Santa Clara: NVIDIA; 2020.

[213]

Motional. Technically speaking:learning with every mile driven. Report. Boston: Motional; 2021.

[214]

Harris S. Cruise’s continuous learning machine predicts the unpredictable on San Francisco roads [Internet]. Medium; 2020 Sep 11 [cited 2023 Oct 26]. Available from:

[215]

H.S. Mahajan, T. Bradley, S. Pasricha. Application of systems theoretic process analysis to a lane keeping assist system. Reliab Eng Syst Saf, 167 (2017), pp. 177-183

[216]

Stolte T, Bagschik G, Maurer M. Safety goals and functional safety requirements for actuation systems of automated vehicles. In: Proceeding of 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC); 2016 Nov 1-4; Rio de Janeiro, Brazil. Piscataway: IEEE; 2016. p. 2191-8.

[217]

Zhao S, Duan J, Wu S, Gu X, Li C, Yin K, et al. Genetic algorithm-based SOTIF scenario construction for complex traffic flow. Automot Innov 2023;6:531-46.

[218]

P Cao, W Wachenfeld, H Winner. Perception sensor modeling for virtual validation of automated driving. It-Information Technology, 57 (2015), pp. 243-251

[219]

Linder A, Davidse RJ, Iraeus J, John J, Keller A, Klug C, et al. VIRTUAL—a European approach to foster the uptake of virtual testing in vehicle safety assessment. In: Proceedings of 8th Transport Research Arena TRA 2020; 2020 Apr 27-30; Helsinki, Finland. 2020.

[220]

Yahoo. Honda wins world-first approval for level 3 autonomous car. Report. Science X (2020)

[221]

Capperella. Mercedes drive pilot level 3 autonomous system to launch in Germany. Report, Car and Driver, Harlan (2023)

[222]

Proposal for a new UN Regulation on: uniform provisions concerning the approval of vehicles with regard to Automated Lane Keeping Systems. Report. 2021 Oct. Report No.: GRSG-122-16.

[223]

Madala K, Krishnamoorthy J, Gonzalez CA, Shivkumar A, Solmaz M. Contributing factors to consider while defining acceptance criteria and validation targets for assuring SOTIF in autonomous vehicles, SAE Technical Paper 2022.

[224]

Favaro F, Fraade-Blanar L, Schnelle S, Victor T, Peña M, Engstrom J, et al., Building a credible case for safety: waymo’s approach for the determination of absence of unreasonable risk. 2023. arXiv:2306.01917.

[225]

B Boddeker, W Von Wendorff, N Nguyen, P Diehl, R Meertens, R Johannson. Automated driving safety—the art of conscious risk taking—minimum lateral distances to pedestrians. 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE); 2021 Feb 1-5, IEEE, Grenoble, France. Piscataway (2021), pp. 1466-1471

[226]

Rafrafi M, Bourdeaud’Huy T, Risk apportionment methodology based on functional analysis. In: Proceedings of the Multiconference on “Computational Engineering in Systems Applications”; 2006 Oct 4-6; Beijing, China. Piscataway: IEEE; 1103-9.

[227]

H Langdalen, EB Abrahamsen, JT Selvik. On the importance of systems thinking when using the ALARP principle for risk management. Reliab Eng Syst Saf, 204 (2020), p. 107222

[228]

X Tang, K Yang, H Wang, J Wu, Y Qin, W Yu, et al.. Prediction-uncertainty-aware decision-making for autonomous vehicles. IEEE Trans. Intell Veh, 7 (4) (2022), pp. 849-862

[229]

K Yang, X Tang, S Qiu, S Jin, Z Wei, H Wang. Towards robust decision-making for autonomous driving on highway. IEEE Trans Vehicular Technol, 72 (9) (2023), pp. 11251-11263

[230]

Peng L, Li B, Yu W, Yang K, Shao W, Wang H. SOTIF entropy: online SOTIF risk quantification and mitigation for autonomous driving. IEEE Trans Intell Transp Syst. In press.

[231]

L. Peng, H. Wang, J. Li. Uncertainty evaluation of object detection algorithms for autonomous vehicles. Automot Innov, 4 (2021), pp. 241-252

[232]

J Liu, H Wang, Z Cao, W Yu, C Zhao, D Zhao, et al.. Semantic traffic law adaptive decision-making for self-driving vehicles. IEEE Trans Intell Transp Syst, 24 (12) (2023), pp. 14858-14872

[233]

S Li, J Zhang, S Wang, P Li, Y Liao. Ethical and legal dilemma of autonomous vehicles: study on driving decision-making model under the emergency situations of red light-running behaviors. Electronics, 7 (10) (2018), p. 264

[234]

H Wang, A Khajepour, D Cao, T Liu. Ethical decision making in autonomous vehicles: challenges and research progress. IEEE Intell. Transport Syst Mag, 14 (2022), pp. 6-17

[235]

E Medvet, A Bartoli, J Talamini. Road traffic rules synthesis using grammatical evolution. G Squillero, K Sim (Eds.), Applications of Evolutionary Computation, Springer, Berlin (2017), pp. 173-188

[236]

J Talamini, A Bartoli, A De Lorenzo, E Medvet. On the impact of the rules on autonomous drive learning. Appl Sci, 10 (7) (2020), p. 2394

RIGHTS & PERMISSIONS

THE AUTHOR

AI Summary AI Mindmap
PDF (3007KB)

17910

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/