1. SW-FMEA: What Is It For, Why Do We Need It, and What’s the Challenge?
The ISO 26262 standard mandates a Safety-Oriented Software Analysis (SSA), but it doesn’t provide a sufficient methodology detail for conducting it.
While ISO 26262 defines the “what” — the requirements, expected outcomes, and guiding principles — it leaves the “how” of performing an SSA open to interpretation.
Key Aspects of SSA
The SSA primarily focuses on three critical elements:
- Software Properties (ISO 26262 Part 6, Section 7.4.10)
Software properties refer to the “environment” or “context” in which the application software operates. Importantly, these properties are independent of specific software functions and address measures that prevent software errors in general. The following lists software properties and safety measures to ensure them.
Examples:- Robustness against erroneous inputs
- E.g. checksum, counters, range checks, redundancies
- Freedom from interference
- E.g. Memory Protection Unit (MPU), dual storage
- Scheduling properties
- E.g. Watchdog, program flow monitoring
- Functional behavior of software components
- E.g. Static code analysis, Worst-case stack analysis
- Robustness against erroneous inputs
- Software Architecture (ISO 26262 Part 6, Section 7.4.10)
This aspect evaluates the safety-related functionality of the software architecture, abstracting software from hardware dependencies. This blog outlines the methodologies, guidewords, and contentious aspects related to software architecture analysis. Practical examples will be covered in the next installment.
Distinctions:- Dependent failure analysis (DFA) addresses dependency on hardware (e.g. resource sharing, independence of memory and processing for safety-critical functions) as well as dependencies of lower or higher level Automotive Safety Integrity Level (ASIL) software.
- System-level and hardware-level design consider diagnostic coverage and means to effectively mitigate system-level safety risks. Those result in measures such as signal redundancy or single bit correction.
- Freedom From Interference (FFI) (ISO 26262 Part 6, Section 7.4.11)
FFI analysis in the DFA ensures that independent software elements do not adversely influence one another, and hardware dependency is considered.
FFI focuses on:- Timing and Execution: Examining task configurations, core assignments, and safeguards to prevent QM (Quality Management) code from starving ASIL code.
- Memory: Analyzing memory protection schemes, such as redundancies, single-bit error corrections, and memory protection units (MPU).
- Communication: Ensuring robust protection mechanisms and detecting potential communication failure modes.
The primary goal of software architecture analysis is to demonstrate that safety-related software functions and properties are suitable and sufficient for achieving overall system safety.

Purpose of This Blog
This blog aims to provide practical guidance for conducting a Software Failure Mode and Effects Analysis (SW-FMEA)—a vital component of SSA. Given the lack of a standardized industry methodology, this guidance is especially valuable for individuals trying to navigate SW-FMEA work and for organizations striving to develop their own approaches.
Conclusion
- What is the SW-FMEA for?
SW-FMEA is a systematic method to evaluate a product’s safety-related software architecture and functions. - Why do we need it?
The SW-FMEA is an analysis method that can be used to satisfy the software architecture aspect of the SSA as required by ISO 26262 part 6. - What’s the problem?
There is currently no standardized or widely accepted methodology for performing an SW-FMEA, leaving organizations to develop their own approaches.
2. Why the AIAG & VDA FMEA Handbook Falls Short for SW-FMEA?
The FMEA Handbook, jointly published by the AIAG (Automotive Industry Action Group) and VDA (Verband der Automobilindustrie) in 2019, serves as a unified guide to industry best practices for developing FMEAs. While comprehensive for traditional FMEAs, it falls short when it comes to the needs of Software Failure Mode and Effects Analysis (SW-FMEA).
What the FMEA Handbook Covers
The FMEA Handbook provides guidance on:
- Design FMEA (DFMEA)
- Process FMEA (PFMEA)
- Supplemental FMEA for Monitoring and System Response (FMEA-MSR)
While these methodologies include some principles applicable to SW-FMEA, they are primarily focused on hardware and system-level behaviors.
While this contains some principles of the SW-FMEA, there are also fundamental differences when analysing software.
FMEA Purpose vs. SW-FMEA Purpose
The purpose of an FMEA, as outlined in the FMEA Handbook, is to address the technical aspects of risk reduction for both product design and processes.
While FMEAs are primarily used to improve safety and prevent failures, their application is broad. They encompass considerations such as reliability, manufacturability, serviceability, and reducing warranty and goodwill costs, among others.
In contrast, the purpose of a SW-FMEA is specifically focused on preventing the violation of safety goals. Its scope is narrower, with an exclusive emphasis on safety-related concerns and ensuring compliance with explicitly documented safety goals.
Attempts to Address Software
The FMEA Handbook touches on software in several sections:
- Chapter 4 (FMEA-MSR): Evaluates risks under real-world conditions, such as switching to a degraded mode, notifying the driver, or logging diagnostic trouble codes (DTCs) for service.
- Appendix A (View A): Mentions DFMEA for software.
- Appendix E (E1 to E3): Includes limited guidance on FMEA for software.
Despite these efforts, the Handbook fails to meet the requirements of ISO 26262, particularly in addressing software architecture.
The Gap: System-Level Behavior vs. Software Architecture
The Handbook views software primarily in terms of system-level behavior. ISO 26262, however, requires analysis at the software architecture level. This distinction is clearly stated in Appendix E1, which notes:
“[…] not the software, but the functions […] are to be examined in the system context.”
This focus on system functions rather than the software architecture limits the Handbook’s relevance for conducting a comprehensive SW-FMEA.
What Principles Can Be Adapted for SW-FMEA?
Despite these gaps, some fundamental FMEA principles remain relevant for SW-FMEA:
- Failure Modes: Identifying potential failures (“What can go wrong?”).
- Effects Analysis: Assessing the impact of those failures (“What happens if it fails?”).
- Mitigations: Implementing strategies to prevent, detect, or react to potential failures.
Conclusion
The AIAG & VDA FMEA Handbook focuses on the system level. It does not provide the detailed methodology required for analyzing software architecture as mandated by ISO 26262 Part 6.
3. SW-FMEA and Sw architecture: How do they interact?
What does the interaction between the software architecture and the SW-FMEA actually look like? So far, we’ve only established that focusing solely on system-level behavior is insufficient.
According to ISO 26262 Part 6, Annex E, the SSA examines the software architecture to ensure the suitability of its functionality.
- Analyzing the Architecture:
The SW-FMEA evaluates the software architecture to identify potential failure modes and their effects. - Addressing Gaps:
If gaps or deficiencies are identified in the software architecture, updates are required. This could involve revising:- Software Architecture
- Software Requirements
- Implementing Safety Measures:
When additional safety needs are identified, the Safety Plan must be updated to reflect the implementation of broader improvement measures.

Conclusion
The SW-FMEA is an iterative process that not only evaluates the software architecture but also drives updates to the software safety requirements and identifies the need for additional safety measures. By doing so, it ensures that the software architecture aligns with the safety goals.
4. From DFMEA to SW-FMEA
Readers are likely familiar with DFMEA and PFMEA, methodologies that have been foundational to automotive product development since the 1970s (see also the FMEA Handbook). However, transitioning from these established methodologies to SW-FMEA requires addressing key differences unique to software.
Key Differences: From FMEA to SW-FMEA
Software introduces a distinct set of challenges due to its nature:
- Systematic Failures Only:
Software doesn’t experience random failures like hardware. Its sole failure type is the systematic design fault—commonly referred to as a “bug.” Bugs can originate at any stage, from system design and software architecture to detailed design and implementation. The implication is that there is no need to establish a root cause as part of the SW-FMEA, because there is only one: a bug. - Safety-Critical Design:
Even with the potential for bugs, safety-critical software must be designed to reduce the risk of hazards. This requires ensuring functionality aligns with design intent, incorporating fault tolerance, and enabling reliable operation under varying conditions, including random hardware failures or environmental changes.
This Leads to a Shift in Focus
Moving from a hardware-centric DFMEA to a software-focused SW-FMEA introduces several fundamental changes:
- Root Causes: There’s no need to analyze root causes in SW-FMEA; the singular cause is always “bugs.”
- Mental Models: A conceptual framework is needed to analyze the range of potential malfunctions introduced by design faults.
Risk Metrics in the SW-FMEA?
Risk Metrics: The traditional Risk Priority Number (RPN) — based on Severity (S), Occurrence (O), and Detection (D) — is often seen as less relevant in the context of SW-FMEA.
While software engineers must evaluate the adequacy of safety measures, there is broad consensus that the S-O-D framework, originally designed for hardware systems, does not align well with the nature of software risks. However, a case can still be made for adapting the S-O-D framework using a software-maturity perspective.
Misconception about RPN in Software
Before getting into that, we need to clarify a very common misconception of RPN and in particular the O and D values. They are a risk-based judgements. It’s less about how often something occurs, but about the risk, the unknown. If something has been used in the field for a long time and no field issues were seen, then the occurrence risk-value is low. If a potential failure would have been found in a design analysis or prototype test, the detection risk-value is low.
For example, in a Design FMEA (DFMEA) for a mechanical seal:
- If the seal is being used for the first time before completion of the Design Verification Plan and Report (DVP&R), both Occurrence and Detection values would initially be high.
- After successful DVP&R testing, the Detection value would decrease because potential failures have been mitigated.
- With proven performance over time in the field, the Occurrence value would also drop.
Adapting Risk-Based Judgments for Software
For software, a risk-based approach can similarly adapt the Occurrence and Detection metrics:
- Occurrence
- High-Risk: “This is a brand-new architecture, so design faults are more likely.”
- Low-Risk: “This architecture has been in use for decades with a proven track record of reliability.”
- Detection
- High-Risk: “Issues will only become apparent after the software is deployed in the field.”
- Low-Risk: “Errors can be identified during architectural analysis or through thorough testing methods.”
In software, Occurrence and Detection values are less about absolute failure rates and more about confidence in design maturity and robustness of the verification process.
A related concept is software complexity and provenance, both of which are emphasized in ISO PAS 8926:2024 as criteria for qualifying safety-critical software from non-ISO 26262 sources.
- Complexity is defined as the “degree to which a software or a software architectural element exhibits a design, implementation, and/or functionalities that are difficult to understand and verify.”
- Provenance refers to “information regarding the origins, custody, and ownership of software and its associated data.” When software is developed according to another functional safety standard, the uncertainty related to its provenance is generally lower.
Applicability of DFMEA Steps to SW-FMEA
The following table outlines how the steps from the FMEA Handbook translate to SW-FMEA:
DFMEA Step | SW-FMEA Applicability | SW-FMEA Details |
---|---|---|
Step 1: Planning and Preparation | Applicable in principle | Covered by ISO 26262 work products such as the safety plan, item definition, and software architecture. |
Step 2: Structure Analysis | Applicable in principle |
Structure analysis focuses on the allocation of functions and requirements to system elements, as well as defining boundaries and interfaces. It encompasses activities such as:
|
Step 3: Function Analysis | Applicable in principle | Breakdown of scope into systems, subsystems, and components, aligning with ISO 26262 work products and processes. |
Step 4: Failure Analysis | Mostly applicable | Failure modes and effects are relevant, but analyzing failure causes is not necessary. |
Step 5: Risk Analysis | Applicable | Risk assessment is essential, but quantifying it using severity, occurrence, and detection is not useful for SW-FMEA. |
Step 6: Optimization | Applicable | Updates to the software architecture, software requirements, or planning of additional safety measures are crucial. |
Step 7: Results Documentation | Applicable in principle | Summarizing and communicating results aligns with ISO 26262 work products and processes. |
Conclusion
The SW-FMEA is a systematic approach to ensure that safety-critical software minimizes the risk of hazards by meeting design intent, incorporating fault tolerance, and delivering reliable functionality.
Unlike DFMEA, SW-FMEA:
- Does not require root cause analysis, as all failures stem from design bugs.
- Demands a mental model for evaluating potential malfunctions.
There is some controversy around using risk values in the SW-FMEA. Most often the SW-FMEA does not rely on traditional RPN metrics for risk quantification.
5. The Controversy: Is the SW-FMEA just Busywork?
The SW-FMEA has garnered a reputation for being tedious and sometimes undervalued, with many software engineers dismissing it as “busywork.” This perception often stems from its tendency to devolve into a lengthy checklist of software architecture blocks and signal interactions, offering limited insight or actionable outcomes.
Functional safety processes must be designed to reduce hazards efficiently and effectively, but when the SW-FMEA becomes a check-the-box exercise, it fails to meet these objectives. To maintain relevance and utility, the SW-FMEA must focus on its core purpose: identifying and mitigating risks in software architecture in a meaningful way.
Refocusing the SW-FMEA
If the SW-FMEA has become more of a checklist in your organization, it may be time to revisit its methodology. While some issues—like “the function computes the wrong value”—are better handled through requirements, code reviews, and testing, the SW-FMEA should be used to systematically evaluate the architecture from a risk perspective of violating a safety goal.
Rather than duplicating efforts already covered by other processes, the SW-FMEA’s value lies in asking:
- “What can go wrong at the architectural level in regard to violating safety requirements?”
- “What potential risks do software modules have to violate safety requirements despite following the development process?”
A Novel Approach: Focus on Software Maturity
To realign the SW-FMEA with its core purpose, this novel approach shifts the focus from exhaustive checklists to evaluating software maturity and technical robustness.
Instead of cataloging every block and signal in the architecture, this method involves posing targeted, risk-based technical questions about specific functions.
Key questions for a maturity-focused SW-FMEA might include:
- Algorithm maturity:
- How was the algorithm selected and qualified?
- Were simulation and prototyping used for verification?
- How resilient is the algorithm to external factors like electrical noise, quantization errors, or timing jitter?
Let’s take the example of a function calculating rotational speed from position sensor data:
- Has the algorithm been proven in prior use?
- Was the impact of propagation delay on speed changes considered?
- How does timing variation affect speed averaging?
- Should position sensor data include timestamps, or is the time delta between processing calls sufficient?
This approach emphasizes the technical viability and resilience of the software, fostering a deeper understanding of potential risks and encouraging proactive mitigation strategies.
Conclusion
The SW-FMEA can occasionally drift into the realm of “busywork,” especially when approached as a rigid checklist. However, by refocusing the SW-FMEA on its core purpose—systematic risk identification and mitigation—organizations can restore its value as a cornerstone of functional safety.
The proposed maturity-focused approach highlights meaningful inquiry over exhaustive documentation, ensuring the SW-FMEA remains a practical and effective tool for evaluating software architecture and improving safety outcomes.
6. To be Continued …
This blog serves as both a guide to SW-FMEA and a platform for discussing controversial topics, such as the relevance of RPN and the need to challenge the SW-FMEA approach if it risks becoming a box-ticking exercise.
In the next installment, we’ll delve into practical examples of SW-FMEA in action, demonstrating how to apply these concepts effectively in real-world scenarios.
Looking to sharpen your skills in SW-FMEA and safety analysis? Explore our full list of training courses for expert guidance on ISO 26262, software architecture, and more.