Functional Safety Suite 7.0
User’s Guide

1 Overview

According to [Wikipedia],

Functional Safety is the part of the overall safety of a system or piece of equipment that depends on the system or equipment operating correctly in response to its inputs, including the safe management of likely operator errors, hardware failures and environmental changes.

Functional Safety Suite is intended for modeling and calculus in the field of functional safety. It supports the following methods:

Event tree analysis:

A flexible and therefore widely used method for quantitative risk analysis.

Functional architecture models:

Graphical representation of the hardware of the safety function, including the logical and physical interaction between the components in a schematic using common symbols. Adding information related to the possible faults of each component and their propagation, a fault tree can automatically derived from the architecture model.

Fault tree analysis:

A universal method for qualitative and quantitative hazard analysis. In particular suitable for calculation of failure rates and unavailabilities of systems, that are characterized by complex homogeneous or in-homogeneous multi-channel architectures. This software also supports correct calculation of failure rates of elements, that may fail multiple times during system life time. Monte-Carlo simulation allows new types of gates, suitable to model diagnosis, replacements, spares etc. in a convenient and precise manner. The unreliability can be calculated as well.

Reliability block diagrams:

An alternative visualization of multi-channel structures, based on the same algorithms as used for fault trees.

Markov models:

Flexible method for quantitative hazard analysis of some time-variant systems, that cannot be described by fault trees. Each fault tree can be represented by a Markov model, but the Markov model is typically much more complicated, since its complexity increases exponentially with the number of basic events.

Complex component models:

Specific tool to calculate safety parameters of components with multiple time-variant failure modes, of whose some are safe, some dangerous. E. g. the typical “bath tub” curve of failure rates can be modeled.

Functional Safety Suite offers:

  • A graphical user interface to create and edit models.

  • Steady state and transient (time-dependent) evaluation.

  • High performance algorithms for exact but nevertheless fast evaluation of even huge fault trees.

  • Charts for unavailability, unreliability, occurrence rate and other values as function of time.

  • Export of all graphics in bitmap (PNG) or vector graphic format (SVG).

  • Export of all evaluation output data in text format.

  • Calculation or Partial Derivatives (Birnbaum Importancy), Criticality Importance, Risk Reduction, Fussel-Vesely-Importance and Risk Achievement for both basic events and generic basic events.

  • Readable file formats for all data (mostly XML files).

  • Linking of complex components, fault trees, reliability block diagrams, Markov models and event trees.

  • Support of modularization in fault trees and reliability block diagrams.

  • Creation of reports in Microsoft Word format (OOXML, docx).

  • Update of reports, even after they have been modified manually by using Microsoft Word.

Functional Safety Suite aims to provide the maximum possible symmetry between fault trees and Markov models. Thus most fault trees can be converted automatically to a Markov model, correctly considering common cause failures and condition events.

Functional Safety Suite further aims to support the user in correct modeling. This is achieved by

  • automatic creation of fault trees based on architecture schematics,

  • completely internal handling of common cause failures in fault trees (beta-model),

  • simplified handling of common cause failures in Markov models,

  • conversion of fault trees to Markov models, including common causes and conditions,

  • many generic basic event models fitting to all typical events,

  • extensions to standard Markov models, so that also condition events can be used,

  • reasonable restrictions regarding modeling and configuration,

  • reasonable modifiers for basic events,

  • notes or warnings if suspicious data is encountered in evaluations,

  • cancellation of calculations that don’t make sense

1.1 Terms and Abbreviations
.

Table 1: Terms and abbreviations

Term

Meaning

Basic event

An event related to an →element. The basic events of a Markov model form the →edges, the basic events of a reliability block diagram are the →blocks, the basic events of an event tree are the →cases (of a →condition).

β

The common cause factor of an occurrence rate or probability.

Branch

In a fault tree: The part of the tree that is below the event including the event itself (including the special case that the event is a basic event and therefore the branch is just the basic event).

Block

The representation of a basic event or a reference in a reliability block diagram.

Case

The representation of a basic event in an event tree. This is one out of one or multiple values that a →condition can take. Each case can be true or false, defined by a probability (typically an →unavailability).

Component

A technical unit that can have several failure modes.

Condition

In an event tree: A constraint that can take at least two →cases.

Condition event

A basic event that is characterized by a probability (typically an unavailability), but no occurrence rate.

D

duty cycle

Damage

In an event tree: The final state that can be reached in case of a hazard, characterized by its severity.

Edge

The representation of a basic event in a Markov model.

Element

Any →component, human behavior or environmental condition that influences the behavior of the system with respect to the →top event.

EUC

Equipment under Control, see definition in [EN 61508].

Event

A situation or a state that can occur related to an element, system or sub-system.

F(t)

The →unreliability.

FT

Fault tree

FTA

Fault tree analysis

Generic basic event

The probabilistic model that describes the occurrence or existence of a basic event. It is stored in a library and thus can be used in multiple models.

Generic damage

A possible damage, defined within one event tree, saved in the .etf file.

h

The occurrence rate (also: occurrence frequency) with unit 1/h. If h belongs to an event describing a failure, it is called (conditional) ‘failure frequency’ or (conditional) ‘failure intensity’ (CFI).

MRT

Mean repair time. If the overall system (i. e. the “EUC” in terms of [EN 61508]) is taken out of service immediately after detection of a failure, it is zero. If the overall system is still operated for a certain time (e. g. with only one channel instead of two) or if the overall system isn’t shut down at all, it is to be considered.

MTTD

Mean time to detect. Necessary for all kinds of dormant failures.

MTTF

Mean time to failure, and also the mean operation time between two failures.

MTTR

Mean time to restoration. Includes the →MTTD and the →MRT.

PFD

Probability of Failure on Demand, see definition in [EN 61508]. It is identical to the (mean) →unavailability \( \overline {Q} \).

PFH

Probability of Failure per Hour, see definition in [EN 61508]. It is identical to the (mean) failure frequency, and thus the (mean) occurrence rate →\( \overline {h} \).

PFTT

Short for ‘Process Fault Tolerance Time’, i. e. the time period for which the process (→EUC) can be operated with wrong control actions before it gets out-of-control (“point of no return”).

PI

Short for ‘Prime Implicant’, the equivalent of a minimal cut-set for incoherent fault trees. For coherent fault trees, the prime implicants are identical to the minimal cut-sets. Please see relevant literature for more information.

Q

The →unavailability.

RBD

Reliability block diagram

State

In a Markov model: A state that a system can take.

Sub-tree

A fault tree or a branch of a fault tree that is referred by a TRANSFER-IN gate in a (higher-level) fault tree. A sub-tree may contain references to lower-level sub-trees. Dividing a fault tree in several fault trees is useful, when a fault tree is too large to be displayed on one page, or if a branch of a tree is needed more than once.

System lifetime

The (mean) lifetime of the system in scope. Needed to calculate some values of complex components (see 10), for some basic event models (see 4), and as stop time for transient evaluation.

TFFR

Tolerable Function Failure Rate, the result of a THR apportionment and thus the safety requirement for a high demand or continuous mode safety (sub-)function of a (sub-)system. Also called “Tolerable Probability of Failure per Hour” (TPFH).

THR

Tolerable Hazard Rate, the result of a risk analysis for each identified hazard. If given in 1/h, it is mathematically the same as the →TPFH, but not each failure is a hazard.

Top event

The topmost gate of a fault tree, describing the (undesired) state that the system can enter due to the occurrence of one or several basic events.

TPFD

Tolerable Probability of Failure on Demand, the result of a THR apportionment and thus the safety requirement for a low demand mode safety (sub-)function of a (sub-)system.

TPFH

Tolerable Probability of Failure per Hour, the result of a THR apportionment and thus the safety requirement for a high demand or continuous mode safety (sub-)function of a (sub-)system. Also called “Tolerable Function Failure Rate” (TFFR).

Unavailability

The probability Q(t) that an →element or system wouldn’t perform as intended, when it would be needed at time t (“on demand”). For non-repairable systems, the unavailability Q(t) is identical to the unreliability F(t), and both are called “failure probability”. For repairable systems (modeled e. g. by basic events of type repairable or cyclic), unavailability Q(t) and unreliability F(t) are completely different values, since the unavailability becomes zero with each (complete) test or repair, whereas the unreliability increases monotonously.

Unreliability

The probability F(t1,t2) that an →element or system doesn’t perform as intended over a certain time interval t1t2. Usually t1=0 is assumed, thus F(t1,t2) is shortened to F(t) with t being t2-t1=t.

w

The (unconditional) occurrence density considering restoration. In contrary to h it is unconditional with respect to whether the component is still available at time t. However it is not a ‘probability density’ (such as f(t)), since its integral over infinite time is greater than 1 in general. Its unit is 1/h.
Note: Up to version 3.3 of this program, the symbol w has been used for the conditional occurrence frequency, which is now named h in coherence with most literature.

1.2 Conventions
  • A term in slanted letters indicates a term with a certain meaning within Functional Safety Suite, e. g. a type of data objects.

  • A term in bold letters indicates a menu, command or button name.

  • A term in ‘single quotation marks’ indicates a fixed term not directly related to Functional Safety Suite.

  • A term in “double quotation marks” indicates a name or a quote. It is also used to indicate, that a term or statement is not literally correct (e. g. a simplification or common but imprecise wording).