System Safety Engineering
  • Home
  • System Safety Engineering
  • Software Safety
  • Forensic Engineering
  • Human Factors
  • Organizational Factors
  • Risk

Software Safety

Software has been built into more and more products and systems over the years and has taken on more and more of the functionality of those systems. The question is: how dependable is the functionality provided by software? The traditional approach to verification of functionality - try it out and see if it works - is of limited value in the case of software which can be much more complex than hardware.

Software safety has evolved to be a parallel effort to the development of the software itself. The System Safety engineer is involved in each step of the software development process identifying which functions are critical to the safe functioning of the greater system and tracing those functions down into the software modules which support them.

NASA has a Software Safety Guidebook online. It describes the software safety effort as a part of a larger system safety program.

DoD also has a handbook and other good links.

However, the main problem with using the traditional system safety method on software is that the probability of software failure is not measurable or even easily estimated. Traditional system safety uses a combination of probability and severity to rate the risk of each hazard. Software does not "fail" after it is completed. What happens is that latent defects in the original product assert themselves later in the life of the product, potentially causing safety problems. The methods described in the handbooks above use an alternative to probability - "software control authority" that is, how much control the software has over the system and how much time there is for a human operator to intervene if the software does something unexpected.

An alternative approach is to use the techniques of Software Reliability Engineering to develop estimates of the reliability of a piece of software as it is going through the development process. This is done by tracking the "bug reports" and matching the rate of bug removal to an exponential curve. The curve is then extrapolated to estimate future reliability. An entire textbook is available online.