Understanding Safety / SMS Basic

Risk & HazID

Hazard Identification & Risk Reduction

The second pillar of the ICAO SMS framework, aims to use the proper tools and the proper techniques to identify hazards and eliminate- reduce the risk factor in the aviation industry. As risk is a subjective concept, each individual has a different risk tolerance and acceptance, the aim here is more to control the risk rather than eliminate. Total risk elimination can only be achieved if we ground the aeroplanes and stop flying, but then we lose the purpose. The concept being promoted by the ICAO and the Authorities is to reduce the risk to ALRP of to ALS. ALARP stands for As Low As Reasonably Practical, which will take into account the operational and financial impact of risk control. ALS is the Acceptable Level of Safety as set out by the Authorities, which means that the organisations must find the means to achieve this level regardless of impacts.

ALARP principle shows that the sum of all risks is in the acceptable class. The level of acceptability will be set on the initial stage by the Regulator but it can further be tightened by the operator (if the safety culture is strong). If the risk is not in the acceptable area, it must NOT be in the intolerable area. If in the intolerable area, the operation should immediately cease in order to protect the system. Demonstrating ALARP is a practise and requires a skilful team, as many times the transition from the intolerable area to the ALARP can be achieved with some good standard industry practices, or with a simple change of procedures. There are cases though, that require strong effort and a serious financial cost in order to achieve ALARP. This is where the management steps in to make the decision if the cost-benefit analysis is valid or if the process is better to be ceased. ALARP changes over time. What was intolerable a few years back, with the advancement of technology or the maturity of the organisation can become an easy transition. It is therefore required to revisit the identified hazards periodically to re-examine, re-asses, re-evaluate and update the level of risk.

Hazard is a condition, an event, or a circumstance, existing or latent, which has the potential to cause harm, disrupt operations, cause damage, or reduce the ability to operate. A hazard can cause adverse consequences when its potential energy is released. Hazard is a prerequisite condition for an accident / incident to happen. Hazards are usually identified in energy sources (e.g. electrical system, fuel system) or in complex operations with safety critical factors (e.g. aircraft turnaround, high velocity moving aeroplane).

Threat is the action, or condition that will cause the hazard to be released.

One needs to note, that many definitions exist on the same words and many times the definitions are not clear enough and do not provide clear distinction (e.g. ICAO definition of threat and hazard).

Similarly, one should not confuse hazard and consequence Consequences are the end result of a hazard being released. An example given by ICAO is the case of 15knots wind. If it is a headwind along the runway is rather beneficial and cannot be considered as a hazard. If, however, it is a crosswind of 15knots, which could be near the aircraft limits, or the pilot’s limits, is a hazard. If this hazard is now released by a threat, e.g. inexperienced pilot, the consequence could be a loss of directional control during landing and a chance of runway excursion.

Hazard Identification is the process of looking ahead and trying to understand and control hazards. HazId must be a laterally thinking process, i.e. try to look in every direction, unencumbered by past ideas and experiences. Hazards can sometimes be very obvious to the researcher, but it can sometimes be totally hidden and only be uncovered as a result of a release or because of a similar hazard being identified. Hazards exist in all levels of the organisation and thus it is vital that there is input by every department during this process.

The methodology for collecting data can be any of the three typical ways in SMS. Reactive, proactive or predictive. Reactive methodologies require examination of past events and reports, whilst the proactive methods examine data from monitoring and surveys to unhide the latent conditions. Predictive methodologies go a step further where one is trying to be creative and identify hazards never heard of before or examine cases that would be considered extreme. As SMS is an industry approach, the practitioner should not limit the inputs from within the organisation but should try the best to communicate with other providers or the Authorities in order to enrich the list of potential hazards. Some methodologies include formal reviews, surveys, monitoring of data, assessments of operations, reports, brainstorming etc.

Hazards can be categorised in big groups in order to prioritise them but also easier control them. The most common categories are:

  • Environmental (ENV) which refer mainly to the weather phenomena (thunderstorms) and geographical natural events (e.g. foods) and terrain restrictions
  • Technical (TECH) refer to the engineering aspect of the machinery and equipment used in aviation operations, as well as the facilities where this equipment is stored or used
  • Organisational (ORG) includes those factors such as culture in the company, expansion or recession of operations, economic situation of the organisation as well as the global economy and operation philosophies
  • Human (HUM) category includes all the human performance issues, like physical wellbeing, illness, fatigue, cognitive and physical limitations etc

The whole process should follow the PDCA cycle, Plan-Do-Check-Act. The process can be divided in 6 distinctive steps in order to be clear, procedural and controlled. The steps are:

  1. Identify Hazards
  2. Explore the likelihood and Severity of hazards
  3. Identify the current defences and/or controls
  4. Evaluate effectiveness of current defences and/or controls
  5. Identify extra defence and/or controls required
  6. Record and monitor

The Hazard Identification can be further broken down into 3 steps:

  1. State the generic hazard
  2. Identify specific components of the hazard
  3. Identify specific risk associated with the hazard

HazId is process that should be continuous and constantly looking out. However, some events may trigger an immediate hazid process. These could be an event (incident or accident), an unexplained increase in minor events, or increase in non-compliances. It can also be the expansion of the operations or major organisation changes.

The final form of hazard identification must be a well-documented and controlled process. SMS requires as part of HazID, the Hazard Log, where the organisation lists all the identified hazards and the controls/ measures taken. The Hazard Log is an important live document, with continuous monitoring and changes being required. As a minimum the Hazard Log should include the following items:

  • Project name
  • Name of system being examined
  • ID of Log
  • ID of Hazards
  • Who identified the hazard (identifier)
  • Date Created
  • Last updated
  • Description (narrative or bullet point)
  • Category of hazard
  • Consequences (potential)
  • Severity
  • Proposed actions – Mitigation measures
  • Proposed by (name of individual)
  • Person responsible for the actions (name)
  • Date of Actions taken
  • Date of Review
  • Status

There are two approaches when classifying hazards. Both can be useful depending on the case. The first way is to follow the standards industry taxonomies and grouping of hazards, in such a way as to be able to compare results and keep track of the industry standards etc. The other way is raw, and involves a complete narrative description of the hazard, which is helpful in cases of novice hazard or for hazards with no prior past/ experience. However, this way comparison and data analysis can be more difficult.

The following list is a few of the considerations one must have in mind during this process:

  • Design of system
  • Procedures and Practices of the department
  • Communication (laterally, bottom-up and bottom-down)
  • Organisation Factors affecting the system
  • Working environment conditions
  • Regulatory issues
  • Defences in place and cross departmental interactions of defences
  • Human Factors