Alarms Reduction in Control Room to enhance Operator response.

Introduction
There has been a marked increase in the number of telemetry points being monitored and controlled from a Control Center. This can be mainly due to many expansion programs going on in a company. These could include large capital projects, or just facility upgrades. Now, the question need to be asked if such a high number of alarms improves or hamper the operators response. When it comes to monitoring plants and power systems, more alarms are not necessarily better. A proliferation of alarms can overwhelm an operator during an emergency, impairing his ability to respond quickly and correctly. An alarm flood can result and reduce the operator’s effectiveness to respond in an emergency. It can lead to a critical alarm going unnoticed and being masked by less important alarms.

What is the purpose of an alarm system?
Purpose of any alarm system is to direct the operator’s attention towards condition requiring timely assessment and action to prevent the situation escalating.

Alarm management is an important function in any SCADA system. One of the most common causes of unplanned downtime is a failure to respond effectively during plant disturbances. About 40% of the losses in the industry are attributes to human error and are deemed preventable. The engineers must always is look to improve the presentation of alarms to the operators so that they can respond correctly in a timely manner to prevent a potential disaster. Both SCADA and operation engineers need to work together to improve the current alarm system function in Control Center environment.

Below are outlined some definitions, and steps that can be taken to improve alarm system in a Control Center.

What is an alarm?
A warning to the operator that immediate action is required to correct a prevailing condition in the plant.

What is Alarm Management?
A process by which alarms are engineered, monitored, and managed to ensure safe and reliable company operations. Alarm management is like a safety program, and is an ongoing process. Continuous Improvement and performance monitoring is essential.

Why so many alarms?
In the old analog days, alarms were hard-wired. There was a cost associated to each alarm, and they were carefully designed and installed. In the days of the DCS, alarms are generally free, and therefore every possible alarm is enabled.

How does improving An Alarm System help an Operator?
The operator can focus on important alarms, he can suppress meaningless alarms as needed, and he can quickly asses operating condition based on clear, concise and consistent message descriptions. Operator can gather information as to the cause of the incident and apply the recommended corrective action. Evaluating alarm system and operator performance is an essential part of a good alarm system.

Four Phases to improve an Alarm System
Improving an existing alarm system consist of four phases as outlined below. A good analysis tool that identifies which devices are creating the most alarms is essential for any measurable improvement. It can help determine the most frequent alarms, the chattering alarms, the nuisance alarms. It helps us see where and when alarms occur, helping one to deduce the cause to see why there are alarms floods and perhaps re-prioritize the alarms.

PHASE 1 – Alarms Analysis
To improve an existing alarms system an analysis is required to identify the main areas where effort should be focused.

1. Alarm and Event Historizing – captures all alarm and event information from SCADA. Analysis can be done to identify frequent alarms, chattering alarms, and operator response to an alarm.
2. Alarm Priorities (80/15/5) – EEMUA (appendix-1) recommends to maximize operator effectiveness, there should be only three different sets of alarm priorities. Define prioritization rules and apply them consistently to each alarm. Base your priority on the potential consequences if the operator fails to respond. Prioritize 5% high priority, 15% medium priority and 80% low priority.
3. Alarm Messages – Make sure the alarm messages are clear and meaningful, and not just Tag ID. An engineer may configure a device using the Tag ID, but the planner needs to know whether it is pressure, temperature or a level of a device. The message should be such that operator understands it without any ambiguities.
4. Alarm limits – Check all limits are correctly set as per the standard
5. Alarm dead bands – Check all limits are correctly set as per the standard
6. Reduce the percentage Disabled, Inhibited and Standing alarms.
7. Check if alarm priority set but alarm is not from a remote.
8. Have a defined response to each alarm.
9. The engineer must play an active role with new project. Where relevant, group alarms together to bring one alarm e.g. Loss of DC to all 86 L/O relays.
10. All solar power RTUs must have their voltages monitored and alarmed.
11. Reduce total number of alarms by using RTU software to parallel similar type (loss of DC) in the RTU itself and send a single alarm to control Center..

PHASE 2 – Alarm Rationalization
Rationalization is the process of determining the reason for an alarm and its correct settings. One must ask the questions, is this alarm required? How to implement it?

1. Review Alarm Configuration
2. Resources Required
a.Operator,
b. Operation engineer
c. Software engineer
3. Clean Up database
4. Alarm Rationalization Process
a. What is the purpose of the alarm
b. Action required by operator
c. Consequences of failure to respond
d. Can the operator prevent or control the situation
e. Time required to respond
5. Operator Training on the Alarm System

Phase 3 – Performance of alarm system

1. Average number of alarm per hour
2. Max number of alarm per hour
3. Number of hours when alarm rates outside acceptable target

Phase 4 – Advance Solutions

1. Sorting & Filtering – Operator should be able to sort in a number of ways.
2. Alarm Grouping – Combining alarms into urgent and non urgent categories.
3. Alarm Shelving – Unnecessary alarms could be temporarily moved to shelves. Chattering alarms can be moved to shelves for a time.
4. Alarm Eclipsing – alarm messages that are repeatedly activated by the same tag can be collapsed and displayed as a single line. ( alarms from a single device)

Conclusion
Proper analyses of alarms and events in a SCADA system can reveal significant areas where improvement may be necessary. Phase 1 can be implemented immediately in the present SCADA systems. Minimizing the number of alarms reduces operator fatigue and stress, and allows him to focus on the critical alarms and take corrective action. Alarms must also be compared against best practices and industry guidelines from EEMUA and ISA. A good alarm system is important in any operations. Poorly functioning alarm systems have been identified again and again as contributing factors in many industrial accidents. Improvement in an alarm system can avoid costly plant upsets, resulting in significant economic benefits by reducing and avoiding unplanned outages.

References:
1. EEMUA 191: Alarm Systems. A Guide to Design, Management and Procurement, 1999.
2. Norwegian Petroleum Directorate YA-711: Principles for alarm system design, 2001.
3. ISA RP18.2 Management of Alarm Systems for the Process Industries
4. Namur NA102: Alarm Management. 2005

Appendix 1 – EEMUA 191
EEMUA 191
Following a number of high profile disasters such as Texaco Milford Haven (1994) and The Channel Tunnel (1996), a significant interest developed in Alarm Management Systems. The inadequacy of the Alarm Management System was cited as a contributory factor in these two disasters. Time and time again alarm floods and poor alarm prioritization are the culprits. In Texaco Milford Haven, operators received 275 alarms in the 11 minutes before the disaster.
An Alarm Management guideline was raised by the Engineering Equipment and Materials Users Association (EEMUA 191) in 1999 and is now considered to be the definitive Alarm Management reference document.
The Engineering Equipment and Materials Users’ Association, (EEMUA), is a European based, industry Association run for the benefit of companies that own or operate industrial facilities. EEMUA aims to improve the safety, environmental and operating performance of industrial facilities in the most cost-effective way. EEMUA Members pursue these aims by sharing engineering experiences and expertise, and by the promotion of their distinct interests as the users of engineering products.

Alarms recommendation by EEMUA

Ave Alarms Per Day, 144
Ave Standing Alarms, 9
Peak Alarms per 10 min, 10
Ave Alarm per 10 min, 1
Distribution % Low/Med/High is 80/15/5

Power System Control Center

A Power Control Center is the nerve center of a power system. It monitors frequency, demand and loadings on the transmission lines. It also monitors the protective devices that protect the power system components such as transmission lines, transformers and generators. SCADA(Supervisory Control and Data Acquisition) and EMS (Energy Management System) systems are used in the Control Center to manage the flow of power, meet the daily demand and protect the power system components and monitor any alarms being generated.

One of the key measurement to monitor is the ‘system frequency’. This is the speed at which the generators are spinning (50 cycles per seconds). To keep the frequency at 50cps, the customer power demand must be always equal to the generation supply in a power system. This balance is achieved by keeping the frequency at 50cps at all time. The operator responsible for generation must know the daily demand profile. Generally demand is low at night, but during the day may have two peaks, one in the morning and one in the evening. As the demand increases, the operator can bring on new generators on to the system or require online generator to produce more if they have spare capacity. Sometime the operator has control over some generators, for some he calls the power plants. Generators can automatically make corrections for minor changes in customer demand increases. The power company is required to meet the daily power demand and yet must maintain the the power system frequency at 50cps( also known as 50 hertz or 50Hz).

Real Time data is collected from the power plant through remote devices connected to the control center via communication lines. This data includes power flow, alarms, frequency and voltage levels. Monitoring of alarms are the key to keeping continuity of power to the customers. When alarms are received from transformers, generators or transmission lines corrective action is required to prevent a power black out. The control room operator must take corrective action in a given time to arrest a deteriorating condition such as a fault on a component or excessive loading on a transmission line. Therefore, the operators perform a key function in the entire power system to ensure customers continually receive electricity. They must have the correct tools to do their jobs e.g. 99.99% availability of EMS, SCADA, Alarm Management Application. And must be only burdened with so many alarms in a given time. They must never be flooded with alarms in an emergency situation. If this happens they are bound to loose ‘operation situational awareness’ and take the wrong action. This can result in a major disaster on the power system.

Saudi Aramco Control Center