Tuning The Forgotten Loop
We can tune PID controllers, but what about tuning the operator?
The purpose of tuning loops is to reduce errors and thus provide more efficient operation that returns quickly to steady-state efficiency after upsets, errors, or changes in load. State-of-the-art manufacturers in process and discrete industries have invested in advanced control software, manufacturing execution software, and modeling software to “tune” everything from control loops to supply chains, thus driving higher quality and productivity.
The “forgotten loop” has been the operator, who is typically trained to “average” parameters to run adequately under most steady-state conditions. “Advanced tuning” of the operator could yield even better outputs, with higher quality, fewer errors, and a wider response to fluctuating operating conditions. This article explores the issue of improving operator actions, and a method for doing so.
Over the past decade we’ve spent, as an industry, billions of dollars and millions of man-hours automating our factories and plants. The solutions have included adding sensors, networks, and software that can measure, analyze, and either act or recommend action to help production get to “Six Sigma” efficiency. However, few, if any, plants are totally automated. Despite a continuing effort to remove personnel costs and drive repeatability through automation, all plants and factories have human operators. These important human assets are responsible for monitoring the control systems, either to act on system recommendations, or override automated actions if circumstances warrant.
Most of the time, operators let the system do what it was designed and programmed to do. Sometimes, operators make errors of commission, with causes ranging from misinterpretation of data to poor training, or errors of omission attributed to lack of attention or speedy response. What the operator does during a potential panic situation often depends on how well he or she has been trained, or “tuned.”
The focus on reducing human error isn’t trivial: multiple studies by the Department of Energy (DOE) and Electric Power Research Institute (EPRI) have identified the probability of errors in power plants and nuclear facilities. System availability, quality of output, and operator safety will become even more important as industry restructuring (such as the Smart Grid Initiative) take hold. We can expect similar issues across the broad spectrum of other manufacturing and process industries.
We know from life experience (and management training) that “a chain is as strong as its weakest link.” The irony in our factory automation strategy is that while we’ve invested heavily to improve our data sensing and automation systems, we haven’t made similar investments to strengthen and improve the human element (the operators) in our systems.
There are instances where time is wasted just figuring out “what happened” before remedial action is taken. Often, the plant is down while troubleshooting occurs, or the problem can’t be determined until the engineers are allowed in to do the troubleshooting. In the meantime, the plant must operate at reduced output or quality.
While we can view the conditions that occurred during an incident with process historians, we don’t really know what the operator response was. Operator console monitoring software, which records what the operator saw on the HMI and what his or her response was, makes it possible to reconstruct exactly what happened for diagnostic, operator training, or process improvement.
For example, years ago I was doing a plant start-up involving a new distributed control system at a cement plant. We received a call saying some “ductwork had collapsed,” and the control system was suspected as the culprit. After arriving at the plant with a colleague, we spent a full day poring over alarm printouts and talking with operators, even decoding register stacks in the software looking for the problem (the “ductwork” was a 15-foot diameter 30-foot long steel duct connecting the preheat tower with the kiln). Ultimately, we discovered that the induced draft fan damper’s hydraulics were connected backward (when the operator commanded “open,” the damper closed and thus the duct collapsed like a plugged drinking straw). If we’d had a video recording of the operator’s console’s display, we could have seen the command, the damper position indicator, and the vacuum measurement. It would have saved man-hours of engineering time, and would have had the plant up and running many hours sooner!
A console recorder can also clear an operator of possible errors. In many cases, it can show that the operator did exactly what was necessary given the information available at the time. If regulations require an operator to perform certain functions at certain times or after certain activities, the console recorder can document that the procedures were done properly. It would give operators confidence to know that they are being “backed up” by the console recorder.
The Approach And The Obstacles
The simplest way to provide operators with a means for improving their performance is to give them visual feedback of their activities. Professional sports players, coaches, and teams watch “game films” and “scouting films” on a regular basis. Films of their own game performance provide an efficient means of reviewing game conditions, plays, actions, errors, and “what we want to do differently next time.” Scouting films help teams develop strategies for winning, and feed information into practice sessions so that “muscle memory” can be called upon readily and confidently (This same technique is used for initial and recurrent training of airline pilots, as well).
The ideal solution is a system that can coordinate what the operator actually saw, video from the plant area where a situation occurred, and data from an historian. Such a system can recreate the exact conditions that occurred during an incident.
For example, “alarm floods” are an ongoing problem. Studies show that 90 percent of alarms are due to incorrect system configuration and poor alarm strategies. Alarm floods usually occur at the worst possible time for a control room operator, such as start-up, shutdown, and trips. Eliminating alarm floods requires a complete analysis of alarm priorities. This would be much easier if the analysts could see what the operator was presented with during an alarm flood.
A 21st Century Solution
With the advent of inexpensive computers, digital video recording, and specialized software, there are new ways to improve operator training to reduce errors and improve uptime and quality. The Longwatch Console Recorder software is a small module that is loaded into the computer running the HMI or DCS/SCADA console display software. This “software camera” takes the image generated by all programs using the computer’s display (including HMI software) and presents it to the Longwatch Video Engine, where it is recorded as a digital video stream. The result is a video file identical to that which would be recorded by a real camera but, in this case, contains the image of the HMI display, as well as the mouse movements as the operator moves and clicks around on the screen.
Because the software is recording exactly what is being shown on the operator’s display, there is no ambiguity about what the operator is seeing. Thus, ex-post-facto analysis of the display might offer insight as to whether the sensed data (measurements and statuses) were displayed properly, and whether the operator took appropriate action. The Console Recorder’s video will also show if the operator was looking at the appropriate display or was otherwise distracted.
The Video Historian expands the analysis capability further. The unique “message mapping” capability enables the user to see exactly what was being displayed on a variety of recorded consoles when particular events occurred. Many plants use multiple HMI displays—typically the “three CRT” configuration pioneered by DCSes years ago. Multiple HMIs can display events and data ranging from plant alarms to messages generated by other applications (such as workflow tracking or quality/lab testing subsystems). Being able to see what was on all the displays at the time of an event can be very useful for reinforcing good manufacturing and safety practices, and for training the operators which screens to watch under various conditions.
Being able to play back what the operator was actually seeing – along with a video recording of the operator’s actions and even combined with actual video from the plant – can be valuable for other purposes. Consider the classic experience of an engineer getting the phone call in the middle of the night saying that something isn’t working right. After arriving in the control room, the first series of questions is, “What happened, what did you do, and did you change anything?” The answers to the last two questions are often “nothing” and “no.” A video playback of the display might corroborate the answers, or it might shed light on things forgotten.
In our global economy, competition is broad, consumer demands are high, and margins (for error and for profit) are razor-thin. As part of cost reduction and increased automation, fewer humans are involved with operating larger and larger plants. Thus, those few operators who are left have huge responsibilities, and their actions can have a major effect on plant output and, ultimately, business performance. Just as the new global economy has demanded new ways of operating business, it demands new ways of making sure operations deliver efficiency, quality, flexibility, and availability. A simple, yet effective, technology is available today: video feedback that “closes the forgotten loop” of human activity in our automation system. The video recording and playback of the operator’s display and action can help manufacturers become more competitive, while providing a means to support operations quality and repeatability through analysis and training