Over the past few years, we have seen an increase in the use of redundancy within industrial control systems. In this blog I would like to add my insights as to true redundancy to identify the major considerations, advantages, and possible alternatives to be considered.
Redundant (Miriam-Webster 2023): serving as a duplicate for preventing failure of an entire system (such as a spacecraft) upon failure of a single component.
Redundancy has always been implemented in designs within the electrical, mechanical, hardware/virtualized computing infrastructures, and network designs. Examples include electrical power feeds from different utility sources, multiple pumps connected to the same or multiple source(s), mirrored servers, or dual fiber runs to a network switch. This allows the system to maintain operation when unexpected failures occur.
When determining the requirements for industrial control redundancy, start with the obvious. Work your way from the top down. I like to start with a site or building infrastructure review to identify single points of failure and how this would affect the control system within the scope of a project.
To determine the redundancy requirement for a system or process, several factors should be considered.
Assess the impact of system failure on overall operations, safety, or financial consequences. The higher the criticality, the greater the need for redundancy.
Evaluate the reliability and failure rates of individual components within the system. If certain components are known to have a higher probability of failure, redundancy may be necessary to mitigate this risk.
Determine the acceptable downtime for the system. If rapid restoration or continuous operation is required, redundancy becomes more important to ensure minimal interruptions.
Evaluate the financial implications of system downtime versus the cost of implementing redundancy measures. Redundancy should be balanced with the potential losses incurred during system failures.
Consider the system's scalability and potential growth. If there are plans to expand or increase system capacity in the future, incorporating redundancy early on can help accommodate future requirements.
Assess the environmental conditions in which the system operates. Harsh or unstable environments may increase the likelihood of component failure, necessitating redundancy for increased robustness.
Determine if there are any specific regulatory or compliance standards that dictate redundancy requirements for the system. Ensure compliance with relevant guidelines or industry-specific regulations.
Analyze historical data on system failures or incidents to identify patterns or trends. This information can help inform the level of redundancy required to mitigate similar risks in the future.
Remember that redundancy should be carefully designed and implemented based on a comprehensive evaluation of these factors to strike the right balance between reliability, cost, and system performance.
Industrial control systems have redundant hardware and software solutions. Each manufacturer offers design guidelines that need to be followed to correctly implement redundancy.
PLC redundancy will have 2 of everything within the main controller. This includes chassis, power supplies, controllers, sync modules, IO Modules and network connections.
Software redundancy will have a Primary and Secondary server for each process such as visualization, alarms, and IO data.
In some applications, after a thorough system review, it may be determined that full redundancy is not required.
Here are a few alternatives to consider for control system redundancy:
Implementing fail-safe mechanisms can help prevent system failures and minimize their impact. Fail-safe mechanisms include safety checks, emergency shutdown procedures, and protective measures that are activated in case of a failure. For example, duplicate communication module that handles Modbus TCP/IP communications to critical devices that can be automatically/manually activated.
Employing advanced fault detection and diagnosis techniques can help identify potential failures or deviations from normal system behavior. By continuously monitoring system parameters and comparing them to expected values, faults can be detected early, allowing for prompt corrective action.
Adding redundant sensors can provide additional measurements for critical system variables. If one sensor fails or provides inaccurate readings, the redundant sensor can serve as a backup, ensuring that reliable information is still available for control purposes.
Utilizing diverse control algorithms can enhance system resilience. By employing multiple control algorithms with different approaches and assumptions, the system can switch to an alternative algorithm in case the primary one fails or behaves abnormally.
Implementing robust control strategies can help compensate for uncertainties and disturbances in the system. Robust control techniques account for variations and uncertainties in the system parameters, ensuring stable performance even in the presence of disturbances or component failures.
Regular system monitoring and preventive maintenance can help identify potential issues before they lead to failures. By implementing a comprehensive monitoring and maintenance program, system reliability can be improved, reducing the need for redundancy in the first place.
Remember that the specific choice of alternatives depends on the requirements and constraints of the control system.
I have deployed many systems that were identified here. In many of the designs I have seen, especially during the construction phase, they have fallen short of the intended function of redundancy. It is important to carry redundancy designs beyond the control system purview and be a holistic approach that includes all project disciplines. There needs to be a shift in education and awareness for engineering and trades in the correct methods to deploy true control system redundancy.
Here is the short list of shortcomings to be mindful of:
Redundant power feeds and network connections originating from the same distribution panel or being run within the same raceway.Control panel ventilation within a high temperature area did not have a backup cooling fan. During failure of one of the dual power feeds, the exhaust fan turned off and the panels overheated.
About the author
Sam Lacasse is a Senior Process Controls Engineer for Hallam-ICS with 22 years of experience. Graduating from New England Institute of Technology 1993 with an A.S. Science Degree. He has extensive experience in Toxic Gas Monitoring, Food & Beverage, Robotics/Vision/Motion control and large scale Water/Wastewater applications.
Read My Hallam Story
About Hallam-ICS
Hallam-ICS is an engineering and automation company that designs MEP systems for facilities and plants, engineers control and automation solutions, and ensures safety and regulatory compliance through arc flash studies, commissioning, and validation. Our offices are located in Massachusetts, Connecticut, New York, Vermont and North Carolina, Texas and Florida and our projects take us world-wide.