How Can My Plant Improve Equipment Reliability?

Answered August 27 2020

Improving equipment reliability

In a world where some plant equipment is starting to approach middle age or end of life, the science of reliability has stepped up and played a huge role in the drive to create safer and better maintained industrial environments. In its most effective capacity, reliability-centered maintenance incorporates equipment design, equipment history, and the recognized and generally accepted good engineering practices (RAGAGEP) of policies and codes enforced by regulatory bodies and other knowledgeable communities. 

More specifically, codes and standards created by:

These regulations encourage facilities to use preventive and predictive technologies where appropriate to reduce unplanned failures, while also incorporating good asset management techniques. This encouragement for root cause failures to be identified, minimized, and even eliminated where they can is essential in establishing more reliable equipment. 

These codes and policies also cover more sophisticated approaches to reliability and overall equipment care, which can dramatically improve a broken run-to-failure approach, to the point where reliability can become a reality where it wasn’t a clear option before. 

History Influencing Optimization

In the 1960s, the Federal Aviation Administration conducted a series of engineering studies on airplanes. The studies proved that not every part of a piece of equipment needed to be replaced or overhauled in order to prevent failures. Some failures cannot be prevented, but they can be planned for, and this is the most effective way of managing their maintenance. This concept revolutionized the way assets across several industries were being managed, and thus, condition-based monitoring was born. 

Today, certain equipment is still run-to-failure. But this is only if it does not make economic or good engineering sense to identify failure mode symptoms early, and overhaul is not an effective solution. A good example would be bolts on a conveyor. The conveyor itself can have a predictive vibration monitoring route applied to it to identify troublesome vibrations and to make minor adjustments, such as tightening bearings or performing realignment. However, if vibration issues are because of a failed bolt, the solution is to just replace the bolt. No accelerated replacement frequency or additional greasing technique will prevent bolts from failing.  

The effort put into evaluating what equipment needs preventive and predictive technologies and routines, versus what can run to failure, will be rewarded with optimized runtimes, plannable outages, and reductions in surprise failures and upsets. The best methods to obtain such wonderful rewards are:

  • Smart use of asset management, such as knowing the manufacturing and design details of the equipment
  • Predictive technologies (e.g., lubrication and vibration routes)
  • Following the applicable RAGAGEP codes and processes to maintain good equipment health.

The obstacles industry participants face in most facilities are not wholly unique despite proprietary differences. And once the mechanisms of equipment failure are understood well, those root causes can be minimized and even eliminated given proper attention.

Regulation Leading the Way

In the last thirty years, OSHA has developed and enforced many process safety management policies. Of those policies, mechanical integrity is key to helping facilities achieve best practices for both fixed and rotating equipment. These policies inform industries about what RAGAGEP codes can be implemented for the best maintenance and reliability strategy tasks over the lifespan of fixed and rotating equipment. The policies also instruct how to manage the programs that will implement these best practices in a timely manner. 

The mechanical integrity element of process safety management achieves this by leading industry to follow API codes for testing, inspection methods, and frequencies for equipment, as well as ASME codes and standards for use in equipment design. The management of these programs becomes straightforward after these codes are effectively applied. If changes are needed or wanted, the proper channels must be followed, and the proper people must approve these changes. 

Additionally, the proper evaluations must be conducted to determine if the changes in the risk of failure, from an inspection frequency change for example, is within an acceptable range or not. Company management will determine the criteria applicable here, but frequency is not the only thing that can be updated to provide more effective inspection plans and lower risk. If a change in the technology or identification of a damage mechanism is needed to determine the deterioration of a piece of equipment, it will be best identified through the use of codes like API 571.

Benefits of Using Regulation-Specified Effective Inspection Practices 

An example of these codes in action is the examination of a pressure vessel. Things that may be done without the use of API codes may be simple visual inspections by an operator who looks at the equipment every day. This same person may generate work orders based on the findings from these inspections to help maintain a polished looking vessel. For a water tank with no insulation, running at atmospheric pressures, where the impacts of failure are minimal, this may be acceptable. 

However, this water tank does not match the level of complexity across entire manufacturing units and facilities. To encompass other nuances related to various process streams, additional understanding of damage mechanisms, their causes, and how rapidly they deteriorate a vessel need to be incorporated into the analysis that dictates optimum maintenance strategy. 

For many damage mechanisms such as corrosion, cracking, hydrogen attack, etc., there is an appropriate nondestructive test that can be conducted to discover failures and accelerated deterioration before bigger problems arise. Some of these damage mechanisms are simply not visible to the naked eye, so these extra measures, put into place and performed by qualified individuals, are a facility’s best chance at continuing to run and make products without incident. The effectiveness of these methods is why codes and standards are often called out to be followed by governing regulatory bodies like OSHA.

Properly Deviating from Prescribed Methods

OSHA specifically identifies the need to use codes like API 510 and 570 within the mechanical integrity domain. These codes guide facilities to determine the frequency of testing and inspections in a prescribed manner for vessels and piping. Within these codes, it also states that one can pivot from these prescriptive intervals by conducting a risk-based inspection analysis in accordance with API 580 and 581.

This assessment guides facilities to find a more effective inspection, while ensuring risk remains at an acceptable threshold as the newly defined interval passes. It would be unsafe to have test and inspection routines put into place without regard to risk and effectiveness of identifying failure modes. 

It is well known that risk-based inspection and reliability are complimentary programs. After good design, maintenance and reliability programs are the only line of defense in ensuring that equipment holding potentially fatal materials is taken care of properly. Design cannot do enough on its own, and nothing lasts forever. So properly using required and suggested test methods for early deterioration detection is always better than putting the facility and community at risk for what a failure could bring. 

The responsibility of choosing and scheduling the appropriate tasks falls on site management and engineering teams and should never be taken lightly. 

Risk-based inspection analysis is the best path to mitigate, reduce, and even eliminate probability and consequences of failure for pressurized equipment. It also makes sure the teams involved know the level of work needed to achieve risk levels as low as reasonably practical as well as manage residual risk once programs are put into place. This analysis work is typically done following the API 581 methodology of risk-based inspection analysis, where a facility must take into consideration the effects of mitigation measures and hazards to people, property, and the environment.  

This level of assessment and implementation is meant to be all-inclusive of hazardous failures affecting not only the plant operators, but the acute effects possible for land, water, and air in the surrounding community. The responsibility of quantifying risks of chemicals being released in any amount need to be analyzed properly to determine how far-reaching these impacts can get and how long it will take to clean up. 


Over the last forty years, the best testing and inspection practices have been identified and formalized in regulations to help industrial facilities ensure great equipment health. The reliability and mechanical integrity of equipment has been improved across the board through these regulatory bodies, informing and enforcing RAGAGEPs to be used when needed. 

Regulations, codes, and standards empower justification for good practices to not be ignored. Codes and standards do not cover every scenario’s needs; however, a good case can be made to adjust testing type or frequency through proper risk-based inspection evaluation. A solid foundation for change to improve maintenance and reliability plans can be established if risk is understood, and the impact of proposed changes are properly assessed and accepted. Equipment health and reliability improvements found through processes like these keep not only workers and equipment safe at a facility, but they keep the community and world beyond the fenceline safer as well.

Asset Management Questions & Answers