Operations & Maintenance

A Step By Step Guide to Choosing the Right Maintenance Strategy for Your Equipment

Ryan Chan

[alert type=”info” icon-size=”hide-icon”] Key Takeaways:

  1. Choosing the right maintenance strategy is a function of the cost of equipment failure and ease of monitoring.
  2. Build a team with members from multiple departments. Make sure all stakeholders are represented.
  3. Start with criticality analysis on the asset to evaluate the potential costs of failure.
  4. Analyze how much monitoring the equipment would cost relative to its overall cost.
  5. Choose a maintenance strategy, implement it, and monitor your progress. Adjust your strategy as needed. [/alert]

When choosing a maintenance strategy for a given piece of equipment, it may be tempting to go with either the most effective or least expensive option.

While predictive maintenance is seen as one of the best ways to improve productivity, safety, and equipment downtime, it may not always be the best option for your situation. It comes with its associated costs, so it may not always be cost-effective. Other maintenance strategies are often more appropriate.

Choosing the right maintenance strategy is not always a straightforward process. Assets function in different ways. So it makes sense that each component might need to be handled differently. When deciding on maintenance strategies, the value of equipment and the cost of any failure event also come into play.

Here, we’ll discuss how to choose the right maintenance strategy for your equipment.

Two Factors to Consider

When determining the right maintenance strategy for a given piece of equipment, you’ll have two primary factors to consider:

  1. The cost of equipment failure
  2. The ease of monitoring the equipment

Both of these will form the basis of your maintenance strategy for each piece of equipment.

High cost of failure and ease of monitoring mean a more involved maintenance strategy.

1. Cost of equipment failure

The first factor is the cost of equipment failure. Essentially, if the asset breaks down, what impact will that have?

One of the costs associated with equipment failure is equipment downtime. The longer a piece of equipment is down, the more it’s going to cost. Given that the U.S. loses over $647 billion every year due to machine downtime, certain breakdowns are definitely worth preventing, especially for equipment that’s central to keeping your processes running.

Downtime isn’t the only factor to consider in overall costs, however. Repair costs, safety, and environmental impacts are also concerns well worth considering. If a given failure mode would result in injury to your employees or others, that would be worth preventing as well.

Why does equipment fail?

Before even avoiding equipment failures, it helps to understand why they occur in the first place. Here is a list of some of the most common reasons that a machine fails:

Overworking equipment

Different types of materials that make up equipment have pre-determined capacities. If you overwork the equipment and go beyond its limits, they can break. Compromising the integrity of individual components can cause even more damage to the bigger subsystem.

Physical damages

Apart from being overstressed by usual operations, external factors can also cause physical damage to equipment. Examples of these are environmental conditions that result in rusting, corrosion, thermal fluctuations, and erosion.

Mishandling and errors

Most machines are designed to perform specific tasks. If these are used for the wrong type of application or simply used incorrectly, their parts can break more easily. These issues can be resolved or prevented by following detailed procedures and proper training.

Insufficient maintenance

Manufacturers typically advise maintenance instructions for the equipment they produce. Design considerations rely on these measures to prolong the life cycle of a machine. Some examples of these are regular oil changes and cleaning tasks.

Improper installation or design

Poor installation or design decisions can lead to one of the other causes of equipment failure. In a worst-case scenario, these decisions might even pose a safety risk. For example, opting for undersized equipment can lead to overworked components and ultimately equipment failure.

2. Ease of monitoring

The second factor to consider when determining your maintenance strategy is the ease of monitoring. Watching over each piece of equipment incurs a cost. If the cost of monitoring a given asset would be more than that of the failure mode it prevents, it may not be worth implementing.

Where monitoring is more expensive, maintenance strategies that require less vigilance may be more appropriate. For instance, if it would cost too much to install sensors on a piece of equipment, you might be better served with a schedule-based preventive maintenance plan.

Step 1: Laying the Ground Work

Before you actually get started with choosing a maintenance strategy, you need to make sure you have the necessary foundation to handle that process. Specifically, you’ll need a team and sufficient maintenance data.

The team you put together for this planning process needs to consist of personnel from different departments. Making sure maintenance, operations, engineering, and so forth are involved grants you the advantage of having multiple skillsets and areas of expertise at the table, and it ultimately ensures everyone affected is represented.

This isn’t just a one-off committee either. This team will meet on a semi-regular basis to assess your current maintenance strategies and make adjustments as needed.

In addition to a team, you’ll need some maintenance data to work with as well. Particularly, metrics such as mean time between failures (MTBF) and overall equipment effectiveness (OEE) will be especially useful during the next step.

Tip: Log work orders, asset downtime, and other maintenance data using a CMMS.

Step 2: Criticality Analysis

Criticality analysis is the way your team will determine how much an asset will cost you if it fails. The higher an asset’s criticality, the higher its potential costs. Some facilities evaluate criticality strictly by the impact an equipment failure would have on their process. Others use a more holistic approach, evaluating the effects failure would have on safety, maintenance costs, production, and the environment, as in the graphic below.

Under each category, rank the cost of a given failure on a scale from one to five, with one being the least severe and five representing disastrous consequences.

Tip: Some organizations multiply these scores together for a composite score when performing criticality analysis. However, when planning maintenance strategies for your equipment, simply taking the highest score is also sufficient.

The higher the ranking, the more mature the maintenance strategy is for that asset.

For example, suppose the staff of a toy manufacturing plant is trying to determine the best maintenance strategy for their thousands of feet of conveyor belts. Looking at their data on past equipment failures, they find that it typically takes four hours to pin down the location and cause of the breakdown and to make the needed repairs. No major safety or environmental risks occur (aside from some spilled pieces posing a minor hazard), and the repairs typically aren’t very expensive. As such, their ratings would look like this:

Overall, they’d rate their conveyor system’s cost of failure as 2.

Let’s look at another example. An oil refinery is developing a maintenance strategy for their pipelines. Given that failure could result in significant losses, including disastrous consequences for their employees’ safety and the environment, they rate it a 5.

Step 3: Assess Costs of Different Strategies

Once you have a clear ranking of what each asset might cost upon failure, you’ll need to assess the possible strategies to use on each one. Typically, the more monitoring intensive a strategy is, the more it will cost. As such, the possible strategies you’d use might be ranked like so from high to low cost:

In our conveyor belt example, the cost of monitoring thousands of feet of belt would likely be fairly expensive, especially if some parts aren’t readily accessible. Perhaps the most efficient methods would be ultrasound or infrared, neither of which would require physical contact, but which both would involve consistent effort. On a 1 to 5 scale, the team might rate this one a 3, right in the middle.

The oil refinery’s pipelines might offer a similar challenge, especially since many pipes might be buried or hidden behind other equipment. Again, ultrasound would likely be the tool to use, and the time it would take to monitor each pipeline means the team would rate the cost at around 2, given the total cost of the pipelines.

Tip: The cost of monitoring is relative. Compare the cost of monitoring with the price of the asset and make a rating from there. As a frame of reference, the best in class target for the cost of maintenance as a percentage of resale asset value is below 3%.


Step 4: Choose a Maintenance Strategy

With the cost of failure and the cost of maintenance in hand, it’s time to determine the best maintenance strategy for your equipment. Using our chart from earlier, each of our example companies plots their assets:

The oil refinery and toy manufacturer plot their assets in purple on this matrix.

Since the cost of monitoring the toy company’s conveyor system puts it right on the line, they could potentially opt for either. They decide to try condition-based maintenance to see if it might reduce maintenance costs over time.

Meanwhile, the oil refinery plans to implement a predictive maintenance strategy using ultrasound and predictive analytics.

Tip: Keep in mind that many industries have legal requirements in place regarding the way they maintain their assets. Where any laws exist, compliance with those laws is a top priority.

What are the main types of maintenance strategies?

Equipment failure is always a real possibility. While a machine is not expected to run forever, we want to at least prolong and maximize its life cycle. A maintenance strategy drives the plans on how an asset can keep running.

Reactive maintenance

Reactive maintenance is the simplest and most intuitive form of maintenance. As an oversimplification – if something breaks, then the maintenance team works to get it back into service.

There are varying scenarios where reactive maintenance makes sense. It could be a case of a minor faulty component that requires some corrective action. On the other hand, it could involve a catastrophic failure of a major part, causing production stoppages.

While it doesn’t sound like a fool-proof strategy – and it usually isn’t – there are valid reasons to use reactive maintenance. A simple example is changing out light bulbs. A busted light does not impose any grave consequences, and can easily be fixed. Clearly, the same cannot be said for more critical equipment.

Proactive maintenance

Proactive maintenance is a maintenance philosophy that aims to avoid failure by addressing conditions that can lead to a breakdown. In other words, proactive maintenance tries to stop failures before they even happen.

Proactive maintenance is quite a broad term. Of course, there are many different approaches that one can take when carrying out maintenance tasks proactively. One of the key factors that further subdivides proactive maintenance is the way maintenance tasks are triggered. Each type of proactive maintenance would look at a different set of criteria to prompt for particular tasks. Here are a few of the most common types of proactive maintenance strategies:

  1. Preventive maintenance (PM) is the most common form of proactive maintenance. PM tasks are predominantly performed according to calendar-based or usage-based schedules. A practical example of PM is when changing out your car’s engine oil after a set number of months or miles.
  2. Predictive maintenance (PdM) is perhaps the most advanced form of proactive maintenance available in practice. PdM employs the use of sensors to track an asset’s condition in real-time. With additional performance data available, maintenance activities are recommended precisely when needed. In contrast, a calendar- or usage-based schedule risks performing maintenance tasks too much or too little.
  3. Condition-based maintenance (CBM) works with the same principle as predictive maintenance, but not with the same level of technology. CBM also uses sensors to assess the condition of an asset. However, the analysis and decision to perform maintenance work mostly lie in the evaluation of a trained operator.

Reliability-centered maintenance (RCM)

In the traditional sense, the main types of maintenance would have stopped at either being reactive and proactive. In reality, whether purposely or not, you would employ more than one maintenance strategy to cover the whole range of equipment and appliances. What’s important is to ensure that these maintenance choices contribute to the overall objectives of the company.

RCM gives you the freedom to assign maintenance strategies to maximize reliability while optimizing resources. Assigning assets to different types of maintenance becomes a data-driven solution rather than a limitation. Moreover, RCM does not stop at whether an asset should be maintained reactively or proactively. RCM tries to find the most suitable type of proactive maintenance while planning for the potential for reactive corrections.

Step 5: Implement Your Strategy

Once you determine the strategy to use for each asset, it’s time to implement it. In some cases, doing so may represent a major shift in your maintenance team’s culture, so this will take some planning.

For instance, the toy manufacturer has been using a run-to-failure approach with their conveyor system. As such, switching over to a condition-based model will be a bit of a jump, though they do have some recurring PMs in place on other assets. They’ll need to train their maintenance technicians or equipment operators to periodically check their system with ultrasound and log the data. As that data builds up, they’ll need to be aware of when the readings look abnormal.

For the oil refinery, they’ve been using some condition-based monitoring on other equipment. Implementing their new strategy for monitoring their pipelines is simply a matter of adding predictive analytics to their current strategy, training their personnel to use it, and adapting their work order planning practices accordingly.

How to implement RCM

RCM is a thoughtful process that allows you to objectively choose an appropriate maintenance strategy for each asset. Its implementation requires reliable data and analysis that allow for sensible decisions. A typical RCM process involves the following steps:

Discuss the plan

As with most projects, the whole process starts with developing and discussing the plan. The scope and limitation of work, as well as the accountable groups, are defined in this stage. It is important that relevant groups are clear about their roles before moving on to the next steps.

Select the equipment

The team then needs to list all equipment and assets that are subject to RCM implementation. The decision process typically involves several factors including safety, legal, and economic considerations.

Identify functions

Identifying an equipment’s functions includes both qualitative and quantitative descriptions of the expected task. RCM focuses on achieving consistent and dependable operations according to these defined functions.

This step recognizes that each asset is a component of a bigger subsystem. Understanding each of the equipment’s functions leads to an overall view of how the system works.

Identify functional failures

Failure refers to any situation where a piece of equipment does not perform as expected. Functional failures, therefore, can manifest in different ways with varying levels of seriousness.

Functional failures can resemble one or more of the following examples:
– Complete functional failure
– Poor performance of a function
– Intermittent performance of a function
– Over performance of a function

Identify failure modes and effects

As we have mentioned previously, components of a system are in many ways related to each other. This step gathers what we already know about our assets and identifies the causes and effects of failure.

This process analyzes conditions that can possibly cause failures and connects them with corresponding consequences. This step also accounts for other factors such as the likelihood of occurrence, detectability of failure, and risk controls.

Select maintenance tasks

Based on the previous steps, you now have a better idea of which assets need more attention. Minimize or even eliminate risks of production stoppages by assigning maintenance tasks that increase your functional operation.

Equipment that is critical to your operations needs to have the highest reliability. This suggests moving towards proactive maintenance for high-cost machinery and other essential assets. The availability and ease of getting performance data can further lead you to more specific proactive maintenance strategies such as PdM or CBM.

At the other end of the spectrum, minor assets with minimal business impact can be handled reactively. Going back to the light bulb example, it is more economical to change out the bulb only when it fails. In the same example, you can imagine the proactive alternative to be rather wasteful – replacing a perfectly working bulb.

Evaluate and review

Evaluating and reviewing your RCM process helps to bring out opportunities for improvement. This step can be performed through detailed discussions with subject matter experts. Experience and personal accounts from the team can also be gathered to assess the performance of current processes.

Alternatively, simulations can be performed to fast-track the evaluation process. Hypothetical conditions can be assumed for the current set-up to gather information on how the system reacts. Data from simulations hold valuable information for potential improvements.

Step 6: Monitor Your Progress and Make Adjustments

Once you’ve implemented your maintenance strategy, it’s not the end. You need to monitor its progress and make adjustments as you go. Logging work orders and costs into your CMMS, seeing if downtime decreases, and so forth are all part of this process.

Going back to our examples:

As the toy manufacturer implements a condition-based maintenance strategy, they find that production does in fact increase, which lines up with current research on CBM. They decide to continue on with this strategy, but they do make a few tweaks to improve efficiency. That way, they’ll pay off the investment they made in their ultrasound equipment a little faster.

The oil refinery’s attempts to implement predictive analytics don’t quite pan out as hoped. They made the mistake of focusing on software selection without training their team in the skills they’d need to use it. The strategy itself wasn’t necessarily wrong, and the software they used was top-notch—they just didn’t implement the strategy well. As a result, they resolve to train their personnel in using data to plan maintenance work.

This article was updated with additional information in July, 2020.

Please enter a valid email address