Why You Should Design Your CMMS for Reliability Engineering
Answered October 30 2020
I can’t think of a better objective than reliability and availability. Every computerized maintenance management system (CMMS) has powerful features, but the magic is in the core team vision for operational excellence. Unfortunately, the CMMS community in general does not understand how to maximize value by focusing on things that matter. In this piece, I’ll show how the core team and reliability team can work together to leverage failure data in the CMMS to make better decisions.
RCM Analysis Helps to Identify Accurate Maintenance Strategies
Reliability-centered maintenance (RCM) analysis provides a structured framework for analyzing the functions and potential failures for a physical asset with a focus on preserving functions. The RCM method documents functions, ways it can have functional failure, failure modes, failure effects, consequences, and proactive (or default) actions. This analysis provides the best way to determine the proper maintenance strategies for critical assets and systems. The RCM failure mode is necessary to discover the correct mitigating task. These failure modes identify the component, component problem, and cause of the failure, e.g., fuel pump motor bearing seized due to lack of lubrication.
Parsing the Failure Mode
Figure 2 shows multiple functions for a given asset where each function can experience multiple ways to fail. And each functional failure can have multiple failure modes, shown as a phrase containing the failed component and component problem. Note that if the cause was known, it also would have been part of the failure mode phrase. By parsing this failure mode phrase into three separate fields, then failure analysis within the CMMS is enhanced by using validated fields.
Store Results of RCM Analysis Directly Inside the CMMS
Assuming the CMMS can be configured, it makes perfect sense to add this application. In order to set up and maintain a living program, this data is readily available by the reliability team to refine as they go. The RCM facilitator often stores the output in an Excel spreadsheet, which is a stand-alone document and not easily updated. The CMMS offers data security, a familiar screen design, and the ability to be joined to other tables such as asset, labor, and the work order failure mode. As new systems are analyzed, this data can be uploaded in bulk. As new failure modes are discovered, they can be electronically routed for review and approval, and then inserted into the new application.
Create a Defendable PM Library
Imagine having a defendable preventive maintenance (PM) record, which is linked back to the RCM failure mode. By linking one (or more) failure modes to one PM record, you can then verify the validity of the entire PM library. And during this initial review, you might discover PM records that are no longer valid.
Understanding the RCM Standard SAE-JA-1011/12
This standard identifies failure mode as the language of RCM. More importantly, it describes the failure mode as three separate elements. If asset management professionals are to understand RCM and the definition of a failure mode, this is where they should look. This standard states that a failure mode should consist of a noun and a verb. Further, if you want to apply the optimum maintenance strategy, you also need to know the cause. Every reliability team should understand this standard. More importantly, they should insist that the CMMS failure data have this level of granularity, so that they can run a bad actor report.
On RCM Blitz
This is a book written by Douglas Plucknette, which describes RCM analysis. It clearly defines the failure mode as three pieces – however, grouped together as a phrase. Douglas may be the best in the business when performing RCM analysis and discovering system defects, but he never tried to store this data inside a CMMS. Upon reading his book, it became obvious to me that this was an opportunity worth investigating.
CMMS Product Design
Unfortunately, most CMMS products do not capture a failure mode as three separate fields. Rather, they concentrate on asset problem codes. To make matters worse, they introduce a failure code hierarchy which overcomplicates the design by segregating components to each failure class. On the surface, this sounds pretty clever. But if there were 50 total failure classes, 25 components, and 25 component problems, then there could be a total of 31,250 boxes in the hierarchy – and that’s not including the cause codes. This is the main reason why so many organizations have never succeeded at establishing a failure data library. And without the failure data, they cannot run failure analytics using aggregate commands. The most damaging outcome of this design failure is the inability of the reliability team to leverage (failure) data using a failure analytic in the CMMS to manage by exception.
Configuring the Product
All that is required is to add fields to the work order tracking screen, build a failure analytic to extract bad actors, and apply a sort metric. The first step is to identify when the failure mode is required. The answer to that is to add a Yes/No field called Functional Failure. Thus, if this field is flagged Yes by the operator, then the maintenance staff must enter the failure mode at job completion.
The failure mode consists of three pieces: failed component, component problem, and cause code. The failed component can be found on the ITEM master table, commodity field. By running a query against the (historical) material issues table, you can use the commodity field linked with each item/part number. Then, by running an SQL distinct command, you have an instant list of failed components. At this point, you can make a static domain or a dynamic table-based domain. Note that this is for ALL assets.
Although this failed component domain could be quite large, by using internet search technology called TYPEAHEAD BUFFER, you can easily find the component you want in two to three clicks. And since the maintenance technician did the component replacement (or repair), they obviously know what the component is. In some instances, the component they replaced may not be in the list, in which case there needs to be another field called MISSING COMPONENT. If populated, then this information is electronically routed to the reliability team for review. Upon their approval, the software would add the new value to the domain and also backfit the work order component field.
The component problem could be a simple, static list of less than 25 values. An example list is shown in figure 5. Thus, the primary key might be CALIB, and the description for CALIB would be MISS-CAL-STICK, OUT-OF-CAL, OUT-OF-SPEC, whereby the description would be searchable. Usually these problem codes would be entered by the technician, but may sometimes require a supervisor or reliability professional.
Cause codes can be more challenging. And the list of cause codes could be quite extensive. Further, the question needs to be asked, “How far do you go?” The danger in not capturing any cause code is that the expensive bearing you just replaced could fail again in six to nine months because you never really eliminated the cause. These values could be entered by the technician but might require the maintenance supervisor.
Cause Code Hierarchy
This is one time where I do recommend the use of a hierarchy. The benefit of a hierarchy is to start at a higher level, and then work into human factors. Some asset management professionals state that the majority of equipment failures are mostly related to human factors. But this does NOT mean maintenance. Defects can be inserted into the asset anywhere in the life cycle by humans. Figure 6 shows all of the possibilities.
Using a Cause Code Hierarchy
As stated, the cause code may be the hardest piece to capture. But without this information, the person creating the PM job plan will be guessing as to how best to prevent this from happening again. Admittedly, sometimes the cause is not known. On that note, be sure the maintenance staff is cautioned about (accidentally) destroying evidence. In figure 7, there are three cause fields added to the work order failure reporting screen. Cause code 1 has a short list of values: NO-DEFECT FOUND, AGING, WEAR-AND-TEAR, POWER-SURGE, HOUSEKEEPING, ENVIRONMENTAL REASONS, FORCE-MAJEURE, and HUMAN-FACTOR. The technician would read through the list and only choose HUMAN-FACTOR if none of the above apply. And if HUMAN-FACTOR is chosen, then cause code 2 is required. Similarly, with cause code 2, if WORKMANSHIP is chosen, then cause code 3 is required. There could be additional fields to capture lack of skill, lack of standards, and lack of leadership.
Keep in mind that cause code 3 could be dissected a lot further in the case of a formal root cause failure analysis. The purpose of this type of failure coding is to get somewhat close by capturing an RCM-style failure mode and preventing recurrence.
Automatic Comparison of Work Order Failure Mode to RCM Failure Mode
Because all of this data is now inside the CMMS, there are many new possibilities which help us continually improve strategies and failure modes. Since the RCM failure mode is stored as three pieces, and the work order failure mode is stored as three pieces, it is now possible to implement an automatic comparison. Examples are shown below:
- If the work order failure mode does not exist in the RCM application for that same asset, then route to reliability professionals for review.
- If the work order failure mode does exist, then also route to reliability professionals to “ask why this failure happened.” Perhaps the CMMS PM job plan is missing or incorrectly set up. Or, the maintenance technician failed to follow procedure. Or, the PM work order was not performed per schedule.
Chronic Failure Analysis
To me, this may be the #1 benefit of having proper failure coding. By designing a failure analytic (see figure 8), we can now extract bad actors. And by choosing the worst offender in the list, the reliability team can dynamically drill down on the failure mode. Some say the largest portion of operations & maintenance costs is due to these recurring failures. If so, shouldn’t we have a way to focus on them?
Explaining the Sort Metric
You could sort the bad actors several ways. But choosing the asset with the most breakdowns is not necessarily the best way. There could be value in operational downtime, mean time between failures, and asset condition. But based on my experience, I believe the most powerful metric is the average annual maintenance cost divided by replacement cost – referred to as AA$ / RPL$. This metric states that any asset over 7% is an asset in trouble, meaning you are spending quite a bit on it compared to the replacement cost. This metric will evaluate thousands of assets and float the bad actors to the top, instructing the reliability team to focus here. Once chosen, they might perform a more detailed root cause analysis. This is called managing by exception.
Drilling Down on Failure Mode
The software experts say we can pretty much do anything nowadays. Well my challenge is this: develop a bad actor report for the reliability team to run, display on the overhead, allow them to choose a particular asset, and then see the failure mode in pie chart format. The first pie chart would be failed components. The team lead would click on the largest wedge, and then see the component problems, and so forth. Imagine the wealth of information in the hands of the reliability team to make data-based decisions.
Make the CMMS Work for You
This is how the core team and reliability team can work together to leverage failure data in the CMMS to make better decisions. More importantly, this is how a best-in-class organization can optimize return on asset and improve profitability. All that is required is a vision for excellence.
John Reeve is a Senior Consultant. With 20,000 followers on LinkedIn, he regularly shares knowledge on many topics in support of asset management. Being the 2nd consultant hired by the company that invented Maximo, he spent the first 10 years consulting in project scheduling and cost management, followed by 15 years on Maximo software. But it was the last part of his career (another 15 years) where he focused on advanced concepts resulting in a U.S. Patent for maintenance scheduling called the “order of fire.” John is also a CRL, CMM, and book author.
Latest Maintenance Articles
- What Are Ideal Preventive Maintenance Schedules for Manufacturing Equipment?Root Cause Analysis Techniques and FundamentalsHow to Use eBay to Save Money and Time on ProcurementThe Ultimate Guide to Programmable Logic Controllers (PLCs)How to Respond to an FDA Warning LetterWhat Is Maintenance Process Mapping?
Fleet Management and GPS
Oil & Gas
Asset Management Questions & Answers
How to Optimize Your CMMS
It's important to use a CMMS effectively if your facility is going to see improved reliability. That's why a CMMS plan is key.
Why Your Maintenance Team Should Be Trained as CMMS Superusers
A CMMS is a substantial investment that can have equally substantial returns if utilized effectively. The key is training staff as CMMS superusers.
Your Blueprint for CMMS Optimization
Not everyone will be able to perform all of the steps. But with this blueprint for CMMS optimization, leadership can create their own road map for success.
Why do some CMMS Implementations Fail?
A staggering 70% of CMMS implementations fail - we discuss why this rate is so high and what can you do to overcome the challenges of team adoption.
How to improve efficiency by shifting from a paper system to a CMMS
Switching over from a pen and paper system to process maintenance requests can be challenging. We make the change simple for you.
What is the difference between a paper work order and a digital work order?
Searching through stacks of a paper is being replaced by digital work orders, where you can easily find what you're looking for in a simple search query.
How to Transition from Paper Work Orders to a CMMS
If your facility currently uses paper work orders, transitioning over to a CMMS can save time and money, while also helping you improve reliability.
What is an ERP System?
Enterprise resource planning (ERP) software helps companies manage all their business-related procedures and processes more efficiently.
What are some of the largest data breaches? How do I protect my maintenance team?
Two of the biggest data breaches that changed the way we think about cybersecurity were the Equifax data breach of 2017 and the Yahoo! breach of 2013.
How IoT Works: The Four Factors That Make IoT Possible
The IoT, Internet of Things, is a system of multiple parts such as computers, machines, mechanical systems, and more that send data through a network.
What is the Difference Between EAM and CMMS?
At an essential level, enterprise asset management (EAM) software operates like a more advanced, larger-scare computer maintenance management system (CMMS).
What are the most common CMMS failure codes?
Common failure codes, like user mistakes and calibration problems, allow organizations to track recurring problems and improve maintenance efforts.
How do I write a CMMS RFP? [6 Steps]
For complex or specialized CMMS services, you might need to write an RFP. Writing a CMMS RFP is an involved process.
How much does a CMMS cost?
Though the cost of a computerized maintenance management system (CMMS) ranges depending on your needs, a CMMS is a significant investment for any company.
What is an Asset Performance Management (APM) system?
Asset performance management (APM) systems are arrays of tools that strive to improve equipment availability and reliability while limiting risk and cost.
Work Order vs. Purchase Order: What’s the Difference?
Although work order and purchase order sound similar, they have very different functions in a business setting. Learn more here!
Selecting the Best CMMS
Selecting a CMMS is a big decision for any company, but selecting the right CMMS for you is crucial. Read more about the top CMMS softwares available today!
What is augmented reality and how is it used in maintenance?
Augmented Reality is the process of using smart technology to illustrate and train maintenance professionals on how to perform maintenance.
What is cloud ERP?
Cloud Enterprise Resource Planning (ERP) software is an off-site ERP system that allows businesses to track, manage, and retrieve critical information.
What is OEE Software?
Overall Equipment Effectiveness (OEE) is a best-practice measure of productivity within manufacturing. Learn more about how to use OEE.
What are world-class training practices for CMMS software?
There's no "best way" to train employees to use a CMMS - rather, it's best to vary training practices based on the kind of information that's being taught.
When should I build a custom integration with our CMMS through APIs?
The main reason you’d want to integrate your CMMS through an API is to satisfy highly complex reporting needs from multiple software systems.
What is asset hierarchy in maintenance software and why does it matter?
An asset hierarchy is a logical index of all your maintenance equipment, machines, and components, and how they work together.
Should we design our own CMMS or shop around for one?
When an organization decides to implement CMMS software, they come to a crossroads: should we design one in-house or shop around for an external product?
What are the main challenges of facility management software?
The main challenges of facility management software are: difficulty keeping up with technology changes, fragmented nature, and poor quality of entered data.
What is the difference between CAFM and CMMS?
The difference between computer-aided facility management (CAFM) software and CMMS software has to do with scope and focus.
What is CAFM software?
CAFM software gives facilities managers the tools to monitor assets, schedule maintenance and repairs, track work orders, and meet compliance requirements.
How do we encourage useful data input from our employees?
When a CMMS is implemented properly, an organization can start to gather some really amazing data. Pre-empt bad data by encouraging solid data input.
Does my facility NEED CMMS software?
There's a short answer and a long answer to this question. The short answer is: not technically, though it helps. The long answer is slightly different.
What can my facility do if our CMMS implementation is not going as planned?
While CMMS implementation is mostly a game of preventive action in regard to failure, that doesn't mean there are no options when a facility's CMMS project is sinking.
How should I evaluate a CMMS?
The absence or presence of features can make or break a CMMS, it's important to consider the vendor's implementation plan, and availability of training.
What are good practices for a CMMS vendor to facility relationship?
With any CMMS vendor, the key to a good vendor-to-facility relationship is open lines of communication and plainly stated goals.
How do I create awareness of my facility’s new CMMS?
Even when an organization properly implements CMMS software, there's no guarantee that employees will begin using it of their own accord.
How do you justify the cost of purchasing a new CMMS system?
Justifying the cost of purchasing a new CMMS system can be done by comparing the value that the CMMS adds to the company, to the total cost of the CMMS system.
What are some examples of how CMMS software is used in different industries?
CMMS software is a tool for planning, scheduling, and tracking maintenance work, which happens in nearly every industry from manufacturing to restaurants.
How can maintenance software help reduce downtime?
There are a few ways I’ve seen maintenance software help reduce downtime, and they all come down to streamlining your maintenance processes.
Should I buy on premise or cloud based maintenance software?
Eventually, all maintenance software will be cloud-based. However, the timing of moving may depend on your facility and its current needs and situation.
How do we get technicians on board for CMMS implementation and use?
Most people think about managers when it comes to CMMS buy-in, but it's equally important to make sure that your technicians approve of CMMS implementation.
What are the most common failures in CMMS implementation?
There are four really huge ways to mess up a CMMS implementation, and most failed projects suffer from one or more of them.
What are the benefits to using a CMMS?
A computerized maintenance management system (CMMS) delivers a wide range of benefits to any business that has to repair, inspect, and maintain equipment.
How do we get management on board for a CMMS implementation?
Start with the "crawl, walk, then run" mentality to justify maintenance costs to managers, who may not see as much obvious value from CMMS software.
How do you optimize drive time for service routes?
Although there are some factors you simply can’t control, you can employ tools to help you optimize drive time for your service routes.
Does our society today put too much of an emphasis on innovation rather than maintenance?
I think it’s great to push the limits of what’s possible and advance our technology forward. But we might be neglecting proper maintenance.
What software do I need to use for facility maintenance?
However, for the sake of this question, I’ll share some general features that you should be looking for in your facility maintenance software solution.
Should I use an ERP for maintenance?
ERPs are known for their complexity, so they’re not usually worth implementing unless the size of your company warrants it.
What is the difference between EAM and CMMS?
At a base level, enterprise asset management (EAM) software operates like a more advanced, larger-scale computerized maintenance management system (CMMS).