A few of the RCM books on our Bookshelf
These are my 'go to' references and guides I still use when sharing the RCM philosophy with new clients.
It's March 2024 as I post this and I'm busy working on a new module for our free to access learning platform UPTIME Consultant Academy® that has a working title of "RCM 101" it aims to introduce newbies to RCM and review all of the above books.
It's March 2024 as I post this and I'm busy working on a new module for our free to access learning platform UPTIME Consultant Academy® that has a working title of "RCM 101" it aims to introduce newbies to RCM and review all of the above books.
Reliability-Centered Maintenance (1978)
Original Paper by Nowlan & Heap

reliability_centered_maintenance_by_nowlan_and_heap.pdf | |
File Size: | 32223 kb |
File Type: |
Mathematical Aspects of Reliability Centered Maintenance - Appendix to RCM

ada066580.pdf | |
File Size: | 4488 kb |
File Type: |
Reliability-centred Maintenance
Recently scrolled through my articles published and realised that I hadn't covered Reliability-centred Maintenance (RCM) which kind of shocked me!
Here belatedly is an outline of what RCM is, where RCM2 comes from and how relevant it still is in industry today.
Here belatedly is an outline of what RCM is, where RCM2 comes from and how relevant it still is in industry today.
Hands up everyone that has heard of MSG3 and importantly what it has delivered? OK those that say 'yes' have a head start on the rest so bear with me whilst I cover this very important document; first a little history.
Let's put things into context about the timeline of maintenance practices. We hear a lot of noise about Industry 4.0 today and what's going to happen in the future, I always like to start my RCM, PdM and CBM training with a history lesson, we need to take a look back at the preceding eras first before moving on.
Industry 1.0 would have been the advent of the Industrial Revolution, Steam was the big technological advance of the day, Engineering was really BIG back in the steam age. The scale of some of those engines, pumps and locomotives still impresses me today, everything was over engineered to last, systems were simple with levers, ratchets, pawls, cylinders, slides that just ran until they broke, then relatively cheap labour just fixed it. Time wasn't tight and the consequences of failure both in financial and safety terms were low.
Then around the turn of the 20th century machines became faster and lighter in build, World War One came along and drove this innovation along with Henry Ford introducing new manufacturing techniques, production quality became important as mass manufacture took hold. But labour was still relatively cheap so the break and fix mentality still held true as systems were still quite simple to service and repair.
Industry 2.0 took off around the outbreak of WW2 as demands increased, labour became short as the war expanded, this increased the use of mechanisation with more complex systems, this period also saw the next big leap as Alan Turing developed theoretical computer science, built the first general computer at Bletchley Park that by breaking the German's Enigma encryption code almost certainly influenced the outcome of the second world war. He is also considered by many to be the father of Artificial Intelligence (AI) something that has only recently been developed.
After WW2 was over is when the real work started, because of shortages, mechanisation and more complex systems in industry started a push to introduce increased automation, fixed interval maintenance inspections and equipment refurbishment was introduced.
The thinking then was that every component had a 'lifespan' and everything started out in pristine 'good' condition then wore out over a period of time, if you could intervene before the end of this wear out cycle you could elongate the overall lifespan of the equipment. Increasing amounts of Preventive Maintenance (PM) was set in place that required more man hours to service equipment, at the same time the developed economies saw 100% employment as planned and fixed time maintenance became the norm.
Let's put things into context about the timeline of maintenance practices. We hear a lot of noise about Industry 4.0 today and what's going to happen in the future, I always like to start my RCM, PdM and CBM training with a history lesson, we need to take a look back at the preceding eras first before moving on.
Industry 1.0 would have been the advent of the Industrial Revolution, Steam was the big technological advance of the day, Engineering was really BIG back in the steam age. The scale of some of those engines, pumps and locomotives still impresses me today, everything was over engineered to last, systems were simple with levers, ratchets, pawls, cylinders, slides that just ran until they broke, then relatively cheap labour just fixed it. Time wasn't tight and the consequences of failure both in financial and safety terms were low.
Then around the turn of the 20th century machines became faster and lighter in build, World War One came along and drove this innovation along with Henry Ford introducing new manufacturing techniques, production quality became important as mass manufacture took hold. But labour was still relatively cheap so the break and fix mentality still held true as systems were still quite simple to service and repair.
Industry 2.0 took off around the outbreak of WW2 as demands increased, labour became short as the war expanded, this increased the use of mechanisation with more complex systems, this period also saw the next big leap as Alan Turing developed theoretical computer science, built the first general computer at Bletchley Park that by breaking the German's Enigma encryption code almost certainly influenced the outcome of the second world war. He is also considered by many to be the father of Artificial Intelligence (AI) something that has only recently been developed.
After WW2 was over is when the real work started, because of shortages, mechanisation and more complex systems in industry started a push to introduce increased automation, fixed interval maintenance inspections and equipment refurbishment was introduced.
The thinking then was that every component had a 'lifespan' and everything started out in pristine 'good' condition then wore out over a period of time, if you could intervene before the end of this wear out cycle you could elongate the overall lifespan of the equipment. Increasing amounts of Preventive Maintenance (PM) was set in place that required more man hours to service equipment, at the same time the developed economies saw 100% employment as planned and fixed time maintenance became the norm.
Setting the gun sights on a B52 bomber, photo from the 'The Red Barn' Museum of Flight at Boeing Airfield, Seattle, birthplace of Boeing
As the events of WW2 faded into history the fledgling (excuse the pun!) airline industry was established as freight and transport planes from demobilisation were pressed into service taking paying civilian passengers, we were at the cusp of Industry 2.0 and would soon have to think about the next big industrial and engineering step forward.
Some of these aircraft were designed for the primary purpose of transporting service personnel and military equipment to a theatre of war and part of that deal was redundancy as a catastrophic loss (being shot down!). The mechanisation and automation improved due to attrition, build more than get destroyed and you can win the war, they didn't expect these planes to last that long in war service. If the systems failed in flight during wartime then service personnel lost their lives, it was hard but accepted, if this happened with civilians in peacetime though it might become a little unpalatable.
It did happen, month after month civilian aircraft crashed killing passengers and crew, it got to be such an issue in the USA that the then government decided action needed to be taken to support this new industry or it may not take off (sorry).
Some of these aircraft were designed for the primary purpose of transporting service personnel and military equipment to a theatre of war and part of that deal was redundancy as a catastrophic loss (being shot down!). The mechanisation and automation improved due to attrition, build more than get destroyed and you can win the war, they didn't expect these planes to last that long in war service. If the systems failed in flight during wartime then service personnel lost their lives, it was hard but accepted, if this happened with civilians in peacetime though it might become a little unpalatable.
It did happen, month after month civilian aircraft crashed killing passengers and crew, it got to be such an issue in the USA that the then government decided action needed to be taken to support this new industry or it may not take off (sorry).
Above: United Airlines fatal crash in Brooklyn killed 82 passengers
The above crash was not uncommon during the 1950s and accidents like this and the unacceptable costs of maintenance instigated the United States Department of Defence in partnership with United Airlines to start the first recorded Reliability Study known as MSG1 (Maintenance Steering Group)
This looked critically at how maintenance is carried out on aircraft and investigated the effect on reliability, safety and the value added. The third iteration in 1978 MSG3 was the game changer and it coined the title Reliability-centred Maintenance (RCM) for the design and implementation of maintenance practices for aircraft.
The main practice identified was that almost all planned maintenance routines were worthless, in a lot of cases PMs risked introducing failure modes that were not present in the first place. Reliability was being negatively impacted by wasteful fixed time preventive maintenance plans and rebuilds.
Previously MSG1 was developed as a program to produce the maintenance plans for the new B747 or Jumbo Jet. Following that MSG2 developed the maintenance for the DC-10 and the Lockheed TriStar, MSG3 expanded upon those to provide the first framework for use on an aircraft's in-service maintenance requirements.
To demonstrate the impact that MSG2 had; the Douglas DC-8 had 339 assets that required overhaul compared to the seven on the DC-10 serviced using the MSG2 document. The Boeing B-747 had 66,000 hours of major structural inspections before the big one at 20,000 hours service, the DC-8 a much smaller less sophisticated aircraft required 4 million standard maintenance labour hours to reach the same 20,000 hour service point!
This looked critically at how maintenance is carried out on aircraft and investigated the effect on reliability, safety and the value added. The third iteration in 1978 MSG3 was the game changer and it coined the title Reliability-centred Maintenance (RCM) for the design and implementation of maintenance practices for aircraft.
The main practice identified was that almost all planned maintenance routines were worthless, in a lot of cases PMs risked introducing failure modes that were not present in the first place. Reliability was being negatively impacted by wasteful fixed time preventive maintenance plans and rebuilds.
Previously MSG1 was developed as a program to produce the maintenance plans for the new B747 or Jumbo Jet. Following that MSG2 developed the maintenance for the DC-10 and the Lockheed TriStar, MSG3 expanded upon those to provide the first framework for use on an aircraft's in-service maintenance requirements.
To demonstrate the impact that MSG2 had; the Douglas DC-8 had 339 assets that required overhaul compared to the seven on the DC-10 serviced using the MSG2 document. The Boeing B-747 had 66,000 hours of major structural inspections before the big one at 20,000 hours service, the DC-8 a much smaller less sophisticated aircraft required 4 million standard maintenance labour hours to reach the same 20,000 hour service point!
Shown above is the one of the front pages of MSG3 published in December of 1978.
E. Stanley Nowlan and Howard Heap are rightly commended for their work on this very important document that took years to pull together.
Tom Matteson was also heavily involved but was never credited with authorship as he had just left United Airlines to set up his own consultancy only a few months before the document was published.
Tom Matteson was also heavily involved but was never credited with authorship as he had just left United Airlines to set up his own consultancy only a few months before the document was published.

So if we are not in the business of maintaining or operating aircraft what about us?
Well, John Moubray must have thought the same as he decided this methodology could apply to things that were sat on the ground.
He set about working on RCM 2 shown here, this is my first edition copy that is well thumbed and used as a prop when I train RCM, PdM, CBM and the other Reliability practices.
I still refer to it as 'The Reliability Maintenance bible'.
First published in 1991 and then a second edition (see title photo) in 2000, it has not been reprinted since.
It sets out Reliability-centred Maintenance 2 methodology for all industrial equipment other than aerospace.
One of my spheres of interest is Predictive Maintenance (PdM) and I always teach in my training sessions that you have to take RCM 2 into account to get the true value from any PdM programme, it sets out a framework to target the predictive technologies, it puts them into context.
One of my spheres of interest is Predictive Maintenance (PdM) and I always teach in my training sessions that you have to take RCM 2 into account to get the true value from any PdM programme, it sets out a framework to target the predictive technologies, it puts them into context.
Without it you can waste a lot of time and money with an uncontrolled PdM strategy
If you would like to find out more about RCM 2 and how it can focus your maintenance operations please get in touch.
I can offer initial one day introductions and ongoing relationships with the right clients.
I can offer initial one day introductions and ongoing relationships with the right clients.
"RCM II" by John Moubray
If you are interested in learning about RCM then this is one of the books you can read to understand the subject
Reliability-centred maintenance is a process that determines what needs to be done to ensure that physical assets continue to deliver what their users want them to do.
RCM is now widely recognised as the most cost effective path to developing a world class maintenance strategy.
I can vouch for this as I have seen the impact from the front line, with the transformation of a production facility in a chaotic and totally reactive loop, through a journey to a proactive and predictive plant with world class figures.
It's a journey that takes buy in and some hard work across all functions, but the payback is well worth it.
Getting the culture right is a key element to a sustainable program so initially you may need to do work on that aspect.
In todays economic uncertainties it is becoming more important to protect finite resources, the top three in my book are, Time, Money and Energy. If you get it right you can align all three, saving money by spending less time expending wasted energy.
An example of this is with on condition tasks where you can displace fixed time maintenance (saving time) thereby reducing costs (money) whilst saving precious energy in human resources.
This labour resource can be used to further improve the asset performance through engineering and process improvements.
I can vouch for this as I have seen the impact from the front line, with the transformation of a production facility in a chaotic and totally reactive loop, through a journey to a proactive and predictive plant with world class figures.
It's a journey that takes buy in and some hard work across all functions, but the payback is well worth it.
Getting the culture right is a key element to a sustainable program so initially you may need to do work on that aspect.
In todays economic uncertainties it is becoming more important to protect finite resources, the top three in my book are, Time, Money and Energy. If you get it right you can align all three, saving money by spending less time expending wasted energy.
An example of this is with on condition tasks where you can displace fixed time maintenance (saving time) thereby reducing costs (money) whilst saving precious energy in human resources.
This labour resource can be used to further improve the asset performance through engineering and process improvements.