What's your downtime monitoring strategy?

3rd April 2025

Recording downtime is one of the first steps a company makes to improve their OEE. After all, you need to be able to measure what you're working to improve. Too often we see companies that have issues crunching data or understanding their plants performance. Data that is too granular, unstructured, or vague can can make it difficult to understand where energy should best be focused. This article covers the selection of downtime intervals, cause codes, categories and data collection methods. Keep in mind that there is no one-size fits all, as most of these decisions are dependant on the organisation or production line.

Minimum downtime Interval

The minimum downtime interval is the shortest duration where downtime will be recorded. At first glance, it's tempting to record all stoppages of a line as downtime. However, consider a line that stops for 5 seconds, runs for a minute and stops for another 5 seconds. What happens when the operator is asked to record that downtime? If it's being recorded in a spreadsheet or on paper it might take a minute to record all the details. It won't be long before these short downtime events disappear - but only because operators stop recording these events.

For manually recorded downtime, a minimum downtime interval between 5 and 10 minutes works best. For automated systems where the operator needs to enter details for each event, this can be reduced to around 2 minutes, and automated systems that give the flexibility to hide short events from operators, or automatically assign causes can be set as low as 5 seconds.

If your line suffers from a high frequency of downtime events, consider increasing the minimum downtime interval, conversely, if the line has only a few downtime events per hour, the interval could be reduced.

Essentially the goal is to achieve a balance of data quality and operability of the downtime recording system. No downtime recording system works without the support of operators, so choose a minimum downtime interval that gets the important data, but doesn't overwhelm them with data entry.

Regardless of what minimum downtime interval you choose, you should review it after 3-6 months and adjust it as required.

Downtime categories and causes

Downtime categories are the high-level groupings of downtime events. This is usually fairly simple to define. Basically, you want a category for each work group that will resolve an issue. A good starting point is Mechanical, Electrical, Operations. You can also add categories such as Admin, which can be useful if you want to record things like mandatory meetings, training, etc... separate to the operations category, which can then be more focused on more immediate impacts to the line such as waiting for, or running out of material.

It's typically better to have a short list of categories, and it works best if a category can be assigned to an individual or a team for improvement. For example, the electrical team leader can just look at the top three causes of downtime in the electrical category. But if there are categories like 'control signals', 'sensors', 'motive power', 'PLC', etc... it can be unclear what needs to be prioritised.

screenshot

Causes, is the next step of granularity, where there starts to be enough detail to really focus improvement initiatives. The list of causes should be highly informed by your production line. For example, most production lines use photoelectric sensors to manage product flow, so codes like 'sensor failed', 'sensor dirty', 'sensor blocked', 'sensor alignment' are pretty common in all industries. On the other hand, in a sawmill, product is conveyed primarily by chains and rollers so codes like 'chain tension', 'chain off', 'roller bearing' would be common, but these may not be relevant for a bottling line in a food factory.

Around 15 causes per category is a good starting point, along with reviewing them every few months until they are well established. It's important to also have an 'other' option so that you can review and add new causes something particular starts to become common. However, be careful when adding or removing causes because you want to build a long term trend of downtime. Changing causes can make this long-term data less data less accurate.

Downtime locations / equipment reference

It's important to identify the location of the downtime. The list of locations can be taken from your ERP or CMMS systems if available. Otherwise you'll need to develop one. This list should be hierarchial in the form

Line > Machine > Component > Part

For example:

Packing Line > Palletiser > Robot > Grabber Closed Prox

This would be for a proximity sensor detecting the palletiser robot end effector has closed (grabbed a carton). It is preferable to use the real name of the sensor in this situation (something like PX290) to keep data consistent across plant displays, drawings, PLC code and labels.

However, the big question is how deep to make this hierarchy? Again the answer depends. LineInterpreter allows you to configure and select data in a hierarchial way, expanding and shrinking locations, so defining locations all the way to the part level is possible, while keeping the system user-friendly. Our location select widget also has search to make it even easier to find locations when entering downtime.

SCREENSHOT

recording downtime events by hand or using Excel, it's best to stop at the machine or component level. The last thing you want is an Excel dropdown bigger than the screen, requiring scrolling, or multiple printed pages of downtime locations at the operator station.

Operator comments

Always give the ability for commentary to be added to downtime events. For short events, this isn't too important, but if you have an event longer than 15 minutes, it should have some sort of comment. These comments are especially important on events with a cause of 'other'.

Collection method

Last up, collection method. Are you going to collect data with hand-written notes, on a computer using Microsoft Excel or Access, or using a purpose-built downtime recording system, such as Line Interpreter. Most companies start downtime recording with hand-written sheets before quickly moving to Microsoft Excel. So here we'll compare using manual downtime recording with Excel to a purpose-built downtime recording system.

Cost - Manual downtime entry is cheaper. It's likely there's already a PC at the operator station that can be used to enter downtime into Excel. A purpose-built downtime recording system will have a cost, either for software or hardware, or both. In the case of Line Interpreter, we deploy and manage a group of servers in the cloud for each of our customers. We maintain backups, deploy security patches, provide support, and add new features. This costs more than a local Excel spreadsheet, but it means our customers can focus on improving their performance, rather than the downtime monitoring system.

Data Accuracy - While it's possible to get good results out of a manual downtime recording system, without close management, data accuracy can fall significantly. It's not uncommon for up to 50% of downtime events to go unrecorded when relying on manual data entry. With automated downtime recording systems, downtime events won't be missed, and operators are prompted to enter details for each event.

Reporting - Reporting is another area where automated systems shine. With manual systems, generating reports can be time-consuming and error-prone, often requiring significant effort to collate and analyze data. Automated systems, on the other hand, can generate reports with minimal effort. Line interpreter allows users to create custom reports that can be emailed to stakeholders automatically.

Conclusion

The advice on each aspect of a downtime monitoring strategy changes depending on the data collection method being used, but has been summarised in the table below. Essentially, with a manual downtime recording system, compromises are required to keep the system usable. With an automated downtime recording system, there is no need to compromise. Automated systems collect the most accurate data, and can analyse it in real-time, displaying it on large screens around the production area so operators just need to look up to see if the shift is running to target. These systems provide significant advantages to operations and engineering teams, for more information, just look at our features. You'll spend less time wrangling with spreadsheets, and more time fixing problems with a purpose-built downtime recording system.

DecisionAdvice
Minimum downtime intervalBalance operability and data quality.
CategoriesOne per work group resolving issues (but at a minimum have mechanical, electrical, operations).
CausesInformed by the line structure. Start with 15 per category, periodically review, but don't change unless there is a real need, as it will affect long-term trends.
LocationsMachine or component level for manual entry, as deep as you want with an automated system.
Operator commentsAlways leave space for them.
Collection MethodAutomated solutions are better, but more expensive.

© Copyright 2025 Line Interpreter