1 Simple Strategy to Update Your Data Center Device Maintenance Processes

1 Simple Strategy to Update Your Data Center Device Maintenance Processes

By: Mike Kalkas


Stop forgetting regular preventative device maintenance.

As an integrator, I’m concerned with trying to ensure my customers recognize where they’re bleeding due to cost associated with energy management. Often, the losses I see aren’t due to wasted energy, but poor processes.

“My energy monitoring needs are simple enough,” is a common statement by many data center operators. However, simple is not always so simple, especially when many decisions are made ‘because it’s always been done that way.’

Here is just one easy change you can make this year to increase your overall energy efficiently and decrease operational risk: automating device maintenance.


Understand your energy

Energy is more than just electricity. Energy can be transformed into many different types of potential, and used or wasted in various ways. These potentials can take the form of pressurized gas (e.g., steam or air stored in tanks), thermal energy (e.g., hot rooms or cold rooms), elevated weight (e.g., elevator), live and stored electricity (e.g., power from an outlet or battery), and light (light fixture).

All are derived from electricity, and all cost you money in some form. Understanding where that energy is used vs. where it is wasted will give you the opportunity to minimize costs.

Many institutions suffer huge losses due to broken and inefficient equipment.

As equipment is used and gets older:

  • Insulation breaks down, allowing small current losses
  • Connections become loose, causing higher resistance and wasting power in the form of heat
  • Bearings become dry, increasing motor resistance and causing the motor to draw more power for the same work
  • Leaks in air systems allow energy in the form of air to escape
  • Light fixture lamps burn out, leaving the fixture to futilely attempt to start the lamp


All these things add up to wasted energy that could have been prevented with proper maintenance.

Many managers resort to spreadsheets and paper documents for tracking and maintaining their maintenance schedules. They usually miss equipment because it got lost off the list or forgotten during installation or change over.


Automate yearly preventative device maintenance reminders

Preventative maintenance is sometimes not part of your plan or processes, and slips through the gaps. In some cases, there’s not a set maintenance plan for specific devices. Often, it’s because the device only needs maintenance if its broken, or has a long maintenance cycle (e.g., ATS, RPPs, PDUs).

It’s easy to forget device maintenance, but because they’re only being looked at yearly (if at all), that automatically makes them a long-term critical risk.

Here’s a common scenario.

A CT goes bad on an RPP. Instead of notifying maintenance, a customer might move the breaker to another position and moves on. A year later, the bar still hasn’t been replaced because it hasn’t caused enough of a problem to solicit a high level of attention. A few years (and dozens of missed maintenance opportunities) later, several CTs in the bar have failed because the entire bar has problems.

Because the customer “juggled” redundant power connections on the servers when a fault occurred, causing several servers to switch over at once, more than the local breaker tripped. This meant the equipment was no longer evenly used, mixing almost new and very used components and circuitry. For the duration of this situation, components were running at a higher temperature than normal and caused the area to use more cooling.

This costly maneuver might cost you more than extra cooling costs. It could cost you customers due to higher SLA price points.

Device maintenance: remembering is a matter of process
Usually random and yearly bits of maintenance are left up to the electrical manager. Some managers are extremely organized, and add multiple reminder events to calendars, or even have a dedicated to do list on a spreadsheet organized by month. But…most are overworked, and once-a-year items that aren’t constantly on their mind get forgotten. Plenty of daily fires keep their minds occupied in the present.

There simply isn’t a backup or accountability aspect to their process that works faultlessly. Reminders can be haphazard, ignored, overlooked, or forgotten to be set. In some cases, there isn’t even an indication if the preventative maintenance has been completed, other than the day passing when it should have been done.

So how do you ensure important, but rare maintenance items aren’t forgotten?

Solve long-term preventative maintenance with automation
Many managers purchase a device maintenance tracking tool, expecting this to keep their processes on track. Unfortunately, this is just one more tool that technicians have to learn how to use, keep track of, and in many cases is not supported by IT.

So, why not build it into your EPMS or BMS system instead?

Device maintenance alerting is one of the best benefits you could possibly build in for your technicians to help them maintain every piece of critical equipment in your data center. Your EPMS system should already send out alerts to your technicians, and some devices already have a maintenance notification system built in (e.g. CAHUs) with points intended to be exposed such that they can be alerted on.

Most data centers don’t build maintenance alerting into their devices because of:

  • Complexity – they just want a simple summary alarm (because they tend to have many alarm storms)
  • Cost – the extra effort involved is perceived as a huge extra expense compared to using already existing spreadsheets.
  • Schedule – All additional features take extra time.


But that’s not enough. Missed maintenance can cause situations where you can’t keep your end of the SLA agreement. If you are lucky enough to not run into that situation, you are still losing money from equipment not functioning at peak performance, wearing out parts that will need to be replaced sooner, and wasting energy that wouldn’t have been used if proper maintenance had been performed.

Send alerts for upcoming maintenance
It’s simple to add a “LastMaintenanceDate” attribute to devices when you are using an object-oriented application, and a script that triggers once a day to compare if that date is older than a year, then turn an alarm point on or off. This could be accessed through the HMI to set a new date on the device once maintenance has been completed. This would then turn the alarm off for that device.

If you have many of the same type of devices, use a single instance in your system per area rather than one alarm on each device. This would help keep your alarms under control while helping keep up with your maintenance plan. If alarming is an issue, set up a visual reminder instead.

Please note, while the maintenance manager is no longer charged with remembering maintenance dates, maintenance alarming only works if it’s strictly enforced by the maintenance manager and his technicians.

A word of caution: don’t overdo your alarms
It’s important not to create too much maintenance noise in your alarming system, or you may start a new problem: alarm apathy.

Because hundreds of alarms can occur each day, too much noise in the system will make alarms meaningless.  Eventually, one of your critical yearly maintenance alarms will get buried. Or worse, an operator will tire of all the alarm notifications and hit the “acknowledge” button without reading them.


Good processes lead to better data center health

Without change, critical maintenance will continue to be missed. Eventually, a problem will occur that could cause downtime, broken SLAs…and could also cause heads to roll. By fixing the problem of forgotten maintenance through automation, you’re one step closer to sanity, and one step further from unexpected device errors and hidden operational costs.


Michael Kalkas was an Application Engineer at Affinity Energy from 2012 to 2017, with responsibility for developing, deploying, and maintaining integrated SCADA (Supervisory Control and Data Acquisition) systems. Some of his daily responsibilities were automated process software design, hardware footprint bolstering, system functionality testing, and system monitoring.

With over 25 years of electrical engineering experience, Michael previously worked for GE Appliance and Lighting, Manpower Inc., and the U.S. Army. Prior to joining Affinity Energy, Michael spent 11 years at Cooper Lighting as a Senior Lab Technician, where he performed photometric testing/UL testing against various fixture designs and lamp sources, maintained a UL Lab Testing SCADA system, and developed custom tools and databases for data manipulation and filtering.

Michael received a B.S. in Computer Science from South University, AAS in Electrical Engineering Technology from Blue Ridge Community College and is licensed as a Microsoft Certified Professional and Microsoft Technology Associate.