I've been a proponent of putting engineers on call for their software. You learn so much being on call that you can't learn any other way. But you can't just have people on call without consider the entire system they are in. Without the ability to address issues found, even the most stalwart engineer will burn out.
Right now, I'm on call 24/7 for my parents, both of whom are cognitively and physically impaired. They can't use email, can barely use computers and thus use the phone. Almost every phone call I get is related to them in some way. Sometimes the calls are mis-dials. Sometimes it's serious and one of them is on the way to the emergency room. But usually, it's something about the TV or computer isn't working how they expect.
But, no matter what, I not only can't stop getting these calls—they are my parents after all—and I can't address the underlying issues that spawns them. Advanced age can be extremely difficult.
To state the obvious, this is extremely stressful. But it's not just because it's my parents and the issues are serious. It's mostly what I stated above…there's no way to make it stop or make it better. It just keeps coming and coming. This is how burnout happens.
I'll talk about coping mechanisms in a bit, but let's bring this back to software.
Putting engineers on call isn't just connecting their phone, Slack, and email to Pager Duty. They are part of a complex system that can be made significantly worse by not being careful. Often, engineers are notified of errors, expected to fix them, and then get right back to adding features.
This means the team must cope with never-ending and unpredictable interruptions whose underlying cause they may not even be allowed to investigate, much less fix. I've been in situations where you couldn't even ask to address this stuff, as there was no one who could say "yes". This is hugely demoralizing and, among other things, leads to turnover. Engineers already have a better chance ofl making more money by quitting their job, and if they can escape an unhealth on-call, that's even more reason.
It can be difficult to address (or avoid creating) this situation:
Here are some expectations to set with the team, managers, and anyone outside engineering that, when used together, can form the basis of a healthy on-call culture:
The engineers will usually get this, and they will trust their leadership if a) the training is made real and done well, and b) solutions to undelrying problems are given space to be worked on.
Everyone else will benefit from some context about why this is important at all
It's important to couch all this as information, not a request for permission. You don't ask the Chief Product Officer if you can write tests. You don't ask the CEO if you should use good variable names. You don't ask the head of HR if you can deploy from your continuous integration system. These are just part of how software is made. On-call—healthy on-call—is no different.
Here is what I do. I turn my phone off during hours that I need to focus. I refuse to talk calls from my parents during that time. It's the only way I can find uninterrupted time to work. And it's risky, because something serious could happen during the times my phone is off. I'm just betting that that won't happen.
This is what engineers will do, especially those that don't have the luxury of finding a new job. It's hard to blame them, as they must protect themselves before helping others. I'm no good to my parents if I'm completely burned out and psychologically unprepared to help them when it's truly needed.
Fortunately, you don't need to be a neurologist or eldercare specialist to address on-call problems on an engineering team. You just need to account for the entire system, set the right expectations, and hold everyone to them.
Unless otherwise noted, my emails were written entirely by me without any assistance from a generative AI.