We're all fighting the silent killer of limited IT budgets and tight application development timelines: technical debt. And it’s not just technical debt itself, but also a general attitude in enterprise IT today that maintenance is a second-class citizen.
For many enterprise IT professionals, technical debt starts not with the building of a new application or infrastructure system. It starts with an attitude of “uncoolness” for all things maintenance.
Yet humans have known that maintenance is an essential aspect of life for millennia. Consider that Hindu cosmology has three main gods: Brahma the Creator, Vishnu the Preserver, and Shiva the Destroyer.
Think of it like this: You can either pay now for preventative maintenance on your Ferrari (metaphorically speaking, of course, unless you're really living the dream), or you can wait until the engine explodes on the Autobahn, leaving you stranded and explaining to the CEO why Q4 results are delayed. Trust me, the latter is significantly more expensive.
Technical debt that creeping collection of quick fixes, rushed implementations, poor architectural choices, and "we'll-fix-it-later" promises, is a monster. It always starts innocently enough. A little hack here, a slightly clunky workaround there, a quick decision from the top to use some new, untested technology because it’ll make them look “cutting edge” and “in charge”. But like a financial debt with high compound interest, it grows exponentially, sucking the lifeblood out of your team, draining your budget, and stifling true innovation that adds value to the business.
Consider simple human biology. We spend literally a third of our lives sleeping so our brains can function properly. The entire sleep cycle is maintenance!
Becoming a Technical Debt Slayer
In my opinion, our hyperactive form of capitalism excels at innovation and creation but fails at maintenance and paying down technical debt. But for most lives, maintenance matters more. Not only because there’s a need for it but also because there’s a certain nobility in taking care of what we’ve already created. And maybe we shouldn’t look at maintenance as the enemy of innovation, nor should we look at technical debt as something that is best solved by upgrading to the newest version of a coding framework or switching to a nimbler and newer SaaS software because it has “GenAI Included!” in the marketing verbiage.
The good news? You can fight back as the Master of Maintenance. You can be a Technical Debt Slayer!
Here are some additional techniques and processes to strengthen your maintenance routines and reduce technical debt, whether that be in applications or IT infrastructure:
1. Code Reviews: Your First Line of Defense
Think of code reviews as peer pressure for excellence. Alternatively, we can think about code reviews as a formalized and planned form of mentoring. Before any code hits production, make sure it's been thoroughly vetted by another set of eyes and, if possible, make sure that the more experienced staff review the work of the less experienced staff. This isn't about catching typos (although that's a bonus), it's about ensuring system thinking, code quality, adherence to standards, and identifying potential future headaches.
2. Continuous Integration/Continuous Deployment (CI/CD): Automate Your Way to Stability
CI/CD isn't just a buzzword; it's a powerful weapon against technical debt and a bulwark for system-driven thinking. By automating the build, testing, and deployment (and rollback) process, you can catch errors early and often, before they morph into monstrous bugs that require herculean efforts to squash. Furthermore, frequent, small deployments are infinitely less risky than infrequent, monolithic releases that are riddled with potential failure points.
- Tool Tip: Invest in a robust CI/CD pipeline tool like Jenkins, GitLab CI, or Azure DevOps. Consider it an insurance policy against catastrophic deployments and late-night emergency fixes.
- Best Practice - Automated Testing: Use automated testing tools and frameworks to run tests on your codebase continuously. This ensures that new changes do not introduce bugs and helps maintain the stability of your applications.
- Tip for DBAs: Data management pros, like DBAs and Data Engineers, need to use code reviews as a regular part of their processes, so they should also use CI/CD as a regular part of their methodology. In addition, the use of Quality Gates is extremely important for CI/CD pipelines to ensure that no poorly performing SQL code goes into production.
3. Embrace Systems Thinking: Risk Versus Refactoring
On the one hand, it’s very important to remember that we inject risk into our enterprise IT systems every time we make a significant change to the underpinnings of enterprise IT. Back in the 1980s and ’90s, there was an old joke saying, “Never install a Microsoft product until version 3 has been released”. Even today, there’s a kernel of truth to the idea that new and unproven technologies are risky.
I always feel a cold, creeping fear when I hear a leader of Dev or Data teams say, “We can’t wait to work with the newest, shiniest thingamabob!” The “ooh, shiny” attitude has doomed more projects than I care to count. Also remember that the vendor landscape is littered with v1 products that never saw a v2. This revelation is even more obvious when the new product offers an entirely novel paradigm, such as the raft of new products and services offering GenAI capabilities. Take my advice—let someone else find where all the bugs and unfulfilled promises lurk.
Compare the risks of introducing new and unproven technologies into important enterprise IT ecosystems to the fact that ancient COBOL- and FORTRAN-based applications of enormous societal importance are still running today. I’m thinking about the Social Security safety net, the IRS income tax processing systems, and even systems I helped build in my NASA and DoD days way back in the day. Those systems are very old, but because maintenance is considered a tier-one priority, they continue to run and provide services to hundreds of millions of users today.
On the other hand, refactoring can be thought of as preventative surgery. It's the process of improving the internal construct of your code without changing its external behavior. An example of refactoring would be to switch an old app from original ASP code to the latest ASP.NET. Refactoring is all about cleaning up messy code, simplifying complex logic, and improving maintainability. (See what I did there)? Don't wait until the codebase is a tangled mess of spaghetti code before you start refactoring. Schedule regular refactoring sprints as part of your development cycle.
- Key takeaway: Small, incremental refactoring is far less painful (and less expensive) than a massive rewrite project. But you must balance the inherent risks of introducing new code, applications, and architectures against the value provided by the newly refactored code.
- The wisdom of the elders: Moving forward with a novel technology is inherently risky. It literally includes words suggesting violence, like “cutting-edge” and “bleeding-edge”. Let other, more reckless enterprises roll the dice, but, in my case, I prefer to let novel technologies mature enough to feel confident that they aren’t a flash in the pan nor too problematic to last.
4. Documentation: The Gift That Keeps On Giving
Remember that brilliant piece of code you wrote at 3 a.m., fueled by caffeine and the desperation of a tight deadline? Yeah, neither do I. I have plenty of examples where I was troubleshooting some code and said, “Who wrote this crap!?” only to find out that I wrote it. Embarrassing! Proper documentation is essential for understanding and maintaining your systems.
It's not just about commenting on code; it's about creating architectural diagrams, documenting API endpoints, and outlining the overall system design. In many cases, it also helps to document the rationale for important decisions made during the development process. If your in-house apps always use JSON, but one outlier application uses XML to solve the same problems, then I want to know why. It’ll help quite a lot in the future, especially if any personnel turnover occurs during the application's lifetime.
- Tool Tip: Incorporate documentation into your development workflow. Make it a mandatory part of your Dev sprints. Consider using tools like Sphinx, popular in the Python community, or MkDocs to create beautiful and easily maintainable documentation. Future-you will thank you.
- No Excuses: If nothing else, standardize on using a good transcription feature in your team meeting products, like Microsoft Teams, so that you can capture team conversations about your Dev projects. Schedule an extra meeting to just talk about maintenance and documentation. Future-you will thank you, again.
5. IT Infrastructure and Application Monitoring: Know Your Vital Signs
You wouldn't drive your car with the check engine light on, would you? (Okay, maybe some of you would...but you shouldn't). The same principle applies to your IT systems, like networks, storage, compute, and databases, which all exist to support your end-user applications. Implementing comprehensive monitoring solutions that provide real-time visibility into performance, resource consumption, and potential bottlenecks. Early warning signs are your best friends in the fight against maintenance issues and technical debt.
- Practical advice: Observability tools like DPA, SQL Sentry®, and SolarWinds® Observability can help you monitor everything from server performance to database performance to application latency. Setting up automated alerts and responses allows you to proactively address issues before they escalate into major problems.
- Vendor management: Many enterprise IT teams tend to think about vendor-provided IT tools and products as black boxes that we can’t change. This might be implied in your licensing agreements, but don’t think that you’re powerless. When you have a good monitoring system in place, you can empirically prove when an ISV’s application isn’t delivering good performance or is experiencing an anomaly. Use the proof provided by your monitoring systems to insist on added support or to demand even a fix.
None of the strategies that I’ve suggested disqualify other, more compartmentalized behaviors that you should ensure are practiced by your enterprise IT teams. Here are some additional techniques and processes to strengthen your maintenance routines and reduce technical debt, whether that be in applications or IT infrastructure:
- Regular System Audits: Conduct regular audits of your enterprise IT systems to identify vulnerabilities, outdated software, and hardware issues. This proactive approach helps in maintaining system integrity and performance.
- Patch Management: Crucially important, especially security patching! Implement a robust patch management process to ensure all software and systems are up to date with the latest security patches and updates. This helps reduce the risk of security breaches and improves system stability.
- Capacity Planning: Regularly assess your infrastructure's capacity to handle current and future workloads. Do you know when the busiest times of the day, the month, the quarter, and the year are for your most business-critical applications and databases? If not, how can you expect to effectively expand capacity without wasting time and money? Capacity planning involves monitoring resource consumption and planning for upgrades or scaling to prevent performance bottlenecks.
- Business Continuity and Disaster Recovery Planning: Develop and maintain a comprehensive disaster recovery plan. Do this not only for simple server-down scenarios but also for long-term outages. This includes regular backups and preventative maintenance routines on your database, recovery drills for both full restores and partial restores, and learning the time it takes your team to achieve full remediation. Only then can you ensure that critical data can be restored and used quickly in case of an emergency.
These and other techniques can help ensure that your enterprise IT infrastructure and applications function well long into the future.
As we wrap up, I encourage you to pay more attention to maintenance and maintainers. Believe it or not, humans have been thinking about maintainability and sustainability for many millennia. Hindu cosmology has three main gods: Brahma the creator, Vishnu the preserver, and Shiva the destroyer. Many Biblical passages refer to the Hebrew God as a shepherd, a preserver, and protector.
Our society focuses value first and foremost on glittery new things, conspicuous consumer culture, disposable products, rent over ownership (remember when we used to own our music? Now I rent my music from Spotify), and an emphasis on always cutting labor costs. Let’s see if we can nudge our IT teams towards greater respect for the people whose jobs support, stabilize, and maintain. They’re not superstars. They’re just grinding it out from day to day, fighting the good fight.
The Bottom Line
Technical debt is a real and present danger. Maintenance is a crucial, but often overlooked aspect of enterprise IT. Ignoring it is like ignoring the overdraft warnings on your bank statement. Eventually, the interest payments will cripple your budget and prevent you from investing in new and innovative initiatives. By adopting these proactive strategies, you can build a sustainable and maintainable IT infrastructure that will support your business goals for years to come. Don't let technical debt be your legacy. Be a master of maintenance! Take up the sword and become a Technical Debt Slayer! Your sanity (and your budget) will thank you for it.
In a recent whitepaper, SolarWinds identified the four most critical components of database observability. Has your organization got these covered?