/ Database

AI Won’t Fix a Broken System: Why Foundations Matter for Database Performance

Artificial intelligence is transforming how database teams diagnose and optimize performance. But there’s a pattern behind the most successful outcomes: AI accelerates clarity when it’s built on unified, trustworthy data. Where telemetry is fragmented, siloed, or incomplete, and workflows are reactive, even sophisticated models struggle to produce insights that teams can actually act on.

Eoin Keenan

December 15, 2025

Page Contents

Recent industry findings underscore this gap between ambition and readiness. In a large global survey of database professionals, 65% report using AI-assisted tuning, yet 75% still experience alert fatigue to the point that 38% have considered leaving their roles. On average, DBAs spend ~27 hours per week on reactive or routine tasks, while only about 40% say their monitoring environment is fully unified.

Put simply: AI is arriving, but the foundations that ensure you get the most out of AI, from unified full-stack visibility to clean telemetry, are far behind.

This article lays out a pragmatic path to close that gap without turning your week into a transformation project.

AI Reveals the Quality of Your Foundations

AI thrives on patterns and context. Database estates today are rarely simple. The vast majority of DBA teams must manage multiple engines across data centers on-premises, in the cloud, and a hybrid of both. Common impediments for data management teams include overlapping toolchains, inconsistent telemetry, and limited protected time for engineering. Three constraints recur:

Fragmented visibility: When metrics and events are split across tools and teams, cause-and-effect relationships remain hidden. Analysts burn time stitching evidence together by hand and correlating symptoms of poor performance to actual internal signals to point out the root cause.
Noisy signals: Threshold-only alerts fire constantly. Teams get paged for the expected, not the significant. Many alerting systems become the proverbial “boy who cried wolf,” resulting in the ignoring of important alerts due to high volumes of low- and no-value alerts. Firefighting displaces optimization.
Reactive workflows: Without shared processes, methods, and guardrails, people fall back on intuition, which prolongs mean time to resolution (MTTR), spurious action plans, and invites repeat incidents.

In that context, AI can surface “interesting” patterns, but trust and actionability suffer. By contrast, teams operating in unified environments report 72% faster diagnosis and significantly more time for strategic work (≈60%), since the inputs are clean and the context is complete. The lesson is straightforward: get the foundations right, then let AI multiply your gains.

Foundations That Unlock AI’s Potential

1) Monitor: single pane of glass view

Start by standardizing how you see performance across engines and environments. A consistent first look lens from a single pane of glass reduces cognitive load and context switching for the DBA team. In the world of relational databases, a proven approach is a wait statistics-centric view. Wait statistics answer the same question on all major relational database platforms: What is this workload waiting on? Is it CPU, I/O, networking, bad SQL, locks, or something else?

With that as the starting point, DBA teams then overlay OS host and storage metrics, virtualization indicators, and (where available) application traces and logs to build a complete and unified picture of cause and effect.

Why it matters: Unifying the first look turns diagnosis into a repeatable, evidence-based process that accelerates cross-team conversations and collaboration.

2) Diagnose: clean the signal

Before you tune alerts, tune your baselines. Baselines establish a “normal” for critical workloads, so deviations and anomalies stand out, and noise recedes. Pair baselines with anomaly detection to highlight what changed (e.g., a SQL execution plan regression, a spike in I/O waits, a surge in connection counts) rather than what’s simply high. Add the right diagnostic breadcrumbs, such as Top SQL queries by wait statistic, execution plan regressions, blocking/deadlock visibility, and the broader operational context (jobs, backups, maintenance windows).

Why it matters: Clean signals shorten mean time to truth, the time it takes to explain an incident, not just resolve it. This is where the survey’s alert fatigue finding (75%) can be bent back in your favor.

3) Optimize: make improvements durable

Optimization is a strategic discipline, not a sequence of ad hoc fixes. Focus your efforts on the workloads responsible for the most wait time and treat AI-assisted proposals (such as query rewrites and index ideas) as starting points. Engineering judgment still decides which ships. Protect gains with simple guardrails: regression checks after releases, index hygiene, and capacity reviews so improvements survive seasonality and change. This also directly impacts your Error Budget, the amount of time a service can be down or perform poorly before violating its agreed-upon Service Level Agreement (SLA). By focusing optimization on reducing performance-related errors, you preserve this budget for unexpected issues, making your operations strategically resilient.

Why it matters: Intentional optimization converts reclaimed time into resilience rather than one-off wins, and acts as an investment in your organization's operational stability.

4) Everywhere: flexible deployment without lock-in

Foundations have to hold where your data actually runs. Favor approaches that work consistently across self-hosted, cloud, and hybrid data centers without forcing a single operating model. Assume a multivendor reality and design low-friction workflows. Remember, if the “right way” is hard, teams will find a way to work around it.

Why it matters: Consistency and portability ensure practices travel with your estate as it evolves, rather than staying trapped in a proof of concept.

Conclusion of Part 1

The Monitor → Diagnose → Optimize → Everywhere methodology is the critical first step to modernizing database operations and preparing for true AI adoption. But how exactly does this framework translate into organizational maturity, and what is the responsible, practical role of AI in daily DBA work?

In Part 2 of this series, we will look at new survey data to define the database operations maturity model, clarify what AI can—and cannot—do for performance tuning, and provide a set of durable, actionable habits your team can implement today to reclaim time and focus on strategic work.