In the first post in this series, I walked through how SQL Server uses memory and which configuration habits make life easier for DBAs; in this follow-up, I want to focus on the incidents you actually feel, including plan cache bloat, bad query patterns, and the troubleshooting paths that get you from symptom to root cause faster.

Why Plan Cache Bloat Wastes SQL Server Memory

One of the less obvious paths to memory pressure is plan cache bloat.

Every time SQL Server compiles a query or stored procedure, the resulting execution plan can be stored in cache. That’s useful to know because query compilation has a cost.

If the same query runs again, SQL Server can reuse the existing plan instead of expending additional CPU resources to generate a new execution plan. However, not every plan deserves a long life.

In environments with a lot of ad hoc SQL, ORM-generated statements, inconsistent connection settings, or embedded literals, SQL Server may cache thousands of plans that are rarely reused. Small differences in SET options, ANSI_NULLS, parameterization, or collation can lead SQL Server to store multiple plans for what looks like the same workload pattern. Over time, that becomes waste.

The OPTIMIZE FOR AD HOC WORKLOADS setting can help. When enabled, SQL Server stores a lightweight version of the execution plan called a plan stub the first time an ad hoc query runs. If that query runs again, the full plan is then cached.

I’ve seen this help in many environments, and I’ve rarely seen it hurt. It won’t fix poor application design by itself or bad SQL code. Still, it can reduce plan cache waste and can reduce compile pressure when one-off ad hoc SQL is part of the workload.

How Bad Query Patterns Create Memory Pressure

Memory problems aren’t always caused by memory settings. Sometimes they come from the way SQL queries are written.

The classic example is SELECT *.

If an application needs three columns but the query asks SQL Server to return all sixteen columns, the engine has to manage more data than necessary. Those extra columns consume memory, increase I/O, and add work the engine didn’t need to do. One query may not matter much. A repeated pattern across a busy workload does.

Memory grants are another pressure point.

SQL Server uses index statistics, cardinality estimates, and other index metadata to estimate how much memory a given query needs for operations such as sorting, hashing, and joining. If index statistics are stale, or the plan invalidates using a given index, the memory grant can be too large or too small.

  • A grant that’s too large can reserve memory other queries need
  • A grant that’s too small can spill temporary objects to TempDB and push more pressure back onto the storage subsystem

Either way, performance suffers. That’s why memory optimization isn’t only a DBA configuration exercise. It’s also a SQL query quality exercise. For example:

  • Keep statistics current
  • Review expensive queries
  • Avoid retrieving unnecessary data through your SQL queries
  • Watch for large scans, spills, and inefficient access patterns in the execution plan

The fix isn’t flashy. More often, it’s a set of stable habits that reduce waste.

Why Page Compression Is Worth Testing

Page compression is one of those SQL Server features many teams still avoid because they’re worried about CPU cost.

That concern’s understandable.

Compression sounds expensive, and nobody wants to add overhead to a server that’s already running hot on CPU. However, page compression can often reduce the amount of data stored on disk and the amount of data held in memory. That means fewer pages to read, fewer pages to cache, and less I/O pressure.

In testing against the SQL Sentry® database, selected large tables saw roughly 60% space savings with modest CPU overhead once the initial compression work was complete. When I say modest, I mean it. In a standard OLTP workload using the TPC-C benchmark scaled to 2,000 concurrent users, our benchmark systems reported an increase of only 0.6% added CPU when page compression was enabled. A 60% improvement in memory and I/O performance for an added 0.6% of CPU is practically a free and significant performance improvement.

Your workload may behave differently, so test before you roll it out broadly. For example, our benchmark tests revealed that workloads with a high proportion of UPDATE statements experienced less performance improvement than workloads with a more balanced mix of UPDATE statements and other DML statements. Even so, page compression is worth a serious look, especially for read-heavy and mixed workloads.

When Partitioning and Resource Governor Help

Two other levers are worth knowing about, even if they’re not the first things I’d reach for.

Partitioning can help when a large share of the database is historical and rarely accessed. If older data can sit in read-only partitions, SQL Server can reduce some of the work associated with that data. For example, SQL Server does not cache plans for SELECT statements accessing data solely from a read-only partition.

Resource Governor can help when specific workloads, users, or applications are known to consume too much memory. It gives you a way to set an upper boundary for a given workload or user, instead of relying entirely on SQL Server’s internal balancing.

Neither feature replaces good schema design, good queries, or sensible memory settings. However, both can help mature teams create more predictable operating conditions.

Start SQL Server Memory Troubleshooting With What Changed

When memory problems surface, don’t start by randomly clearing caches or restarting services.

Start with what changed.

(If you have enabled Query Store, this is the perfect built-in tool for determining a major shift in SQL query patterns.)

Ask questions like these:

  • Was there a new deployment?
  • Was there a schema change?
  • Was a batch job added?
  • Did a reporting workload move?
  • Did something change in the VM, storage layer, or connection settings?

Memory symptoms often appear after something else changes.

Next, check for errors.

Look outside SQL Server first. DNS issues, domain controller problems, operating system errors, or storage problems can all look like database performance issues to the end-user. If the error’s inside SQL Server, the message itself often gives you a useful path to investigate.

After that, look at wait statistics.

If memory’s part of the problem, waits such as RESOURCE_SEMAPHORE, PAGEIOLATCH_*, PAGELATCH_*, and LATCH_* are memory-related latch waits that can point you in the right direction.

  • RESOURCE_SEMAPHORE waits matter in particular because they usually mean queries are waiting for memory grants
  • PAGEIOLATCH_* indicates queries are waiting for the interplay of memory and the storage subsystem to complete when answering a query, often pointing to pressure on the data cache
  • PAGELATCH_* waits point specifically to waits on latches in the data cache
  • LATCH_* commonly arise from contention on resources other than data cache, as well as waits caused by heaps or large LOB/MAX data types

One PerfMon counter I always want to check is Memory Grants Pending. In a healthy system, you want that value to be zero. When it’s above zero, this counter indicates the number of queries that are queuing for memory they can’t get.

Then correlate the waits with other telemetry:

  • Memory clerks
  • Plan cache usage
  • Buffer pool usage by database
  • Expensive queries
  • PerfMon counters
  • Extended Events
  • Recent deployment history

That correlation matters because a single metric rarely tells the whole story.

Use Cache-Clearing Commands With Care

SQL Server gives DBAs commands to clear different caches. They exist for a reason, and they can be useful in testing or in specific troubleshooting scenarios.

They can also cause harm when they’re used casually.

  • DBCC FREEPROCCACHE clears the plan cache
  • DBCC DROPCLEANBUFFERS clears clean buffers from the data cache
  • DBCC FREESYSTEMCACHE ('cache_name') can clear smaller system caches or user databases by name

These commands may be more targeted than a full restart, but they still create a performance trough. Alternately, you might prefer to use DBCC FLUSHPROCINDB (db_id) for times when you wish to clear the plan cache for a single database.

Remember that restarting SQL Server or clearing its caches causes the server to run much more slowly until the caches are properly warmed. SQL Server has to compile plans again, and it must read data from disk again. End-users pay the cost.

That doesn’t mean you should never use these statements. It means you should know why you’re using them, what you expect to happen, and how you’ll measure the result.

A high-performance DBA doesn’t create more uncertainty during an incident. They reduce it.

Better Monitoring Reduces Guesswork

You can troubleshoot SQL Server memory with native tools. Every DBA should understand the underlying DMVs, wait statistics, counters, and error logs well enough to interpret what the monitoring platform’s showing. After all, as a data management professional, you may be asked to troubleshoot database servers that have only the native tools installed with the database server itself.

But doing all of that manually takes time and experience.

Teams can easily miss trends that stretch across several days. Under pressure, it is just as easy to chase the loudest signal instead of the right one. This becomes especially hard when stakeholders are waiting for answers.

That’s where purpose-built monitoring helps.

SolarWinds® Database Performance Analyzer (DPA) software uses wait-time analysis to show where database time’s being spent and which queries are driving the pain.

SolarWinds® SQL Sentry® software takes a deeper SQL Server-focused approach. It correlates Windows and SQL Server telemetry across the same time period, which makes it easier to connect memory, CPU, I/O, TempDB, waits, and top SQL activity.

SolarWinds® Database Observability extends that thinking into broader hybrid and SaaS observability use cases, where database performance needs to be understood alongside application and infrastructure behavior.

The point is not that tools make prettier charts.

The point’s that they shorten the distance from “something’s wrong” to “this query, this plan, this grant, or this workload changed.” In an environment where DBAs already spend too much time on reactive work, that difference matters.

What High-Performance DBAs Actually Do

The DBAs I trust with business-critical SQL Server environments tend to have a few habits in common.

They know:

  • How much memory SQL Server’s allowed to use
  • What else runs on the server
  • How the buffer pool, data cache, plan cache, memory grants, TempDB, and I/O layer interact
  • Whether statistics are current
  • Which queries consume too many resources, whether that be too much memory, I/O, or any other system resource
  • Whether a memory issue started after a deployment, data change, or configuration change

Most importantly, they use memory decisions to buy back time.

A query that stops spilling to TempDB is time returned. When a noisy ad hoc plan pattern gets cleaned up, that is time returned as well. Avoidable restarts that never happen are also time returned.

That shift is what makes the environment more predictable, more stable, and the DBA role less reactive.

SQL Server Memory Troubleshooting FAQ

What is plan cache bloat?
Plan cache bloat happens when SQL Server stores too many execution plans that are duplicates or are rarely, if ever, reused. It’s common in environments with heavy ad hoc SQL, ORM-generated queries, no standards for SQL code formatting, inconsistent SET options, or lots of embedded literals.

Is OPTIMIZE FOR AD HOC WORKLOADS worth enabling?
In many mixed workloads, yes. It stores a small plan stub on first execution and caches the full plan only if the query runs again. That helps reduce the weight of one-off plans and reduces plan cache bloat.

How do query patterns turn into memory problems?
Bad habits like SELECT *, stale statistics, and large scans force SQL Server to move and hold more data than necessary. These factors increase memory grant size, cause spills to TempDB, and amplify I/O.

Where should I look first when I suspect a memory issue?
Start with what has changed. Then check wait statistics. Once wait statistics identify one or more categories of memory pressure, check other telemetry such as PerfMon Memory Grants Pending, plan cache composition, and data cache usage by database.

When is it okay to clear caches?
Use cache-clearing commands with caution. For example, only in controlled testing, or in specific troubleshooting scenarios where you know why you’re doing it and how you’ll measure the outcome. They shouldn’t be a routine fix.

How do SolarWinds tools help with memory tuning?
SolarWinds database tools reduce the time it takes to go from “we see a memory symptom” to “we know which workload change caused it and how to fix it.”

Resources

You may also like