All Posts

4 Statistical Process Control Rules That Detect Anomalies in Systems
June 26, 2013
SolarWinds
Statistical Process Control (SPC), or using numbers or data to study the characteristics of our process to make it behave the way we want it to behave, has been around…
Quantifying Abnormal Behavior
June 25, 2013
Baron Schwartz
At Velocity last week, I spoke about how we quantify abnormality in a system’s time-series metrics cheaply, in realtime, at high frequency. Note that this is not the same thing as…
Replacing Clever Code with Unremarkable Code in Go
June 4, 2013
Baron Schwartz
Not too long ago, my primary programming language was Perl. I’ve written a lot of Perl, including some things that I think are quite clever. And therein lies the problem.…
How Does Adaptive Fault Detection Work? Does It Eliminate Thresholds?
April 17, 2013
Baron Schwartz
In previous posts, I claimed that thresholds are a root of much evil in monitoring systems (not the root of all evil, but a root of much evil), and that…
Using Socat to Simulate Networking Traffic to Test and Debug
April 15, 2013
SolarWinds
If you don’t know socat, you probably should. From its man: Socat is a command line based utility that establishes two bidirectional byte streams and transfers data between them. Because the…
Two Reasons Why Threshold-Based Monitoring Is Hopelessly Broken
April 10, 2013
Baron Schwartz
Why is a threshold-based alert such a disaster? There are two big reasons. Thresholds are always wrong. They’re worse than a broken clock, which is at least right twice a…
A Sure-Fire Recipe For Monitoring Disaster
April 9, 2013
Baron Schwartz
In this post I’ll tell a story that will feel familiar to anyone who’s ever monitored MySQL. Here’s a recipe for a threshold-based alert that will go horribly wrong, beyond…
Why You Should Almost Never Alert On Thresholds
April 8, 2013
Baron Schwartz
This post is part of an ongoing series on the best practices for effective and insightful database monitoring. Much of what’s covered in these posts is unintuitive, yet vital to understand. Previous…
SQL Server Consolidation, Part 3
December 20, 2012
Thomas LaRock
I wrote a couple of posts previously on SQL Server consolidation. The first post tried to give insight on some of the problems and associated motivating factors that most companies have…
SQL Server Consolidation, Part 2
December 20, 2012
Thomas LaRock
Consider for a moment that you have a deck attached to your house. It is one story above ground level and is growing weaker with each passing year. The former…
1180181182