Stay up to date on the latest things happening at Stanza by following the Stanza Blog!
We explore top-down and bottom-up overload, and the importance of protecting your most important user journeys against them. We discuss practical strategies for managing overload situations, and the challenges to ensuring they provide uninterrupted, optimal user experiences regardless of traffic fluctuations.
An announcement of Stanza's evaluation beta, and a discussion of why traffic management is so important to software operations.
We all want a reliable service, and we've all heard things like "circuit breaking", "throttling", and "traffic management" will help. But what do these things really mean, and how do they make our systems reliable as they scale?
Unlike poetry, learning from incidents is rarely done by rote, but in this case we'll make an exception.
Are you a software engineering director in charge of some Site Reliability Engineers (SRE) and wondering what they’re doing - or should do? Then read on!
What could be simpler than a health check endpoint that just returns a HTTP 200 when called? But health checks aren’t simple at all. Health checks are a critical signal in orchestration systems, and when things go wrong, they can cause havoc.
What do you do when the graph doesn't go up and to the right? Welcome to the doldrums. SREing in challenging times, from Tiarnán