Should applications be released when we don’t know how stable the user experience is? The obvious answer is the right one: of course not.
Application stability is the most important metric that software companies can use. Yet many organizations don’t measure stability at all. Or, if they track errors, this information isn’t used by engineering and product organizations to make informed development decisions.
Today, many DevOps teams track “error rates,” which is the inverse of stability. And that’s great! It’s helpful to know what your error rates are and how often your apps are failing.
However, this data is often not visible to other parts of the organization, which is why other teams may view it as something outside their purview. Nothing could be further from the truth.
Stability is a metric that everyone—and I do mean everyone—should care about. From top to bottom, stability should be discussed on a regular basis and used as a common language between teams to answer the all-important question: do we build new features or fix bugs?
In short, stability bridges the gap between product and engineering teams, but it also consolidates business decisions across all teams in the organization. Win-win.
If your goal is to release high-quality apps that customers will use and enjoy (obviously), then it’s time to add stability scores and targets to your arsenal. Here’s how.
To measure stability, you must collect and analyze data. If you can’t use your data, there’s no point in having it. That means your first step is to ensure measurement tools are in place. By adopting a stability management and error monitoring tool like Bugsnag, you can metricize and analyze your stability with ease.
In case you’re wondering, there is precedent for measuring stability. You already talk about uptime and availability of applications, right? Stability follows the same logic. Think about the metric stack in the context of building a new car:
Stability answers the second question and demonstrates where the rubber hits the road. Quite frankly, none of these questions represents something that only DevOps should care about. Developers care. Product owners care. And we all know customers will care if their tire blows out during their first test run.
With a stability monitoring tool in place, you can easily see your application’s stability scores, which are calculated using real-time error rates and sessions data. These scores give you the percentage of successful app interactions in each release.
Then the fun begins. You’re in the driver’s seat with tools like Bugsnag that allow you to set your own goals and targets for stability.
The best way to consider stability targets is to think about the behavior you want to encourage and how to best enable your teams to agree on what action needs to be taken next. That means setting two numbers: critical stability and target stability.
Let’s go back to our car analogy. If you produce one thousand vehicles, and one percent of your customers immediately get flat tires, you’ll have ten angry customers on your hands. Do you think that’s too many customers to potentially lose?
The same principles apply in software. We know there will always be bugs (and that’s okay, by the way, so the real question is: how many bugs is too many?
The answer to that question will help you determine your critical stability– your team’s SLA. For example, if you set 99 percent as your critical stability, that means a full one percent of your customer base is experiencing problems when you hit this lower threshold. Since there’s a one-to-one mapping between stability and customer happiness, you can assume these customers are having a bad experience and will likely drop your app.
This critical stability target should be an easy number to rally around.
Every engineer stops what they are doing, halts their work on building new features, and fixes bugs.
If critical stability is your “sh*t is on fire” moment, then target stability is your aspirational goal. It’s your SLO – a metric that many organizations communicate externally to set appropriate user expectations.
Realistically, we all know it’s impossible to achieve 100 percent stability, especially if you want to move quickly. Instead, companies should strive to accomplish a delicate balance between developing new features to stay competitive and maintaining a stable and crash-free app.
That sweet spot can only be reached through a constant trade-off between speed of innovation and stability. If you aim for perfection, you harm innovation because bugs are always an inevitable outcome of new development. If you move too fast, you risk your stability and may hurt your customer base.
Ask yourself: what’s our real goal? Do we want to innovate faster? Can we spend less time fixing bugs? By agreeing upon a target stability where everyone believes you’re doing great, you can balance the need to fix bugs with the drive to build new features.
Right now, you’re probably thinking to yourself: how does one come up with these target stability and critical stability numbers? Great question.
First of all, not to worry: stability targets aren’t something any one person is going to know how to set precisely from the get-go, and everyone in your organization will likely think about it in a slightly different way. Here are some tips:
Stability targets give engineering leaders and product owners common ground to have a conversation around goals. These discussions will almost always be a negotiation, and stability targets will evolve. The point of adjusting your targets is to allow you to understand where your stability is at right now while keeping an eye on how and when to move faster.
You might also be wondering who should own your stability targets. There is no right answer.
Some argue that product leads should take ownership since they understand the revenue impact of bugs and crashes and the importance of building new features. However, product teams often need buy-in from development. And, typically, products like Bugsnag are brought in by engineering teams that understand the business and have high-functioning relationships with product teams.
In an ideal world, you want to achieve some kind of middle ground because stability is important and touches on pain points for multiple teams. With stability as a shared metric, product teams will better understand technical debt, and development teams will have a stronger recognition of how stability impacts the product roadmap. Any engineer who is tempted to think, “I’m just working on the next feature,” will instead start to consider why these features are being built and the impact they’ll have on the business.
Therefore, while one person may lead the charge, the goal is to have both teams contribute to the conversation and committed to the goals. When everyone is aligned with metrics and targets, the outcome is bound to be stronger stability and better customer experiences.
When we talk about stability, what’s really being discussed is code quality and product quality. Everyone knows quality is a good thing, but historically, it’s been challenging to measure it with any degree of accuracy.
Stability scores change all that. You now have a tactical method to measure and talk about quality. Coupled with stability targets, your product and engineering teams can decide when and how to move that slider between developing features and fixing bugs, and both teams do so with a complete understanding of the impact these decisions have on stability.
And that’s what we call a stable relationship.