Howdy, friends. This post outlines my current framework for prioritizing software defects, a.k.a. “bugs.” I’m writing this not because I believe how to prioritize bugs is exciting but because I want to link to it from other stuff I’m considering writing. Apologies for the dry tone!
I’m presenting a framework I’ve used in development teams of between 1 and 100 people where there isn’t a sustainment or dedicated bug-fixing group.
Why does this matter? Why is it useful?
Like most frameworks, this one aims to help you and your team quickly make good decisions in, like, 98 out of 100 cases. Particularly at, um, “less good” shops, I’ve observed folks struggle with bug triage and prioritization. I’ve witnessed teams struggling to handle bugs in a low-drama, work-like fashion.
Having an agreed-upon framework also keeps you honest with yourself and your team. It helps you communicate to others what’s happening to folks outside of the engineering and product teams.
Being disciplined about bug prioritization gives you insight into your quality. Understanding that you’re making certain kinds of bugs faster than you can or want to fix them is good information for planning and root-cause analysis.
We all feel bad about bugs, and we want to fix them all and fix them all right now. (Or at least I do.) But you’re often unable to do that, or it’s not the right thing for the business. Having some crisp definitions helps everyone hold the line and make choices that are good for the business (and the team.)
Fixing bugs costs, especially when they interrupt previously committed work, and the costs can be painful. It’s useful for everyone to understand all the consequences when something’s deemed a Priority One bug.
Note: If you’ve worked with me in the past, some of these names are the same, but many of the meetings have changed.
Four priorities
This framework has four priorities. Priority One or “P1” is the highest, and Priority Four or “P4” is the lowest.
Priority 4 (P4)
These are the bugs with the most negligible business impact. They’re likely minor annoyances, small fit and finish issues, or issues that affect only a few users or rarely occur.
You put these issues on the shelf and fix them opportunistically. They’re typically unscheduled unless your team has fixed all the other bugs and is looking for something to do–so, not all that often. I like to work on P4s during hack days or, say, those days before a holiday weekend when the team is on its own to find something to do. Maybe you pull a couple into a sprint backlog with a little uncommitted capacity. These are also suitable tasks for onboarding new hires or interns.
I would not go through the effort of stack ranking the bugs in this bucket.
If P4s aren’t fixed within a fixed time frame (my default is sixty or ninety days), they’re automatically closed. If they’re truly essential, they’ll get reopened. Otherwise, there’s no reason to carry the mental overhead of the stale inventory.
Of course, the context surrounding this kind of defect matters. A minor UI glitch might be a higher-priority issue if it keeps your largest customer from renewing. A misspelling in some UI copy is no big deal unless it’s the first thing every one of your new customers sees in your onboarding flow.
I don’t want to give the impression that you should ignore these bugs completely. I mean, at the end of the day they’re problems that someone noticed and logged.
I would especially take care ofof monitoring just the number of bugs falling into this bucket. If P4s materially outnumber P3s, then I’d assess if we need to be more mindful of fit and finish or if there’s something else we could do in the design, implementation, and testing to reduce the rate at which P4s are arriving.
Finally, it is useful for someone, a tech lead, engineering manager, or product owner, to periodically review these and see if they can group them or surface any trends. A handlful of P4s may be a single P3 in hiding! While each P4 may be an annoyance, in aggregate, they could actually turn out to be a more significant issue that needs addressing.
Priority 3 (P3)
These are bugs with your usual amount of business impact. They are your vanilla, no-drama, run-of-the-mill software defects. Most of your bugs will be in this or the P4 bucket.
Bugs in this bucket get scheduled into your regular development cadence. If you’re using some agile methodology, you’d pull some number of these into each of your development iterations or possibly do an all-bug-stomping iteration. If you’re using waterfall with alphas, betas, and such, you could work on them then.
Especially if I’m doing some flavor of agile development, I’d stack rank the bugs in this bucket and ensure that ranking is visible to anyone who cares about the bug’s status. If I’m doing a more waterfall-style development, I’d consider slotting them into specific upcoming releases.
Like the P4s, if P3s do not get fixed within some fixed time frame, again I’d default to 60 or 90 days here, then I’d argue that they should just be closed, just to be provocative.
My thinking is this: if they were truly P3s but haven’t found their way into an iteration, they’ve been mis-prioritized. In truth, they are P4s, but we don’t want to make the hard decision of disappointing a customer or other stakeholder. I find that punting on these hard, uncomfortable decisions contributes to feeling that bugs are out of hand.
Okay, but what if it really for true is a P3 and we just have P3s aging out? Well, then you have another problem. You’re making bugs faster than you’re able or willing to fix them. You will have to look hard at your situation and figure out the root cause: why are you creating so many bugs? Or, why aren’t you devoting enough people to fixing them? Or both?
(I’m going to wait to dive deep into this, but if you’d like me to expand on it, let me know!)
Priority 2 (P2)
Now, things are getting spicy, and the dramatic music is swelling.
P2 bugs with sufficient business impact that you are willing to jeopardize other committed development work to address them. When a P2 appears, the assigned developers drop whatever they’re doing and start resolving the issue.
So, a defect has to be a big enough deal that you’re willing to put other priorities at risk. (As I stated above, this will be less of an issue if you’ve got a sustainment team of some sort.)
Also, you will ship this bug fix as soon as it is complete. That’s less of an issue for people working on cloud-based SaaS products and a CI/CD pipeline and a bigger deal for folks who ship on-prem software, where, say, a patch version needs to be shipped.
So, the defining trait of bugs of this priority is that they will (very likely) mess up your regularly scheduled programming, and everyone understands that. There are costs, context switches, and delayed or potentially abandoned work-in-progress.
For example, a Scrum team will cancel their sprint and replan after they fix the bug, a squishy agile team will drop work out of their committed backlog, a waterfall project team will slip its date, a team practicing Kanban will violate their WIP constraints as this bug transits the pipeline.
P2s don’t get stack ranked because there just aren’t that many of them—or at least there shouldn’t be! There should be something like 0 to 2 of them at any moment, but mostly zero.
If you constantly have P2s in flight, you’ve got a problem. As with the P3s, you’ve either got a quality issue, a prioritization issue (people trying to jump the P3 line), or some combination of both.
Priority 1 (P1)
It’s the big one!
P1 bugs are like P2 bugs with the defining quality that they’re sufficiently important that you will work around the clock to get a fix or acceptable workaround.
When one of these beasts rises from the depths, there’s even more disruption and collateral damage. That means, at minimum, giving people time off afterward.
As with P2s, there should only ever be one in flight, and honestly, hopefully, there will never be one in flight.
If you continually deal with P1s, you either have major quality problems, a broken work environment, or both. While some people dig the thrill of P1s, they’ll eventually burn people out and cause them to leave.
Act accordingly.
Some other thoughts
I have a couple more points I want to make, but they don’t have an elegant place in the rest of the post, so I’ll just toss them here at the end.
More concrete definitions of the priorities
I’ve had more concrete mechanisms for determining priority in the past, but I’ve mostly abandoned them for a couple of reasons.
First, it was hard to explain to other people, and I decided to give up at some point.
Second, it really is impact on the business that ultimately drives a bug’s priority and that idea is pretty straightforward.
So, instread of some kind of formula or lookup matrix, I’d instread come up with a list of examples that you can share. That exercise helps tailor the framework to your own business. It has the further benefit of helping train future bug triagers and it further helps to keep you honest.
The impact of workarounds
I mentioned this above, but to ensure it doesn’t get lost, I’ll mention it again here. You’re not always trying to fix a bug; you might just be looking for an acceptable workaround. And once you get that workaround, it’s my experience that a bug’s priority often drops by one level. Or, in the case of a P3, maybe it doesn’t become a P4, but instead remains a P3. It just moves lower in the stack ranking.
What makes a workaround “acceptable” is goingt to depend on your situation. Maybe it’s going to be something that that one big customer’s IT team will accept. Maybe it’s something that your customer success team can communicate or perform without too much inconvenience to them or your customers.