A Bug Prioritization Framework

Howdy, friends. This post outlines my current framework for prioritizing software defects, a.k.a. “bugs.” I’m writing this not because I believe how to prioritize bugs is exciting but because I want to link to it from other stuff I’m considering writing. Apologies for the dry tone!

I’m presenting a framework I’ve used in development teams of between 1 and 100 people where there isn’t a sustainment or dedicated bug-fixing group.

Why does this matter? Why is it useful?

Like most frameworks, this one aims to help you and your team quickly make good decisions in, like, 98 out of 100 cases. Particularly at, um, “less good” shops, I’ve observed folks struggle with bug triage and prioritization. I’ve witnessed teams struggling to handle bugs in a low-drama, work-like fashion.

Having an agreed-upon framework also keeps you honest with yourself and your team. It helps you communicate to others what’s happening to folks outside of the engineering and product teams.

Being disciplined about bug prioritization gives you insight into your quality. Understanding that you’re making certain kinds of bugs faster than you can or want to fix them is good information for planning and root-cause analysis.

We all feel bad about bugs, and we want to fix them all and fix them all right now. (Or at least I do.) But you’re often unable to do that, or it’s not the right thing for the business. Having some crisp definitions helps everyone hold the line and make choices that are good for the business (and the team.)

Fixing bugs costs, especially when they interrupt previously committed work, and the costs can be painful. It’s useful for everyone to understand all the consequences when something’s deemed a Priority One bug.

Note: If you’ve worked with me in the past, some of these names are the same, but many of the meetings have changed.

Four priorities

This framework has four priorities. Priority One or “P1” is the highest, and Priority Four or “P4” is the lowest.

Priority 4 (P4)

These are the bugs with the most negligible business impact. They’re likely minor annoyances, small fit and finish issues, or issues that affect only a few users or rarely occur.

You put these issues on the shelf and fix them opportunistically. They’re typically unscheduled unless your team has fixed all the other bugs and is looking for something to do–so, not all that often. I like to work on P4s during hack days or, say, those days before a holiday weekend when the team is on its own to find something to do. Maybe you pull a couple into a sprint backlog with a little uncommitted capacity. These are also suitable tasks for onboarding new hires or interns.

I would not go through the effort of stack ranking the bugs in this bucket.

If P4s aren’t fixed within a fixed time frame (my default is sixty or ninety days), they’re automatically closed. If they’re truly essential, they’ll get reopened. Otherwise, there’s no reason to carry the mental overhead of the stale inventory.

Of course, the context surrounding this kind of defect matters. A minor UI glitch might be a higher-priority issue if it keeps your largest customer from renewing. A misspelling in some UI copy is no big deal unless it’s the first thing every one of your new customers sees in your onboarding flow.

I don’t want to give the impression that you should ignore these bugs completely. I mean, at the end of the day they’re problems that someone noticed and logged.

I would especially take care ofof monitoring just the number of bugs falling into this bucket. If P4s materially outnumber P3s, then I’d assess if we need to be more mindful of fit and finish or if there’s something else we could do in the design, implementation, and testing to reduce the rate at which P4s are arriving.

Finally, it is useful for someone, a tech lead, engineering manager, or product owner, to periodically review these and see if they can group them or surface any trends. A handlful of P4s may be a single P3 in hiding! While each P4 may be an annoyance, in aggregate, they could actually turn out to be a more significant issue that needs addressing.

Priority 3 (P3)

These are bugs with your usual amount of business impact. They are your vanilla, no-drama, run-of-the-mill software defects. Most of your bugs will be in this or the P4 bucket.

Bugs in this bucket get scheduled into your regular development cadence. If you’re using some agile methodology, you’d pull some number of these into each of your development iterations or possibly do an all-bug-stomping iteration. If you’re using waterfall with alphas, betas, and such, you could work on them then.

Especially if I’m doing some flavor of agile development, I’d stack rank the bugs in this bucket and ensure that ranking is visible to anyone who cares about the bug’s status. If I’m doing a more waterfall-style development, I’d consider slotting them into specific upcoming releases.

Like the P4s, if P3s do not get fixed within some fixed time frame, again I’d default to 60 or 90 days here, then I’d argue that they should just be closed, just to be provocative.

My thinking is this: if they were truly P3s but haven’t found their way into an iteration, they’ve been mis-prioritized. In truth, they are P4s, but we don’t want to make the hard decision of disappointing a customer or other stakeholder. I find that punting on these hard, uncomfortable decisions contributes to feeling that bugs are out of hand.

Okay, but what if it really for true is a P3 and we just have P3s aging out? Well, then you have another problem. You’re making bugs faster than you’re able or willing to fix them. You will have to look hard at your situation and figure out the root cause: why are you creating so many bugs? Or, why aren’t you devoting enough people to fixing them? Or both?

(I’m going to wait to dive deep into this, but if you’d like me to expand on it, let me know!)

Priority 2 (P2)

Now, things are getting spicy, and the dramatic music is swelling.

P2 bugs with sufficient business impact that you are willing to jeopardize other committed development work to address them. When a P2 appears, the assigned developers drop whatever they’re doing and start resolving the issue.

So, a defect has to be a big enough deal that you’re willing to put other priorities at risk. (As I stated above, this will be less of an issue if you’ve got a sustainment team of some sort.)

Also, you will ship this bug fix as soon as it is complete. That’s less of an issue for people working on cloud-based SaaS products and a CI/CD pipeline and a bigger deal for folks who ship on-prem software, where, say, a patch version needs to be shipped.

So, the defining trait of bugs of this priority is that they will (very likely) mess up your regularly scheduled programming, and everyone understands that. There are costs, context switches, and delayed or potentially abandoned work-in-progress.

For example, a Scrum team will cancel their sprint and replan after they fix the bug, a squishy agile team will drop work out of their committed backlog, a waterfall project team will slip its date, a team practicing Kanban will violate their WIP constraints as this bug transits the pipeline.

P2s don’t get stack ranked because there just aren’t that many of them—or at least there shouldn’t be! There should be something like 0 to 2 of them at any moment, but mostly zero.

If you constantly have P2s in flight, you’ve got a problem. As with the P3s, you’ve either got a quality issue, a prioritization issue (people trying to jump the P3 line), or some combination of both.

Priority 1 (P1)

It’s the big one!

P1 bugs are like P2 bugs with the defining quality that they’re sufficiently important that you will work around the clock to get a fix or acceptable workaround.

When one of these beasts rises from the depths, there’s even more disruption and collateral damage. That means, at minimum, giving people time off afterward.

As with P2s, there should only ever be one in flight, and honestly, hopefully, there will never be one in flight.

If you continually deal with P1s, you either have major quality problems, a broken work environment, or both. While some people dig the thrill of P1s, they’ll eventually burn people out and cause them to leave.

Act accordingly.

Some other thoughts

I have a couple more points I want to make, but they don’t have an elegant place in the rest of the post, so I’ll just toss them here at the end.

More concrete definitions of the priorities

I’ve had more concrete mechanisms for determining priority in the past, but I’ve mostly abandoned them for a couple of reasons.

First, it was hard to explain to other people, and I decided to give up at some point.

Second, it really is impact on the business that ultimately drives a bug’s priority and that idea is pretty straightforward.

So, instread of some kind of formula or lookup matrix, I’d instread come up with a list of examples that you can share. That exercise helps tailor the framework to your own business. It has the further benefit of helping train future bug triagers and it further helps to keep you honest.

The impact of workarounds

I mentioned this above, but to ensure it doesn’t get lost, I’ll mention it again here. You’re not always trying to fix a bug; you might just be looking for an acceptable workaround. And once you get that workaround, it’s my experience that a bug’s priority often drops by one level. Or, in the case of a P3, maybe it doesn’t become a P4, but instead remains a P3. It just moves lower in the stack ranking.

What makes a workaround “acceptable” is goingt to depend on your situation. Maybe it’s going to be something that that one big customer’s IT team will accept. Maybe it’s something that your customer success team can communicate or perform without too much inconvenience to them or your customers.

Pairing for a Few Days With ChatGPT and GitHub Copilot

👋🏼Howdy, friends. I recently decided to build a Ruby on Rails project using ChatGPT and GitHub Copilot. For those who don’t know, these are AI tools. Here’s how it went.

In summary, this is a thing, it’s useful, and we’re not going back. Are the tools overhyped? Yes, I think so. Is this where things are going? Oh, dear reader, I think so.

Plumbing

Here is my setup. I used Copilot via integration with JetBrains RubyMine. In RubyMine, autocompleted code appeared as I typed, which I could accept or ignore.

I used ChatGPT via the web interface. I would describe some code I wanted ChatGPT to write, or I would ask it a specific question. Halfway through this experiment, I upgraded from the free ChatGPT 3.5 to the paid ChatGPT 4.0.

Some good things

Using these tools reminded me a lot of pair programming, as advertised, except my partner was ~~a sinister ghost in the machine~~ a happy cyber-friend! Both systems routinely generated plausible code. Quickly establishing this starting point is especially great for me as I’m much better at refining than developing code from scratch. Reducing my activation energy to almost zero was an enormous productivity enhancement, even if I didn’t accept all of the code as proposed.

I enjoyed and profited (in some cases) from the tools showing me different and sometimes more straightforward ways to approach problems. I hadn’t anticipated it teaching me like that.

A place in which Copilot supercharged my productivity was generating test cases in RSpec, both at the system level and for model “unit” tests. Again, this is where it helped that it mechanically generated the code, but it also figured out what the next test should be before I did (and wrote it.) In one case, where I had written three non-trivial system tests, it predicted–whole cloth–the test, the comments in it, the assertions–just everything.

I used ChatGPT to do things like answer programming questions. I went out of my way to ask it questions first instead of Google. Sometimes those answers were great, and sometimes they could have been better (read: wrong.) I spent a fair amount of time validating its responses on Google.

I asked ChatGPT to do things like generate Rake tasks by first describing them. That worked well!

I also asked it to create a home page for the product using the Bootstrap 5 CSS framework, including some copy, which it did better than I could, but its output still falls short of a professional’s, for sure. I also asked it to design a “zero state” for the app’s main screen, which it did well enough! (I’m guessing I could yield better results here as I become a more experienced prompt author.)

Some less good things

Alas, all wasn’t all smooth sailing. The tools are imperfect and can make mistakes that a human coder might make. There were a couple of situations where the generated code introduced subtle bugs that only revealed themselves later. Also, ChatGPT will cheerfully generate incredibly plausible but fake code–I feel this was less of an issue for Copilot, but I didn’t take careful notes.

There’s a learning curve. Part of it is just muscle memory. For example, I needed to learn to pause for a moment for CoPilot to do its thing or realize that Copilot had nothing to offer. I had to reprogram myself to go to ChatGPT before Google. With ChatGPT, there’s also the learning curve of how to craft decent prompts.

Something irksome is that these tools don’t have a consistent coding style, either lexically or idiomatically. So, you might end up with one file using single quotes and another double. Or, want part of your RSpec test suite might use subject and others do not.

In Closing

Did these tools 10x my productivity? No. Did they 2x my productivity? At least.

I am skeptical that, today, any random person can describe some complicated system to these tools and have them generate a whole thing for you. They still need skilled operators for all kinds of reasons.

Copilot and ChatGPT are a permanent part of my toolchain now. They should be part of yours, too.

If you’ve got any questions, I’d love to answer them if I can. You can find me on email, LinkedIn, Bluesky, and Twitter.

Live or Take-home Coding Assessments?

Last week in the RailsLink community‘s #work-offers Slack channel, a member ran a poll with this question:

“Do you prefer a live code assessment or a take-home assessment when interviewing for a developer position?”

I answered that I’m pretty firmly in the take-home camp. Here’s why.

Just to clarify, “take-home” is something the candidate completes alone, probably where they live. “Live” is done physically or virtually in an office, and here’s the important bit, with an interviewer watching the candidate.

I think take-home tests are at least an order of magnitude less stressful than a live coding exercise. And there’s recent research, “Does Stress Impact Technical Interview Performance?“, that supports this notion. (If you click through, you’ll see the paper thinks “live” tests are perfectly constructed to stress people out.)

A brief aside, to talk about my hiring process philosophy in general. I feel strongly about setting candidates up to succeed. That is, an interview process should be less about weeding out the people you don’t want to hire and more about surfacing the people you do want to hire.

You want to help people succeed for both practical (engineers are scarce) and noble (being a decent human) purposes.

Interviews are stressful! And coding interviews? Yikes! It’s one of the most stressful things I can remember ever doing. Asking people to perform well under these conditions sure feels like “weeding out.”

Generally, being a software engineer doesn’t require that one operate well under extreme stress. So unnecessary stress’s presence in the process is unwelcome! I don’t need to know if folks can code under duress–I just want to see if they can code!

So, I believe that folks answering a reasonable, relevant, and practical coding exercise at home is a lot less stressful. At home, you’ve got your own computer and dev environment. You’re probably in a workspace you enjoy and set up to your preference. You get to Google guilt-free. You can grab a snack or hit the bathroom if the need arises.

The more I’ve interviewed people (and been interviewed), the more I want to let candidates take the interview on their terms in the time and place of their choosing, if possible. (And with an open book, because all software developers work with an open book.)

Am I asserting that a take-home assessment is stress-free? Oh, Dear Reader, absolutely not! Still pretty stressful, I’d say!

Does this mean that there will be no further technical assessment in the course of a face-to-face or onsite interview? No. But it probably means there’s nothing that involves writing curly braces on a whiteboard.

We have all blanked at coding questions in interviews because it’s kind of intense. As a hiring manager, the idea that someone didn’t make it all the way through an interview process because it was too stressful just drives me crazy because finding good people is often my biggest challenge!

And, as a human, I feel bad that it makes people feel bad.

So, if you’re a hiring manager, minimize or drop your live coding tasks. Figure out how to turn them into take home tests.

If you’re a candidate, ask if you can do parts of the interview at home.

We’re at the end of this post, but I’m sure you’ve got a lot of questions. What is a fair assessment? How do you create one? Where in the process should it occur? Should you even have a coding assessment? All great questions that I look forward to addressing in the future!

Thanks for reading! You can find me on Twitter (@mjbellantoni) and LinkedIn. I’d love to hear your thoughts!

React is now effectively part of the Rails stack

In the most recent Stack Overflow Annual Developer Survey, (see my writeup here) React.js is the top-ranked web framework, with over 41% of respondents saying they use it. This fact got me thinking about how often I see React mentioned in Ruby on Rails job postings on RailsGigs–because I can tell you it’s a lot!

So I decided to take a look at the data. Let’s go!

In the past five or so months, RailsGigs has published 1,662 job posts. Of those, 841 of them mention “react.” That’s 50.6% of all job postings! I knew it was a lot, but I was pretty shocked that it was just more than half.

(If we only consider the job postings that are currently live, 50.0% of them mention React.)

Here’s some pie for you:

More than half of all job posting descriptions on RailsGigs mention React

I audited 25 of these job postings to understand the context in which the term “react” is used. 60% of the postings state straight up that the employer using React as part of their Rails stack. 52% of them ask for “experience,” while another 12% use it in the context of a “nice to have.” One posting even uses it in the context of “aptitude.”

So, while employers sometimes mention React in the context of “you should know some Javascript like React or Vue or something,” the anecdotal evidence is that React is actually in use at a whole lot of Ruby on Rails shops.

60% of employers state that React is part of their stack. Over 50% ask for some amount of experience while just over 10% list it as a “nice to have.”

To give you a sense of how often “react” is appearing alongside other Javascript or frontend-related words, here’s the frequency of some others:

What took me back in this data is that React appears more often than Javascript!

The other big surprise is that Ember has such a strong showing. In some sense, not a surprise as, at one point, Ember was kind of going to be “the Javascript Rails.” (And maybe it is?) I didn’t dig into the data to see if it was a requirement in job postings or not. That’s a project for another day.

Hotwire and Stimulus have a pretty weak showing. Is that a surprise or not? I do know that I’m expecting those numbers to grow steadily in the next 12 months.

When one looks at the data, it’s pretty dang hard to argue that React isn’t a de facto part of the “Rails” stack. It feels like this is a new topic for, uh, “healthy conversation” right alongside “Minitest vs. RSpec,” “fixtures vs. factories,” and so on.

I would not be surprised to find that React adoption has peaked. And like I said: I would further predict that Stimulus and Hotwire gain a lot more adoption in the coming year. I look forward to a future or more HTML and CSS with more judicious use of Javascript, generally.

If you think this is interesting or have a question you think this data can answer, hit me up in the comments or reach out at @mjbellantoni. And if you’d like to post your job on RailsGigs, lemme know!