The ultimate software development tool

Best practices on project management, issue tracking and support

Category: Code Review

Code Review Best Practices

As developers, we all know that code reviews are a good thing in theory. They should help us:

  • Find bugs and security issues early
  • Improve the readability of our code
  • Provide a safety net to ensure all tasks are fully completed

The reality is that code reviews can frequently be an uncomfortable experience for everyone involved, leading to reviews that are combative, ineffective, or even worse, simply not happening.

Here is a quick guide to help you to create an effective code review process.

WHY do we do code reviews?

The first question to answer when reviewing your code review process is: what’s the purpose of our code reviews? When you ask this question, you soon realise that there are many reasons to perform code reviews.  You might even find that everyone in the team has a different idea of why they’re reviewing code. They might think they’re reviewing

  1. To find bugs
  2. To check for potential performance or security problems
  3. To ensure readable code
  4. To verify the functionality does what was required
  5. To make sure the design is sound
  6. To share knowledge of the features that have been implemented and the updated design
  7. To check the code meets standards
  8. …or one of hundreds of other reasons

If everyone in the team has a different “why”, they’re looking for different things in the code. This can lead to a number of anti-patterns:

  • Code reviews take a long time as every reviewer will find a different set of problems that need to be addressed
  • Reviewers become demotivated as every review throws up different types of problems depending upon who reviews it
  • Reviews can ping pong between the reviewer the code author as each new iteration exposes a different set of problems

Having a single purpose for your code reviews ensures that everyone taking part in the review, whether they’re a code author or a reviewer, knows the reason for the review and can focus their efforts in making sure their suggestions fit that reason.

WHAT are we looking for?

Only when we understand why we’re doing the review can we figure out the things we want to look for during the review. As we’ve already started to see, there are a huge number of different things we could be looking for during our review,  we need to narrow down the specific things we really care about.

For example, if we’ve decided the main purpose of our reviews is to ensure the code is readable and understandable, we’ll spend less time worrying about a design that has already been implemented, and more time focusing on whether we understand the methods and whether the functionality is in a place that makes sense. The nice side effect of this particular choice is that with more readable code it’s easier to spot bugs or incorrect logic. Simpler code is often better performance too.

We should always automate as much as possible, so human code reviewers should never be worrying about the following:

  • Formatting & style checks
  • Test coverage
  • If performance meets specific requirements
  • Common security problems

In fact, what a human reviewer should be focusing on might be fairly simple after all – is the code “usable”? Is it:

  • Readable
  • Maintainable
  • Extensible

These are checks that cannot be automated. And these are the features of code that matter most to developers in the long run.

Developers are not the only people who matter though, ultimately the code has a job to do. Our business cares about: does the code do what it’s supposed to do? And is there an automated test or set of tests to prove it?

Finally, does it meet so-called non-functional requirements? It is important to consider things like regulatory requirements (e.g. auditing) or user needs (like documentation) if these checks.

WHO is involved in code reviews?

With a clear purpose and a set of things to be looking for in a review, it’s much simpler to decide who should be involved in the review.  We need to decide:

Who reviews the code? It’s tempting to assume that it should be one or more senior or experienced developers.  But if the focus is something like making sure the code is easy to understand, juniors might be the correct people to review – if an inexperienced developer can understand what’s happening in the code, it’s probably easy for everyone to understand.  If the focus of the review is sharing knowledge, then you probably want everyone to review the code. For reviews with other purposes, you may have a pool of reviewers and a couple of them are randomly picked for each review.

Who signs off the review? If we have more than one reviewer, it’s important to understand who ultimately is responsible for saying the review is complete. This could be a single person, a specific set of people, a quorum of reviewers, specified experts for particular areas of the code, or the review could even be stopped by a single veto. In teams with high levels of trust, the code author might be the one to decide when enough feedback is enough and the code has been updated to adequately reflect concerns that were raised.

Who resolves differences of opinion? Reviews may have more than one reviewer.  If different reviewers have conflicting advice, how does the author resolve this? Is it down to the author to decide? Or is there a lead or expert who can arbitrate and decide the best course? It’s important to understand how conflicts are resolved during a code review.

WHEN?

When has two important components:

When do we review? Traditional code reviews happen when all the code is complete and ready to go to production. Often code will not be merged to trunk/master until a review is complete, for example the pull request model. This is not the only approach. If a code review is for knowledge sharing, the review could happen after the code has been merged (or the code could be committed straight to master). If the code review is an incremental review that is supposed to help evolve the design of the code, reviews will be happening during implementation.  Once we know: why we do reviews; what we’re looking for; and who takes part, we can more easily decide when is the best time to perform the review.

When is the review complete? Not understanding when a review is complete is major factor that can lead to reviews dragging on indefinitely. There’s nothing more de-motivating than a review that never ends, a developer feels like they’ve been working on the same thing forever and it’s still not in production. Guidelines for deciding when the review is complete will depend upon who is taking part in the review, and when the review is taking place:

  • With knowledge sharing reviews, it could be signed off once everyone has had a chance to look at the code
  • With gateway reviews, usually a single nominated senior (the gatekeeper) says it’s complete when all their points are addressed
  • Other types of reviews may have a set of criteria that need to be passed before the review is complete.  For example:
    • All comments have been addressed by fixes in the code
    • All comments have either led to code changes, or to tickets in the issue tracker (e.g. creating tickets for new features or design changes; adding additional information to upcoming feature tickets; or creating tech-debt tickets)
    • Comments flagged as showstoppers have all been addressed in some way, comments that were observations or lessons to learn from in the future do not need to have been “fixed”

WHERE do we review?

Code reviews don’t have to happen inside a code review tool. Pair programming is a form of code review. A review could be simply pulling aside a colleague and walking through your code with them. Reviews might be done by checking out a branch and making comments in a document, and email or a chat channel. Or code reviews might happen via GitHub pull request or a piece of code review software.

Summary

There are lots of things to consider when doing a code review, and if we worried about all of them for every code review, it would be nearly impossible for any code to pass the review process. The best way to implement a code review process that works for us is to consider:

  • Why are we doing reviews? Reviewers have an easier job with a clearly defined purpose, and code authors will have fewer nasty surprises from the review process
  • What are we looking for? When we have a purpose, we can create a more focused set of things to check when reviewing code
  • Who is involved? Who does the reviews, who is responsible for resolving conflicts of opinion, and who ultimately decides if the code is good to go?
  • When do we review, and when is the review complete? Reviews could happen iteratively while working on the code, or at the end of the process. A review could go on forever if we don’t have clear guidance on when the code is finally good to go.
  • Where do we review? Code reviews don’t need a specific tool, a review could be as simple as walking a colleague through our code at  our desk.

Once these questions are answered we should be able to create a code review process which works well for the team. Remember, the goal of a review should be to get the code into production, not to prove how clever we are.

How to build a code review culture: An Interview with Derek Prior

We’ve interviewed Derek Prior, a Developer at Thoughtbot and host of The Bikeshed Podcast. We discuss how to build a code review culture, diving into the benefits of code reviews, the essential elements to make them effective and how to handle conflict if it arises.

Content and Timings

  • Introduction (0:00)
  • Code Reviews Vs. Pair Programming (0:26)
  • Benefits of Code Reviews (1:10)
  • What to cover in a Code Review (4:48)
  • Code Review Author and Reviewer Best Practice (6:47)
  • Handling Conflict in Code Reviews (10:21)
  • Recommended Resources (12:45)

Transcript

Introduction

Derrick:

Derek Prior is a developer for Thoughtbot in Boston. He co-hosts a web development podcast called, “The Bike Shed”. He speaks about development practices at conferences, including the talk, “Cultivating a Code Review Culture”. Derek, thank you so much for taking time to join us today. Do you have a bit to share about yourself?

Derek:

I’ve been speaking at conferences and meet-ups lately about code reviews, like you said. It’s something that I’ve had a lot of experience with over the last 10 years or so.

Code Reviews Vs. Pair Programming

Derrick:

Some people mean different things when they use the term “code review”, like “pair programming”. What do you mean by code reviews?

Derek:

There’s this term, and it’s not my term, called, “modern code review”, and that’s what I’m talking about. What that means… The definition of that is basically, asynchronous or tool-driven, generally like, lightweight reviews. Most of the people I work with that means a GitHub pull request. That’s typically what I’m talking about. Pair programming is great, and has a lot of the same benefits. But when I’m doing pair programming, I still like to have another person to review the code… I still like to have that other person, also review the code.

The reason for that is in pair programming, you’re kind of building up a solution with your pair, as you go. I really like to see at the end, if that holds up for another developer who might have to work on this next week kind of thing. Like, does this make sense to somebody who wasn’t there all along the way.

“It’s much more helpful to focus on the cultural benefits of code reviews”

Benefits of Code Reviews

Derrick:

Why are code reviews important?

Derek:

Everybody here, when they hear about this, everybody’s natural reaction is to say that they catch bugs. That is true. If I have code reviewed that’s going to have less bugs, than code that isn’t reviewed. But I think that that puts too much importance on the “finding bugs” part. Microsoft actually did a study of this, Microsoft Research, in 2013. Where they found that people consistently said the number one benefit from code reviews is finding defects in code.

But then when they looked at actual data collected in their code review tools, and talked to people after they did code reviews, what they actually found was that people got way more benefit, way more cultural benefits, out of sharing knowledge with each other, or keeping up with what everybody’s doing, and knowing what’s going on in that other part of the code base. Finding a really interesting alternative solution to a problem that they hadn’t thought of, just by getting somebody else’s viewpoint.

Those are the much more important things. I feel like they’re much more interesting than finding defects. Finding defects is actually, frankly, really hard. A lot of times I’ll talk to people about code reviews, and they’ll say, “Well we did code reviews, but we still had all these bugs. They weren’t helping us catch the bugs”, and yeah, they’re not going to catch all the bugs. By saying, ‘The chief thing we get out of code reviews is “defects finding”’, what we’re really doing is setting ourselves up for those situations where people say, “But you did a code review on this, and there’s still a bug.” Right?

It’s hard to find all the bugs, because when you’re doing one of these lightweight code reviews, you’re really just looking at a slice of a change. You’re looking at the diff, and to know exactly how that’s really going to impact your system, you have to know the entire system. You can catch edge cases where, “Oh, you didn’t check to see if this was nil”, or whatever the case may be. Knowing exactly where this is going to screw up your data, is hard to know, without knowing the whole system. Code review is great for some defect tracking, or defect finding, but it’s not a panacea. Instead, I think it’s much more helpful to focus on the cultural benefits of code reviews.

Derrick:

How should you work code reviews into your workflow?

Derek:

What we typically do at Thoughtbot is, when I finish up a PR, I will paste it into Slack, or whatever. We’ll paste that in and say, “Can I get a review for this please?” If it’s not urgent, that’s basically all that happens, right? If it’s urgent, I will ping somebody directly. Ping a couple people directly and say, “Hey, this is a production fix, can you take a look at this?” Generally, that’s enough, with the way we work, to get your code reviewed within the next few hours. Which is, generally sufficient.

You can move on to something else. You can review somebody else’s code in that time. You can kind of trade, if somebody else is coming up and finishing something up. You can be, “Oh yeah, I’ll take a look at that. You take a look at this one for me.” That kind of thing. We take a really, lightweight approach to it.

When you think about it, there’s a lot of natural breaks throughout your day that you can work these in to right? For me, I come in in the morning, “are there any PRs outstanding I can look at?” Then I get started. Then, right before lunch, or in the afternoon, if I’m going to take a coffee walk around Boston Common or something, I’ll look then. Either before or after. There’s plenty of times I find, that I can just work these in naturally. There’s no need to schedule them.

I’ve worked with teams that try to schedule them, or try to say, “This person is going to be the one, that’s going to be chiefly doing code reviews this week”, and I’ve never seen that work particularly well. I just try to say, “Keep it lightweight, friendly”, things like that. People are really surprised, or skeptical, to hear that this works. The biggest thing that makes this work is, keeping small, discrete, pull requests, that are going to be much easier reviewed, and providing some excellent context. That’s the “why” you’re making this change, not necessarily the “what”. Ultimately, the big secret to this is that, most of these code reviews only take me five minutes. It’s not a big commitment.

What to cover in a Code Review

Derrick:

What type of things should a code review cover?

Derek:

It’s really important that each person involved in the code review, doesn’t operate from a central checklist, but rather just looks at things that interest them about a change right? Or interests you overall. Maybe you just got finished reading this great book on design patterns. You’re looking for a way to apply this one design pattern. That’s what you’re going to look for. Great. Fine. That’s valuable. You’re going to teach somebody something.

Or maybe you’re really interested in web security, and you’re going to take a look at things from that perspective. Or accessibility, you’re going to take a look at things from that perspective. That’s where having a good blend, on your team, of people who are interested in different things, really pays off, in code review. We’re all going to look at code reviews, and try and see if there’s an obvious bug. I said, not stressing the bug finding, is a big key to this.

But yes, you’re going to look for them, and you’re going to comment on them when you find them. What I really like when people do, is just sort of take their slant on how they think software development should be. Just kind of look at it with their lenses and see, “Oh, did you consider doing this alternative thing over here?” I typically look at … I’m always harping on naming. Right? Part of what I think makes code review so great, is that it’s a great place to have a technical discussion about your actual software. Rather than in the abstract, like when we go to conferences and things like that.

I harp on naming a lot to say, “Do these names give us a good basis for a conversation, about this? Are they descriptive?” That’s really important to me. I look for test coverage, to see … Like I said, I’m not hunting for bugs throughout the entire system … But I am looking to see, “Is this adequately covered, by test?” I’m not going to look at every single test, but I’m just going to to kind of get an idea of “What kind of tests would I expect to see in this change?” I’d expect to see a feature spec. Or I’d expect to see a unit test. That kind of thing. Just make sure that those are there. Stuff like that.

Like I said earlier, the biggest thing is just everybody kind of, bringing whatever it is that they’re interested in … Whatever they’re an expert at … To the table, so that we can all learn from each other.

“Having conflict in your code reviews is actually really beneficial”

Code Review Author and Reviewer Best Practice

Derrick:

As a code author, what should you do to get the most from a code review?

Derek:

The number one thing is context. When you’re submitting a pull request, you’ve been working on this thing for four hours, or eight hours, or two days, or sometimes even a week, or whatever the case may be. You have a lot of context built up in your head. Some things seem really obvious to you at this point. But, to the person reviewing it, they weren’t there for you. They don’t have that context. What you really need to do is think about… It really helps if you’ve been making several small commits along the way, where you’re describing what was in your head.

But as you prepare the change… If you’re going to use a GitHub pull request, and you’re going to submit your pull request to GitHub, make sure you give a nice description of, a summary, of everything … What I really like to see is, not what … A lot of people will summarize what they did? That’s kind of important, the bigger the change gets, so I know what to expect when I’m looking through the change? But it’s not the only thing I want to see, right?

I can figure out what a change does by looking at the code. Unless it’s totally obfuscated. What’s really interesting is why. Why are we changing it? Why is this the best solution? What other solutions did you consider? What problems did you run into? Is there an area of the code you’re really unsure about, that you’d really like some extra eyes on? That kind of thing. Those are going to help you get a much better review, by setting up everybody else to be on the same page as you.

Derrick:

When you’re reviewing code, how do you do it without being overly critical or needlessly pedantic?

Derek:

This happens a lot. Where code reviews that aren’t done particularly well, or overly critical, can kind of, lead to resentment among the team. The big thing there is, negativity bias, that you need to be aware of. If I’m talking to you … Like I am in this conversation, and I give you some technical feedback, and I say, “Oh, why didn’t you use this pattern here”, whatever. You’re going to perceive that in one way. But if I say that same exact thing written down, it comes off more harsh. You’re going to perceive that more negatively.

It’s much more subject to your particular mood. I can’t influence it with the way I say something. It’s just a fact that, the same feedback written, is going to be perceived more negatively. One excellent way to get around that, is to give feedback in a manner that’s more of a conversation. What I like to do is ask questions. At Thoughtbot, we call this “asking questions, rather than making demands”, right?

Instead of saying, “Extract the service object here, to reduce some of this duplication”, I would say, “Hey, what do you think about extracting a service object here, to eliminate this duplication?” Right? They’re very similar comments, but now I’m opening it up to a conversation, by asking you a question. Like, “Oh, what do you think …?” Sometimes I’ll even say something like, “What do you think about doing it this way?”, and I’ll provide a code sample. Then afterwards, if I don’t really feel particularly strongly about it, I’ll be like, “No, I’m not really sure though, what do you think?” Just to kind of say, “I can see how this would go either way.” “What do you think?”

Clarifying how strongly you feel about a piece of feedback, is also something that can help. Making sure that you’re addressing people’s feedback. That’s basically, how to avoid being overly critical. Is just asking nice, friendly questions, that open up a conversation? The pedantic part, I guess would be like, when you start arguing in circles. When you’re not coming to a good conclusion here, right? What I’d say there is just, make sure you’re focusing on high value things. Don’t go edge case hunting everywhere.

There’s a lot of times you can find edge cases that, just aren’t going to come up. Or, maybe they are and they’re important. But really, what you want to be focusing on is, the more higher value type stuff that we talked about earlier. Try and focus on those, to stay away from these like, pedantic arguments that you can into in code review.

“The big secret is that most of these code reviews only take me five minutes”

Handling Conflict in Code Reviews

Derrick:

Inevitably, with discussions between developers in code, disagreements are going to occur, from time to time. How should you handle this type of conflict?

Derek:

I think the first thing to recognize, is that, having conflict in your code reviews like this, is actually really beneficial. Right? As long as they’re the right types of conflict and nobody feels bad about them afterwards, that kind of thing … Because if you’re agreeing with your teammates all the time, and nothing interesting is happening in your code reviews, you basically have a monoculture, and those are dangerous on their own right.

Like I eluded to before, you want everybody bringing their own experiences, and their own expertise, and sometimes those are going to clash with each other. My experience was, doing things this way, led to these problems, and your experience was, doing things this way, was actually really beneficial. We’re going to have conflict about that. The key is that, handling those conflicts properly, is how everybody on the team is going to learn.

What I suggest, when you find yourself going back and forth, a couple comments about something, and you don’t seem to be getting anywhere … Nobody’s yielding, right? What I try to do is take a step back and be like, “How strongly do I feel about this?” Do I feel like, “This is a quality issue, that if we go out with this code, like it is today, it’s going to be a serious problem immediately?” Or is this kind of like, “It’s not really the approach I would have taken, but, maybe it will work. Or it’s a reasonable solution.” That kind of thing right?

Often times, that’s actually what it is. Once we get beyond a certain level of quality, what we’re talking about is trade-offs, all the time. That’s basically all we’re talking about when we’re doing computer programming. Like I said, once we get past the initial quality is, the trade-offs involved. If you’re arguing about what is essentially a trade-off, maybe it’s time to be like, “Okay. You know what? I would prefer to do it this way, but I can see why you’re doing it that way, and why don’t we just revisit this, if we need to, when we have more information?”, right?

Sometimes, you’ll argue about code, for a while, and then it goes into production. Then you don’t touch it for six months. Which means, it was fine. Even if you didn’t think it was the greatest looking code in the world, like it was written, and it doesn’t have bugs that are obvious. It doesn’t need to be continually improved so, whatever, it’s fine.

If you can table the arguments you’re having until you have more information, and then kind of revisit them. Be like, “Oh, okay. I see now we have this bug, or we have this additional feature we need. I can see how that thing you said earlier, would be a better approach here.” That kind of thing. Waiting until you have more information to, really dig in again, is probably a good thing to do.

Recommended Resources

Derrick:

Can you recommend any resources for those wanting to improve the effectiveness of their code reviews?

Derek:

We have guides. If you go to github.com/thoughtbot/guides, there’s a bunch of guides in there. Some of them are like style, and protocol and things like that. But there’s also a code review guide in there. Whether or not you agree with everything in there, I think it’s a good example of something that I’d like to see more teams do. Where they provide, “This is what we write down, and agree to.” Then have pull requests against, when we want to change. What we think code reviews should be. How we think they should operate.

Every piece of feedback, should it be addressed? Yes, I think so. At least to explain, “Okay, I see what you said there, but I’m going to go in this direction instead.” That kind of thing. I’d like see more people take a look at that, and then adopt it for their team. Other things that can really help with code reviews, are style guides. Which we also have in that repo but, there’s also many community style guides. Just avoiding those arguments about, “Oh, I’d really like see you call …” Like in Ruby, we have map and collect under the same method, right?

Let’s be consistent. Let’s pick map, or let’s pick collect. Or something like, just writing that down somewhere, making that decision once. Rather than arguing about it on all pull requests, those types of things. That’s basically it. I’d like to see more people, get on board with and kind of documenting their experiences and, their expectations out of these things.

Derrick:

Derek, thanks so much for taking time and joining us today.

Derek:

Yeah, thanks for having me. It was good fun.

——–

Its extremely important to learn how to build a code review culture. Code reviews are great for code quality as they reduce errors, help deliver better code and encourage teams be more compassionate towards one another.

Refactoring to a happier development team

 

In this interview with Coraline Ada Ehmke, Lead Software Engineer at Instructure, we discuss data-driven refactoring and developer happiness teams. Coraline gives some great advice on the kinds of tests we should write for refactoring, tools to use and metrics to monitor, to make sure our refactoring is effective. We also learn about the role of refactoring in the Developer Happiness team at Instructure. You can read more from Coraline on her site.

Content and Timings

  • Introduction (0:00)
  • Useful Tests for Refactoring (0:57)
  • Data-driven Refactoring Metrics (4:13)
  • Winning Management Over to Refactoring (6:34)
  • The Developer Happiness Team (8:36)
  • Recommended Resources (13:58)

Transcript

Introduction

Derrick:
Coraline Ada Ehmke is Lead Engineer at Instructure. She is the creator of the Contributor Covenant, a code of conduct for open source projects, and Founder of OS4W, an organization encouraging greater diversity in open source projects. She speaks regularly at conferences about software development, including the talk ‘Data-Driven Refactoring’. Coraline, thank you so much for joining us today. Do you have any more to share about yourself?

Coraline:
I want to say I’ve been doing web development for about 20 years now, which is forever. I actually wrote my first website before there was a graphical browser to view it in. I started out in Perl, went into ASP for a little while, did some Java, and then discovered Ruby in 2007, and I haven’t looked back. All that experience has led me to be very opinionated about what makes good software. I have some very strong opinions that I’m always flexible about changing, but I have a sense of what good software looks like, and that sort of drives me in my daily work.

“A lot of problems come with code that’s just good enough”

Useful Tests for Refactoring

Derrick:
I want to dig into your experience with refactoring a bit. You say that without tests, you shouldn’t refactor. Why are tests so important to refactoring?

Coraline:
I think it was Michael Feathers who once said that ‘if you’re refactoring without testing, you’re just changing shit’. Basically, the importance of testing is you want to challenge and document what your assumptions are about the way that a piece of code works currently, which can be really difficult, because you might look at a method or a class name and assume its role based on its name, but that name could have drifted over time. You need to ensure that you’re documenting what you think it does and then testing to see that it does what you think it does. Testing can also help you stay within the guardrails through refactoring efforts. It’s hard to know where to stop sometimes, so tests can help you identify when you’re doing too much. You want to make sure you’re doing the minimum rework with the maximum results.

Derrick:
Yeah, that sounds great. It’s so easy with refactoring to just continue down that rabbit hole. What kind of tests are useful when refactoring?

Coraline:
It’s good to have good test coverage to start with, if it’s possible. If you don’t have good test coverage at the start, it’s good to build up some unit tests and integration tests just to make sure you’re staying on the path. There are a couple kinds of tests that I’m particularly interested in when it comes to refactoring. The first is boundary testing, which is basically using test cases to test the extremes of input. The maximum values, the minimum values, things that are just inside or outside of boundaries, typical values, and error values as well.

I like to do generative testing for this. If I have a method that takes an argument, I will create an enumerable with lots of different kinds of arguments in it, and see what passes and what fails. Those failures are really important because when you’re refactoring, you can’t know for sure that somewhere in the code base something is waiting for that method to fail or raise an exception. Your refactoring should not change the signature of that method such that a given piece of input stops causing a failure. You want to be really careful with that. Generative testing can help a lot with that, so there’s that with boundary testing.

Also, I like attribute testing, which is an outside-in way of testing. You check that after your code runs the object that’s being tested possesses or does not possess certain qualities. This can be really useful for understanding when you changed how an object is serialized or the attributes of an object without getting into the implementation details.

The most important tests, though, for refactoring are tests that you throw away. We’re hesitant. We love deleting code, but we hate deleting tests, because it gives us this feeling of insecurity. The sort of test that you do during refactoring doesn’t lend itself well to maintaining code over time. You’re going to be a lot more granular in your tests. You’re going to be testing things that are outside of the purview of normal integration or unit tests, and it’s important that you feel comfortable throwing those away when you’re done with a refactoring effort.

Derrick:
Yeah. The importance of being able to throw away stuff really gives you that perspective on what you’re trying to do.

Coraline:
And it lets you work incrementally, too. You can start by challenging your assumption about how a piece of code works. Once you feel confident, you can set up some boundary tests to see how it responds to strange conditions. Once you feel confident with that, then you’re probably confident enough to do some refactoring and make sure that you’re not introducing regressions.

Data-driven Refactoring Metrics

Derrick:
An important aspect of your approach to refactoring is data and measuring the impact of refactoring efforts over time. What are some useful metrics to track?

Coraline:
I want to emphasize that over time is really, really important, because if you’re not collecting data on your starting point and checkpoints along the way, you don’t know if you’re actually making your code base better or worse. It is possible through refactoring to make your code base worse. Some of the things I like to look at at the start are test execution times. If your test suite takes 45 minutes to run in CI, you’re not doing TDD. It’s impossible, because the feedback loop for your test has to be as short as possible. Refactoring tests is actually a great way to get started in trying to adjust that test execution time.

Another one that I’ll probably talk about later is the feature-to-bug ratio. If you look at your sprint planning, figure out how much time you’re spending on new features versus how much time you’re spending on bugs. You can actually use some tools to track down what areas of your code base are generating the most bugs, and combine that with churn to see what’s changing the most. Those are interesting.

I like to look at code quality metrics. A lot of people use Code Climate for tracking code quality, which I don’t really like that much, because it assigns letter grades to code. If a class has an ‘F’, you might say, “Oh, it needs to be rewritten,” or if a class has an ‘A’, you might think, “Oh, it’s perfect. No changes are necessary.” Code Climate doesn’t really give you the change over time. It’ll give you notification if a given class changes it’s grade. That’s great, but you can’t track that over time. I like some other code quality tools. Looking at complexity, there are tools that do assignment branch conditional algorithms to see how complex a piece of code is. I also like to look at coupling, because coupling is often a symptom of a need for refactoring, especially in a legacy application.

Derrick:
What mistakes do you see people making with refactoring?

Coraline:
Basically trying to do too much at once. It’s really easy to look at a piece of code and immediately make a value judgment, saying, “This is bad code,” which is really unfair to the programmers who came before, and saying, “I’m just going to rewrite the whole thing.” We have this instinct to burn it all down. Trying to change too much at one time is really a recipe for disaster. You want to keep the surface area of your changes as small as possible and work as incrementally as possible. Those things take discipline, and I think it’s hard for a lot of people to achieve that without actively policing themselves.

“The most important tests, though, for refactoring are tests that you throw away”

Winning Management Over to Refactoring

Derrick:
Often just getting the time to actually spend on refactoring can be a challenge. What are some benefits of refactoring that can win management over?

Coraline:
That’s a big, important question. I think in a healthy development organization, the development team, the engineers themselves, are stakeholders. They have some input into your backlog. If there’s technical debt that needs to be addressed, that can be prioritized. Not all of us live in a world where we have that kind of influence over the backlog. Some of the ways you can convince management that refactoring is a good idea… I talked before about the bug-to-feature ratio. Stakeholders want new features. New features keep products healthy and appealing to end users and to investors. Anything that stands in the way of a new feature getting out is necessarily a business problem.

If you look at your feature-to-bug-fix ratio and see that it’s poor, one way you can sell it is to say, “through this refactoring, we’re going to improve this ratio and be able to do more feature work.” You can also look at how long it takes to implement a new feature and talk about how refactoring a piece of code, especially if it’s code that changes very often, can make it easier to add new features to a particular area of the code base faster and easier.

In the end, if none of that works, just cheat. Just lie and pad your estimates, so you automatically build in some time for reducing technical debt with all of your estimates, which unfortunately I’ve had to use more often than I would have liked to. It is a valuable tactic to keep in your toolbox.

Derrick:
Yeah, definitely. I still really like… You mentioned this earlier, but the feature-to-bug-fix ratio is really important.

Coraline:
It really is a great health check for how the team overall is doing, what the code looks like and the health of the team, as well. If people are doing a lot of bug fixes, you’re probably going to have some unhappy engineers, because they want to work on new and shiny things. That can have an impact on how happy your developers are, which is really critical to how much work they’re getting done and how good your product is.

The Developer Happiness Team

Derrick:
Definitely. Let’s talk about happiness. You previously led what was known as the Developer Happiness or Refactoring team at Instructure. Tell us a little bit about the role and mission of that team.

Coraline:
It was one of my favorite jobs, honestly. We were charged with increasing the happiness and productivity of the engineering team. Within our charter was identifying processes or parts of the code base or even people who were standing in the way of developers being as productive as they could possibly be. Being very data driven in my approach, the first thing I wanted to do was get a sense of just how gnarly the code base was.

We ended up writing a bunch of code analysis tools. Again, we were using Code Climate, but I wasn’t really satisfied with its tracking ability over time. So we wrote a few tools. I wrote one called Fukuzatsu, which is an ABC complexity measurement tool. There was already a tool out there that was very similar called Flog that was written by Ryan Davis, but it’s an opinionated tool. It punishes you for things like metaprogramming, and it also favors frameworks like active record and punishes you for using alternative ORMs, so just in the nature of the bias that was built in. I wanted something that… I don’t want opinions with my data. I’m an engineer. I’m a professional. I’m capable of forming my own opinions. I just wanted raw data to work with, and that’s what Fukuzatsu gave me.

Another tool that we ended up building as part of this process at the very beginning was something called Society, which basically makes a social graph of the relationships between your classes. You get this nice circular diagram with afferent and efferent coupling displayed in different colors, and you see the links between different classes. That can help you identify service boundaries, for example, because if you have one class that’s a trunk of a lot of inbound or outbound connections, you might say, “That’s a good place for a service.”

A lot of the work we did early on was building tools. We built a mega-tool that collected data from all these other tools and presented them in a dashboard format with change shown over time. You could drill into complexity. You could drill into coupling. You could drill into code quality or code smells, or various other metrics, and see how they were changing at the commit level. That being done, we identified which areas were causing a lot of bugs, which areas of the code are really complex or really changing a lot.

The next step was creating a team of refactoring ambassadors. Rather than taking on the refactoring work ourselves and handing it over to the team that owned the code, our goal was to send in someone who’s really good at refactoring to work with that team to refactor the problematic code, which I think was pretty valuable in terms of ownership and continued success, and also training and teaching and helping those teams level up.

I think a lot of problems come with code that’s just good enough. I think we’ve all seen pull request bombs where someone had a bug to fix or had a feature to implement, and they went down a path that maybe was not optimal. By the time you actually saw it to do a code review, it was too late to suggest an alternative approach, because so much work had been put into it. You knew that you would just crush the spirit of whoever sent the PR. We end up with this lowest-common-denominator code. People also tend to copy and paste the approach that they’ve seen taken elsewhere in a class or elsewhere in a code base. Demonstrating to them that there’s a better way and there’s a better pattern you can follow is really important to maintaining code quality. That was the overall mission of the team.

Derrick:
Is that kind of focussed team something you’d recommend others try? What kind of impact did it have on factors like code quality and the developer happiness?

Coraline:
I think people appreciated that we were paying attention to engineering’s needs. We’d send out surveys to gauge how satisfied people were with the code that they were writing, with CI, with the test suite, with the overall process. We tracked it over time. I think engineers want to be listened to outside of the scope of just, “What are you building right now?”, or “What are you fixing right now?” They feel empowered when their needs are being addressed and when people are asking them questions about how they’re feeling and the work that they’re doing.

In terms of it being a full-time team, I think that really depends on the size of your engineering organization. Instructure has about 100 engineers, so it made sense to have a dedicated team doing that for a while. I think you could have an individual who’s charged with doing that as their primary job, and that would probably help, or have a team that’s a virtual team where you’re rotating people through on a regular basis.

There are different ways to approach it. I think the important thing is just gauging the health of your development organization and gauging the happiness of your engineers. That is going to have a significant impact on the quality of your code in the end.

Derrick:
You talked about all those tools for code quality, and they sound really great. How did you measure developer happiness? Was it talking to people, or…

Coraline:
Talking to people. We had a survey that we created that asked a bunch of questions about, “What do you think are the impediments to getting your job done? How good do you feel about the feature work that you’ve been doing? Are we spending too much time fixing bugs?” We published the anonymized results of those surveys to our internal wiki so people could go back and reference them, and we could use that to track how well we were doing as a team, as well. Again, just listening to people makes them feel better about things and gives them some hope that maybe change is coming.

“I think engineers want to be listened to outside of the scope of just, ‘What are you building right now?’”

Recommended Resources

Derrick:
Can you recommend any other resources for those wanting to learn more about effective refactoring?

Coraline:
I would suggest really getting to know what your testing tools are as a first step, and looking at some alternative ways of testing. Generative testing, for example, gets a bad rap, but I think it’s really a helpful technique to use when you’re doing refactoring. Looking at what people in other languages are doing in terms of their philosophical approach to testing is really important, and making sure that you’re really up to speed on what those different tools are so that you can be more effective in your work.

In terms of tooling, I would recommend a tool for Ruby called Reek, which is a code smell identifier that can help you isolate what areas you want to focus on. You can create an encyclopedia of code smells for your code base and refer back to that, and organize your work such that, “Oh, we’re going to address all the code smells that are of this kind,” and do that across the board. Really look at the tools that are available for your language in terms of code quality.

There are a few books that I recommend. The first is ‘Working Effectively with Legacy Code’ by Michael Feathers. The examples are written in Java, but the lessons that it teaches are applicable to any language. That’s a really good place to start. I work in Ruby, so I would also recommend ‘Refactoring: Ruby Edition’ by Jay Fields and Shane Harvie, and another book called ‘Rails Anti-Patterns’ by Chad Pytel and Tammer Saleh.

Derrick:
Those are fantastic resources. Coraline, thank you so much for joining us today.

Coraline:
Great, it was a lot of fun. Thank you for inviting me.

 

Originally published on www.fogcreek.com on October 2015