Modern code reviews are considered a key component in the development lifecycle of any IT organization.
Yearly case studies performed by various researchers report that it’s one of the main methods for improving code quality, sharing knowledge across the team, increased collaboration as well as greater adherence to coding standards.
Unlike Fagan’s code inspection design introduced in 1976, most of us employ a more lenient process, which is:
- informal
- asynchronous
- tool-based
- focused on reviewing code changes
Results from multiple analysis have shown that the latent defect discovery rate is about 30% in case of informal reviews, proving them to be very effective. They are also a great source of knowledge, because the process usually entails multiple developers collectively looking at a problem and the possible solution.
However, code reviews can become quite cumbersome for newcomers, especially if you find yourself in a company that has some tribal knowledge, which is usually the case.
Tribal knowledge is any unwritten form of how a process should work, or in our specific case, how to produce code. And there can be quite a lot of it, spanning from syntax preferences to unwritten rules and established conventions, which in turn results in code reviews containing dozens of comments that could have been avoided.
That kind of noise takes a lot of focus from the business logic and design. It further elongates the code review process as each back and forth takes additional time due to the asynchronous nature of the process.
A solution to this problem are linters and if you chose Ruby as your programming language, chances are high that you already used RuboCop.
RuboCop linter
Linters are tools used during development that warn the programmer about formatting and adhering to a set of rules.
The warnings are usually related to syntax, but the Ruby community went a step further and introduced a couple of behavioral ones like rubocop-rails that can even spot unwanted behaviour for you.
If you think about it, why wouldn’t you want automatic warnings when you introduce a bug…before the code is shipped. It sounds amazing!
A very cool behavioural cop is the Rails/UniqueValidationWithoutIndex, which detects model uniqueness validations and checks if the uniqueness constraints are also set on the database level. In the following example you can see how the warning looks like if you forget to add the unique index during database migrations.
This concept opens up a lot of ideas and opportunities for a company to create a code convention template that is shared throughout all projects. This increases technical cohesion on a team level and ensures that everyone is roughly on the same page when it comes to conventions and best practices.
Let’s look at a specific code example in Rails that declares the attribute confirmed_at on a User and it defaults to the current time. The Attributes API provides just what we need:
class User
attribute :confirmed_at, :datetime, default: Time.zone.now
end
pry> User.new
=> #<User id: nil, confirmed_at: Tue, 27 Oct 2020 11:39:22 UTC +00:00
At first glance, this looks totally fine, but if you don’t write thorough tests, the bug can be hard to catch, even during code reviews. Look at what happens for every new user:
pry> User.new
=> #<User id: nil, confirmed_at: Tue, 27 Oct 2020 11:39:22 UTC +00:00
pry> User.new
=> #<User id: nil, confirmed_at: Tue, 27 Oct 2020 11:39:22 UTC +00:00
pry> User.new
=> #<User id: nil, confirmed_at: Tue, 27 Oct 2020 11:39:22 UTC +00:00
Time.zone.now is being passed as a value to the default option, and since attribute is a class method, the value of Time.zone.now is set at the moment when the application is booted. This means that all values will be the same regardless of the return value of Time.zone.now at the time of code execution. The solution is defining confirmed_at dynamically:
class User
attribute :confirmed_at, :datetime, default: -> { Time.zone.now }
end
This is a perfect example of tribal knowledge. You either learn to avoid it by breaking production, writing very thorough unit tests or by having someone more experienced mention it during the code review.
Now imagine someone’s watching over your code as you type it, without the awkward presence of course, and giving you real time feedback to avoid such pitfalls. This is exactly what RuboCop’s cops provide us with. In order to be able to build such a layer of protection, you first need to understand how the source code gets converted into data structures.
Abstract syntax tree
AST is a tree representation of the written source code. Each node of the tree denotes a piece of the source code, which can be any literal, object, constant or any other construct used in the programming language.
Let’s take a look at some simple examples of Ruby constructs and how their AST looks like.
For this, the parser gem is used. After installing it, you can jump into your terminal and start playing around:
ruby-parse -e ’1’
(int 1)
ruby-parse -e ’”1”’
(str “1”)
ruby-parse -e ’:key’
(sym :key)
ruby-parse -e ’[]’
(array)
ruby-parse -e ’{}’
(hash)
The basic constructs are pretty straightforward. We get a readable description of the code, which tells us that 1 is an integer with value 1, while “1” is a string with value “1”. We can see that this continues to be the case in more complex expressions:
ruby-parse -e ’[1, 2, 3]’
(array
(int 1)
(int 2)
(int 3))
ruby-parse -e ’{username: “john”, password: “doe”}’
(hash
(pair
(sym :username)
(str "john"))
(pair
(sym :password)
(str "doe")))
If you haven’t grasped what’s going on, it really helps reading the Ruby code out loud. Take a look at the last example and try reading every bit of Ruby code in the most descriptive way possible. I said it the first time like this:
I have a hash that contains 2 elements, where the first element key is :username and value is “john” and the second element key is :password and value is “doe”.
This is not explicit enough, because I didn’t specify the primitive type of the key (symbol) and value (string) of the elements.
So, let’s investigate the AST of our example:
ruby-parse -e ’attribute :confirmed_at, :datetime, default: Time.zone.now’
(send nil :attribute
(sym :confirmed_at)
(sym :datetime)
(hash
(pair
(sym :default)
(send
(send
(const nil :Time) :zone) :now))))
Looks familiar, doesn’t it? It’s a combination of all the nodes we’ve parsed in the previous examples. Any class that inherits ApplicationRecord or includes the AttributesAPI module can receive the attribute message that accepts 3 arguments:
- symbol confirmed_at
- symbol datetime
- keyword argument :default with any value (in our case Time.zone.now)
The AST concept gives us the ability to convert source code into data structures. Now we just need another tool that will traverse the tree and find specific nodes in Ruby using pattern matching, or maybe regular expressions?
Node pattern
Fortunately for us, this problem has also been solved with the Node pattern. It is a DSL for finding specific nodes in the AST using simple strings.
This is the last step before diving into the implementation of a custom cop – finding the right string that will match against a specific AST. Here are some examples of the Ruby code, AST, and the pattern.
The only real difference between the AST and the pattern is nil?, which is a special case predicate that matches against nil objects. Bear in mind that the goal is to match arbitrary code to these patterns, not just our specific examples in the table. We need to take great care of warning of invalid cases while keeping the valid usages unaffected:
attribute :role, :string, default: :user
attribute :role, default: :user
attribute :active, :boolean, default: true
Sandi Metz gave an excellent talk on RailsConf in 2015, where she stated:
You can reveal how things are different, by making them more alike.
By putting this quote to use, we can spot the static nodes in an infinite range of code possibilities and reveal the pattern:
attribute :any_name, :optional_argument, default: any_construct
- attribute – the class method that never changes, so we keep this as “attribute” in the pattern
- any_name – doesn’t really concern us, as long as it is there because the API requires it as a method parameter. This fits the bill nicely for the _ matcher. It will fit any single node.
- optional_argument – also doesn’t matter what it is, but it doesn’t have to exist. The “?” matcher limits the match to 0 or 1 nodes, and when used in conjunction with “_” you effectively create a pattern for any optional construct “?_”
- default – the Attributes API also receives keyword arguments, but we are only interested in the hash when the first node is a symbol with value “default”. This is covered in the Abstract Syntax Tree chapter by parsing the hash.
- any_construct – this is the target we are interested in. If we can get the type of this node we’ll be able to warn the developer for offending cases like a method call, array or hash, because none of those should be used on the class level, since we’re creating an attribute that will be used on the instance level. The $ matcher captures the node, making it available for processing.
Let’s use the information and write the pattern that would fit the bill for all cases:
(send nil? :attribute
_
?_
(hash
(pair
(sym :default)
$_))
At this point, you should be confident enough to start writing some code.
Creating the cop
We love writing tests, especially when the tests are fast and easy to understand.
RuboCop provides a beautiful DSL for testing cops but unfortunately a lot of articles found online don’t have the information how to set up cop tests. Don’t worry, we’ve got you covered this time!
Here are some tips for setting up RSpec cop tests:
Make sure RuboCop is in your Gemfile and bundled up.
Open spec/rails_helper.rb and add:
require ’rubocop’
require ’rubocop/rspec/support’
Include it globally or specific category of tests (eg. type: :cop):
RSpec.configure do |config|
config.include RuboCop::RSpec::ExpectOffense
# rest of your rspec configuration
end
The ExpectOffense module provides us with a beautiful way of writing clear and descriptive cop tests. They resemble the actual Ruby code you’re trying to cover:
#spec/cop/rails/attribute_default_block_value_spec.rb
describe Cop::Rails::AttributeDefaultBlockValue do
subject(:cop) { described_class.new(config) }
let(:config) { RuboCop::Config.new }
let(:message) { ’Pass method in a block to `:default` option.’ }
it ’disallows method called from other objects’ do
expect_offense(<<~RUBY)
attribute :confirmed_at, :datetime, default: Time.zone.now
^^^^^^^^^^^^^ #{message}
RUBY
end
end
Now we can create the cop:
#lib/cop/rails/attribute_default_block_value.rb
module Cop
module Rails
class AttributeDefaultBlockValue < ::RuboCop::Cop::Cop
MSG = ’Pass method in a block to `:default` option.’
TYPE_OFFENDERS = %i[send array hash].freeze
def_node_matcher :default_attribute, <<~PATTERN
(send nil? :attribute _ ?_ (hash <$#attribute ...>))
PATTERN
def_node_matcher :attribute, ’(pair (sym :default) $_)’
def on_send(node)
default_attribute(node) do |attribute|
value = attribute.children.last
return unless TYPE_OFFENDERS.any? { |type| value.type == type }
add_offense(node, location: value)
end
end
end
end
end
RuboCop::Cop::Cop provides a bunch of stuff that’s used in all class instance cops.
The def_node_matcher method is used for matching nodes. It requires the name of the matcher and the matching pattern. In the example above I used matcher chaining, but you can just as easily use one matcher with the AST that was defined at the end of the previous chapter:
def_node_matcher :default_attribute, <<~PATTERN
(send nil? :attribute
_
?_
(hash
(pair
(sym :default)
$_))
PATTERN
Callback methods (on_send, on_block, on_class etc…) check whether a node is matched against a pattern. You know what callback to use from the first node AST, which in our case is send. We used def on_send(node).
Inside of the callback methods you usually start off with the matcher name defined in def_node_matcher in order to get the offending construct. When calling default_attribute(node), you get an instance of a type of RuboCop::AST::Node that knows its type and node children, alongside other things.
The add_offense method provides an easy way of adding a warning message, defining the error positioning (^^^) and the severity of the error:
- refactor
- convention
- warning
- error
- fatal
I advise you to look at the RuboCop source code, because it’s really nicely documented and there’s a lot of examples for learning. The rest is just…Ruby.
Lastly, include your new cop in .rubocop.yml.
require:
- ./lib/cop/rails/attribute_default_block_value
- rubocop-infinum
Upon reopening or changing a file, you should see the new cop in action.
Want cleaner pull requests?
Code reviews are important and the discussions should focus on strategies used to solve the domain-specific problem, finding edge cases, and pointing out possible design or test improvements. They should have a minimum number of comments related to project conventions, as well as company code conventions, typos, or syntax errors.
Having a quick feedback loop in your IDE enables all of that, while being much more effective time-wise than PR reviews.
Another benefit is the feeling of accomplishment when your custom cop gets recognized as useful and gets promoted from a project-specific cop to a company-specific one. Maybe it will even get merged with the other cops to serve and protect codebases all around the world.
We use RuboCop at Infinum and have only started down this road, but we see a lot of benefits. Having cleaner pull requests, automated knowledge sharing and being able to engage with open source software makes it a no-brainer.
The Ruby community is one of the nicest programming-language-related communities out there and we should all actively search for ways to improve it and help other fellow developers write good, clean code.