Refactoring, Part Two: In Defense of Magic
A quick program note: If you like this newsletter, you might like my recent books: "Modern Front-End Development for Rails" (Ebook) (Amazon) and "Modern CSS With Tailwind" (Ebook) (Amazon). If you've already read and enjoyed either book, I would greatly appreciate your help by giving a rating on Amazon. Thanks!
In my last post, I refactored the status field for cards in my project tracker to a more object-oriented representation that used value objects and classes to replace conditionals throughout the code.
I was apprehensive about posting it, because it seems like that kind of OO code is out of fashion. What feedback I got, though, was positive. So I'm going to go a little farther into metaprogramming with this example. The code here works for this project, but eventually goes beyond what I'd be likely to recommend for most projects.
When we left the code, we had one case statement remaining: a factory method that converted a string value to one of several subclasses of a parent Status
class:
class Status
def self.at(value)
return value if value.is_a?(Status)
case value
when "done" then DoneStatus.new
when "started" then StartedStatus.new
when "unstarted" then UnstartedStatus.new
when "inbox" then InboxStatus.new
when "attic" then AtticStatus.new
else
throw InvalidValueError.new(value)
end
end
class DoneStatus < Status
def order
1
end
def to_s
"done"
end
end
# and so on
end
Parenthetically, I'm re-reading 99 Bottles of OOP for a book club at work, and when Sandy and Katrina do a factory method like this, they explicitly don't keep the first return
if it's already converted line, but I like it because then I can use this factory method as a safe conversion method -- it's always safe for me to call Status.at
because if the string has already been converted, then it'll just return itself.
Anyway, that's where I left the code for the purposes of the last post because it's clear enough and made the point I wanted to make.
But I still don't like that long case statement. I considered a data-driven version where I had a hash with string keys and class values like {inbox: StatusInbox}
, but I decided to go with a data-driven version where the data is simply an array of strings:
class Status
STATUSES = %w[inbox attic unstarted started done]
def self.at(value)
return value if value.is_a?(Status)
raise InvalidValueError.new(value) unless valid?(value.to_s)
"Status::#{value.to_s.capitalize}Status".constantize.new
end
class DoneStatus < Status
def order = 0
def to_s = "done"
end
# and so on...
end
To get a minor part out of the way, the def order = 0
is Ruby 3.0 one-line method definition and I've decided I like it on the grounds that it saves me two vertical lines and loses no information. More to the point, I find the syntax is good shorthand for intent -- my intent is for any method I define with that syntax to be short and effectively a calculated attribute of the class.
Structurally, I've replaced the case
statement with an array and a line of metaprogramming, "Status::#{value.to_s.capitalize}Status".constantize.new
, which generates an instance of a class whose name is based on the value.
And now it's time to talk about metaprogramming and magic.
Metaprogramming has a few different definitions. For our purposes, it's any code that waits until runtime to do something dynamically that is normally done statically at compile or load time. In this case, the code is deriving the subclass class name from the value string at run time, where the previous version statically listed the classes at load time.
Magic, as applied to programming tools, has no agreed-on definition beyond the vague statement that "Rails has a lot of magic". I define "magic" as anything that happens in the code that can't be explained or defined by a simple text search. It's the supernatural action of code, you can't tell where it's coming from and sometimes it seems spooky.
In this code snippet, the Status#at
method does work that can't be searched for using normal definitions. If I wanted to see all the places in the code where DoneStatus
is instantiated by searching for DoneStatus.new
, I would no longer find that code because it's been replaced by the dynamic line.
Ruby makes metaprogramming relatively easy, which depending on who you talk to is either a great strength of Ruby or something we should only talk about in embarrassed hushed tones. It is worth mentioning that Ruby's two most influential frameworks -- Rails and RSpec -- are both heavily based on metaprogramming, and none of the other language copies have fully achieved the ease of use of those tools without Ruby metaprogramming. (And I do realize that people are rolling their eyes at me for putting "RSpec" and "ease of use" in the same sentence.)
As I write this, I'm looking back at my notes from the very first Rails-related conference like thing I ever attended. This was a touring event sponsored by Pragmatic called The Rails Edge, and I went to the Chicago incarnation in, I think, August 2007. The very first talk was from Dave Thomas about metaprogramming Ruby. I quote him as saying "Java is the blunt scissors of programming" and, speaking about the idea that metaprogramming was technically challenging made a comment that it was a "people problem" not a technology problem. Actually... I think I need to do a separate newsletter post on these notes.
Where was I?
My point being that there was once a lot of excitement about being able to do weird dynamic stuff in Ruby, and while I get why that cooled, it has always been part of what I like about Ruby and Rails.
Magic code can be slow. It can be hard to read, and it can be really hard to follow if you didn't write it. I completely and totally get why we've shied away from it, especially on larger projects. (Dave-Thomas-in-my-notes-from-2007 lists these same basic negatives to metaprogramming.)
I do wonder if we've over-corrected a little. So I'm pushing metaprogramming a little bit on my side project, which is still small and which I am the only coder for the foreseeable future.
As I've done for other parts of the project, I'm trying to be explicit about my metaprogramming goals. A metaprogramming solution is viable if:
It provides a clear aesthetic benefit in the code. There's no point in producing ugly metaprogrammed code. In this case, I'm collapsing an ugly case
statement and making it data driven.
It provides some other benefit over the static version. In this case, I think there are a couple benefits. A new status can now be added just by updating the STATUSES
list -- several dependencies (not all shown here) can use that list and no longer need to be updated separately. The metaprogrammed version also enforces consistency in the name of the subclass of a new status. Enforcing consistency is often a good reason for metaprogramming.
(The non metaprogrammed version of this where the STATUSES
is a key/value pair of statuses and classes: {done: DoneStatus, started: StartedStatus, ...}
has the data-driven benefit, but not the enforced consistency benefit.)
Some allowance needs to be made for discoverability. I haven't done much for this yet, because it doesn't really worry me in this code. If somebody wants to find out where Status
objects are created, the class methods of the Status
class are likely to be the first place they look.
With that in hand, I looked for other code that had duplicate structures based on the name of the classes and that could be metaprogrammed using the same list of status strings.
The original code had a series of methods in the parent Status class that looked like this:
def done? = to_s == "done"
def started? = to_s == "started"
def unstarted? = to_s == "unstarted"
def inbox? = to_s == "inbox"
def attic? = to_s == "attic"
This is clear, but clunky, and has duplicated structure. If I change the list of statuses, I also have to change this list of methods.
One way of cleaning this up would be to define all the boolean methods as just returning false
in the parent class and allow each subclass to override the one method it cares about to set it to return true
. That's a reasonable way to go about it, though it still requires changing the methods if we change the list of statuses.
Now that we have a single canonical list of status values, though, we can metaprogram our way out of this.
Here's one way:
# done? started? unstarted? inbox? attic?
# def done = to_s == "done"
STATUSES.each do |status|
define_method "#{status}?" do
to_s == status.to_s
end
end
This uses Ruby's define_method
to create all the boolean methods. I like define_method
as a metaprogramming tool because it's often a straightforward way to remove duplication and enforce consistency. The boolean methods are now data-driven, that's a genuine benefit. I think this is prettier than the list of methods, and with the included comments, it's discoverable. The second line of comments is a sample version of the method.
If I was on a team with other Ruby developers, I'd probably stop there.
But I'm not on a team of Ruby developers for this, I'm on a team of me and there's something I like better.
Fair warning, though, this one gets more magical, and this will probably be a bridge too far for some of you.
Rails has a thing called a StringInquirer
, which is what powers Rails.env.test?
. It's one of my favorite Weird Ruby things to show off, and (most of) the implementation is as follows:
class StringInquirer < String
private
def method_missing(method_name, *arguments)
if method_name.end_with?("?")
self == method_name[0..-2]
else
super
end
end
end
class String
def inquiry
ActiveSupport::StringInquirer.new(self)
end
end
The StringInquirer
is a method-missing hack. The method-missing
method is a Ruby feature commonly found in dynamic OO languages. Before Ruby raises a "method not found" error, it looks for a method_missing
method and calls it with the method name as an argument and gives the class an opportunity to respond to to the method name however it sees fit.
Inside method_missing
you can parse the original method name however you want, meaning you can implement a lot of dynamic behavior. ActiveRecord used to use this heavily, and you'll see toy examples that use it for, say Roman numerals to give you an API like RomanNumeral.new.mcmxiv
, where you parse the numeral in the method_missing
call.
In this case, StringInquirer
is a subclass of String
, so any method sent to a StringInquirer
that is not already defined as an existing method on String
goes to method_missing
. If the missing method name ends in ?
, then method_missing
compares the method name to the value of the string, returning true
if and only if the value of the string is the name of the method (without the ?
). If the method name doesn't end in ?
, we just carry on our merry way, presumably ending up with an error.
You use StringInquirer
like this:
x = "fred"
x_inq = x.inquiry
x_inq.fred? # true
x_inq.barney? # false
When we call fred?
, the method_missing
compares the method name, fred
, with the string value fred
and returns true. When we call barney?
it compares fred
with barney
and returns false.
I admit that even by Rails standards, this is extra. I even admit that I'd be reticent to include it in a codebase with not-me developers on it.
And yet, I love it. (I even like its close friend ArrayInquirer
.) It's weird and unnecessary (is x_inq.fred?
really any better than x == "fred"
?) and yet the implementation is elegant, and somehow it's a stand in for everything I like about Ruby and Rails.
How does StringInquirer
help us here?
Since each of our status classes has a to_s
method with the string we are checking against, we can put the following in the parent Status
class:
def value = to_s.inquiry
# inbox? attic? unstarted? started? done?
STATUSES.each { |value| delegate :"#{value}?", to: :value }
The first line defines value
as the StringInquiry
wrapper around the string value. Then we loop over all our statuses, and use the Rails delegate
feature to pass the #{value}?
method to the value.
If I call status.done?
, the done?
method is delegated to the value
StringInquirer
object, which uses method_missing
to return true if the value is done
, meaning that it's an instance of the DoneStatus
class. So we have the same functionality as before.
Against my list of reasons to metaprogram, I genuinely do find this version more aesthetically pleasing than the list of methods, and, for me, a little better than the define_method
version. It has a benefit beyond taste, in that it makes the boolean methods also automatically adjust to changes in the set of valid statues. With the associated comment, it's somewhat discoverable.
I freely grant that a lot of people aren't going to like this, and that it's readable to me mostly because I like StringInquiry
so much and also tend to be aggressive about using delegate
, so it's two patterns I'm very comfortable with. I'd expect that for other people, it'd at least need a more explicit comment.
There's actually still a little code smell here. The content of the STATUSES
constant is basically duplicated by the to_s
method of each subclass. I may clear that up someday, possibly by back-deriving the to_s
from the class name, but I haven't figured out a way to do it that isn't hard to follow or ugly -- apparently there is a line I won't cross.
There are a couple of lessons I might draw here. One is that "good code" is highly dependent on context. The final version of this code is good for me. The define_method
version is good for other purposes, and the "list all the methods explicitly" version works well in other contexts entirely.
That said, I get real joy from Ruby's flexibility here and it bothers me sometimes that the trend in the Ruby and Rails communities is to move a way from that kind of thing. I miss the old Rails find_all_by_email_and_first_name_and_state
method name build ups, while realizing that find_all(email:, first_name:, state:)
works perfectly well and is probably easier to explain.
A bunch of years ago, I was teaching Ruby at an internal workshop and I was explaining something weird, which could easily have been StringInquirer
and as I finished the explanation, somebody in the back of the from broke out laughing. I think about that a lot, partially because I've never been sure if the person was amused or condescending. I do think there's a place in our code for things that make us happy, and it's worth trying to figure out if we can reconcile the fun flexible bits with the needs of larger teams.
Also, I guess this is a trilogy now, because I just did another OO refactor in this codebase that I want to talk about...
If you made it this far, you might have something to say. Comment on this post at it's permanent home: http://noelrappin.com/blog/2021/08/refactoring-part-two-in-defense-of-magic. Or subscribe to this newsletter to get future versions directly in your inbox at http://buttondown.com/noelrap.
Dynamic Ruby is brought to you by Noel Rappin.
Comments and archive at noelrappin.com, or contact me at noelrap@ruby.social on Mastodon or @noelrappin.com on Bluesky.
To support this newsletter, subscribe by following one of these two links: