2.4 Day 3: Serious Change

The whole point of Mary Poppins is that she made the household better as a whole by making it fun and changing the hearts of the people in it with passion and imagination. You could back off a little and play it safe, using Ruby to do the same things you already know how to do in other languages. But when you change the way a language looks and works, you can capture magic that makes programming fun. In this book, each chapter will show you some nontrivial problem that the language solves well. In Ruby, that means metaprogramming.

Metaprogramming means writing programs that write programs. The ActiveRecord framework that’s the centerpiece of Rails uses metaprogramming to implement a friendly language for building classes that link to database tables. An ActiveRecord class for a Department might look like this:

  class Department < ActiveRecord::Base
  has_many :employees
  has_one :manager
  end

has_many and has_one are Ruby methods that add all the instance variables and methods needed to establish a has_many relationship. This class specification reads like English, eliminating all the noise and baggage that you usually find with other database frameworks. Let’s look at some different tools you can use for metaprogramming.

Open Classes

You’ve already had a brief introduction to open classes. You can change the definition of any class at any time, usually to add behavior. Here’s a great example from the Rails framework that adds a method to NilClass:

ruby/blank.rb
  class NilClass
  def blank?
  true
  end
  end
 
  class String
  def blank?
  self.size == 0
  end
  end
 
  ["", "person", nil].each do |element|
  puts element unless element.blank?
  end

The first invocation of class defines a class; once a class is already defined, subsequent invocations modify that class. This code adds a method called blank? to two existing classes: NilClass and String. When I check the status of a given string, I often want to see whether it is blank. Most strings can have a value, be empty, and be possibly nil. This little idiom lets me quickly check for the two empty cases at once, because blank? will return true. It doesn’t matter which class String points to. If it supports the blank? method, it will work. If it walks like a duck and quacks like a duck, it is a duck. I don’t need to draw blood to check the type.

Watch what’s going on here. You’re asking for a very sharp scalpel, and Ruby will gladly give it to you. Your open classes have redefined both String and Nil. It’s possible to completely disable Ruby by redefining, say, Class.new. The trade-off is freedom. With the kind of freedom that lets you redefine any class or object at any time, you can build some amazingly readable code. With freedom and power come responsibility.

Open classes are useful when you’re building languages to encode your own domain. It’s often useful to express units in a language that works for your business domain. For example, consider an API that expresses all distance as inches:

ruby/units.rb
  class Numeric
  def inches
  self
  end
 
  def feet
  self * 12.inches
  end
 
  def yards
  self * 3.feet
  end
 
  def miles
  self * 5280.feet
  end
 
  def back
  self * -1
  end
 
  def forward
  self
  end
  end
 
  puts 10.miles.back
  puts 2.feet.forward

The open classes make this kind of support possible with minimal syntax. But other techniques can stretch Ruby even further.

Via method_missing

Ruby calls a special debugging method each time a method is missing in order to print some diagnostic information. This behavior makes the language easier to debug. But sometimes, you can take advantage of this language feature to build some unexpectedly rich behavior. All you need to do is override method_missing. Consider an API to represent Roman numerals. You could do it easily enough with a method call, with an API something like Roman.number_for "ii". In truth, that’s not too bad. There are no mandatory parentheses or semicolons to get in the way. With Ruby, we can do better:

ruby/roman.rb
  class Roman
  def self.method_missing name, *args
  roman = name.to_s
  roman.gsub!("IV", "IIII")
  roman.gsub!("IX", "VIIII")
  roman.gsub!("XL", "XXXX")
  roman.gsub!("XC", "LXXXX")
 
  (roman.count("I") +
  roman.count("V") * 5 +
  roman.count("X") * 10 +
  roman.count("L") * 50 +
  roman.count("C") * 100)
  end
  end
 
  puts Roman.X
  puts Roman.XC
  puts Roman.XII
  puts Roman.X

This code is a beautiful example of method_missing in action. The code is clear and simple. We first override method_missing. We’ll get the name of the method and its parameters as input parameters. We’re interested only in the name. First, we convert that to String. Then, we replace the special cases, like iv and ix, with strings that are easier to count. Then, we just count Roman digits and multiply by the value of that number. The API is so much easier: Roman.i versus Roman.number_for "i".

Consider the cost, though. We do have a class that will be much more difficult to debug, because Ruby can no longer tell you when a method is missing! We would definitely want strong error checking to make sure it was accepting valid Roman numerals. If you don’t know what you’re looking for, you could have a tough time finding that implementation of that ii method on Roman. Still, it’s another scalpel for the tool bag. Use it wisely.

Modules

The most popular metaprogramming style in Ruby is the module. You can literally implement def or attr_accessor with a few lines of code in a module. You can also extend class definitions in surprising ways. A common technique lets you design your own domain-specific language (DSL) to define your class.[4] The DSL defines methods in a module that adds all the methods and constants needed to manage a class.

I’m going to break an example down using a common superclass first. Here’s the type of class that we want to build through metaprogramming. It’s a simple program to open a CSV file based on the name of the class.

ruby/acts_as_csv_class.rb
  class ActsAsCsv
  def read
  file = File.new(self.class.to_s.downcase + '.txt')
  @headers = file.gets.chomp.split(', ')
 
  file.each do |row|
  @result << row.chomp.split(', ')
  end
  end
 
  def headers
  @headers
  end
 
  def csv_contents
  @result
  end
 
  def initialize
  @result = []
  read
  end
  end
 
  class RubyCsv < ActsAsCsv
  end
 
  m = RubyCsv.new
  puts m.headers.inspect
  puts m.csv_contents.inspect

This basic class defines four methods. headers and csv_contents are simple accessors that return the value of instance variables. initialize initializes the results of the read. Most of the work happens in read. The read method opens a file, reads headings, and chops them into individual fields. Next, it loops over lines, placing the contents of each line in an array. This implementation of a CSV file is not complete because it does not handle edge cases like quotes, but you get the idea.

The next step is to take the file and attach that behavior to a class with a module method often called a macro. Macros change the behavior of classes, often based on changes in the environment. In this case, our macro opens up the class and dumps in all the behavior related to a CSV file:

ruby/acts_as_csv.rb
  class ActsAsCsv
  def self.acts_as_csv
 
  define_method 'read' do
  file = File.new(self.class.to_s.downcase + '.txt')
  @headers = file.gets.chomp.split(', ')
 
  file.each do |row|
  @result << row.chomp.split(', ')
  end
  end
 
  define_method "headers" do
  @headers
  end
 
  define_method "csv_contents" do
  @result
  end
 
  define_method 'initialize' do
  @result = []
  read
  end
  end
  end
 
  class RubyCsv < ActsAsCsv
  acts_as_csv
  end
 
  m = RubyCsv.new
  puts m.headers.inspect
  puts m.csv_contents.inspect

The metaprogramming happens in the acts_as_csv macro. That code calls define_method for all the methods we want to add to the target class. Now, when the target class calls acts_as_csv, that code will define all four methods on the target class.

So, the acts_as macro code does nothing but add a few methods we could have easily added through inheritance. That design does not seem like much of an improvement, but it’s about to get more interesting. Let’s see how the same behavior would work in a module:

ruby/acts_as_csv_module.rb
  module ActsAsCsv
  def self.included(base)
  base.extend ClassMethods
  end
 
  module ClassMethods
  def acts_as_csv
  include InstanceMethods
  end
  end
 
  module InstanceMethods
  def read
  @csv_contents = []
  filename = self.class.to_s.downcase + '.txt'
  file = File.new(filename)
  @headers = file.gets.chomp.split(', ')
 
  file.each do |row|
  @csv_contents << row.chomp.split(', ')
  end
  end
 
  attr_accessor :headers, :csv_contents
  def initialize
  read
  end
  end
  end
 
  class RubyCsv # no inheritance! You can mix it in
  include ActsAsCsv
  acts_as_csv
  end
 
  m = RubyCsv.new
  puts m.headers.inspect
  puts m.csv_contents.inspect

Ruby will invoke the included method whenever this module gets included into another. Remember, a class is a module. In our included method, we extend the target class called base (which is the RubyCsv class), and that module adds class methods to RubyCsv. The only class method is acts_as_csv. That method in turn opens up the class and includes all the instance methods. And we’re writing a program that writes a program.

The interesting thing about all these metaprogramming techniques is that your programs can change based on the state of your application. ActiveRecord uses metaprogramming to dynamically add accessors that are the same name as the columns of the database. Some XML frameworks like builder let users define custom tags with method_missing to provide a beautiful syntax. When your syntax is more beautiful, you can let the reader of your code get past the syntax and closer to the intentions. That’s the power of Ruby.

What We Learned in Day 3

In this section, you learned to use Ruby to define your own syntax and change classes on the fly. These programming techniques fall in the category of metaprogramming. Every line of code that you write has two kinds of audiences: computers and people. Sometimes, it’s hard to strike a balance between building code that can pass through the interpreter or compiler and is also easy to understand. With metaprogramming, you can close the gap between valid Ruby syntax and sentences.

Some of the best Ruby frameworks, such as Builder and ActiveRecord, heavily depend on metaprogramming techniques for readability. You used open classes to build a duck-typed interface supporting the blank? method for String objects and nil, dramatically reducing the amount of clutter for a common scenario. You saw some code that used many of the same techniques. You used method_missing to build beautiful Roman numerals. And finally, you used modules to define a domain-specific language that you used to specify CSV files.

Day 3 Self-Study

Do:

Modify the CSV application to support an each method to return a CsvRow object. Use method_missing on that CsvRow to return the value for the column for a given heading.

For example, for the file:

  one, two
  lions, tigers

allow an API that works like this:

  csv = RubyCsv.new
  csv.each {|row| puts row.one}

This should print "lions".