Archive for December, 2007

Rails style meta programming in Ruby – part 2

This is part two of this blog post – if you landed on this page, then please checkout part 1 before reading this part.

Last time, we learned that all methods are executed by sending a message to self, where self is set to the appropriate object automatically. We also learned that class definitions are code which is executed at run time, and that a class is nothing but an object of type Class.

What is a metaclass?

Here is the next important point – Classes are a place to store instance methods, objects are a place to store instance variables. An object cannot store a method, only a class can.

But what about class methods? We know that a class is just an object, and when you say something like

class Foo
  def self.hello
    puts "hello"
  end
end

Foo.hello

You are invoking the hello method on the instance of the object stored in Foo, which happens to be an object of type Class (ie your class definition). So where are these class methods stored?

When you invoke a method on an object, Ruby looks at the object’s class pointer, which contains a reference to the ‘class object’ that defines the class. The class object contains the list of methods defined by the class definition, and a pointer to its super class, which contains another list of methods and a pointer to its super class and so on. Ruby searches along this chain until it finds the method it requires, or calls method_missing if it cannot.

To be consistent, a ‘class object’ should behave in the same way. It has a class pointer that points to a class. Ruby magically creates this virtual class when you define a class method, and uses it to store all class methods for the class. This magically created virtual class is what is known as the classes metaclass.

Note that metaclasses follow the same inheritance chain as classes, inheriting from metaclasses up the chain, which is why this works:

class Foo
  def self.hello
    puts "parent says hello"
  end
end

class Bar < Foo
end

Bar.hello

Classes can have instance variables

So if a class is just an object, and an object is a place to store instance variables, then can a class have instance variables too? Yes it can, but think carefully about how to access them.

Normally, we create a class, say Foo, and in that class defintion use attr_reader to provide methods to access the instance variables. So where do we put our attr_reader defintion to allow access to a class instance variable? We have to put it in the objects class defintion, which in this case is the classes metaclass:

class Foo
  @class_instance_var = '1234'

  class<<self
    attr_reader :class_instance_var
 end

  def hello
    puts "hello"
  end
end

puts Foo.class_instance_var

To complete the picture, we just need to know what the following syntax means:


class<<self
 ...
end

'class<<obj' allows us to extend an object by "building a new class just for object obj". In the context above, it basically changes self to be the metaclass instead of the defining class.

Another way of thinking of it is simply that 'class<<obj' gives you access to obj's metaclass.

When we are in the metaclass, we can define the attr_readers required to read the class instance variables.

Finally some Rails magic

Now we know every we need to know make this work:

class User < ActiveRecord::Base
  set_table_name :silly_named_users_table
end

We know that when 'set_table_name' is execute, it is executed against self, which happens to be the object defining the User class, so its just like saying:

class User < ActiveRecord::Base
  User.set_table_name :silly_name_users_table
end

This means that the method 'set_table_name' must be in User's metaclass, or in the inheritance chain for it, which includes ActiveRecord::Base. We know its not in User's metaclass, as we didn't put it there, so it must be in ActiveRecord:

module ActiveRecord
  class Base
  
    def self.set_table_name (table)
      puts "setting the table name to #{table}"
    end
  end
end

class User < ActiveRecord::Base
  set_table_name :silly_named_users_table
end

In the Users class, we want to store this alternative table name - it needs to be stored in a class instance variable in the User class, which is simple (I will also add an attr_reader method to the meta class so we can test the code does the correct thing):

module ActiveRecord
  class Base
  
    def self.set_table_name (table)
      puts "setting the table name to in AR #{table}"
      @tablename = table
    end
  end
end

class User < ActiveRecord::Base
  set_table_name :silly_named_users_table
  
  class<<self
     attr_reader :tablename
  end
end

puts User.tablename

The above code does as expected - why?

When set_table_name is encountered in the User class definition, self is the class object stored in the constant User. It cannot find the set_table_name method method in User's metaclass, so it looks in the parent class, and finds it, but self is not changed to the parent class, its still the class object stored in the constant User that is how inheritance works.

When we set the instance variable in the parent method, self is still User so we are setting an instance variable in the User class object. Anytime we call a method against an instance of a User object, the method can access that tablename variable and use it to generate the correct SQL. How to do that is left as an exercise for the reader 😉

Advertisements

December 18, 2007 at 11:27 pm 3 comments

Rails style meta programming in Ruby – part 1

For a long time now, I have pondered and wondered about just how Rails does what it does – surely such wizardy is well outside my basic Ruby skills – well not any more thanks to this presentation by Dave Thomas.

He explains all the concepts you need to know, so that you too can do things in your own code like:


Class Users < ActiveRecord::Base
  set_table_name :unconventional_name
  has_many :friends
end

Dave, I am sure, explains it much better than me, but the video is an hour long, and I had to watch it twice, so here, hopefully is the 10 (or 20) minute summary.

Everything is a method

First things first – every method you call in Ruby is run by sending a message to an object – even at the top level when you wouldn’t have though so:


puts "hello, world!"

In the above code, puts is a method with is executed by sending a message to an object. Infact, no matter which method you call on any object in your code, it is always executed by sending a message to the same object, ‘self’.


puts self.to_s
puts self.class

prints


main
Object

So, at the top level, all messages are sent to an object contained in the variable self that is called ‘main’ and is of class Object.

When you type something like


  a = Array.new
  puts a.class

It prints ‘Array’, as the variable ‘a’ contains an instance of an Array object. The trick to remember here is that the ‘a.’ says, set the value of self to the object a, and then execute the method on self. This means that everytime you execute a method, it is always executed as a message passed to ‘self’, which will have been set to the correct object.

That is why this works:


class Foo
  def hello
    there          # a method call with no receiver?
    puts " there"
  end
  def there
    puts "hello"
  end
end

b = Foo.new
b.hello

When you are inside an instance method, executing another method without a class or object prefix invokes the method on the object self. Ruby arranges so that the value of self inside an instance method is the correct thing – ie set to the instance of the object the original method was invoked on, in this case the instance of Foo stored in ‘b’.

The following code is equivalent to the above, but self is explicity specified.


class Foo
  def hello
    self.there          # a method call with a receiver?
    puts " there"
  end
  def there
    puts "hello"
  end
end

b = Foo.new
b.hello

I know this seems trivial, but its very important, so I will say it again – a method is always executed by sending a message to self – self is always changing so it contains different objects however.

Class definitions are executable code

Big deal – what has this got to do with this spiffy ‘has_many’ Rails style directives. Well, the next thing you need to learn, is that class defintions are executable code. Remember that any method that is executed, must be executed on self …


class Foo
  puts "hello"

  def bar
  end
end

Classes are just objects

Runnning the above code prints “hello” – we know its doing this because class definitions are executable code, but also that puts is invoked against self, so what is self?


class Foo
  puts self
  puts self.class

  def bar
  end
end

Which outputs


Foo
Class

So inside a class definition, the current object is set to an object with the same name as your class, that is of type Class. This leads to the second thing you need to remember – a class is nothing but an object – yes its a little special, but its an object none the less.

Ruby is even sneakier – when you define a class ‘Foo’, Ruby creates a constant called Foo, and places an object of type Class into it which contains the definition of your klass. Don’t believe me? Run this code –


class Foo
end

puts Foo
puts Foo.class

Which produces


Foo
Class

ie, the contents of a constant, ‘Foo’, (because in Ruby any variable starting with an uppercase letter is a constant) is an instance of Class, called ‘Foo’.

Phew – thats enough for now – next time, I will introduce metaclasses and the other bits and pieces that allow these Rails like ‘has_many’ directives to actually work …

December 13, 2007 at 11:39 pm 1 comment


Feeds