A Quick Jaunt Through Merb's Framework Code

Posted by ezmobius Sun, 02 Dec 2007 01:07:00 GMT

This is a tutorial for people that want to familiarize themselves with the merb framework code and how a request travels through the framework. This is not a complete walkthrough but it will definitely get you into the code base and peeking around at key areas.

To follow along at home with you should grab Merb trunk from svn here:

 svn co http://svn.devjavu.com/merb/trunk merb

This tutorial was written with svn rev #1058 which is trunk as of today. It is probably subject to some change but mostly covers code that is on the hot path and doesn’t change very often.

Start with lib/merb/mongrel_handler.rb

look through MerbHandler#process this method is called by mongrel for each request and mongrel gives us a request object with all the HTTP headers already cooked as well as a raw post body for multipart stuff. It also gives us a response object to write our final response to.

The first section of this method deal with normalizing the path prefix if there is one. This means that if you set path prefix to ”/foo” and a request is made for /foo/bar/baz/42 the request will strip /foo and merb will only see a request for /bar/baz/42.

# Truncate the request URI if there's a path prefix so that an app can be
# hosted inside a subdirectory, for example.
if @@path_prefix
  if request.params[Mongrel::Const::PATH_INFO] =~ @@path_prefix
    MERB_LOGGER.info("Path prefix #{@@path_prefix.inspect} removed from PATH_INFO and REQUEST_URI.")
    request.params[Mongrel::Const::PATH_INFO].sub!(@@path_prefix, '')
    request.params[Mongrel::Const::REQUEST_URI].sub!(@@path_prefix, '')
    path_info = request.params[Mongrel::Const::PATH_INFO]
  else
    raise "Path prefix is set to '#{@@path_prefix.inspect}', but is not in the REQUEST_URI. " 
  end
else
  path_info = request.params[Mongrel::Const::PATH_INFO]
end

Next comes the built on logic that checks for rails style page cached html files or static assets if you are running merb with no webservers in front of it. Basically if there is a static file that matches the request exactly it will be served by mongrel without ever calling into your app. If nothing matches the uri exactly then we will append .html to the uri and check to see if that file exists and serve it directly, if none of these rules matched then it is a dynamic request and we need to feed it into merb and your application code.

 # Rails style page caching. Check the public dir first for .html pages and
 # serve directly. Otherwise fall back to Merb routing and request
 # dispatching.
 page_cached = path_info + ".html" 
 get_or_head = @@file_only_methods.include? request.params[Mongrel::Const::REQUEST_METHOD]

 if get_or_head && @files.can_serve(path_info)
   # File exists as-is so serve it up
   MERB_LOGGER.info("Serving static file: #{path_info}")
   @files.process(request,response)
 elsif get_or_head && @files.can_serve(page_cached)
   # Possible cached page, serve it up
   MERB_LOGGER.info("Serving static file: #{page_cached}")
   request.params[Mongrel::Const::PATH_INFO] = page_cached
   @files.process(request,response)
Now here is the real entry point to the framework code that in turn calls your app code. this line here:
controller, action = Merb::Dispatcher.handle(request, response)  
We just call handle on Merb::Dispatcher and pass in the request and response objects we got from mongrel.

Now we move to this file: lib/merb/dispatcher.rb

Let’s look through Dispatcher.handle: First thing we do here is instantiate a Merb::Request object by passing in the http_request object that mongrel gave us. This wraps the raw request from mongrel in our shiny Merb::Request object that handles matching routes, exposing http headers as methods and parsing any query strings, multipart requests and cookies. For now we don’t need to delve too much into how the router works or the multipart parsing, you can just assume you get a fully baked request object that has the params, sessions, cookies in it and knows which controller and action to instantiate and call based on your router.rb definitions. ( if you really want to follow through the request routing code have a look at the route_match, route_index, route_params and controller_name methods in the Request class.). The Merb::Request#controller_name & controller_class methods returns a string of the name of the controller to instantiate and it’s actual class object is returned from the controller_class method.

def handle(http_request, response)
  start   = Time.now
  request = Merb::Request.new(http_request)
  MERB_LOGGER.info("Params: #{request.params.inspect}")
  MERB_LOGGER.info("Cookies: #{request.cookies.inspect}")
  # user friendly error messages
  if request.route_params.empty?
    raise ControllerExceptions::BadRequest, "No routes match the request" 
  elsif request.controller_name.nil?
    raise ControllerExceptions::BadRequest, "Route matched, but route did not specify a controller" 
  end

So now back in the Dispatcher.handle method you can see that just before the rescue we call dispatch_action with the controller class to instantiate, the request.action to call and the request, response and status objects.

  # set controller class and the action to call
  klass = request.controller_class
  dispatch_action(klass, request.action, request, response)
rescue => exception

In dispatch_action you can see that we call klass.build(request, response, status).

# setup the controller and call the chosen action 
def dispatch_action(klass, action, request, response, status=200)
  # build controller
  controller = klass.build(request, response, status)

Let’s take a peek at Merb::Controller.build in lib/merb/controller.rb In this method you can see we create an instance of our controller and call set_dispatch_variables on it.

def build(request, response = StringIO.new, status=200, headers={'Content-Type' => 'text/html; charset=utf-8'})
  cont = new
  cont.set_dispatch_variables(request, response, status, headers)
  cont
end

In set_dispatch_variables we do a little munging of the params and setup our request, response, status and response headers. If you peek around a bit in this class you can see how we delegate params, sessions and cookies to the request object since it knows that info. Now we return the instance of our controller we just created to the dispatch_action method back in the Dispatcher.

So back in lib/merb/dispatcher.rb in the dispatch_action after our call the Merb::Controller.build you can see Merb’s configurable mutex lock.

...    
  if @@use_mutex
    @@mutex.synchronize { controller.dispatch(action) }
  else
    controller.dispatch(action)
  end
  [controller, action]
end
If @use_mutex is true we will lock around the call to our controllers action dispatch, if we have the @use_mutex is false we can just dispatch directly with no lock. It is important to note that we are only locking here around before filters and actions, the request is already parsed and requests are already routed in a thread safe way. Having such a small lock really helps with contention( as opposed to rails giant lock that locks everything including routings and multipart parsing).

So let’s look at Merb::Controller#dispatch back in lib/merb/controller.rb:

def dispatch(action=:index)
  start = Time.now
  if self.class.callable_actions[action.to_s]
    params[:action] ||= action
    setup_session
    super(action)
    finalize_session
  else
    raise ActionNotFound, "Action '#{action}' was not found in #{self.class}" 
  end
  @_benchmarks[:action_time] = Time.now - start
  MERB_LOGGER.info("Time spent in #{self.class}##{action} action: #{@_benchmarks[:action_time]} seconds")
end
In the dispatch method you can see we do a check to make sure the action about to be called is in self.class.callable_actions. This is a hash made on server boot for every controller class that includes only that controllers public methods. private and protected and inherited methods are not visible via the web for safety’s sake. Once we know that the action is callable we call setup_session, which surprisingly sets up the session by pulling it out of your cookie or the database or memcached depending on what session container you are using.

You can see that we now call super(action), let’s take a look at what is going on there. Please open lib/merb/abstract_controller.rb and have a peek at Merb::AbstractController#dispatch.

def dispatch(action=:to_s)
  caught = catch(:halt) do
    start = Time.now
    result = call_filters(before_filters)
    @_benchmarks[:before_filters_time] = Time.now - start if before_filters
    result
  end
...

Here you can see how Merb’s before filters work when you throw :halt. We use Ruby’s catch/throw stack unwinding methods to wrap the before filter chain. We setup a catch(:halt) block. What this does is enables us to call throw :halt from anywhere in our before filter chain to stop the filter chain and return a result. We set caught to the value returned from our filter chain. Have a peek at call_filter in the same class, we basically just iterate over all of our before filters and call them based on if they are a Proc, Symbol or String. If any filters throw :halt we return to the catch in the dispatch method. If all before filters complete without errors or calling throw :halt, we return the :filter_chain_completed symbol.

Now back up in Merb::AbstractController#dispatch:

  ...

  @_body = case caught
  when :filter_chain_completed
    call_action(action)
  when String
    caught
  when nil
    filters_halted
  when Symbol
    send(caught)
  when Proc
    caught.call(self)  
  else
    raise MerbControllerError, "The before filter chain is broken dude. wtf?" 
  end
  start = Time.now
  call_filters(after_filters) 
  @_benchmarks[:after_filters_time] = Time.now - start if after_filters
end

notice that we are setting @_body to the return value of the case statement of the value in the caught variable. If caught was :filter_chain_completed then we call Merb::AbstractController#call_action(this is the normal path usually taken if your filters were all successful)

This is where Merb’s action arguments come into play. Merb allows you to define your controller actions to take arguments based on the params, for example:

class Posts < Application
  def show(id)
    @post = Post.find(id)
  end
end

And here is Merb::AbstractController#call_action:

def call_action(action)
  # [[:id], [:foo, 7]]
  args = self.class.action_argument_list[action.to_sym].map do |arg, default|
    raise BadRequest unless params[arg.to_sym] || default
    params[arg.to_sym] || default
  end
  send(action, *args)
end

This is just sugar so you don’t have to use params[:id] all the time. Any args you have your actions take will be matched with what is in params and your action will be called with the values that are appropriate. This includes the ability to set default values like def foo(id, name=”bob”). If id is not supplied your action will throw an error, but name can be left out and will use the default value.

Now if :filter_chain_completed is not the return value of your filter chain there are a few different ways merb handles it. If the value is a string, then it is the response value , if it is a symbol that is not :filter_chain_completed then the method that corresponds to that symbol is called. If it is a proc then it is called with the controller objet as an argument and the value of the proc is returned. If the value is nil then we call the filters_halted method that can be overridden in your controllers to return whatever you want.

At the end of Merb::AbstractController#dispatch you can see that we call the after filters before we are done with the whole dispatch. If everything went well and no exceptions were thrown we unwind the stack back to the Dispatcher and then we return a tuple [controller, action] out of the dispatcher and into the mongrel handler again. Back in the mongrel handler again you can follow along and see how we massage the headers and write out our response to the response object that mongrel send to the client. Basically a bunch of busywork in here that is probably not interesting enough to cover.

But what happens if our whole dispatch throws an error?
...
rescue => exception
  MERB_LOGGER.error(Merb.exception(exception))
  exception = controller_exception(exception)
  dispatch_exception(request, response, exception)
end

So if something went wrong during the dispatch to your code then we want to show a pretty stack trace page based on merb’s controller exceptions framework. So we call Merb::Handler.dispatch_exception to build our Exceptions controller instance and call the right method on it:

# Re-route the current request to the Exception controller
# if it is available, and try to render the exception nicely
# if it is not available then just render a simple text error
def dispatch_exception(request, response, exception)
  klass = Exceptions rescue Controller
  request.params[:original_params] = request.params.dup rescue {}
  request.params[:original_session] = request.session.dup rescue {}
  request.params[:original_cookies] = request.cookies.dup rescue {}
  request.params[:exception] = exception
  request.params[:action] = exception.name
  dispatch_action(klass, exception.name, request, response, exception.class::STATUS)

Let’s say that you raise BadRequest in one of your actions where a find failed to find a record.

def foo
  @foo = Foo.find :first
  raise BadRequest unless @foo
  render
end    
Since we raised BadRequest we will instantite an instance of the Exceptions controller class and call the bad_request method on it. This allows us to show dynamic error pages with whatever we want on them.
class Exceptions < Application
  def bad_request
    render
  end
end
This will render the app/views/exceptions/bad_request.html.erb template. This allows us a ton of flexibility in handling exceptions. All of the HTTP status codes have their own merb exception class with the proper status code. You can raise any of these to get this nice behavior. Or you can make your own simply by subclassing Merb::ControllerExceptions::Base
class My404Error < Merb::ControllerExceptions::Base
  STATUS=404
end
You just have to specify the STATUS constant to tell it what http status code to return. The exceptions controller is just like any other normal controller so you can have before and after filters and rendering, anything you can do in a normal controller you can also do in an exceptions controller. Exceptions controller can also use the provides api to return errors in different mime types like json, txt, xml etc..

That’s it for this episode of the Merb code base tour. We didn’t touch on what happens in your controller actions such as the provides API and rendering API. These will be left for another post.

Tags  | 9 comments

Comments

  1. Joe said about 2 hours later:
    Awesome, thanks! So, if I have a Rails app, and want to use merb handlers for a couple various actions, can I reuse the models from the Rails app?
  2. Shalev said about 6 hours later:
    Merb is ORM agnostic, which means you can use ActiveRecord just as you can in Rails. So yes, your AR models will work with Merb. Just remember to set use_orm :activerecord in dependencies.yml.
  3. Jason Seifer said about 22 hours later:
    Great article! Really well written and informative.
  4. Daniel Neighman said 1 day later:
    Great article Ezra. That really makes what's going very clear. Man I love Merb ;)
  5. Hampton said 1 day later:
    Awesome dude! I'm going to pour through this before my next hack-attack.
  6. Damien Tanner said 1 day later:
    Superb as always Ezra.
  7. Daya Sharma said 3 days later:
    Thread safe, ORM agnostic and customizable exceptions among others features will definitely give RoR a run for its money. Thanks Ezra.
  8. Shalev said 3 days later:
    I like to see things visually, so I made this diagram that follows your excellent post.
  9. Yaroslav Markin said 11 days later:
    Thanks for the article, Ezra!

(leave url/email »)

   Preview comment