Positive Incline Mike Burrows (@asplake) moving on up, positively

December 21, 2009

DRY up your routes – a Pylons routing refactoring

Filed under: Programming,Web Integration — Tags: , , , , , — Mike @ 11:43 am

[See UPDATES]

This post is in several acts, each one a refactoring. Taking the stage will be some familiar-looking application code; behind the scenes lurks some enhanced framework code that makes the refactorings possible.

Our story starts with an old-school (pre-REST) routing configuration. It’s in Python, for Pylons and other Routes-based web frameworks, but let me reassure the Ruby audience that Routes borrows very heavily from Rails and their routing configurations look quite similar.

We will finish with collection(), an experimental (but I hope worthy) alternative to resource(), the routing helper that rose to prominence when Rails first made its big push to REST.

The starting point: textbook routing.py

from routes import Mapper

def make_map():

    ...

    # Releases - Collection
    mapper.connect('releases', '/releases',
              controller='release', action='index', conditions=dict(method='GET'))
    mapper.connect('create_release', '/releases',
              controller='release', action='create', conditions=dict(method='POST'))
    mapper.connect('new_release', '/releases/new',
              controller='release', action='new', conditions=dict(method='GET'))

    # Releases - Members
    mapper.connect('release', '/releases/{id}',
              controller='release', action='show',
              requirements=dict(id='d+'), conditions=dict(method='GET'))
    mapper.connect('update_release', '/releases/{id}',
              controller='release', action='update',
              requirements=dict(id='d+'), conditions=dict(method='PUT'))
    mapper.connect('delete_release', '/releases/{id}',
              controller='release', action='delete',
              requirements=dict(id='d+'), conditions=dict(method='DELETE'))
    mapper.connect('edit_release', '/releases/{id}/edit',
              controller='release', action='edit',
              requirements=dict(id='d+'), conditions=dict(method='GET'))
    mapper.connect('release_notes', '/releases/{id}/notes',
              controller='release', action='release_notes',
              requirements=dict(id='d+'), conditions=dict(method='GET'))

Full of duplication, verbose, even ugly. Just because it sits in a config directory, does it have to look that bad?

Refactoring 1: Introducing submapper()

New in Routes 1.11 (so really quite new), submapper()provides the means to pull out parameters previously shared across multiple connect() calls.

One particularly nice feature is that the SubMapper objects returned by submapper() support the Python managed object protocol so if you’re on Python 2.5 or above you can use them with the with syntax like this:

from release_tool.lib.mapper import Mapper

def make_map():

    ...

    # Releases - Collection
    with mapper.submapper(
                    controller='release',
                    path_prefix='/releases') as c:
        c.connect('releases', '', action='index',
                conditions=dict(method='GET'))
        c.connect('create_release', '', action='create',
                conditions=dict(method='POST'))
        c.connect('new_release', '/new', action='new',
                conditions=dict(method='GET'))

    # Releases - Members
    with mapper.submapper(
                    controller='release',
                    path_prefix='/releases/{id}',
                    requirements=dict(id='d+')) as m:
        m.connect('release', '', action='show',
                conditions=dict(method='GET'))
        m.connect('update_release', '', action='update',
                conditions=dict(method='PUT'))
        m.connect('delete_release', '', action='delete',
                conditions=dict(method='DELETE'))
        m.connect('edit_release', '/edit', action='edit',
                conditions=dict(method='GET'))
        m.connect('release_notes', '/notes', action='release_notes',
                conditions=dict(method='GET'))

That’s a definite improvement (and credit where credit is due – submappers are great innovation, though a small bug requires our local extensions to be imported here), but notice that there’s still some duplication between the two submappers blocks, the first one corresponding to the collection resource, the second to that same collection’s members. Wouldn’t it be cool if we could nest them?

Refactoring 2: Submapper nesting

On the surface, the Routes 1.11 API doesn’t appear to support submapper nesting, but the SubMapper objects do indeed nest. Adding a submapper() method to SubMapper is a trivial change, and it take only minor internal tweaks for deeper nestings to function correctly.

    # Releases
    with mapper.submapper(
                    controller='release',
                    path_prefix='/releases') as c:
        # Collection
        c.connect('releases', '', action='index',
                conditions=dict(method='GET'))
        c.connect('create_release', '', action='create',
                conditions=dict(method='POST'))
        c.connect('new_release', '/new', action='new',
                conditions=dict(method='GET'))
        # Members
        with c.submapper(
                    path_prefix='/{id}',
                    requirements=dict(id='d+')) as m:
            m.connect('release', '', action='show',
                    conditions=dict(method='GET'))
            m.connect('update_release', '', action='update',
                    conditions=dict(method='PUT'))
            m.connect('delete_release', '', action='delete',
                    conditions=dict(method='DELETE'))
            m.connect('edit_release', '/edit', action='edit',
                    conditions=dict(method='GET'))
            m.connect('release_notes', '/notes', action='release_notes',
                    conditions=dict(method='GET'))

We seem to be on to something here, but what about the repetition (in one guise or another) of the resource name “release”? And what’s with those strange empty string ('') parameters?

Refactoring 3: Links and actions

To fix the repetition issue it’s clear that we need helpers that will generate those connect() calls for us more intelligently. But before we do that, let’s recognise that there are really two kinds of routes being generated here:

  1. Links to singleton subresources that support only one HTTP method, the most ubiquitous examples being the new and edit representations for which GET is the applicable HTTP method, but special subresources that are the targets for POST actions are common too;
  2. Actions that correspond directly to HTTP methods: index and create for GET and POST[1] on collection resources, show, update, delete for GET, PUT, DELETE on member resources.

It’s that second type that give rise to those strange empty strings as there is no further navigation to be done relative to the target resource.

So let’s add link() and action() helpers:

    # Releases
    with mapper.submapper(
                    controller='release',
                    path_prefix='/releases') as c:
        # Collection
        c.action(action='index', name='releases')
        c.action(action='create', method='POST')
        c.link('new')
        # Members
        with c.submapper(
                    path_prefix='/{id}',
                    requirements=dict(id='d+')) as m:
            m.action(action='show', name='release')
            m.action(action='update', method='PUT')
            m.action(action='delete', method='DELETE')
            m.link(rel='edit')
            m.link(rel='notes', name='release_notes')

Things are still moving in the right direction, but given that most of those links and actions follow a very well-worn convention, how about specific helpers for those?

Refactoring 4: More helpers

I’ll be honest – this is just an intermediate stage. You can see where we’re headed now though:

    # Releases
    with mapper.submapper(
                    collection_name='releases',
                    controller='release',
                    path_prefix='/releases') as c:
        # Collection
        c.index()
        c.create()
        c.new()
        # Members
        with c.submapper(
                    path_prefix='/{id}',
                    requirements=dict(id='d+')) as m:
            m.show()
            m.update()
            m.delete()
            m.edit()
            m.link(rel='notes', name='release_notes')

Refactoring 5: Parameter-driven generation

We’ll be using those same names a lot across our resources, so let’s just identify the ones we want in a list:

    # Releases
    with mapper.submapper(
                    collection_name='releases',
                    controller='release',
                    path_prefix='/releases',
                    actions=['index', 'create', 'new']) as c:
        # Members
        with c.submapper(
                    path_prefix='/{id}',
                    requirements=dict(id='d+'),
                    actions=['show', 'update', 'delete', 'edit']) as m:
            m.link(rel='notes', name='release_notes')

But even this pattern will be repeated across resources, so let’s define a collection() helper that does it all:

Refactoring 6: Using the collection() helper

    with mapper.collection(
                    'releases',
                    'release',
                    member_options={
                        'requirements': {'id': 'd+'}}) as c:
        c.member.link(rel='notes', name='release_notes')

And we’re done.

Let’s compare this to the equivalent resource() call:

    # Releases
    mapper.resource(
                'release',
                'releases',
                controller='release',
                # no requirements!
                member = {'notes':'GET'})

Superficially similar, but I’m unable to specify requirements on member paths and I don’t get to name my custom route either – the generated name “notes_release” just isn’t right! The bottom line is that resource() is approaching a dead end; its only way forward is to make its parameters ever more complex.

So… how about we put resource() on notice and bring in something more flexible and – dare I say – Pythonic? And wouldn’t it look just as nice in Ruby too? Answers on a postcard please…

August 4, 2009

described_routes is Rack middleware

Filed under: Web Integration — Tags: , , , , , , , — Mike @ 12:01 pm

Last week I received this:

#described_routes could make beautiful middleware

Why didn’t I think of that?

So I took a quick look at Rack, found there was almost nothing to learn, and over the weekend made the change. And it was well worth it: integrating described_routes into your Rails application is now much easier. There’s no need to modify your routes (the middleware recognizes and serves requests to /described_routes automatically) or your controllers (the discovery protocol’s link headers are added automatically to other requests). In fact the old integration method looks so ugly by comparison that I’ve deprecated it – it’s *that* embarrassing!

Now, run-time integration needs only this modification to your environment.rb‘s Rails::Initializer.run block:

require 'described_routes/middleware/rails'
Rails::Initializer.run do |config|
  ...
  config.middleware.use DescribedRoutes::Middleware::Rails
  ...
end

Revised instructions (compare with the old):

  1. Install the described_routes gem for the server
  2. Add build-time integration to the server (a one-liner to add some useful Rake tasks)
  3. Add run-time integration to the server (just the environment.rb modification above)
  4. Install and run path-to (for an “instant” client API)
  5. Profit!

Yes, it’s for Rack for Rails

The sharp-eyed reader will have noticed that despite the move to Rack, we’re still discussing a Rails integration. What about described_routes for other Rack-aware or Rack-based frameworks?

Most of the new middleware’s functionality exists in an abstract class DescribedRoutes::Middleware::Base but it needs two methods implemented for each framework:

  1. get_resource_templates – get named routes from the application and convert to ResourceTemplates
  2. get_resource_routing – map a request to a ResourceTemplate and its parameter list

In DescribedRoutes::Middleware::Rails:

  1. get_resource_templates hooks into described_routes code originally based on ‘rake routes’, and
  2. get_resource_routing extracts the controller name, action name and path parameters from a Rack environment member ‘action_controller.request.path_parameters’ populated by Rails as it processes the request – our stuff must happen afterwards.  The route name is reverse-mapped from the controller and action name.[1]

It should be clear now that both methods are necessarily framework-specific; indeed Rack does not provide routing itself.

The test of described_routes‘s underlying framework-neutrality will be its integration with a framework other than Ruby on Rails. This should be easier to achieve now than previously; perhaps someone with more knowledge and need than me will beat me to it (please be my guest!). I’m tempted meanwhile to double the challenge and attempt it in Python for WSGI-based frameworks (wish me luck!).

[1]Actually it’s a shame that Rails doesn’t make the request’s route name or route object available anywhere – is there anyone else who would use it?

July 27, 2009

Putting it all together – ResourceTemplates, described_routes and path-to

Filed under: Web Integration — Tags: , , , — Mike @ 10:18 pm

We’ve watched described_routes and path-to grow here over the past few weeks. And fun though it has been for me, it must be hard to get a good overview from blog posts triggered by design challenges! So here goes: an attempt at a one-post overview.

In a sentence?

Add described_routes to your Rails application to give it header-based site discovery with ResourceTemplate metadata that enables instant Ruby APIs on the client side with path-to.

And if I’m not using Rails or even Ruby?

There’s library support for ResourceTemplate metadata in Ruby (for the moment it’s included as part of described_routes) and Python (see described_routes-py). There’s a simpler version of path-to available in Python also, namely path-to-py. And there’s nothing Rails-specific or complicated about ResourceTemplates either – strip away the JSON, YAML or XML formatting and they’re not much more than named resources arranged hierarchically with their URI templates and supported HTTP methods as properties – and (so I’m told) can be useful even without path-to.

OK – I’m sold! What do I have to do?

Just follow these simple steps:

  1. Install described_routes for the server
  2. Add build-time integration to the server
  3. Add basic run-time integration to the server
  4. Add site discovery to the server
  5. Install and run path-to
  6. Profit!

The README files of the two gems tell you all you need to know in detail, but here in one place:

1. Install described_routes for the server

$ sudo gem install described_routes

2. Add build-time integration to the server

This is just a set of Rake tasks that lets you see immediately what the metadata looks like. In your Rakefile:

require 'tasks/described_routes'

Then try it out:

$ rake --tasks described_routes
rake described_routes:json        # Describe resource structure in JSON format
rake described_routes:xml         # Describe resource structure in XML format
rake described_routes:yaml        # Describe resource structure in YAML format
rake described_routes:text        # Describe resource structure in text (comparable to "rake routes")

Specify the base URI of your app with "BASE=http://..." to see full URIs in the output.

3. Add basic run-time integration to the server

Somewhere in your application include the controller, perhaps in an initializer:

require 'described_routes/rails_controller'

Add the following route in config/routes.rb:

map.resources :described_routes, :controller => "described_routes/rails"

You can now browse to /described_routes(.format) and /described_routes/{controller_name}(.format) and see the data generated at run time.

4. Add site discovery to the server

Site discovery (linking resources to their resource-specific and site-wide metadata) works via link headers (“Link:“) added to the responses served by one or more controllers. This has a double benefit:

i) Resources gain some type information derived from the Rails route name of the resource that may be of value to clients
ii) A path-to client (or any other client interested in the ResourceTemplate metadata) can be initialised from a regular resource URI; prior knowledge of metadata location is not needed.

According to your requirements, add the set_link_header filter to either the controller of your root resource (&/or or other specific controllers) or to ApplicationController in order to benefit all controllers:

require 'described_routes/helpers/described_routes_helper'

class MyController < ApplicationController
  include DescribedRoutes::DescribedRoutesHelper
  after_filter :set_link_header
end

Install and run path-to, …profit!

$ sudo gem install path-to

It is now a one-liner to bootstrap a client application, in this example a test blog with user and article resources:

require "path-to/described_routes"

# bootstrap a path-to client from the test_rails_app provided in described_routes
app = PathTo::DescribedRoutes::Application.discover("http://localhost:3000/")
#=> <PathTo::DescribedRoutes::Application>

# get user 'dojo'
puts app.users['dojo'].get
#=> '<html>...</html>

#get a JSON representation of the recent articles of user 'dojo'
puts app.users['dojo'].articles.recent['format' => 'json'].get
#=> [...]

Profit!

This bit is up to you, but metadata-enhanced web apps and instant client APIs achieved for so little work has to be worth something!

July 23, 2009

New link_header gem

Filed under: Web Integration — Tags: , , , , , — Mike @ 9:26 pm

My latest project on github and Rubyforge is link_header, a small rubygem for parsing and generating HTTP link headers as per the latest spec draft-nottingham-http-link-header-06.txt.

Usage

The usual install:

sudo gem install link_header

The library’s LinkHeader and LinkHeader::Link classes follow a pattern established in the ResourceTemplate and ResourceTemplate classes in that they offer easy conversions both to & from Ruby primitives, i.e. Arrays, Strings etc. This in turn makes them easy to prettyprint, convert to & from JSON and YAML, create from test fixtures, and so on. [Aside: @kevinrutherford and I discussed this idea on Twitter a few days ago in response to his blog post “factory method in ruby“. It’s worth a read.]

Link attribute names can appear more than once, so I have chosen a list of attribute/value pairs rather than a hash to represent link attributes. Link objects do however have an #attrs method that will lazily generate a hash if that’s convenient (it’s left to you to decide whether it’s safe!). There’s an example of this below.

So:

require "link_header"
require "pp"

#
# Create a LinkHeader with Link objects
#
link_header = LinkHeader.new([
  LinkHeader::Link.new("http://example.com/foo", [["rel", "self"]]),
  LinkHeader::Link.new("http://example.com/",    [["rel", "up"]])])

puts link_header.to_s
#=> <http://example.com/foo>; rel="self", <http://example.com/>; rel="up"

link_header.links.map do |link|
  puts "href #{link.href.inspect}, attr_pairs #{link.attr_pairs.inspect}, attrs #{link.attrs.inspect}"
end
#=> href "http://example.com/foo", attr_pairs [["rel", "self"]], attrs {"rel"=>"self"}
#   href "http://example.com/", attr_pairs [["rel", "up"]], attrs {"rel"=>"up"}

#
# Create a LinkHeader from raw (JSON-friendly) data
#
puts LinkHeader.new([
  ["http://example.com/foo", [["rel", "self"]]],
  ["http://example.com/",    [["rel", "up"]]]]).to_s
#=> <http://example.com/foo>; rel="self", <http://example.com/>; rel="up"

#
# Parse a link header into a LinkHeader object then produce its raw data representation
#
pp LinkHeader.parse('<http://example.com/foo>; rel="self", <http://example.com/>; rel = "up"').to_a
#=> [["http://example.com/foo", [["rel", "self"]]],
#    ["http://example.com/", [["rel", "up"]]]]

Later…

My next programming task will be some minor refactoring on described_routes and path-to take advantage of this new gem. The driver behind this all is an efficient discovery protocol and a significant reduction in the number of links reported by default – I realised that it was wasteful to produce multiple links on every request that are (let’s be honest) be of no interest at all to clients, when just one of those links points to metadata that carries that same information and more! Then for those server apps that generate the correct headers, a short one-liner will initialize a path-to client given the address of any resource served by the application, i.e. without special knowledge of the location of the app’s metadata resources.

Refactoring aside, the described_routes part of this is done (so servers support the protocol already); I just need to finish path-to part to take advantage of it on the client side.

I can’t make any promises about timelines at the moment (new job starts soon) but a Python version should be forthcoming soon(ish) also. Meanwhile, enjoy the Ruby version if you can!

June 30, 2009

Bootstrapping REST

Filed under: Web Integration — Tags: , , , , , , — Mike @ 6:55 pm

In this article I’m approaching REST from the perspective of client/server interaction, most especially (and slightly unusually I think) on the needs of the client and you, the client’s developer. I hope you find it helpful.

A caveat (no apology): my interest is in structure and navigation. HTTP has mechanisms already for clients and servers to negotiate content types and I’ve made a conscious design decision to rely on their existence. In other words, I ignore it completely here. Orthogonality is a wonderful thing!

We start at the beginning:

The completely hand-crafted client

At the most extreme, you’re working against an undocumented, unsupported server API that lacks regular structure. Any complexity implies significant reverse-engineering effort, made worse if you’re tracking a moving target. If you’re lucky though, you’ll be working against a supported and stable API.

Relying on regularity

Servers built using frameworks such as Rails (and even many that aren’t) will tend to exhibit some predictable patterns. The details may be framework-specific, but a collection resource, say “users”, can be expected to support these (or similar) operations and subresources:

GET, POST         http://www.example.com/users
GET               http://www.example.com/users/new
GET, PUT, DELETE  http://www.example.com/users/{user_id}
GET               http://www.example.com/users/{user_id}/edit

and a nested collection resource (each user’s articles, say) can be expected to follow a similar structure, rooted on a specific user resource.

GET, POST         http://www.example.com/users/{user_id}/articles
GET               http://www.example.com/users/{user_id}/articles/new
GET, PUT, DELETE  http://www.example.com/users/{user_id}/articles/{article_id}
GET               http://www.example.com/users/{user_id}/articles/{article_id}/edit

Armed therefore with just the knowledge of the user and articles collections, your client’s internal object model might easily support expressions such as

app.users
app.users.new
app.users[user_id]
app.users[user_id].edit
app.users[user_id].articles
app.users[user_id].articles.new
app.users[user_id].articles[article_id]
app.users[user_id].articles[article_id].edit

You’ll pay a price when the application departs from its usual pattern, but most of the time you’ll be fine. In fact for many client applications the edit and new are irrelevant, so all you need is the basic hierarchical structure and knowledge of the representations you can GET, PUT, POST there, also how to DELETE things.

Model-driven development

One approach to speeding application development is to abstract out that object model and generate skeleton code for both client and server from some common, logical representation. Not only are you spared the task of building the object model, but (given the right toolset) you can expect to have the web client functionality baked in for you. Most importantly, it’s much cheaper to track any changes made on the server.

The downside? You are now tied to a toolset. That might be a price worth paying if you’re in control of both client and server (for an internal application, say), but for many this will be unpalatable.

The no-model option

[Stick with me – this apparent digression makes metadata-driven clients easier to introduce!]

Dynamic languages make it possible to support expression like

app.users[user_id].articles.new

even if your app object doesn’t actually have a users member (method, attribute or property) and articles and new exist nowhere either. See [1] and [2] to read how (generally) this can be done in Ruby and Python, and [3] for a description of path-to [4], an actual client that can work this way.

Even without these dynamic features, a similar effect can be achieved with operator overloading. This, for example, is perfectly valid Java:

app / users / member(user_id) / articles / _new

The objects returned by such expressions can easily have a URI representation, request methods and so on.

But beware! Suppose a user resource defines an “up” link. Take for example

app.users["dojo"].up

To which of these URIs (below) does this expression refer?

http://www.example.com/users/dojo/up
http://www.example.com/users

We’ll come back to this issue later.

Metadata-driven clients

You can take the no-model approach and validate each navigation against it, so that for a non-existent path foo,

app.foo

raises an error just as the model-driven version would have done. To your client application code, this implied object model looks no different to the generated one, even though (in a way) it doesn’t really exist! I like to think of this as the model-driven approach turned inside-out – instead of running code generated from a model, we run code that interprets a model.

If that metadata comes from the server (so much better than a build-time configuration step), you’re getting closer to that RESTful goal of application interaction driven by links, or

Hypertext as the engine of application state (HATEOAS) [5]

But before claiming this goal genuinely you will need to address a couple of issues that we’ve glossed over so far:

  1. how to generate URIs for collection members, for example turning users["dojo"] into http://example.com/users/dojo
  2. how to generate URIs for navigations like up and first

The cleanest answer to the first issue is the URI Template (a parameterised URI with a standardised syntax), sourced (naturally) from the server. With Fielding’s quote in mind it’s clear that the resolution to the second issue must lie also in server-provided information. In fact we’ll see both issues come together as we understand that the problem really isn’t one of describing URI structure but instead one of describing resource relationships and their manifestations in links.

URI Templates and Resource Templates

A URI Template that maps users["dojo"] into http://example.com/users/dojo might look this this:

http://example.com/users/{user_id}

The syntax isn’t that important (the draft standard looks set to change significantly), and yes, we’ve seen them here already without introduction. They’re pretty self-documenting and (with a good library implementation) easy to use. If you stick to a standard there’s nothing unRESTful about them – having your application create a URI from template is really no different from a browser creating a URI as the result of GET-based HTML form.

In more complex cases – in particular where there is a mixture of mandatory and optional parameters – it’s advantageous to wrap the URI Template in some basic metadata. WADL [6] is a well-supported example of this idea (well almost – see the comments [7]) ; another is my own Resource Template format [8] (actually more a logical schema than a format since it’s available in JSON, YAML and XML).

WADL and Resource Templates typically describe an entire application, each element describing a class of resource, its relationships, its URI template (so that relationships can be turned into links), parameters, supported HTTP methods and so on.

Integrated with Rails, Resource Templates are generated by the server at run-time from the application’s routes, information already available. Although (most typically) they can be generated and consumed for the whole application in one go, they can be generated for a single resource already pre-populated with that specific resource’s known parameters.

At last: links!

From those pre-populated, resource-specific Resource Templates we can generate links – as HTML or XML elements or as HTTP headers – that our client can consume and navigate, resource by self-described resource. Every resource becomes a potential entry point to a virtual application that can easily span multiple servers. Finally, we got there: true hyperlink-based client/server interaction, and where the framework support exists, for very little effort too.

Discussion

In this article I have described the journey I travelled myself over a period of weeks following my decision to expose a complex existing application to scripting via HTTP. For me, technology-neutrality was a determining principle, and selecting HTTP and REST was an easy decision, first to make and then to sell. A proof of concept and a good demo based on a hand-crafted object model came quickly; metadata and templates were the result of some serious refactoring. Resource Templates came after I left the project, as did Rails integration (not a project requirement) and the path-to client in both Ruby and Python.

At the time of writing, link generation is supported on the server side but not yet on the client. The purist in me likes the fact that it is theoretically possible and will strive to keep it that way; the engineer in me is for the moment content with slurping up a whole application’s worth of metadata in one go and thereafter restricting interactions and payloads to the application level. Even as a purist, I see elegance in a simple metadata format, avoidance of hand-coded generation of URIs, respect for the power of HTTP and the orthogonality of formats; and if you believe as I do that RESTfulness lies in how clients and servers interact and not just in how server resources support the HTTP methods, I’ve probably succeeded in taking it further than most.

References

  1. Dynamically extending object behaviour in Ruby and Python [positiveincline.com]
  2. Programmatically adding methods to classes and objects: more Ruby/Python comparisons [positiveincline.com]
  3. path-to is born: nice client-side APIs to web applications in Ruby, also articles tagged path-to [positiveincline.com]
  4. path-to (Ruby) and path-to-py (Python) [github.com]
  5. Representational State Transfer (REST) (Chapter 5 of Architectural Styles and the Design of Network-based Software Architectures) [www.ics.uci.edu]
  6. Web Application Description Language [wikipedia.com]
  7. Partial template expansion in described_routes (comments)” [positiveincline.com]
  8. described_routes: hierarchical, framework-neutral and machine-readable descriptions of Rails routes [positiveincline.com], described_routes (Ruby) and described_routes-py (Python) [github.com]

June 19, 2009

Link headers, link elements, and REST

Filed under: Web Integration — Tags: , , , , , — Mike @ 3:14 pm

The house move looms (still no date yet, frustratingly) so I’ve had less time for programming (or blogging for that matter!).  I do however have an experimental and unreleased enhancement to the Ruby version of described_routes that generates link elements (for the <head> section of your html) or the yet-to-be-standardized link headers completely automatically.  Integrating them into your Rails application requires just a one-liner per layout or controller (or you can do all controllers in one go).  Drop me a line if you would like to play with it.

I have one question outstanding on the new spec (see it here on the ietf-http-wg list archives) but I have to say that I’m pretty happy with it all.  It takes the ideas of Partial template expansion in described_routes and adds a standardised way to relate content and metadata, making it even easier clients to navigate web applications without necessarily slurping up the entire site’s schema in one go.  It allows interaction to be 100% RESTful (HATEOAS and everything), and even if you prefer not to go that way as a client you can at least be confident that the application will be transparent, self-documenting and well-behaved.

June 4, 2009

Programmatically adding methods to classes and objects: more Ruby/Python comparisons

Filed under: Programming — Tags: , — Mike @ 6:53 pm

In Dynamically extending object behaviour in Ruby and Python we explored techniques for extending object behaviour dynamically by catching calls to undefined methods and having the target objects respond in some interesting way. In this sequel we achieve similar results by a very different (put perhaps more straightforward) approach, adding methods to existing classes and objects ahead of time programmatically.

I’ve avoided here approaches that simply involve generating program source as strings and then eval-ing it in some way. This  can be effective, but it’s uninteresting and perhaps a bit inelegant! Instead, we use closures, in the form of blocks, lambdas and nested functions.

Straight then to code. First comes Ruby, which I’ve done for 5 or so years on and off, then Python – for 3 weeks maybe? With that admission out there I’ll be as fair as I can!

Ruby

class Hello
  attr_reader :name

  def initialize(name)
    @name = name
  end

  # define a method named x that says hi to x
  def self.def_hi(x)
    define_method(x) { "Hi #{x} from #{name}" }
  end

  # handy helper - see http://whytheluckystiff.net/articles/seeingMetaclassesClearly.html
  def metaclass; class << self; self; end; end

  # uniquely for the object, define a method named x that says hi to x
  def def_obj_yo(x)
    metaclass.send(:define_method, x) { "Hi #{x} from #{name}" }
  end

  def_hi "everyone"
end

h1 = Hello.new("h1")
h2 = Hello.new("h2")

Hello.def_hi("foo")
h1.def_obj_yo("baz")

puts h1.everyone      #=> "Hi everyone from h1"
puts h2.everyone      #=> "Hi everyone from h2"
puts h1.foo           #=> "Hi foo from h1"
puts h2.foo           #=> "Hi foo from h2"
puts h1.foo           #=> "Hi bar from h1"
puts h2.foo           #=> "Hi bar from h2"
puts h1.baz           #=> "Hi Baz from h1"
puts h2.baz           #=> NoMethodError: undefined method ‘baz’ for #

Here, the everyone and foo methods are defined on the class as normal instance methods, and baz is defined as a method on h1‘s “metaclass” (thanks to _why for the helper), making it specific to the h1 instance.

In each case, define_method does the job of creating and installing the method on a class; the subtlety is in the way the blocks (the {...} bits) capture the x parameters for later use.

Newbie Python

class Hello(object):
    def __init__(self, name):
        self.name = name

    @classmethod
    def def_hi(cls, x):
        'define a method named x that says hi to x'
        setattr(cls, x, lambda self: "Hi %s from %s" % (x, self.name))

    @classmethod
    def def_hello(cls, x):
        'define a method x that says hi to x'
        def say_hello(self):
            return "Hello %s from %s" % (x, self.name)
        setattr(cls, x, say_hello)

    def def_obj_yo(self, x):
        'uniqely for the object, add a callable attribute named x that says hi to x'
        setattr(self, x, lambda: "Yo %s from %s" % (x, self.name))

h1 = Hello("h1")
h2 = Hello("h2")

Hello.def_hi("foo")
Hello.def_hello("bar")
h1.def_obj_yo("baz")

print h1.foo()           #=> "Hi foo from h1"
print h2.foo()           #=> "Hi foo from h2"
print h1.baz()           #=> "Yo baz from h1"
print h2.baz()           #=> AttributeError: 'Hello' object has no attribute 'baz'

To be honest, I’m not sure how idiomatic this Python code is, but it certainly works in Python 2.6.2. The reason I mention the version is that older documentation refers to a now-deprecated module “new”, that contains tools similar to Ruby’s define_method. My code is remarkably simple for what it achieves: it just adds lambda objects (in def_hi and def_obj_yo) or function objects (in def_hello) and sets them as attributes of the class or object. Again, we’re relying on those closures to capture the x parameters indefinitely.  The “hi” and “hello” versions aren’t very different (except in length);  sometimes a lambda is too restrictive and a nested function is more appropriate.

I’m 99.9% sure satisfied now (I was rather less sure as I posted the first drafts of this article; I’ve since read the relevant parts of the Language Reference) that the foo objects are proper methods in the strictest sense, though they definitely do behave like them. It’s probably fair to say though that though baz isn’t really a method; self isn’t set as a result of calling the method – it’s just another variable captured in the lambda. Sneaky! If we copied it onto another object, self would still refer to h1.

Comparison

These two lines of Ruby:

  attr_reader :name
  def_hi "everyone"

are executed as the class definition is created. They are very simple examples of how we can define and use what look like new keywords but are in reality just calls to class methods. More complicated examples accept blocks, providing nice API-defined extension points for subclasses to add custom behaviour. Ruby excels in its ability to define APIs not just for their functionality but for their sheer prettiness from the user’s (especially the extender’s) perspective.

Even disregarding cosmetic details such as the optionality of parentheses in Ruby, I was unable to execute similar code within the scope of the Python class definition. Please do correct if me if I’ve got this wrong – having commented this morning (perhaps prematurely) on Yehuda Katz’s interesting article The Importance of Executable Class Bodies it would be good to clear this up!

[UPDATE: yes you can execute arbitrary code within the scope of the class definition, but it’s not quite as simple as calling class methods – in fact I still haven’t found a way to access the class object as it’s being defined (perhaps it doesn’t exist yet!).  The gory details of something that looks pretty close are in 157768: Automating simple property creation (from 2002, perhaps superseded since)]

There is however still a certain elegance to the Python idea (explored more fully last time) of methods being just function-valued (or callable) attributes of objects. No special syntax needed (I don’t think I can ever learn to love Ruby’s class << self idiom) and I love the ease with which function objects can be obtained, installed and invoked, and how, nested within other functions, they can be closures.

On balance? Ruby does seem capable of supporting the nicest APIs (DSLs even), but there’s a simplicity to Python that perhaps makes for more understandable code internally. There’s an interesting balance there to be struck, though neither one wins on a knockout. We’re down now to the level of personal preference, though even without choosing a winner it has definitely been a beneficial experience to have taken on a new language in a serious way. It changes you!

June 1, 2009

Dynamically extending object behaviour in Ruby and Python – a quick warts-and-all comparison

Filed under: Programming,Web Integration — Tags: , , , , — Mike @ 1:25 pm

In path-to, the seemingly innocuous expressions

app.users['dojo'].articles['behind-the-scenes'].edit
app.user('dojo')

are relying on behaviour that is added dynamically. Here, users, articles, edit and user don’t actually exist as defined properties or methods – they are caught and simulated behind the scenes, driven by metadata (descriptions of web resources) loaded at runtime.

Because of fundamental differences in the way object member access works in Ruby and Python, the path-to implementations differ quite significantly under the covers. As promised in my previous post, I explore some of the issues here.

Object member access

Fundamentally different!

  • All Ruby object access is via methods, some of which may be parameterless accessors
  • All Python object access is via properties, some of which may be methods

So,

foo.bar

in Ruby invokes a method that must be either parameterless or whose parameters all have default values defined. In Python it retrieves a property, which may be a value or a function object. Importantly, that function object is not invoked – only foo.bar() would do that.

Adding behaviour dynamically with method_missing and __getattr__

Recall:

  • All Ruby object access is via methods, some of which may be parameterless accessors

In Ruby, when you invoke a non-existent method (e.g. app.users), Ruby calls the target object’s method_missing, which you can override to do something interesting and return a value (all Ruby functions return values, even if only nil).

  • All Python object access is via properties, some of which may be methods

In Python, an attempt to retrieve a non-existent property results in a call to __getattr__. Again, you can override this to do something interesting and return a value. But what if the thing returned needs to behave like a method?

Chaining with Python’s callables

Callables in Python are objects that can be invoked as functions. These include the obvious things like functions and methods (yes they’re objects too), but also includes lambdas, classes (invoking a class creates an instance) and potentially even regular user-defined objects.

In path-to-py, I wanted to emulate a style that you get for free in Ruby, and (thankfully) it turns out to be not so hard in Python either. The challenge is this: when following a resource relationship that doesn’t need parameters, e.g. from the users collection to its new input form:

app.users.new

I didn’t want to force this kind of style, e.g.

app.users().new()

even though any of these navigations may take optional parameters, such as

app.users.new(format="json")

The trick is to make each of these objects callable by defining a __call__ method. Then app.users.new returns one object, and the (format="json") returns another, slightly more specialised one. It’s a particular example of chaining, which is – as you may have gathered already – at the heart of both versions of path-to.

Lambdas

In the case that the callable returned by Python’s __getattr__ must take arguments, the answer is more straightforward: return a function or a function-like object. In path-to-py, we return a lambda that invokes the object’s child method, passing all arguments (positional and keyword) to it, thus:

return lambda *args, **kwargs: self.child(attr, *args, **kwargs)

Some have criticised Python’s lambda for being limited to single expressions. I’m fine with it myself; the simple case looks very neat, and the general case is more than adequately covered by nested functions. And Ruby isn’t without its warts either:

l = lambda {|foo| ... }
l(bar)      # doesn't invoke l
l.call(bar) # ugh
l[bar]      # what???

And nested functions in Ruby don’t work as closures either. Much as I love Ruby, Python deserves better press in this regard I think!

Indexing collections

This works similarly in both languages, though this time my niggles are with Python. If you want an object to behave like an array or dictionary/hash, simply define the [] operator in Ruby or the __getitem__ method in Python. In path-to, these always return new objects (yes, it’s that chaining pattern again).

Python’s warts: you can declare your method thus to allow arbitrary parameters:

def __getitem__(self, *args):

but its behaviour is inconsistent. Sending it one argument will give you the expected one-element args tuple; two or more arguments give you a one-element tuple containing a tuple! In path-to-py, there’s code to detect this and where necessary flatten args before passing it on. And don’t bother with keyword arguments (adding **kwargs to the declaration) – any attempt to use it, e.g.

app.users['dojo', format='json']

actually gives a syntax error. In path-to-py, you can choose between one of these forms (below) instead. The first two are entirely equivalent; the second looks uglier but removes a level of chaining:

app.users['dojo'](format='json')
app.users['dojo'].with_options(format='json')
app.users['dojo', {'format': 'json'}]

Conclusion

Niggles aside, adding behaviour dynamically in Ruby and Python isn’t all that scary really! Their underlying models are surprisingly different, yet there’s no lack of power in either one.

May 12, 2009

Positional params for path-to/described_routes

Filed under: Web Integration — Tags: , , , , — Mike @ 2:59 pm

I’ve checked off another roadmap item for path-to, namely positional parameters.  This makes its metadata-driven client APIs feel even more natural.

Now, when indexing collection resources such as those seen in the Rails example, expressions such as

app.users['user_id' => 'dojo']

can be written

app.users['dojo']

which is quite a lot neater!  This style can still be mixed with hash style parameters, such as

app.users['dojo', {'format' => 'json'}]

or

app.users['dojo']['format' => 'json']

In the Delicious API example,

delicious.posts['tag' => 'ruby']

can be written

delicious.posts('ruby')

The square brackets become round brackets in this case as the metadata doesn’t model posts as a collection – it’s just a resource that may take a number of optional parameters, the first of which is the tag name.  It took me a few moments to get used to this distinction, but I’m happy now that this gives a more obvious mapping to the underlying API.

May 8, 2009

Delicious API with path-to/described_routes

Filed under: Web Integration — Tags: , , , , — Mike @ 11:08 am

The latest path-to contains an example (based on HTTParty’s) that queries Delicious via a Ruby API that’s 100% metadata driven.  Stripped of comments (the original commented version is here), the code is tiny:

require 'path-to/described_routes'
require 'pp'

config = YAML::load(File.read(File.join(ENV['HOME'], '.delicious')))

delicious = PathTo::DescribedRoutes::Application.new(
              :yaml => File.read(File.join(
                                     File.dirname(__FILE__),
                                     'delicious.yaml')),
              :http_options => {
                  :basic_auth => {
                      :username => config['username'],
                      :password => config['password']}})

pp delicious.posts['tag' => 'ruby'].get
pp delicious.posts['tag' => 'ruby'].recent['count' => '5'].get
delicious.recent_posts.get['posts']['post'].each do |post|
  puts post['href']
end

A couple of things to highlight are:

  1. The methods posts, recent, recent_posts are available thanks to the metadata – they’re not explicitly coded anywhere.  Similarly, URIs aren’t coded here either.
  2. The resource hiercharchy.  These are all separately identifiable things:
    • posts
    • posts['tag' => 'ruby']
    • posts['tag' => 'ruby'].recent
    • posts['tag' => 'ruby'].recent['count' => '5']

The metadata takes the form of described_routes-style resource templates, and I did it by hand this time in YAML:

---
- name: posts
  options:
  - GET
  uri_template: https://api.del.icio.us/v1/posts/get?{-join|&|tag,dt,url}
  optional_parameters:
  - tag
  - dt
  - url
  resource_templates:
  - name: recent_posts
    options:
    - GET
    uri_template: https://api.del.icio.us/v1/posts/recent?{-join|&|tag,count}
    rel: recent
    optional_parameters:
    - tag
    - count
Older Posts »

Powered by WordPress