Starting2rest

Table of Contents

Perl@Work#1 - REST, the path info, and code paths
Should REST be adopted or not?
The URL
The Path Info
Path info as grammar and pseudo-code
The Bad Old Days
The Good New Days
The Transformation Algorithm
Observations and Deductions
MVC - Model-View-Controller
Other modules in the suite
Exiting Cleanly
Author
Copyright
Date

Starting2rest

Perl@Work#1 - REST, the path info, and code paths

This is the 1st in a series about how I use Perl.

REST is short for Representational state transfer.

Here I describe the process of the REST-style path info component (of a HTTP client's request to a server) being transformed - by an algorithm - in such a way as to select a specific module of code to run, and to select a specific method to call within that module.

I'm using the word 'method' here in its Object-Oriented Programming sense.

Hence the transformation takes a string as input, and produces a directive specifying which code path to execute within the application, to service the given url + path info.

There won't actually be much Perl in this article, so readers who are not familiar with Perl should still gain some benefit by following along.

What this article is, and is not

Firstly:

So, getting away from the negativities, what is it?

Well, I have 2 motives for this article. Firstly, I somehow wasn't satisfied by the various articles I read on REST, and secondly, I am currently re-working a large, soon-to-be-released, module - CGI::Office::Contacts - to use REST, and this article is helping me clarify my ideas.

That last bit is the key - this article is a record of my attempt to apply REST - for the first time - to a specific Perl module.

BTW: The abstract for CGI::Office::Contacts is 'A web-based, group and private, contacts manager'.

Should REST be adopted or not?

Well, I hope that by the time you've finished this article, you can answer that question for yourselves.

But for me, the answer is Yes! It is not just another fad, but are definitely the way to go.

They are an extremely neat way of organizing path infos, and that leads immediately and directly to a correspondingly neat way of organizing the set of modules (in any language) which make up a modern and typically complex CGI-based application.

The URL

The module CGI::Office::Contacts ships with 2 scripts which run the same code:

In what follows, I'll rarely refer to these scripts, but here's how they are used, where the 'N' just represents some person's unique id.

So, the part after the script's name, '/person/delete/N', is common, no matter which type of script we are running.

This common part is called 'path info', and we'll focus on it from here on.

The Path Info

Here are some path info samples, used by CGI::Office::Contacts:

	101 /person/add
	105 /person/update/:id
	102 /person/delete/:id
	103 /person_donation/add
	    /person_donation/delete/:ids
	104 /person_note/add
	    /person_note/delete/:ids
	106 /person_site/add
	    /person_site/delete/:ids

	201 /organization/add
	205 /organization/update/:id
	202 /organization/delete/:id
	203 /organization_donation/add
	    /organization_donation/delete/:ids
	204 /organization_note/add
	    /organization_note/delete/:ids
	206 /organization_site/add
	    /organization_site/delete/:ids

The numbers in the first column are values I used for the CGI form field 'action', before switching to REST. More on that big decision later.

The :id syntax is used by CGI::Application::Dispatch. This is the module which implements the algorithm mentioned above, and is used to transform those path infos into code paths.

The :ids indicate that the value is actually a set of ids, separated by e.g. '.' chars. The reason for having multiple ids is that the first one identifies the entity (organization, person), and all others identify donations or notes or sites belonging to that entity.

At run time, the :id would of course be the real id of an entity, or a string as just mentioned. When setting up CGI::Application::Dispatch though, the :id syntax means that the value supplied by the client is stored in such a way that a variable called id is used by your code to retrieve that value.

Path info as grammar and pseudo-code

If we consider the path info in (English) grammatical terms, or in coding terms, we can rewrite all those separate path info samples into one generic grammatical form:

	/noun/verb/id

Or, in programming terms:

	/object/action/id

In either case, we first specify the thing to be processed, and then the action to be perpertrated on that thing.

If it's a brand new thing, then /person/add is sufficient (i.e. the id is not yet known), while actions on pre-existing things always require the thing's id to be specified, as in /person/delete/id.

The Bad Old Days

So far, so good. But I want to talk more about that CGI form field, 'action' I mentioned.

CGI form data as a selector, or switcher

The purpose of 'action', and its values, is to give the application the information it needs to select one code path amongst many. That is, the value of 'action' is what causes one specific execution path to be the one which is active during each instantiation of the application.

But who is the selector?

If we change our viewpoint from that data value to the code making the decision, we notice something interesting.

When using a CGI form field's value to switch, the code which does the switching is actually, and necessarily, inside our application.

For instance, we might use an 'if' statement (very crude), or we might assign that value to a variable within the application, and then use that to switch.

An example of the latter case is when using the Perl module CGI::Application, which uses what it calls the run mode, to do the switching.

In the case of an application whose parent class is CGI::Application, the switch is inside the parent. Our application would just supply a mapping function, which maps values to the names of subroutines.

The Good New Days

REST allows us to answer the following question...

What if the selector were outside our application?

When using a stand-alone module such as CGI::Application::Dispatch, some code somewhere must still do the switching, but now the algorithm is implemented in code completely outside our application.

This turns out to be a fascinatingly different way of doing things.

Not only that, but also CGI::Application::Dispatch implements a generic path info transformation algorithm, and can be reused endlessly, by any number of unrelated projects.

A Digression

There are, of course, alternatives to CGI::Application::Dispatch.

In this list, the first 2 are presumably (I didn't try them) intended as stand-alone path transformers, while Catalyst is a major framework with the transformation logic built in.

There may be others, both in Perl and other languages.

In each case, they solve the same problem: Transforming the path info string into a code path.

There may be arguments in favour, and against, all of these modules, but I won't go into those here, except to say that CGI::Application::Dispatch is elegant and succinct. By that I mean the set of rules specifying the tranformation algorithm are short and side-by-side (there's an example just below), whereas the other modules scatter the rules throughout the code, as a side-effect of how they implement their logic.

But back to what I know best.

The Transformation Algorithm

We are starting with a generic path info such as /object/action/id, and wish to use that to specify code.

The following code fragment, used in both scripts mentioned above, is the starting point.

Note: The syntax ':x' means /x is extracted from the path info, and the value of x is made available to the code. Without the ':', the /x would literally mean x, and would not be interpreted as the name of a variable.

	CGI::Application::Dispatch -> dispatch
	(
	 args_to_new => {QUERY => $cgi},
	 debug       => 0,
	 prefix      => 'CGI::Office::Contacts::Controller',
	 table       =>
	 [
	  ''              => {app => 'Initialize', rm => 'display'},
	  ':app'          => {rm => 'display'},
	  ':app/:rm/:id?' => {}, # The '?' says the id is optional.
	 ],
	);

Here's how it works. It is saying (as per the docs for CGI::Application::Dispatch):

Samples:

For completion I should say that all CGI form field data is also passed to the code being called.

Observations and Deductions

So, what can we make of all this?

Well, to start with:

But wait, there's more!

MVC - Model-View-Controller

Up till now I haven't said anything about the rest of the application - that is, the modules which implement the Model and View components.

Model and View

Under the model component, we'll have:

Also, there is another, corresponding, family of modules under CGI::Office::Contacts::View::*.

Coding and Debugging

It should be clear that if there is work to be done on, say, the Person part of the code, then the work should be restricted to these modules, or their parents:

This is a result of the code structure adopted, and this structure is in turn a direct consequence of mapping the path info to the modules' namespace.

You might be thinking that we're ending up with too many modules, but this is never a problem.

In database design, for example, it's a classic beginners mistake to try to minimize the number of tables used, but this is never a good design policy to be following.

It's the same with code. Each module is small and neat, targetted to one part of the application, and the namespace orients you immediately.

In Perl, it should be mentioned for the non-Perl readers, the structure of module names (i.e. the language) does not force you to inherit modules from parents just because the parents' names are prefixes of the module in question.

However, I always name my modules such that the inheritance tree matches the structure of the modules' names.

Other modules in the suite

Importing

To be released separately are modules such as CGI::Office::Contacts::Import::vCards.

Clearly, this means the controller will be called vCards.pm, and its parent will be Import.pm, and its parent will be Controller.pm.

So the controller component will be CGI::Office::Contacts::Controller::Import::vCards.

Also, the outcome so far indicates precisely what we must do to create handlers for new objects in the future.

For instance, if we allow our entities to have multiple sites (geographic addresses) just as we gave them donations and notes, we must write a controller called CGI::Office::Contacts::Controller::Organization::Sites, whose parent is obvious, and whose place in the scheme of things is also obvious.

Exporting

I haven't written any export code for the new database, but exporting vCards from my email client, and importing them into this package uses XML.

That's not actually a problem, but the above design means that the basic system has no XML-module requirments at all. This is another artifact of the compartmentalized design.

So, the user only needs to install XML-based modules if they choose to import via vCards.

Exiting Cleanly

In the same way we want our programs to exit without error, well-written articles have to exit with a nice conclusion.

To me, the analysis of path info and module structure, and the way these 2 tie so neatly together, means I have gained considerable benefit in feeling assured that the resultant code is as reliable as I can make it.

And that fits in with my personal mantra when operating within the computer industry: Reliability comes first, middle and last.

Top