In this article I'll discuss my experiences in converting a classic CGI script, written in Perl, to run under Plack.
This script uses CGI::Application::Dispatch to run a set of modules based on CGI::Application. These modules are available from http://search.cpan.org/.
If your code does not use CGI::Application::Dispatch, the transition will be even easier.
Plack (Perl Superglue for Web frameworks and Web Servers (PSGI toolkit) ) is also available from CPAN.
I'm writing this from the point of view of an absolute beginner in all matters pertaining to Plack.
You can install Plack via the 'cpan' command-line program, by specifying Task::Plack as the module to install.
Some modules, such as CGI::Application::Dispatch::PSGI, may not be included in Task::Plack's module list.
Real-time help with Plack is available via the irc.perl.org channel #plack.
To simplify the text, I'll (usually) refer to some specific programs, rather than use generic expressions such as 'web server':
This is my Operating System (OS) of choice.
This is a 'web client'.
Wherever I say Apache, think 'web server', unless the context makes clear I'm discussing Apache-specific features.
Apache mediates (helps manage) between the end-points in the http-based transport of data between FF and your CGI script.
This is the Perl Superglue.
In some situations, Plack can be a replacement for Apache.
This is a command-line program which runs your Plack-based CGI script.
By this, I mean a CGI script running under Apache, and not one converted to run under Plack.
Call this local.cgi.
This is a classic CGI script which has been edited to be Plack-compatible.
Call this local.psgi.
Plack documentation talks about your 'app'. Here, local.psgi is my app.
Plack's Advent Calendar: http://advent.plackperl.org/2009/12/day-1-getting-plack.html
With Apache, you'd do this:
Under Plack, we have:
This includes configuation data for Plack, but not for your Perl modules.
This starts up your PSGI script.
The default Plack-based server (March, 2010) is HTTP::Server::PSGI.
You can see how Plack is mediating HTML traffic in the same way Apache does.
#!/usr/bin/perl use strict; use warnings; use CGI; use CGI::Application::Dispatch; # --------------------- CGI::Application::Dispatch -> dispatch ( args_to_new => {QUERY => CGI -> new}, prefix => 'Local::Application::Controller', table => [ '' => {app => 'Initialize', rm => 'display'}, ':app' => {rm => 'display'}, ':app/:rm/:id?' => {}, ], );
#!/usr/bin/perl use strict; use warnings; #use CGI; # Patch # 1. use CGI::Application::Dispatch::PSGI; # Patch # 2. use Plack::Builder; # Patch # 3. # ---------------- my($app) = CGI::Application::Dispatch -> as_psgi # Patch # 4. ( #args_to_new => {QUERY => CGI -> new}, # Patch # 5. prefix => 'Local::Application::Controller', table => [ '' => {app => 'Initialize', rm => 'display'}, ':app' => {rm => 'display'}, ':app/:rm/:id?' => {}, ], ); builder # Patch # 6. Note the 3 semi-colons here. { enable "Plack::Middleware::Static", path => qr!^/(assets|favicon|yui)/!, root => '/var/www'; $app; };
As you can see, only tiny patches to the code are required, to embrace Plack.
Let the underlying infrastructure invisibly handle the CGI protocol stuff.
By switching from CGI::Application::Dispatch to CGI::Application::Dispatch::PSGI, we have access to a new method, as_psgi(), used in patch # 4.
as_psgi() has been injected into the CGI::Application::Dispatchnamespace.
Hopefully, in the near future, CGI::Application and CGI::Application::Dispatch will have native support for this, and we won't need CGI::Application::Dispatch::PSGI.
We use Plack::Builder to get access to 'builder'. See patch # 6.
We call as_psgi() rather than the original dispatch().
We call builder, passing it a block of code. This is discussed next.
Well, down to the 1st semi-colon, we tell 'builder' to use:
enable "Plack::Middleware::Static", path => qr!^/(assets|favicon|yui)/!, root => '/var/www';
This tells Plack to serve URLs - starting with either /assets or /favicon or /yui - by using the middleware component Plack::Middleware::Static.
Plack comes with a whole set of middleware components, but I won't discuss those here.
I use this because my CSS and JS files are under URLs /assets/css and /assets/yui respectively.
YUI - http://developer.yahoo.com/yui/ - is the marvellous YAHOO User Interface Javascript library. I use V 2 of YUI.
You'd extend that regexp if, for example, your code had images to serve.
Also, favicon is there because without it CGI::Application::Dispatch will report: [Dispatch] ERROR: Invalid characters in module name.
And the last part, root => '/var/www', simply means these files live under the disk directories /var/www/assets/css/ and /var/www/assets/yui/.
For speed, you would put the CSS and JS files on a RAM disk. Debian automatically gives you a RAM disk at /dev/shm/. See the comment by 'stinger30au' at the bottom of this page http://ubuntuforums.org/showthread.php?t=182764.
Lastly, we pass $app to 'builder', and that's it!
So, I start local.psgi with:
plackup -l 127.0.0.1:9090 local.psgi &
or, even better:
starman -l 127.0.0.1:9090 --workers 1 local.psgi &
This has a whole series of connotations, some of which I'll spell out next.
plackup runs my 'app', local.psgi.
Since my app is running now, all the Perl modules it 'use's have been loaded too.
You'll see this corresponds to using mod_perl on top of Apache, and then using mod_perl's support for a start-up script (often called startup.pl).
startup.pl's job is to pre-load modules during Apache's start-up phase. This saves them from having to be loaded each time the CGI script itself is run.
Since Plack keeps the app running continuously, the modules are only loaded once, resulting is a significant speed boost over the case of loading them each time the CGI script runs.
plackup uses the default server HTTP::Server::PSGI.
These should be self-evident.
If you're new to the Unix environment, keep in mind that Unix reserves port numbers below 1024 for access to programs started by the superuser. These are called priviliged or trusted ports.
That's why the port used here is nothing like the default HTTP port of 80.
Since I've run plackup from the command line, when logged in as 'ron', the PSGI script is running with my permissions.
This means, for example, that when I test this by switching back and forth between a classic script and a PSGI script, that classic script is actually being loaded and run by Apache, which means output files, say /tmp/x.dat, will be written and owned by a user called (under Debian) 'www-data'.
So, trying to run exactly the same code, from the command line, logged in as 'ron', will mean I don't have permission overwrite /tmp/x.dat.
This is a trap for beginners.
In practice, you don't need to move back and forth between 2 such implementations, but while leaning about Plack, it's understandable you'd want to try that.
Also, you might ship code to someone who does not run Plack, which is a strong motivation for integrating generic access to the environment (for example) into CGI::Application.
For more details about this topic (switching, not file permissions), see Plack::Handler::CGI.
For more details about accessing the environment, see "Accessing the Environment".
When requests arrive from the web client FF, Plack processes them by simply passing them on to my app.
Output from local.psgi undergoes more processing than that, but we don't actually see the details, so I'll happily skip them.
Consult the many and varied Plack docs for details.
Since we've circumvented Apache, we've also lost whatever security features within Apache we were relying on.
Also, the configuration data used by Apache is no longer available. Hence the use of Plack::Middleware::Static to map the (prefixes of) URLs of static files to disk directories.
Whereas you'd use sudo to restart Apache, you don't need it for plackup.
Here's sample code for how to access the environment in both Plack and non-Plack situations.
sub script_name { my($self) = @_; my($env); my($script_name); # Are we running under Plack? if ($env = $self -> query -> can('env') ) { # Yes. $env = $self -> query -> env; $script_name = $$env{SCRIPT_NAME}; } else { # No. $script_name = $ENV{SCRIPT_NAME}; } return $script_name; } # End of script_name.
Natually, this is not an ideal solution. Such access mechanisms should be built into the underlying code, in this case CGI::Application.
Firstly, for the classic case, I'd type
http://127.0.0.1/cgi-bin/local.cgi
into FF's address line.
For Plack, I now use:
http://127.0.0.1:9090/
Now, with YUI an Ajax call has this syntax:
var r = YAHOO.util.Connect.asyncRequest('POST', '<tmpl_var name=form_action>/backup', cb);
where '<tmpl_var name=form_action>/backup' is the URL to hit, and cb is the name of a Javascript function called automatically when the server responds.
We see '<tmpl_var name=form_action>' in there, which uses HTML::Template syntax.
This Javascript is triggered by clicking a CGI form submit button, and the JS code itself lives in a HTML::Template-style template file.
So, in the Perl code, I load the template, and then replace that tmpl_var with the form's action value provided by the server, that is $script_name, as returned by the sample method of the same name above.
The question arises, what to do when switching to Plack? Answer: Nothing.
Yep. That's it. The resultant Javascript changes from the classic case:
var r = YAHOO.util.Connect.asyncRequest('POST', '/cgi-bin/local.cgi/backup', cb);
to the PSGI case:
var r = YAHOO.util.Connect.asyncRequest('POST', '/backup', cb);
This is a very clear indicator that switching to Plack need not be difficult.
In order to manage (run) a suite of PSGI-enabled CGI scripts, we could use Plack::App::PSGIBin (but see the 2nd demo).
Sample # 1, called 'all':
use Plack::App::PSGIBin; use Plack::Builder; my($app) = Plack::App::PSGIBin->new(root => "httpd/cgi-bin") -> to_app; builder { enable "Plack::Middleware::Static", path => qr!^/(assets|favicon|yui)/!, root => '/var/www'; mount "/app" => $app; };
Points to note:
plackup --host 127.0.0.1 --port 9090 httpd/cgi-bin/all
You specify a directory. I happen to use one under my current directory.
All scripts whose names end in psgi are loaded.
You specify what sort of URL to use.
Thus, if './httpd/cgi-bin' contains contacts.pcgi and local.psgi, then the corresponding URLs will be:
http://127.0.0.1:9090/app/contacts.psgi http://127.0.0.1:9090/app/local.psgi
Sample # 2, called 'each'. Here, I've just copied the bodies of contacts.psgi and local.psgi into one script:
#!/usr/bin/perl use strict; use warnings; use CGI::Application::Dispatch::PSGI; use Plack::Builder; # --------------------- my($contacts) = CGI::Application::Dispatch -> as_psgi ( prefix => 'App::Office::Contacts::Controller', table => [ '' => {app => 'Initialize', rm => 'display'}, ':app' => {rm => 'display'}, ':app/:rm/:id?' => {}, ], ); my($local) = CGI::Application::Dispatch -> as_psgi ( prefix => 'Local::Application::Controller', table => [ '' => {app => 'Initialize', rm => 'display'}, ':app' => {rm => 'display'}, ':app/:rm/:id?' => {}, ], ); builder { enable "Plack::Middleware::Static", path => qr!^/(assets|favicon|yui)/!, root => '/var/www'; mount '/contacts' => $contacts; mount '/local' => $local; };
This technique uses the method discussed on day 12 of the 2009 advent calendar - http://advent.plackperl.org/2009/12/day-12-maps-multiple-apps-with-mount-and-urlmap.html.
Points to note:
plackup --host 127.0.0.1 --port 9090 httpd/cgi-bin/each
As before, just changing 'all' to 'each'.
Compared to the 1st sample, these URLs are neat:
http://127.0.0.1:9090/contacts http://127.0.0.1:9090/local
The docs for Starman say it all.
Of course, to master Plack means understanding in more depth, and understanding a wide range of Plack-based features.
Nevertheless, you can see that using, or switching to, Plack, is relatively painless. It's just a matter of connecting the dots, and that is the aim of this article.
Plack: http://plackperl.org/.
The Next Article in this Series: http://savage.net.au/Perl/html/session.management.with.plack.html.
Ron Savage .
Home page: http://savage.net.au/index.html.
Australian Copyright © 2010 Ron Savage. All rights reserved.
All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html