A.staff.search.engine

Table of Contents

Perl@Work#2: A Staff and Phone Services Search Engine
Designing the new database schema
Programs to extract data from pre-existing databases
Programs to import that data in the new database
Designing the screen formats
A CGI script for the main Search Engine
Another Search Engine, for the Telephonists
Author
Copyright

A.staff.search.engine

Perl@Work#2: A Staff and Phone Services Search Engine

This is the 2nd in a series about how I use Perl.

In April 2004 I was contracted by Monash University (Victoria, Australia) to write a browser-based staff and phone servies search engine, to replace a set of disparate programs then in use.

Monash has several campuses in the state of Victoria, and satellite campuses in Malaysia, Hong Kong, South Africa, London and Italy.

I wanted to write everything in Perl, and the university very kindly agreed to that.

I used CGI::Application as the basis of all web-oriented code, although I ended up with about 125 command line programs too.

The components of this new program are:

  • A new database schema
  • Programs to extract data from various pre-existing databases
  • Programs to import that data in the new database
  • Helping design the screen formats
  • A CGI script for the main Search Engine
  • Another Search Engine, for the telephonists
  • A CGI script to update the database
  • Programs to extract data for 14 reports

Designing the new database schema

Although I'm not a database design expert, I was able to do this myself.

I worked on a PC, under MS Windows, and used MySQL because Postgres had not yet released a production-ready installer for their Windows version.

I installed quite a few test versions of everything on a PC (running Windows) which belonged to the project manager, who was of course a staff member. I shared his office for a year.

In production, the system was installed under Red Hat, and used Postgres.

Of course, the Perl module DBI meant the transition to Postgres was effortless.

The data was transferred from machine to machine, and database server to server, in XML format, as exported by DBIx::Admin::BackupRestore.

Besides the basic information, we stored information about the campuses, the various phone numbering systems used in different countries, buildings per campus, rooms per building. and phones per room.

Phones are bit tricky because several people can share a phone (e.g. graduate students), while some people have multiple phones.

Then, secretarial staff (usually women) have their phone number linked to the departments' head's phones, so her number gets displayed to the average searcher, rather than the head's (usually men).

So, a search for a person should show all their phones, unless marked as hidden, but then clicking on a phone number will show all the people sharing that phone.

Module phone numbers (called cell phones in some countries) are never shown in search results for the web-based search engine. However, the telephonists search engine does show them.

To complicate the issue, some mobiles are allocated to specific individuals, while some just belong to the department generally, and some are deemed to be vaguely attached to specific departmental cars (when not fixed in the car).

The database also had to allow for the fact that people appointed to any non-academic position could invent their own job title, which hence joined an ever-growing list of similar but not identical titles. What fun!

Programs to extract data from pre-existing databases

One old database used to seed the new system was maintained by a program written in Turbo Pascal.

Luckily, many years earlier, I had been the president of the Turbo Pascal Users Group here in Melbourne, which helped when it came to analyzing the source code of that system.

Another old database was build with MS Access, so it was easy to use Perl to extract data from that. But, as we know, it's pretty easy to use Person for pretty much anything, except perhaps for writing an operating system.

All such old data was exported as text files, to simplify importation to the new system.

In rare cases, data was supplied in MC Excel format, but of course Perl has no trouble reading such files, either.

Lastly, a few text files, exported from other systems, were used as sources of data.

Programs to import that data in the new database

The various text files mentioned, as well as a set of text files I used for tables of constants, were imported one-by-one, to bootstrap the new database.

This is not a particularly difficult or interesting phase, but when set up carefully means the database can be neatly and easily bootstrapped from scratch at any time.

Designing the screen formats

University staff did most of the design work on the new screens.

I did some of the design, and all the implementation, i.e. the CSS and Javascript, and putting the HTML into templates as used by the module HTML::Template.

That's one of the things about many projects in Australia - the team is often just 1 person, and they have to do everything, in all (programming) languages.

It means great freedom, of course, but also great responsibility.

As for the main page, they wanted a tree structure in the left frame, to display the hierarchy of departments and companies, etc, which make up such a large university, together with the input fields and search results in the right frame.

To complicate the issue, they wanted the tree to default to being partially opened, but since they were paying I eventually stopped complaining.

For the tree, I used CGI::Explorer, with a little bit of extra code to pre-open a couple of nodes.

The search results are displayed underneath the input fields, on the right.

After demonstrating the output, I was asked to auto-scroll the results, so that the input fields scrolled up and out of sight, and the results were at the top on the frame.

This is just another fiddle, but it does mean users have to click the browser's Back button twice when you'd expect once would work.

In the case where a string is entered for a department name, and that string matches a set of departments, then a menu of matches is displayed, and when one is chosen, a tree is displayed, on the right, showing that department's parent, self, and all it's children.

It's possible to navigate up and down this tree, independently of the full tree in the left frame, one level in the heirarchy at a time.

I should mention that when I say department, I always include all the elements in the structure of the university.

Departments are academic, and have under them sections, sub-sections, and staff.

Then there are the service-providing components, the IT (Information Technology) section of Monash itself (which employed me), the radio station, the print shop, student services of various types, and many others.

Also, a range of organizations are partly or wholly owned by Monash, and they too must be included. These are often bio-medical companies.

Lastly, there are a few special cases, e.g. banks, the chemist shop, etc, which are located on one of the campuses.

The number of departments grows at around 10% per year. There were about 900 of these when the program went live, which probably reflects commercialization of university activity more that anything else.

A CGI script for the main Search Engine

This, and the CGI script to allow updating the database, is of course the heart of the system.

Once it was up and running, people saw many fields on the output which they wanted turned in to links. This lead to, you guessed it, featuritis.

For instance, when a large department is displayed, with perhaps 300 staff members' details, each person could have 5 links, making for a complex web page.

Back in those days I did not know about CGI::Application::Dispatch (perhaps it wasn't released yet), and the popularity of RESTful URLS was yet to come. The end result is that processing URLs is the most complicated part of the main program, and (combined with the featuritis) perhaps the most disappointing. I'd love to re-write that part of it!

Near the end of the project, I put a lot of code in to cache search results, both since I figured there would be many similar searches, and also I had found out that the machine running the main search engine was doing a lot of other processing.

Another Search Engine, for the Telephonists

The Turbo Pascal program the telephonists had been using for 17 years had had a good innings indeed.

One side-effect of this was the extreme familiarity of the telephonists with the key-strokes used.

So, a constraint on the new code was that I was forbidded to introduce any change in the key-strokes for the new code.

And all this while converting from MS-DOS V 6 to MS Windows.

Originally, I wanted to write the new code in S-Lang http://www.s-lang.org/ but the Perl interface only worked under CygWin, and since I had never used CygWin up till then, I decided it was just a bit too risky to adopt it for this work.

So instead I used DOS windows directly, and the (Perl) modules Win32, Win32::Console and Win32::Sound. Not as fancy as Term::Slang, but they did the job.

It's very interesting what sorts of things influence us down a particular path.

Author

Perl@Work#2 was written by Ron Savage in 2009.

Home page: http://savage.net.au/index.html

Copyright

Australian copyright © 2009, Ron Savage. All rights reserved.

	All Programs of mine are 'OSI Certified Open Source Software';
	you can redistribute them and/or modify them under the terms of
	The Artistic License, a copy of which is available at:
	http://www.opensource.org/licenses/index.html
 
Top of page