My Workflow for Building Distros

Table of contents

My Workflow for Building Distros
Parameters to the Script
Our Initialization Logic
Cleaning up the Workdir
Checking .gitignore
Checking MANIFEST.SKIP
Checking LICENSE
Checking Makefile.PL
Checking the Source Code
Checking for Hash::FieldHash
Checking for t/pod.t
Checking the contents of the Changes file
Checking for Mojolicious
Checking Changelog.ini
Checking the (local) repo
Checking the Version # in all the Source Files
Building the Distro
Tagging the repo
Putting the Distro to Bed
Updating the 'modules' Database
Updating the Modules' Documentation
Checking if we have Multiple Distros in the Website's Dir
Checking pre-reqs in Makefile.PL against the source code
Checking Out
Wrapping Up and Winding Down
FAQ
Why did you develop this in bash?
How did you choose the format of the git tags you use?
What is the schema of the 'modules' database?
Downloads
See Also
Author
Copyright

My Workflow for Building Distros

Date: 2016-04-29.

This article explains the steps I use to convert a git repo of a Perl module into a new distro.

Just a few days ago Kent Fredric sent 2 github pulls for Tree::Simple which significantly improved the complex but still awkward workflow I've been using for years.

My original workflow code is a Perl module, Local::Modules, and the new code is currently a bash script.

Note for beginners: Modules under the namespace Local::* are personal, and not destined for CPAN.

Here, I will step thru this bash script, adding some commentary to each step.

See "Downloads" for links to files to download.

Clearly, if you use this script, or something similar, you'll need to make adjustments in cases such as which licence to check for, etc. The spelling of licence is discussed below.

Parameters to the Script

In this section, tokens in all caps are bash variables, e.g.: MODULE. So $MODULE is the value of that variable.

We are going to need the module's name in various versions:

o MODULE=Tree::Simple

The module's name, pure and simple.

o DIR_NAME=Tree-Simple

The module's workdir within my MODULE_DIR=/home/ron/perl.modules.

o DIR_PATH=lib/Tree/Simple.pm

The path to the module's main file.

From those we can construct any other required tokens. E.g.: TAR_NAME=$DIR_NAME-$VERSION so we have $TAR_NAME.tar.gz as output by 'make dist', and $TAR_NAME.tgz as I want it.

So, the script will be called with 2 parameters:

o Tree::Simple
o 1.28

Inside the script these are known as $1 and $2, and if defined they are stored in MODULE and VERSION, respectively.

Of course the code could be changed to accept Tree-Simple, and to convert that into Tree::Simple. But I feel it's clearer if we pass the module's name to the script, rather than the dir name, since other users could be using a different convention as to how they name their module's dir.

Our Initialization Logic

Code:

        # Check parameters.

        if [ "$1" == "" ]; then
                echo Parameters: Module::Name Version

                exit
        fi

        if [ "$2" == "" ]; then
                echo Parameters: Module::Name Version

                exit
        fi

        MODULE=$1
        VERSION=$2
        MODULE_DIR=/home/ron/perl.modules
        HOMEPAGE_DIR=/home/ron/savage.net.au/Perl-modules
        DIR_NAME=`echo $MODULE | sed 's!::!-!g'`
        DIR_PATH=`echo $DIR_NAME | sed 's!-!/!g'`
        DIR_PATH=lib/$DIR_PATH.pm
        TAR_NAME=$DIR_NAME-$VERSION

        echo MODULE=$MODULE
        echo VERSION=$VERSION
        echo MODULE_DIR=$MODULE_DIR
        echo HOMEPAGE_DIR=$HOMEPAGE_DIR
        echo DIR_NAME=$DIR_NAME
        echo DIR_PATH=$DIR_PATH
        echo TAR_NAME=$TAR_NAME

Pragmatics:

Note: You might be wondering why I used $DIR_NAME in DIR_PATH=`echo $DIR_NAME | sed 's!-!/!g'`, instead of DIR_PATH=`echo $MODULE | sed 's!::!/!g'`. It's because the /g option does not work in that case :-(.

Let's say we're working on Tree::Simple, which I adopted from Stevan Little and Rob Kinyon. It's dir contents are:

        -rw-r--r-- 1 ron ron 10978 Apr 29 10:18 Changelog.ini
        -rw-r--r-- 1 ron ron 10154 Apr 28 17:54 Changes
        drwxr-xr-x 3 ron ron  4096 Nov 12  2007 lib/
        -rw-r--r-- 1 ron ron 20211 Apr 25 09:31 LICENSE
        -rw-r--r-- 1 ron ron  1265 Apr 26 14:54 Makefile.PL
        -rw-r--r-- 1 ron ron   608 Apr 26 16:04 MANIFEST.SKIP
        -rw-r--r-- 1 ron ron  1313 Apr 25 09:53 README
        drwxr-xr-x 2 ron ron  4096 Nov 29 10:28 t/
        drwxr-xr-x 3 ron ron  4096 Sep 26  2013 xt/

And that's it. No Build.PL any more, since Module::Build been retired from core, and hence only ships via download from CPAN. I actually preferred it to ExtUtils::MakeMaker, but no matter.

No META.json nor META.yml. The former is just gone, and the latter will be generated when we build the distro.

And no MYMETA.json nor MYMETA.yml either.

And no MANIFEST. It too will be generated.

The Changelog.ini is output by ini.report.pl, which ships with my Module::Metadata::Changes. Yes, it too could be deleted but my policy is to keep it, at least for now.

Cleaning up the Workdir

Code:

        # Move into the module's workdir.

        cd $MODULE_DIR/$DIR_NAME

        X=`pwd`

        if [ "$X" != "$MODULE_DIR/$DIR_NAME" ]; then
                echo cd into module dir failed. Current dir is $X

                exit
        fi

        if [ -e MANIFEST ]; then
                echo Remove Build.PL MANIFEST *META.*. Move t/pod* to xt/author/

                exit
        fi

        git clean -dfX

Pragmatics:

This section does nothing more that move into the module's dir and use git to clean that dir up.

Checking .gitignore

Code:

        # Check .gitignore.

        grep ^/$DIR_NAME- .gitignore > /dev/null

        if [ "$?" == "1" ]; then
                echo Put /$DIR_NAME- into .gitignore

                exit
        fi

        # Here, the trailing $ excludes MANIFEST.SKIP.

        grep ^/MANIFEST$ .gitignore > /dev/null

        if [ "$?" == "1" ]; then
                echo Put /MANIFEST into .gitignore

                exit
        fi

Pragmatics:

.gitgnore contains regexps (and some oddities like '!Build/'). So we first look for ^/Tree-Simple- since we want it to be present. $? == 1 means grep failed to find it.

Secondly we look for ^/MANIFEST$, but not MANIFEST.SKIP. Likewise, we want it to be present.

Lastly, output is discarded by being redirected to /dev/null, since we only care about the success or failure of the test, not the text used in the test.

Checking MANIFEST.SKIP

Code:

        # Check MANIFEST.SKIP.

        grep .gitignore MANIFEST.SKIP > /dev/null

        if [ "$?" == "1" ]; then
                echo Put .gitignore into MANIFEST.SKIP

                exit
        fi

        # Here, I omit the * to skip it being expanded by bash.

        grep ^^$DIR_NAME-. MANIFEST.SKIP > /dev/null

        if [ "$?" == "1" ]; then
                echo Put ^$DIR_NAME-.* into MANIFEST.SKIP

                exit
        fi

Pragmatics:

These are the same sorts of tests as for .gitignore, but note the comment about *.

Checking LICENSE

Sigh. Here in Australia we spell it licence, but Perl ain't that flexible!

Anyway, that's why I spell it licence in my Changes files (and below).

Code:

        # Check LICENSE.

        head -1 LICENSE | grep 'Terms of Perl' > /dev/null

        if [ "$?" == "1" ]; then
                echo Put Perl into LICENSE

                exit
        fi

Pragmatics:

This just ensures the LICENSE file contains a specific, dual, licence.

Checking Makefile.PL

Code:

        # Check Makefile.PL.

        grep TEST_REQUIRES Makefile.PL > /dev/null

        if [ "$?" == "1" ]; then
                echo Put TEST_REQUIRES into Makefile.PL

                exit
        fi

        if [ "${MODULE:0:5}" != "Local" ]; then
                grep -A 4 repository Makefile.PL | grep $DIR_NAME > /dev/null

                if [ "$?" == "1" ]; then
                        echo Put repository into Makefile.PL

                        exit
                fi
        fi

        grep LICENSE Makefile.PL | grep perl > /dev/null

        if [ "$?" == "1" ]; then
                echo Put LICENSE perl into Makefile.PL

                exit
        fi

        grep META_MERGE Makefile.PL > /dev/null

        if [ "$?" == "1" ]; then
                echo Put META_MERGE into Makefile.PL

                exit
        fi

        grep -A 2 meta-spec Makefile.PL | grep 2 > /dev/null

        if [ "$?" == "1" ]; then
                echo Put meta-spec V 2 into Makefile.PL

                exit
        fi

Pragmatics:

o TEST_REQUIRES

We want modules which are only used for testing to be listed like this:

        TEST_REQUIRES =>
        {
                'Test::Exception'       => '0.15',
                'Test::More'            => 1.001014,
        },

Rather than including them in this:

        PREREQ_PM =>
        {
                'constant'              => 0,
                'Scalar::Util'  => '1.18',
                'strict'                => 0,
                'warnings'              => 0,
        },
o ${MODULE:0:5}

Here, we check for modules in the Local::* namespace, and skip the next check if they are. The check itself ensures Makefile.PL contains some sort of repository declaration.

The 'grep -A 4' bit outputs 4 lines, and we expect them to be part of these 6 lines:

        repository =>
        {
                type => 'git',
                url  => 'https://github.com/ronsavage/Tree-Simple.git',
                web  => 'https://github.com/ronsavage/Tree-Simple',
        },

So, somewhere in those 4 lines there ought to be a repo for the 'current' module.

o LICENSE

Next, ensure the Makefile.pl actually contains a licence directive.

o META_MERGE

Next, check a META_MERGE is present, in case someone's using a version of ExtUtils::MakeMaker older than 6.46.

o The version of META.yml we wish to generate

Finally, ensure we force the meta-spec version to be 2.

Checking the Source Code

Code:

        # Check the source for Repository.

        if [ "${MODULE:0:5}" != "Local" ]; then
                grep -i '^=head1 Repository' $DIR_PATH > /dev/null

                if [ "$?" == "1" ]; then
                        echo Put repository into $DIR_PATH

                        exit
                fi
        fi

Pragmatics:

For non-Local::* modules, ensure the POD contains a link to the repo.

Actually, this just checks the main source file, not that every *.pm contains a link to the repo. But then, my policy is that only the main file must contain such a link, although having it in every module's POD would be nice.

And this could be extended to check if the licence is mentioned in the docs.

Checking for Hash::FieldHash

Code:

        # Check source for Hash::FieldHash.

        ack Hash::FieldHash lib > /dev/null

        if [ "$?" == "0" ]; then
                echo Replace Hash::FieldHash with Moo

                exit
        fi

Pragmatics:

Over the years I've used a range of class-support modules, and for a while enjoyed using Hash::FieldHash.

But now that I've moved to Moo, each time I release a new version of a module, I make the effort to replace the former with the latter. I have used Moose of course, but the lighter weight Moo suits me fine.

I know I could leave Hash::FieldHash in situ, but my policy is the less variety of competing modules I use, the better. So, for the same reason, I'm replacing Perl6::Slurp and File::Slurp with File::Slurper. And yes, there are other reasons to change, such as weaknesses in File::Slurp being dealt with by File::Slurper.

Checking for t/pod.t

Code:

        # Check t/pod.t

        if [ -e t/pod.t ]; then
                echo Move t/pod.t to xt/author/

                exit
        fi

Pragmatics:

Modern thinking is that POD tests are only the author's concern, and in particular failing POD tests should not stop a module installing. So, push t/pod.t sideways.

But why not automate the move, I hear you all scream? Because it involves patching MANIFEST too? No. MANIFEST is generated, remember? I just did not get round to that yet.

Checking the contents of the Changes file

Code:

        # Check Changes. This depends on the format I use for Changes.
        # Here are lines 1 .. 3 (ignore the '# ' prefix of course):
        # Revision history for Perl extension Tree::Simple.
        #
        # 1.28  2016-04-27T17:35:00

        VER=`head -3 Changes | tail -1 | gawk -- '{print $1}'`

        if [ "$VERSION" != "$VER" ]; then
                echo Version number mismatch

                exit
        fi

        # Here, test for my obsolete Changes format, where [A-Z] starts a day name:
        # Revision history for Perl extension WWW::Scraper::Typo3.
        #
        # 1.01  Thu Aug 26 17:45:00 2010

        grep "^$VERSION\s\{1,2\}[A-Z]" Changes > /dev/null

        if [ "$?" == "0" ]; then
                echo Fix date format in Changes with fix.dates.pl

                exit
        fi

Pragmatics:

CPAN::Changes::Spec is your overarching guideline for the format of the Changes file, and here I assume I've formatted Changes so that the most recent version is near the top of the file, with just 2 lines above it.

So, Changes is only up-to-date if it contains info on the version I'm trying to build.

The second test, as the comment says, just reminds me to reformat dates in my old format. A home-grown program, fix.dates.pl does that. The download link for fix.dates.pl is in "Downloads".

Checking for Mojolicious

Code:

        # Check Mojo log.

        if [ -e log ]; then
                cp /dev/null log/development.log
        fi

Pragmatics:

This just stops me building a module which uses Mojo and includes junk in the dev log.

Checking Changelog.ini

Code:

        # Convert Changes into Changelog.ini.
        # ini.report.pl ships with Module::Metadata::Changes.

        ini.report.pl -c

        git commit -m"Update Changelog.ini from Changes" Changelog.ini > /dev/null

Pragmatics:

I don't assume anyone is actually using Module::Metadata::Changes, but if they are, then ensure Changelog.ini is up-to-date. Let me know if you are!

The advantage of using it is that that version of Changes is in the easy-to-read *.ini format.

My module uses Config::IniFiles, another fine module.

Checking the (local) repo

Code:

        # Check the repo.

        git status | grep Untracked > /dev/null

        if [ "$?" == "0" ]; then
                echo Remove excess files

                exit
        fi

        git status | grep 'Changes not staged' > /dev/null

        if [ "$?" == "0" ]; then
                echo Run git commit on patched files

                exit
        fi

Pragmatics:

Here, a could of tests to ensure the repo is up-to-date before trying to use it's files to build anything.

You'll notice that most of the tests using $? so far checked for 1, meaning failure (target not found), but these 2 are different. If the target is found, we have a problem.

Checking the Version # in all the Source Files

Code:

        # Update the version #.

        update.version.pl -d . -v $VERSION

        git commit -am"Update version # to $VERSION" > /dev/null

Pragmatics:

No point in building a module with out-of-date version #s (not that any of us would do that, of course). update.version.pl is another home-grown script, written on the assumption that my version declarations look like "our $VERSION = '1.28';", all on 1 line.

Building the Distro

Finally!

Code:

        # Build the distro.

        perl Makefile.PL
        make manifest
        make test

        if [ "$?" == "1" ]; then
                echo Test failure

                exit
        fi

        make dist

        if [ ! -e "$TAR_NAME.tar.gz" ]; then
                echo Dist not built

                exit
        fi

        make install

Pragmatics:

This part at least is pretty simple. And yes, I always install the new version. You may not want to do that.

Tagging the repo

Code:

        # Tag the repo.

        T=`printf '%(%a %b %e %T %Y)T'`

        git tag -d $VERSION
        git tag -afm "V $VERSION @ $T" $VERSION

Pragmatics:

Since we're soon going to push the repo to github and the disto to PAUSE, we want to add a version-specific tag to each release..

Putting the Distro to Bed

Code:

        # Move the dist to the homepage dir and clean up.

        mv $TAR_NAME.tar.gz $HOMEPAGE_DIR/$TAR_NAME.tgz
        git clean -dfX
        list.tags.sh

Pragmatics:

The build process leaves the distro in the workdir, so we copy it to the dir holding the repo of my web site, /home/ron/savage.net.au. Eventually I'll run my build website program (which takes 110 seconds), and from that same dir I'll upload the updated module index page http://savage.net.au/Perl-modules.html, the distro itself, and the docs (which have been converted into HTML). Likewise, the site's homepage will have been rebuilt and uploaded, announcing the new version. I must confess the text referring to this new version is currently inserting into the homepage's XML manually before being converted into HTML.

The list.tags.sh script just contains 'git tag -n 1 -l [012]*'. I expect anyone forking a distro will use that too.

Updating the 'modules' Database

Code:

        # Update the 'modules' database.

        perl $MODULE_DIR/Local-Modules/scripts/update.database.pl -m $MODULE -v $VERSION

Pragmatics:

You do have a modules database, right?

That's what the web site's build script uses to generate http://savage.net.au/Perl-modules.html.

Updating the Modules' Documentation

Code:

        # Copy the docs to the homepage dir, where they get picked up
        # by the website builder and converted into HTML.

        cp -r lib/* $HOMEPAGE_DIR/pod/

Pragmatics:

And this is why I said above 'Eventually I'll run my build website program...'. By the time it runs, the 'modules' database needs to be up-to-date, and the most recent *.pm files need to be under the dir here called $HOMEPAGE_DIR, i.e. /home/ron/savage.net.au/Perl-modules, since both the source code containing the docs will be uploaded, and the HTML versions of those docs will be too.

Checking if we have Multiple Distros in the Website's Dir

Code:

        # Warn if there are multiple distros in the homepage dir. This is not an error.

        X=`dir $HOMEPAGE_DIR/$DIR_NAME-*.tgz | wc -l`

        if [ "$X" != "1" ]; then
                echo There are multiple distros in the homepage dir
        fi

Pragmatics:

Ideally, there would of course be only one 'latest' version and corresponding distro, but just in case I've ended up with more, I want to know about it.

Checking pre-reqs in Makefile.PL against the source code

Code:

        # Check pre-reqs in Makefile.PL against code.

        echo Wait ... running validate.build.make.pl

        validate.build.make.pl $MODULE

Pragmatics:

validate.build.make.pl is another home-grown script. It takes a few seconds to run, so warn the user about the otherwise unexpected delay.

Althought the dist is built by now, there's no point running this earlier, since it cannot definitively list the required modules. Thus even detecting suspicious cases could not be used to stop the build. So, we may as build, since that might simultaneously tell us something.

As for validate.build.make.pl, I'm just about to do another search on CPAN for a better solution. This program runs another which uses Perl::PrereqScanner, so different ideas are welcome.

The reason I don't use Perl::PrereqScanner::Lite is it's output for Module::Metadata::Changes is:

        {
          "Config::IniFiles" => 0,
          "DateTime::Format::W3CDTF" => 0,
          "File::Slurper" => 0,
          "HTML::Entities::Interpolate" => 0,
          "HTML::Template" => 0,
          Moo => 0,
          "Try::Tiny" => 0,
          "Types::Standard" => 0,
          strict => 0,
          version => 0,
          warnings => 0
        }

Which isn't much use, whereas my output, using Perl::PrereqScanner, is:

        Module::Metadata::Changes
        -------------------------
        Required:
        Config::IniFiles => 2.88
        DateTime::Format::W3CDTF => 0.06
        ExtUtils::MakeMaker => 7.14
        File::Slurper => 0.008
        File::chdir => 0.1010
        Getopt::Long => 2.42
        HTML::Entities::Interpolate => 1.09
        HTML::Template => 2.95
        Moo => 2.001001
        Pod::Usage => 1.67
        Test::More => 1.001014
        Test::Stream => 1.302026
        Try::Tiny => 0.24
        Types::Standard => 1.000005
        feature => 1.3601
        lib => 0.63
        strict => 1.08
        version => 0.9912
        warnings => 1.23

        Required but not in Makefile.PL:
        ExtUtils::MakeMaker

        In Makefile.PL but not required:
        DateTime
        DateTime::Format::HTTP
        DateTime::Format::Strptime
        Test::Pod

        Warning: Beware cases like Genealogy::Gedcom requiring Genealogy::Gedcom::Date
        --------------------------------------------------

Checking Out

Code:

        # What state did we finish in?

        git status
        dir $HOMEPAGE_DIR/$TAR_NAME.tgz

        echo Finished building $MODULE V $VERSION

        # Prompt to push to github.

        if [ "${MODULE:0:5}" != "Local" ]; then
                echo Run git push now
        fi

Prgamatics:

This is literally a status report, and for non-Local::* modules, issue a reminder to keep the github copy of the repo up-to-date.

Wrapping Up and Winding Down

I hope this will encourage all you module authors to clarify the steps in your own build processes, as you slave away on your many soon-to-be-famous modules.

FAQ

Why did you develop this in bash?

I expect to convert this to Perl someday, but the edit-test cycle is a bit shorter than patching a pre-existing module and installing it, and then testing it. And I was curious to know if my limited bash skills would suffice.

And, of course, changing a pre-existing module step-by-step is painful if the unchanged steps keep interferring in the flow of logic.

How did you choose the format of the git tags you use?

I just made up something I was comfortable with.

What is the schema of the 'modules' database?

Nothing fancy: http://savage.net.au/misc/modules.schema.svg. Note that the 'name' column in the authors table does sometimes contain a list of names.

gitstatuses holds 'No' (id = 1) and 'Yes' (id = 2) for whether or not the repo is on github. ms_specific_statuses just holds 'No' (id = 1) these days. public_statuses also holds 'No' and 'Yes'. All Local::* modules are 'No' of course, but sometimes other modules can be too, to stop them appearing on the public modules index page.

Downloads

build.module.sh

fix.dates.pl

See Also

Config::IniFiles

Dist::Zilla

ExtUtils::MakeMaker

Module::Build (deprecated)

Module::Metadata::Changes

Author

Ron Savage <ron@savage.net.au>.

Copyright

Australian copyright © 2016, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License 2.0, a copy of which is available at:
        http://opensource.org/licenses/alphabetical.