Marpa Praxis # 1 - Briefly discuss MarpaX::Languages::SVG::Parser

Table of contents

Marpa Praxis # 1 - Briefly discuss MarpaX::Languages::SVG::Parser
SVG and XML
Marpa's Role
How Are The 5 Grammars Handled?
Is This Nested Parsing?
Parsing Short Strings
What Could I Do Differently Next Time?

Marpa Praxis # 1 - Briefly discuss MarpaX::Languages::SVG::Parser

Author: Ron Savage.

Purpose: To shed some light on the inner workings of MarpaX::Languages::SVG::Parser.

References:

o W3C's SVG specs
o Paths
o shapes.

SVG and XML

Since SGV is XML, we can use any XML parser we want. I used XML::SAX::ParserFactory because I am familiar with it. At the time I wrote MarpaX::Languages::SVG::Parser, Jean-Damien Durand's module MarpaX::Languages::XML::AST had not yet been written.

Marpa's Role

So where does Marpa::R2 come in? I use it to parse the attributes of various XML tags:

o The 'd' attribute of the 'path' tag
o The 'points' attribute of the 'polygon' and 'polyline' tags
o The 'preserveAspectRatio' attribute of various tags
o The 'transform' attribute of various tags
o The 'viewBox' attribute of various tags

So, the code has in it 5 grammars.

Each of these grammars has been turned into an SVG image by MarpaX::Grammar::GraphViz2. See http://savage.net.au/Perl-modules/html/marpax.languages.svg.parser.

How Are The 5 Grammars Handled?

Firstly, the grammars are stored in the source code of MarpaX::Languages::SVG::Parser::SAXHandler, after with POD (which I put after the code), and after the __DATA__ token. For instance, the first is introduced with the string '@@ d.bnf'.

Secondly, Data::Section::Simple is used to load them on demand. And, yes, that means every time the XML parser encounters a tag with an attribute I wish to parse, another Marpa parser is initialized.

Marpa is so fast I do not care about any delay incurred by this arrangement. As always, my policy is to be prepared to expend any number of machine cycles to make life easy for myself. In other words, my time is valuable but machine time is (effectively) wortheless.

Is This Nested Parsing?

Yes, that's what I call it. The outer, XML, parser is always still running during each Marpa-based parse of an attribute's value.

Parsing Short Strings

It should be obvious that many SVG attribute strings will be quite short. I don't see that as a problem. Anyone who parses SVG faces the same issue. It's simply one aspect of what needs to be done.

What Could I Do Differently Next Time?

I chose to use Set::Array to manage the output from the parser, but while reviewing the code for this article, I think it may have been better to use a tree, perhaps Tree or Tree::Simple. The advantage would be that a tree more clearly represents the nested nature of SVG tags.

MetaCPAN says that this module has no reverse dependencies (i.e. modules depending on it), so the potential change just mentioned could (hopefully) be made without affecting any downstream modules.