Table of Contents |
Introduction |
Interfaces to Marpa |
The Grammar in BNF |
Event-driven Processing of Lexemes |
Sample Input |
Sample Output |
Download |
TODO |
Last updated: 2014-09-18 |
I've been developing another Marpa-based parser for DOT files, in a re-write of my Perl module This time I'm using the SLIF interface (see below)
The new code will be in the next public release of that module. The current version of GraphViz2::Marpa (V 1.13) uses the obsolete NAIF.
My other graph-oriented modules.
Marpa is written in C, although I do all my development using its Perl SLIF interface. Marpa compiles into libmarpa.
Even though using Perl as a wrapper around these interfaces might make you think it's slow, Marpa is so fast that the slowdown is simpy not detectable when I run programs from the command line.
And ultimately, everything I write in Perl could be transcribed into C in order to use the THIF directly, but discussion of that process has only just begun on Marpa's irc channel irc.freenode.net#marpa. See the backlog re a SLIF-to-C compiler starting here.
Note: All software discussed here is Open Source.
See also the related document
Conditional Preservation of Whitespace.
Top
Language you use | Name of Interface | Abbreviation of Interface | Status |
C | - | - | Supported |
Perl - Marpa::R2 | Thin interface | THIF | Supported. Throws Perl exceptions |
Perl - Marpa::R2 | Native/Naive interface | NAIF | Obsolete but supported, and discouraged. Throws Perl exceptions |
Perl - Marpa::R2 | Scanless interface | SLIF | Supported. Used in GraphViz2::Marpa. Throws Perl exceptions |
|
One point in the grammar I'd like to mention explicitly, is the construct you see in, e.g., 'before' could (in a different grammar) have been 'after', while 'event' declares the event's name, as used in the giant 'if' statement in the code below.
Since, astonishingly, Marpa's lexer/parser can be exited and re-entered at any time, a grammar can request a return to the calling code at the start (as here) or end of any given lexeme, giving the caller the opportunity to use various Marpa methods to get and stash the lexeme, and its length, etc. Then, the parser is re-entered with a call to resume($pos).
This also means the $pos variable can be fiddled. And that's how I parse attribute names and values manually, in order to preserve the whitespace within strings, as mentioned in the gammar's comments. Having done that, I stash them, and advance $pos past the end of the last such value, thereby letting Marpa continue without ever actually seeing inside a [...] style attribute declararion.
Further, each attribute value could then have its own standalone grammar, if you wanted to use Marpa to validate each type of attributes' values separately.
I should add that the multiple entries into, and exits from, the parser as used here are probably not how the majority of Marpa-using code is written. That is, the input string is generally processed in one go. Of course it's nice to have a choice, and demonstrates one of the many ways in which Marpa distinguishes itself from traditional parsers.
Output (for my Perl code) is in the form of a tree. There is an example below. This tree is managed by the Perl module Tree::DAG_Node, I didn't write Tree::DAG_Node, but I do now maintain it.
|
|
|
The current code can be downloaded from
To run:
shell> tar xvzf GraphViz2-Marpa-2.00.rc3.tgz |
shell> cd GraphViz2-Marpa |
shell> perl Makefile.PL; make (To install the pre-reqs) |
shell> scripts/test.sh |
shell> scripts/g2m.sh data/16.gv -max info (As per the example above) |
And likewise for any file data/*.gv. |
My homepage |
Debian V 7.6 |
Perl V 5.18.2 |
Marpa V 2.091001 |
GraphViz2::Marpa V 2.00 (unreleased) |