Name |
Overview |
XML::XPath |
XML::Parser |
XML::SAX |
Resources |
Author |
Licence |
NameXML Ramble OverviewPerl, as always, gives you various ways to do things. Processing XML is no exception. Here I offer a few comments on XML::XPath, XML::Parser (and so XML::DOM), and XML::SAX. XML::XPathPerl has a number of mini-languages in it. Take: $x = './x +'; From there we could have: $a = q|$x|; $b = qq|$x|; $c = qr|$x|; $d = qx|$x|; Then there are XS, POD, the full monty of regular expressions, even here documents, etc. So it comes as no real surprise when the XML::XPath docs say: "nodeset = find($path, [$context]) The find function takes an XPath expression (a string) and returns either an XML::XPath::NodeSet object containing the nodes it found (or empty if no nodes matched the path), or one of XML::XPath::Literal (a string), XML::XPath::Number, or XML::XPath::Boolean." Note the phrase 'XPath expression'. This is the XPath mini-language I referred to in my original posting. The return value could be construed as belonging to yet another mini-language. As I see it, there are 2 connotations: a) You need to know the exact structure of the XML document being parsed in order to craft meaningful XPath expressions. Of course, you normally do know the structure, and most likely you need to know it no matter which module you use. Nevertheless, this affects the generality of the code you write. b) Your code will be procedural in structure, and you'll basically process the XML document with 'for' statements. Eg (again from the XML::XPath docs): "my $nodeset = $xp->find('/html/body/p'); # find all paragraphs
foreach my $node ($nodeset->get_nodelist) {
print "FOUND\n\n",
XML::XPath::XMLParser::as_string($node),
"\n\n";
}"
There's nothing wrong with this method of analyzing an XML document. If it does the job - fine. I think of this method as a bit like plodding, or as a line-by-line approach (what I called 'linear' in my original posting). For an XML document with any sort of structure, the 'for' loops are going to become more and more nested, and it's this aspect of the XPath approach which I feel uneasy about. Also, it's the same old sort of coding I write every day. I'm learning XPath expressions sure, but I'm learning nothing new about Perl, or ways of thinking. This contrasts quite strongly with the call-back structure of code using XML::Parser and XML::SAX. Some people would say this is a good thing, in that the module (XML::XPath) is shielding me from certain complexities of the problem at hand. Hmmm. Lastly, there is neither need nor expectation of using OO code here. XML::ParserThe module most of us cut our teeth on, probably, either directly or via XML::DOM. This module is call-back (handler) based, and hence you have to think quite differently about the processing logic as compared to using XML::XPath. Actually, I think being comfortable with call-backs is an important skill, and modules like this are a good, and well-documented, way of developing this skill. This particular module is stable and famous. And, if it does the job - fine. I used XML::Parser when I decided to convert my web site from HTML files to a single XML file, and needed a way of reconstructing the HTML from the XML. In going thru this I learned a lot. I became more comfortable with call-backs, and am very glad I went through this particular learning process. But I was never happy with the structure of the code, although very pleased with it as a first attempt (ie ambivalent). I started programming as a uni student in 1970 (gasp), and I could feel in my bones that this code just wasn't quite right. Again, there is neither need nor expectation of using OO code here. However time marches on, and so does software technology. Enter SAX. XML::SAXWith this module, the call-backs are in their own module, and the latter is based on XML::SAX::Base. I've learned a lot in the last 3 years, of course, but even so, using SAX has been a revelation. It's a vastly more natural way of processing XML. My web site XML file was 4,017 lines, almost all double spaced, and the original Perl code was 335 lines. Obviously no big deal. That program also contained the HTML which had a CSS embedded in it. In the new design, the CSS is in its own file. I've now simplified the XML file slightly, so it's down to 3,678 lines. The SAX-based Perl is in 2 parts:
Not much of a code saving, if that was the intention. But then, it was not my intention. ResourcesArticles/OpenOffice/XML.com Adventures with OpenOffice and XML [Feb. 07, 2001] http://www.xml.com/pub/a/2001/02/07/openoffice.html Articles/Perl/XML/How to http://209.52.133.234/x3d/howto.html Articles/Perl/XML/Igor's Webhome http://www.fh-frankfurt.de/~igor/projects/libxml/ Articles/Perl/XML/Perl SAX 2.0 Binding http://kmacleod.static.iaxs.net/~ken/perl-xml/sax-2.0.html Articles/Perl/XML/Simple XML Validation with Perl http://www.xml.com/pub/a/2000/11/08/perl/index.html Articles/Perl/XML/TPC 2001 Conference Presentations http://axkit.org/docs/presentations/tpc2001/ Articles/Perl/XML/XML Encodings http://standards.ieee.org/resources/spasystem/twig/encoding/encoding.html Articles/Perl/XML/XML Encodings - index http://standards.ieee.org/resources/spasystem/twig/encoding/ Articles/Perl/XML/XML Modules http://standards.ieee.org/resources/spasystem/twig/perl_xml/perl_xml.html Articles/Perl/XML/XML Modules - index http://standards.ieee.org/resources/spasystem/twig/perl_xml/ Articles/Perl/XML/XML and Perl http://www.oasis-open.org/cover/xmlAndPerl.html Articles/Perl/XML/XML-RPC/web.oreilly.com -- Binary Data to Go Using XML-RPC to Serve Up Charts on the Fly http://web.oreilly.com/news/xmlrpc_0701.html Articles/Perl/XML/XML.com Creating Scalable Vector Graphics with Perl [Jul. 11, 2001] http://www.xml.com/pub/a/2001/07/11/creatingsvg.html Articles/Perl/XML/XML.com Creating Web Utilities Using XMLXPath http://www.xml.com/pub/a/2000/01/10/perlwebtools.html Articles/Perl/XML/XML.com High-Performance XML Parsing With SAX [Feb. 14, 2001] http://www.xml.com/pub/a/2001/02/14/perlsax.html Articles/Perl/XML/XML.com Perl XML Quickstart Convenience Modules [Jun. 13, 2001] http://www.xml.com/pub/a/2001/06/13/perlxml.html Articles/Perl/XML/XML.com Perl XML Quickstart The Perl XML Interfaces [Apr. 18, 2001] http://www.xml.com/pub/a/2001/04/18/perlxmlqstart1.html Articles/Perl/XML/XML.com Perl XML Quickstart The Standard XML Interfaces [May. 16, 2001] http://www.xml.com/pub/a/2001/05/16/perlxml.html Articles/Perl/XML/XML.com Pyxie [Mar. 15, 2000] http://www.xml.com/pub/a/2000/03/15/feature/index.html Articles/Perl/XML/XML.com Transforming XML With SAX Filters [Oct. 10, 2001] http://www.xml.com/pub/a/2001/10/10/sax-filters.html Articles/Perl/XML/XML.com Using XSL Formatting Objects http://www.xml.com/pub/a/2001/01/17/xsl-fo/index.html Articles/Perl/XML/XML.com Writing SAX Drivers for Non-XML Data [Sep. 19, 2001] http://www.xml.com/pub/a/2001/09/19/sax-non-xml-data.html Articles/Perl/XML/XML.com XMLLibXML - An XMLParser Alternative [Nov. 14, 2001] http://www.xml.com/pub/a/2001/11/14/xml-libxml.html Articles/Perl/XML/XML.com XMLParser and Character Encodings http://www.xml.com/lpt/a/2000/04/26/encodings/xmlparser.html Articles/Perl/XML/XML::Twig/Tutorial http://standards.ieee.org/resources/spasystem/twig/ Articles/Perl/XML/XML::Twig/Tutorial - examples http://standards.ieee.org/resources/spasystem/twig/tutorial/ Articles/Perl/XML/XML::Twig/Using XMLTwig [Mar. 21, 2001] http://www.xml.com/pub/a/2001/03/21/xmltwig.html Articles/Perl/XML/XML::Twig/Ways to Rome http://standards.ieee.org/resources/spasystem/twig/perl_survey/perl_xml_survey.html Articles/Perl/XML/XML::Twig/Ways to Rome - index http://standards.ieee.org/resources/spasystem/twig/perl_survey/ Articles/Perl/XML developer news from XMLhack by and for the XML community http://xmlhack.com/list.php?cat=10 Articles/Perl/XSP/XSP & Apache http://xml.apache.org/cocoon/xsp.html Articles/XML/XML etc tutorial The XML Revolution http://www.brics.dk/~amoeller/XML/ Articles/XML/XMLperl X Marks (up) the Language http://xmlperl.com/articles/ebohlman/xmarkslang3.php FAQs/XML/FAQ (W3C) http://www.ucc.ie/xml/ FAQs/XSL Frequently Asked Questions http://www.dpawson.freeserve.co.uk/xsl/xslfaq.html Security/Windows 2000/New XML-Based Security Site; Hardening Windows 2000 http://www.shavlik.com/pr_xml_mssecure.asp XML/DMJG.DE's SkatDoc XML http://www.dmjg.de/skatdoc/ XML/Enno's Home Page http://users.erols.com/enno/ XML/IBM - Visual XML Tools http://alphaworks.ibm.com/tech/visualxmltools XML/LibXML http://xmlsoft.org/ XML/MSDN Online XML Developer Center http://www.msdn.microsoft.com/xml/default.asp XML/Mike J. Brown's XML and XSL stuff http://www.skew.org/xml/ XML/Orchard Source and Documentation http://casbah.org/~kmacleod/orchard/ XML/Perl & XML & HTML - IBM http://www-4.ibm.com/software/developer/library/xml-perl/ XML/Perl modules http://www.xml.com/pub/2000/04/05/feature/index.html XML/Pyxie Home Page http://www.digitome.com/pyxie.html XML/RXP - Validating open source parser in C http://www.cogsci.ed.ac.uk/~richard/rxp.html XML/Sablotron/Ginger Alliance http://www.gingerall.com/ XML/Take23 news and resources for the mod_perl world http://take23.org/ XML/The XML C library for Gnome http://www.xmlsoft.org/ XML/The XML Companion http://www.bradley.co.uk/xmlbook.htm XML/The XML Cover Pages - Home Page http://www.oasis-open.org/cover/sgml-xml.html XML/XER (XML Encoding Rules) http://asf.gils.net/xer/ XML/XML - Books and essays by Simon St.Laurent http://www.simonstl.com/ XML/XML Bible http://metalab.unc.edu/xml/books/bible/ XML/XML Cooktop http://xmleverywhere.com/cooktop/ XML/XML Data Binding Resources http://www.rpbourret.com/xml/XMLDataBinding.htm XML/XML Database Products http://www.rpbourret.com/xml/XMLDatabaseProds.htm XML/XML Protocol Comparisons http://www.w3.org/2000/03/29-XML-protocol-matrix XML/XML Query Engine http://www.fatdog.com/#DOWNLOAD XML/XML Recommendation http://www.w3.org/TR/REC-xml XML/XML Schema tutorial http://www.xfront.com/ XML/XML School http://www.refsnesdata.no/xml/default.asp XML/XML School (2) http://www.w3schools.com/xml/default.asp XML/XML tools by category http://www.garshol.priv.no/download/xmltools/cat_ix.html XML/XML-Edifact.org http://www.xml-edifact.org/pub/ XML/XML-RPC Home Page http://xmlrpc.org/ XML/XML-TiePYX in Perl modules http://www.omsdev.com/ebohlman/perlmodules/ XML/XML.com http://www.xml.com/pub XML/XML.com - Getting started http://www.xml.com/pub/norm/part1/getstart1.html XML/XML.com XML From the Inside Out -- XML development, XML resources, XML specifications http://www.xml.com/ XML/XML.org -- Industry News http://www.xml.org/xml/news_market.shtml XML/XMLDB Initiative Enterprise Technologies for XML Databases http://www.xmldb.org/ XML/XMLSOFTWARE The XML Software Site http://www.xmlsoftware.com/ XML/XMLhack by and for the XML community http://www.xmlhack.com/ XML/XMLperl First stop on the XML-Perl highway http://xmlperl.com/ XML/XPathScript An Alternative To XSLT http://www.xml.com/pub/2000/07/05/xpathscript/index.html XML/XSA XML Software Autoupdate http://www.garshol.priv.no/download/xsa/ XML/XSL Transformations XSLT Alleviates XML Schema Incompatibility Headaches -- MSDN Magazine, August 2000 http://msdn.microsoft.com/msdnmag/issues/0800/XSLT/XSLT.asp XML/Xerces/Xerces PPM for Windows http://www.xmlproj.com/xerces/windows_install.html AuthorRon Savage. Home page: http://savage.net.au/index.html This POD was converted to HTML by /Perl.html#fancy-pom2.pl
LicenceAustralian Copyright © 2002 Ron Savage. All rights reserved. All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html |
| Top of page |