XML::Invisible - transform "invisible XML" documents into XML using a grammar
OS | Build status |
---|---|
Linux |
use XML::Invisible qw(make_parser);
my $transformer = make_parser(from_file($ixmlspec));
my $ast = $transformer->(from_file($ixml_input));
use XML::Invisible qw(make_parser ast2xml);
my $transformer = make_parser(from_file($ixmlspec));
my $xmldoc = ast2xml($transformer->(from_file($ixml_input)));
to_file($outputfile, $xmldoc->toStringC14N(1));
# or, with conventional pre-compiled Pegex grammar
my $parser = Pegex::Parser->new(
grammar => My::Thing::Grammar->new,
receiver => XML::Invisible::Receiver->new,
);
my $got = $parser->parse($input);
# from command line
cpanm XML::Invisible XML::Twig
perl -MXML::Invisible=make_parser,ast2xml -e \
'print ast2xml(make_parser(join "", <>)->("(a+b)"))->toStringC14N(1)' \
examples/arith-grammar.ixml | xml_pp
# canonicalise a document
use XML::Invisible qw(make_parser make_canonicaliser);
my $ixml_grammar = from_file('examples/arith-grammar.ixml');
my $transformer = make_parser($ixml_grammar);
my $ast = $transformer->(from_file($ixml_input));
my $canonicaliser = make_canonicaliser($ixml_grammar);
my $canonical = $canonicaliser->($ast);
An implementation of Steven Pemberton's Invisible XML concept, using Pegex. Supply it with your grammar, in Pegex format (slightly different from Steven's specification due to differences between his notation and Pegex's), it returns you a function, which you can call to transform "invisible XML" documents into actual XML.
This is largely a Pegex "receiver" class that exploits the +
and -
syntax in rules in slightly unintended ways, and a wrapper to make
this operate.
See Pegex::Syntax. Generally, all rules will result in an XML
element. All terminals will need to capture with ()
(see example
below).
However, if you specify a dependent token with +
it will
instead become an attribute (equivalent of Steven's @
). If -
,
this will "flatten" (equivalent of Steven's -
) - the children will
be included without making an element of that node. Since in Pegex any
element can be skipped entirely with .
, you can use that instead of
-
to omit terminals.
E.g.
expr: +open -arith +close
open: /( LPAREN )/
close: /( RPAREN )/
arith: left -op right
left: +name
right: -name
name: /(a)/ | /(b)/
op: +sign
sign: /( PLUS )/
When given (a+b)
yields:
<expr open="(" sign="+" close=")">
<left name="a"/>
<right>b</right>
</expr>
Exportable. Returns a function that when called with an "invisible XML" document, it will return an abstract syntax tree (AST), of the general form:
{
nodename => 'expr',
attributes => { open => '(', sign => '+', close => ')' },
children => [
{ nodename => 'left', attributes => { name => 'a' } },
{ nodename => 'right', children => [ 'b' ] },
],
}
Arguments:
- an "invisible XML" Pegex grammar specification, OR a Pegex::Grammar object
See XML::Invisible::Receiver for more.
Exportable. Turns an AST, as output by "make_parser", from XML::Invisible::Receiver into an object of class XML::LibXML::Document. Needs XML::LibXML installed, which as of version 0.05 of this module is only a suggested dependency, not required.
Arguments:
- an AST from XML::Invisible::Receiver
Exportable. Returns a function that when called with an AST as produced
from a document by a "make_parser", returns a canonical version of
the original document, or undef
if it failed.
Arguments:
- an XML::Invisible grammar
It uses a few heuristics:
-
literals that are 0-1 (
?
) or any number (*
) will be omitted -
literals that are at least one (
+
) will be inserted once -
if an "any" group is given, the first one that matches will be selected
This last one means that if you want a canonical representation that is not the bare minimum, provide that as a literal first choice (see the
assign
rule below - while it will accept any or no whitespace, the "canonical" version is given):expr: target .assign source target: +name assign: ' = ' | (- EQUAL -) source: -name name: /( ALPHA (: ALPHA | DIGIT )* )/
To debug, set environment variable XML_INVISIBLE_DEBUG
to a true value.
https://homepages.cwi.nl/~steven/ixml/ - Steven Pemberton's Invisible XML page
Ed J, <etj at cpan.org>
Please report any bugs or feature requests on https://github.com/mohawk2/xml-invisible/issues.
Or, if you prefer email and/or RT: to bug-xml-invisible at rt.cpan.org
, or through the web interface at
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-Invisible. I will be
notified, and then you'll automatically be notified of progress on your
bug as I make changes.
Copyright 2018 Ed J.
This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: