XPathScript

by Yanick Champoux

What is XPathScript?

XPathScript is a XML transformation language

A Perl-based alternative to XSLT

Why Would You Use XPathScript?

You like XML...

Why (cont'd)

But when you look at this

   <xsl:template match="/">
   <html>
       <head>
           <title>OOS Table Show</title>
       </head>
       <body>
      <xsl:apply-templates />
      </body>
  </html>
  </xsl:template>
  
  <xsl:template match="category">
    <div class="category">
        <xsl:apply-templates />
    </div>
  </xsl:template>

Why (cont'd)

you see that

History

Created by Matt Sergeant as part of AxKit

Taken over and made into its own module by the French Connection (Dominique Quatravaux & myself)

XML Engine

XPathScript uses

XML::LibXML (built over libxml, excellent performance)

or

XML::XPath (pure Perl implementation, thus portable)

Anatomy of XPathScript

XPathScript is made of two distinct components

* A transformation stylesheet
(funnily enough called a template)

* A templating system
(funnily enough called the stylesheet)

* Names are probably going to be switched over in v2.0.

Example

example-1.xml

<?xml version="1.0" encoding="iso-8859-1"?>
<show name="OOS February 2008 Show">
  <category name="Dendrobium Alliance">
    <plant>
      <name>Den. Big Alex x Alexandrae</name>
      <owner>Ang&egrav;le Biljan</owner>
      <picture photographer="Yanick Champoux">
        IMG_3549.JPG</picture>
    </plant>
  </category>
</show>

Example (cont'd)

example-1.xps

<%
    $t->set( show => { showtag => 0 } );
    $t->set( category => { 
        pre   => '<table>',
        post  => '</table>',
        intro => '<caption>{@name}</caption>' 
    } );
    $t->set( plant => { rename => 'tr' } );
    $t->set( [ qw/ name owner / ] => { rename => 'td' } );
    $t->set( name => { 
        intro => '<b>', extro => '</b>' 
    } );
    $t->set( picture => { action => $DO_NOT_PROCESS } );
%>

<h1>Show Results</h1>

<%~ / %>

Example (cont'd)

$ xpathscript example-1.xml example-1.xps

<h1>Show Results</h1>
<table><caption>Dendrobium Alliance</caption>
    <tr>
        <td><b>Den. Big Alex x Alexandrae</b></td>
        <td>Angèle Biljan</td>
    </tr>
</table>

Template - Verbatim Stuff

Anything outside of <% %> is printed verbatim

  0 .----------._0   
  |<| verbatim | |>
  | `----------' |
 / \            / \

Template - include

Include an external file (which can include external files as well, which can... well, you get the point)

<!--#include file="/path/to/stylesheet.xps" -->

<h1>OOS January 2008</h1>

<%~ category %>

Template - <% %>

Execute Perl code, don't output anything
(unless print is involved)

<%
    $t->set( tag => { pre => 'foo' } );
%>

<% print "\t$_\n" for @stuff; %>

<% for my $i ( @stuff ) { %>
    I have some <%= $i %>
<% } %>

Template - <%= %>

Execute and print out

The answer to life, the universe and everything is <%= 7 * 6 %>

Template - <%# %>

Comment. Stripped out never to be seen again.

<%# what happens in comments stay in comments  %>

Template - <%~ %>

Apply stylesheet on the nodes matching the xpath expression

<h1>OOS January Show</h1>

<%#  process all the plants %>
<%~ /show/category/plant %>

Template - <%@ %>

Define a sub-template for a tag

Xpaths are relative to the processed tag

<%@ category
    <h1>{@name}</h1>
    <%~ plant %>
%>

<%@ plant 
    {name/text()}, owned by {owner/text()}
    <% if ( findnodes( 'picture' ) ) { %> 
        pictures available 
    <% } %> 
%>

Template - <%- -%>

Squish whitespaces. We can have readability and brievity!

The '-' can be unilateral as well

<h1>
    <%-= get_title() -%>
</h1>

will turn into

<h1>Of Mice and Men</h1>

Template - utility functions

findnodes, findvalues

Query the XML DOM directly

apply_templates( $node | $xpath )

Return the output of the node(s) processing (which can be given directly or via an xpath)

<%= map apply_templates($_), 
        reverse findnodes( '//plant' ) %>

Stylesheet - $t->set( $tag )

Configure tags one at a time

$t->set( tag => { attributes... } )

$t->set( plant => {
    pre  => '<tr>',
    post => '</tr>',
} );

Stylesheet - $t->set( @tags )

... or a whole bunch at the same time

$t->set( [ @tags ] => { attributes...} )

$t->set( [ qw/ name owner / ] => {
    rename => 'tr',
} );

Stylesheet - $t->set(), cont'd

Assignments are additive

$t->set( [ qw/ name owner / ] => {
    rename => 'tr',
} );

$t->set( name => { 
    intro => '<b>',
    extro => '</b>',
} );

Stylesheet - layout attributes

pre
<tag> if showtag
intro
prechildrenif has children
prechildbefore each child
child node
postchildafter each child
postchildrenif has children
extro
</tag> if showtag
post

Stylesheet - interpolation

{xpath/@values} in strings are interpolated

Interpolation can be disabled

Interpolation markup can be modified

$XML::XPathScript::current->interpolation( 1 );
$XML::XPathScript::current->interpolation_regex( qr/X<(.*?)>/ );

$t->set( category => { 
    pre => '<h1>X<@name></h1>',
} );

Stylesheet - behavior attributes

showtag

Should the tag be shown?

Default to true if no tag attribute defined, false otherwise.

# typical way to change <plant> for <tr>
$t->set( plant => { 
    pre  => '<tr>',
    post => '</tr>',
} );

Stylesheet - behavior attributes

rename

Rename the tag, keep the attributes if there are any

$t->set( plant => { rename => 'tr' } )

Stylesheet - behavior attributes

action

How to process the tag and its children?

$DO_SELF_AND_KIDS (default), $DO_SELF_ONLY, $DO_NOT_PROCESS, xpath and $DO_TEXT_AS_CHILD

$t->set( plant => {
    intro  => '{@name/text()} - {@owner/text()}',
    action => $DO_SELF_ONLY,
} );

Stylesheet - behavior attributes

testcode

Bring forth the big guns!

$t->set( foo => { testcode => \&do_foo }  );

sub do_foo {
    my ( $node, $tag ) = @_;

    $tag->set({ pre => 'blah' });  # only for this node

    return $action;
}

Stylesheet - behavior attributes

content

Long form of <%@ %>

Trumps all layout attributes

$t->set( category => { content => <<'END_CONTENT' } );
    <h1>{@name}</h1>
    <%~ plant %>
END_CONTENT

Stylesheet - special tags

text(), comment() and '*'

$t->set( 'text()' => { 
    pre  => '[',
    post => ']',
    action => $DO_TEXT_AS_CHILD,
} );

$t->set( '*' => { 
    testcode => sub { 
        warn $_[0]->name, " not in template\n" 
    },	
} );

namespaces

XPathScript does namespaces!

my $foo = $t->namespace( 'http://foo.com' ); 

  $t->set( bar => { content => 'main namespace' } );
$foo->set( bar => { content => 'foo namespace'  } );

Using Mason as the template system

<%perl>
    use XML::XPathScript::Processor;
    use XML::XPathScript::Template;
    use XML::LibXML;

    my $processor = XML::XPathScript::Processor->new;

    $processor->set_dom(
        XML::LibXML->new->parse_file( 'example-1.xml' )
    );

    # load the template
    my $t = XML::XPathScript::Template->new;
    $processor->set_template( $t );
    $t->set( plant => { rename => 'tr' } );
    # ...
</%perl>

<table>
    <% $processor->apply_templates( '//plant' )  %>
</table>

Beyond XML

Anything that has an XPath interface can potentially be processed by XPathScript.

use B::XPath;
use XML::XPathScript;

sub guinea_pig {
	my $x = shift;
	print "oink oink " x $x;
}

my $xps = XML::XPathScript->new;

$xps->set_dom( B::XPath->fetch_root( \&guinea_pig ) );
$xps->set_stylesheet( '<%~ //print %>' );

print $xps->transform;