This is Annotation XPath for SAX, AXS. It is is made freely available under an MIT-type license, as described in the LICENSE file. All of the code generated by the attribute processor is freely licensed under the same terms as this package.
AXS (pronounced "axis") is an effort to make writing SAX DocumentHandlers easy. An AXS handler subclasses com.googlecode.axs.AbstractAnnotatedHandler and then instead of (or in addition to) the usual startElement(), endElement(), etc. SAX handlers, it defines annotated handlers which are called when the current element in the document being parsed matches an XPath expression. For example, if one had a document
<person>
<names>
<name>John Smith</name>
<name type="alias">Kyon</name>
<name type="alias">Hey, you!</name>
</names>
<age span="subjective">18.32</age>
<age span="years-since-birth">16.1</age>
<locations>
<location>
<country>Japan</country>
<era>mid-Haruhi</era>
</location>
<location>
<country>alternate-Japan@3c603ff:110bb8e</country>
<era>elided-Haruhi</era>
<subsidary-universe/>
</locations>
</person>
the handler function
@XPath("names/name[@type != 'alias']")
public void realName(String name) { ... }
would be called exactly once with the string "John Smith". Similarly, a function
@XPath("locations/location/country")
public void whereIsHeNow(String country) { ... }
would be called twice, once with "Japan" and once with "alternate-Japan@3c603ff:110bb8e".
AXS provides two JAR files, one of which ("axs-runtime") must be included in your application.
The other JAR ("axs-compiler") must be added to your project as an attribute processor for
javac. For the Oracle (Sun) javac, this is done by using the -processorpath command line
argument. In an Ant <javac>
task, this can be done with <compilerarg>
elements.
<javac...>
<compilerarg value="-processorpath"/>
<compilerarg value="${axs-compiler-jar}"/>
<compilerarg value="-s"/>
<compilerarg value="${generated-code-dir}"/>
</javac>
Then, the code generated by the attribute processor must also be compiled and included into your application.
AXS provides four attributes, one which applies to the handler class and three which apply to specific handler methods. The attributes are
This attribute is applied to the handler class, and defines the qualified name (QName) Prefix to Namespace URI mappings used for all the XPath expressions in this path. If this attribute is not present, a single mapping of the null Prefix ("") to the null Namespace URI ("") is used. The strings are of the form "prefix=URI", e. g. "html=http://www.w3.org/1999/xhtml" defines that the Prefix "html" refers to elements in the XHTML namespace. Any prefix, including the null prefix, can be mapped.
This attribute is applied to a handler method, and specifies that the method will be called with the text enclosed by the right-most Element of the XPath expression. If you don't care about the content of the element, only its existance or attributes, you probably want to use @XPathStart instead.
This attribute is applied to a handler method, and specifies that the method will be called with the SAX Attributes of the right-most Element of the XPath expression as soon as that Element is started.
To continue on the example, if one wanted to know all the different ways that John Smith's age is tracked, the handler function
@XPathStart("/person/age")
public void foundAnAge(org.xml.sax.Attributes attrs) { ... }
would be called twice, once for each element.
This attribute is applied to a handler method, and specifies that the method will be called when the right-most Element of the XPath expression is ended.
In the example, if one wanted to stop parsing as soon as two aliases were found, the handler function
@XPathEnd("/person/names/name[@type='alias'][2]")
public void gotTwoAliases() { throw new SAXException("got what we needed"); }
would do so by throwing a SAXException
after the </name>
tag of the "Hey, you!" entry.
(In practice, you'd want to throw a subclass of SAXException
so you could tell it
apart from an actual error!)
Multiple XPath expressions may be combined in a single attribute by using the '|' character to separate alternatives.
AXS handles only a subset of the full XPath specification. Since SAX is a streaming parser,
AXS only accepts forward path steps, and specifically only the child:: and descendant:: axes.
Every path step must have an element (no attribute-only steps), and wildcards elements ('*')
are not permitted. Use the descendant::
(i.e. //
) axis instead.
Only a few predicates are accepted:
- the numeric singleton predicate
[N]
which selects the element at Context Position N (e.g.names/name[2]
selects the 2nd<name>
that is a child of<names>
) - string value comparisons $A CMP $B where CMP is either '=' or '!=' and $A and $B are either string literals ('value' or "value"; for either form, a doubled delimiter is the escape sequence to write that delimiter, e.g. 'a''b' is the literal "a'b" and likewise "a""b" is 'a"b') or attribute names @NAME
- string match functions "
contains(a, b)
", "starts-with(a, b)
", and "ends-with(a, b)
" where A and B are either literals or attribute names - the regular expression string match function "
match(A, L)
" where A is either a literal or an attribute name, and L is a string literal: this tests whether A matches the regular expression specified in L. Regular expression syntax is that ofjava.util.regex.Pattern
, not that of XPath. The optional third(flags)
argument in the XPath standard is not supported. Use the(?idmsux-idmsux)
syntax inside the pattern to set pattern flags, instead. - numeric comparisons to the
position()
function (e.g.[position() < 4]
selects the first three matches) - the special function
[captureattrs()]
which ensures that the attributes of the Element to which it is applied will be available when the handler function is called (see section V). This predicate is always true. - parenthesized expressions, the "and" and "or" boolean operators, and the function
"
not(EXPR)
"
AXS supports both the full and abbreviated naming forms of XPath. The "child::
" axis prefix
can be freely omitted, the "descendant::
" prefix can be abbreviated as "//
", and "attribute::
"
can be abbreviated as "@
".
AbstractAnnotatedHandler
provides several functions which may be called by a handler
function to request more information about the context of the handler call:
Returns how many elements deep the current path is to the root.
Returns the tag at a given depth in the current path. Depth 0 is the root element.
Returns the depth at which the tag can be found, or -1 if it was not found. Can optionally take the depth at which to start an incremental search.
Returns the attributes of the tag at a given depth, or null if they are not available.
Note that the attributes of a given tag will only be available if they were used in a
predicate, or if the special [captureattrs()]
predicate was used. (Predicates which
do not need the attributes to be evaluated don't capture them. If you want them available,
write e.g. "...[position() > 2 and captureattrs()]
".)
Handler subclasses are free to implement all the usual SAX DocumentHandler methods themselves if needed, but they must call the superclass implementation as well.