SaxCompiler

SaxCompiler is a tool for recording SAX ContentHandler events as Java code that can play back the events without parsing XML. Java method calls can be inserted among the SAX events. Optionally, document fragments can be played back by omitting the startDocument and endDocument calls and the startElement and endElement calls for the root element. SaxCompiler is licensed under the “expat”/“MIT” license, and the author does not attempt to claim copyright to the generated code.

The generated code is a final class with a single static re-entrant method called emit that takes a ContentHandler as its first argument.

Processing Instructions

SaxCompiler recognizes five special processing instructions: SaxCompiler-package, SaxCompiler-class, SaxCompiler-args, SaxCompiler-omitRoot, and SaxCompiler-code. Except for SaxCompiler-code, these processing instructions must appear in the document prolog before any other processing instructions.

PI Example What Optionality Where
SaxCompiler-package <?SaxCompiler-package com.example.package?> Declares the package of the generated class. Optional In prolog before SaxCompiler-class
SaxCompiler-class <?SaxCompiler-class DocumentEmitter?> Declares the name of the generated class. Mandatory In prolog before SaxCompiler-args
SaxCompiler-args <?SaxCompiler-args com.example.other.package.Foo f, com.example.other.package.Bar b?> Declares additional arguments for the emit method. Types outside java.lang and the package for the generated code must use fully-qualified names. The identifier contentHandler and identifiers starting and ending with two underscores are reserved by SaxCompiler. Optional In prolog before possible non-SaxCompiler PIs
SaxCompiler-omitRoot <?SaxCompiler-omitRoot?> Signals that the startDocument and endDocument calls and the startElement and endElement calls for the root element should be omitted. Optional In prolog before SaxCompiler-args
SaxCompiler-code <?SaxCompiler-code f.emitFoo(contentHandler, b);?> Contains Java code for verbatim inclusion. The code may use the contentHandler variable and the variables given in SaxCompiler-args. The code is responsible for leaving contentHandler in an acceptable state by nesting any start/end calls properly. The code must not call startDocument or endDocument and is only allowed to throw RuntimeExceptions or SAXExceptions. Optional Anywhere after SaxCompiler-args; may occur multiple times

Example

As you can see, the generated code is targeted at javac—not humans.

Sample Input

<?xml version="1.0"?>
<?SaxCompiler-omitRoot?>
<?SaxCompiler-package com.example.package?>
<?SaxCompiler-class DocumentEmitter?>
<?SaxCompiler-args com.example.other.package.Foo f, com.example.other.package.Bar b?>
<foo xmlns="http://hsivonen.iki.fi/FooML">
<bar attr='val'>Text</bar>
<baz><?SaxCompiler-code f.emitFoo(contentHandler, b);?></baz>
</foo>

Generated Output

/* This code was generated by fi.iki.hsivonen.xml.SaxCompiler. Please regenerate instead of editing. */
package com.example.package;
public final class DocumentEmitter {
private DocumentEmitter() {}
public static void emit(org.xml.sax.ContentHandler contentHandler, com.example.other.package.Foo f, com.example.other.package.Bar b) throws org.xml.sax.SAXException {
org.xml.sax.helpers.AttributesImpl __attrs__ = new org.xml.sax.helpers.AttributesImpl();
contentHandler.startPrefixMapping("", "\u0068\u0074\u0074\u0070\u003a\u002f\u002f\u0068\u0073\u0069\u0076\u006f\u006e\u0065\u006e\u002e\u0069\u006b\u0069\u002e\u0066\u0069\u002f\u0046\u006f\u006f\u004d\u004c");
contentHandler.characters(__chars__, 0, 1);
__attrs__.clear();
__attrs__.addAttribute("", "\u0061\u0074\u0074\u0072", "\u0061\u0074\u0074\u0072", "\u0043\u0044\u0041\u0054\u0041", "\u0076\u0061\u006c");
contentHandler.startElement("\u0068\u0074\u0074\u0070\u003a\u002f\u002f\u0068\u0073\u0069\u0076\u006f\u006e\u0065\u006e\u002e\u0069\u006b\u0069\u002e\u0066\u0069\u002f\u0046\u006f\u006f\u004d\u004c", "\u0062\u0061\u0072", "\u0062\u0061\u0072", __attrs__);
contentHandler.characters(__chars__, 1, 4);
contentHandler.endElement("\u0068\u0074\u0074\u0070\u003a\u002f\u002f\u0068\u0073\u0069\u0076\u006f\u006e\u0065\u006e\u002e\u0069\u006b\u0069\u002e\u0066\u0069\u002f\u0046\u006f\u006f\u004d\u004c", "\u0062\u0061\u0072", "\u0062\u0061\u0072");
contentHandler.characters(__chars__, 5, 1);
__attrs__.clear();
contentHandler.startElement("\u0068\u0074\u0074\u0070\u003a\u002f\u002f\u0068\u0073\u0069\u0076\u006f\u006e\u0065\u006e\u002e\u0069\u006b\u0069\u002e\u0066\u0069\u002f\u0046\u006f\u006f\u004d\u004c", "\u0062\u0061\u007a", "\u0062\u0061\u007a", __attrs__);
f.emitFoo(contentHandler, b);
contentHandler.endElement("\u0068\u0074\u0074\u0070\u003a\u002f\u002f\u0068\u0073\u0069\u0076\u006f\u006e\u0065\u006e\u002e\u0069\u006b\u0069\u002e\u0066\u0069\u002f\u0046\u006f\u006f\u004d\u004c", "\u0062\u0061\u007a", "\u0062\u0061\u007a");
contentHandler.characters(__chars__, 6, 1);
contentHandler.endPrefixMapping("");
}
private static final char[] __chars__ = { '\n', '\u0054', '\u0065', '\u0078', '\u0074', '\n', '\n' };
}

Download

Source and binary in one jar

Invocation

java -jar SaxCompiler.jar < input.xml > Output.java