|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object fi.iki.hsivonen.gnu.xml.aelfred2.XmlParser
final class XmlParser
Parse XML documents and return parse events through call-backs.
Use the SAXDriver
class as your entry point, as all
internal parser interfaces are subject to change.
SAXDriver
Nested Class Summary | |
---|---|
(package private) static class |
XmlParser.AttributeDecl
|
(package private) static class |
XmlParser.ElementDecl
|
(package private) static class |
XmlParser.EntityInfo
|
(package private) static class |
XmlParser.ExternalIdentifiers
|
(package private) static class |
XmlParser.Input
|
Field Summary | |
---|---|
private boolean |
alreadyWarnedAboutPrivateUseCharacters
|
static int |
ATTRIBUTE_DEFAULT_FIXED
Constant: the attribute was declared #FIXED. |
static int |
ATTRIBUTE_DEFAULT_IMPLIED
Constant: the attribute was declared #IMPLIED. |
static int |
ATTRIBUTE_DEFAULT_REQUIRED
Constant: the attribute was declared #REQUIRED. |
static int |
ATTRIBUTE_DEFAULT_SPECIFIED
Constant: the attribute has a literal default value specified. |
static int |
ATTRIBUTE_DEFAULT_UNDECLARED
Constant: the attribute is not declared. |
private String |
characterEncoding
|
private int |
column
|
static int |
CONTENT_ANY
Constant: the element has a content model of ANY. |
static int |
CONTENT_ELEMENTS
Constant: the element has element content. |
static int |
CONTENT_EMPTY
Constant: the element has declared content of EMPTY. |
static int |
CONTENT_MIXED
Constant: the element has mixed content. |
static int |
CONTENT_UNDECLARED
Constant: an element has not been declared. |
private static int |
CONTEXT_LITERAL
|
private static int |
CONTEXT_NORMAL
|
private int |
currentByteCount
|
private String |
currentElement
|
private int |
currentElementContent
|
private static int |
DATA_BUFFER_INITIAL
|
private char[] |
dataBuffer
|
private int |
dataBufferPos
|
private boolean |
docIsStandalone
|
private boolean |
doReport
|
private HashMap<String,XmlParser.ElementDecl> |
elementInfo
|
(package private) static char[] |
endDelimCDATA
|
(package private) static char[] |
endDelimComment
|
(package private) static char[] |
endDelimPI
|
static int |
ENTITY_INTERNAL
Constant: the entity is internal. |
static int |
ENTITY_NDATA
Constant: the entity is external, non-parsable data. |
static int |
ENTITY_TEXT
Constant: the entity is external XML data. |
static int |
ENTITY_UNDECLARED
Constant: the entity has not been declared. |
private HashMap<String,XmlParser.EntityInfo> |
entityInfo
|
private LinkedList<String> |
entityStack
|
private boolean |
expandPE
|
private SAXDriver |
handler
|
private boolean |
inCDATA
|
private boolean |
inLiteral
|
private static int |
INPUT_INTERNAL
|
private static int |
INPUT_NONE
|
private static int |
INPUT_READER
|
private LinkedList<XmlParser.Input> |
inputStack
|
private InputStream |
is
|
private boolean |
isDirtyCurrentElement
|
private int |
line
|
private static int |
LIT_ATTRIBUTE
|
private static int |
LIT_DISABLE_CREF
|
private static int |
LIT_DISABLE_EREF
|
private static int |
LIT_DISABLE_PE
|
private static int |
LIT_ENTITY_REF
|
private static int |
LIT_NORMALIZE
|
private static int |
LIT_PUBID
|
private static int |
NAME_BUFFER_INITIAL
|
private char[] |
nameBuffer
|
private int |
nameBufferPos
|
private NormalizationChecker |
normalizationChecker
|
private HashMap<String,String> |
notationInfo
|
private boolean |
peIsError
|
private char |
prev
|
private byte[] |
rawReadBuffer
|
private static int |
READ_BUFFER_MAX
|
private char[] |
readBuffer
|
private int |
readBufferLength
|
private int |
readBufferOverflow
|
private int |
readBufferPos
|
private Reader |
reader
|
private boolean |
sawCR
|
private InputSource |
scratch
|
private boolean |
skippedPE
|
private int |
sourceType
|
(package private) static char[] |
startDelimComment
|
(package private) static char[] |
startDelimPI
|
private static int |
SURROGATE_OFFSET
|
private static int |
SYMBOL_TABLE_LENGTH
|
private Object[][] |
symbolTable
|
private int |
tagAttributePos
|
private String[] |
tagAttributes
|
(package private) static boolean |
uriWarnings
|
private static boolean |
USE_CHEATS
|
private static int |
XML_10
|
private static int |
XML_11
|
private int |
xmlVersion
|
Constructor Summary | |
---|---|
XmlParser()
Construct a new parser with no associated handler. |
Method Summary | |
---|---|
private void |
checkEncodingLiteral(String encodingName)
|
private void |
checkEncodingMatch(String used,
String detected)
|
private void |
checkLegalVersion(String version)
|
private void |
dataBufferAppend(char c)
Add a character to the data buffer. |
private void |
dataBufferAppend(char[] ch,
int start,
int length)
Append (part of) a character array to the data buffer. |
private void |
dataBufferAppend(String s)
Add a string to the data buffer. |
private void |
dataBufferFlush()
Flush the contents of the data buffer to the handler, as appropriate, and reset the buffer for new input. |
private void |
dataBufferNormalize()
Normalise space characters in the data buffer. |
private String |
dataBufferToString()
Convert the data buffer to a string. |
Iterator<String> |
declaredAttributes(String elname)
Get the declared attributes for an element type. |
private Iterator<String> |
declaredAttributes(XmlParser.ElementDecl element)
Get the declared attributes for an element type. |
private void |
detectEncoding()
Attempt to detect the encoding of an entity. |
(package private) void |
doParse(String systemId,
String publicId,
Reader reader,
InputStream stream,
String encoding)
Parse an XML document from the character stream, byte stream, or URI that you provide (in that order of preference). |
private void |
draconianInputStreamReader(String encoding,
InputStream stream,
boolean requireAsciiSuperset)
|
private void |
draconianInputStreamReader(String encoding,
InputStream stream,
boolean requireAsciiSuperset,
String actualName)
|
private void |
err(String message)
Report non-fatal errors. |
private Object |
extendArray(Object array,
int currentSize,
int requiredSize)
Ensure the capacity of an array, allocating a new one if necessary. |
private void |
fatal(String message)
Report typical case fatal errors. |
private void |
fatal(String message,
char textFound,
String textExpected)
Report a serious error. |
private void |
fatal(String message,
String textFound,
String textExpected)
Report an error. |
private void |
filterCR(boolean moreData)
Filter carriage returns in the read buffer. |
private XmlParser.AttributeDecl |
getAttribute(String elName,
String name)
Retrieve the attribute declaration for the given element name and name. |
String |
getAttributeDefaultValue(String name,
String aname)
Retrieve the default value of a declared attribute. |
int |
getAttributeDefaultValueType(String name,
String aname)
Retrieve the default value mode of a declared attribute. |
String |
getAttributeEnumeration(String name,
String aname)
Retrieve the allowed values for an enumerated attribute type. |
String |
getAttributeType(String name,
String aname)
Retrieve the declared type of an attribute. |
int |
getColumnNumber()
Return the current column number. |
private int |
getContentType(XmlParser.ElementDecl element,
int defaultType)
|
private HashMap<String,XmlParser.AttributeDecl> |
getElementAttributes(String name)
Look up the attribute hash table for an element. |
int |
getElementContentType(String name)
Look up the content type of an element. |
XmlParser.ExternalIdentifiers |
getEntityIds(String ename)
Return an external entity's identifiers. |
int |
getEntityType(String ename)
Find the type of an entity. |
String |
getEntityValue(String ename)
Return an internal entity's replacement text. |
int |
getLineNumber()
Return the current line number. |
private void |
initializeVariables()
Re-initialize the variables for each parse. |
String |
intern(char[] ch,
int start,
int length)
Create an interned string from a character array. |
private boolean |
isAstralPrivateUse(int c)
|
private static boolean |
isExtender(char c)
|
private boolean |
isNonCharacter(int c)
|
private boolean |
isPrivateUse(char c)
|
private boolean |
isPrivateUse(int c)
|
(package private) boolean |
isStandalone()
|
private boolean |
isWhitespace(char c)
Test if a character is whitespace. |
private void |
parseAttDef(String elementName)
Parse a single attribute definition. |
private void |
parseAttlistDecl()
Parse an attribute list declaration. |
private void |
parseAttribute(String name)
Parse an attribute assignment. |
private void |
parseCDSect()
Parse a CDATA section. |
private void |
parseCharData()
Parse character data. |
private void |
parseCharRef()
|
private void |
parseCharRef(boolean doFlush)
Read and interpret a character reference. |
private void |
parseComment()
Skip a comment. |
private void |
parseConditionalSect(char[] saved)
Parse a conditional section. |
private void |
parseContent()
Parse the content of an element. |
private void |
parseContentspec(String name)
Content specification. |
private void |
parseCp()
Parse a content particle. |
private void |
parseDefault(String elementName,
String name,
String type,
String enumer)
Parse the default value for an attribute. |
private void |
parseDoctypedecl()
Parse a document type declaration. |
private void |
parseDocument()
Parse an XML document. |
private void |
parseElement(boolean maybeGetSubset)
Parse an element, with its tags. |
private void |
parseElementDecl()
Parse an element type declaration. |
private void |
parseElements(char[] saved)
Parse an element-content model. |
private void |
parseEntityDecl()
Parse an entity declaration. |
private void |
parseEntityRef(boolean externalAllowed)
Parse and expand an entity reference. |
private void |
parseEnumeration(boolean isNames)
Parse an enumeration. |
private void |
parseEq()
Parse an equals sign surrounded by optional whitespace. |
private void |
parseETag()
Parse an end tag. |
private void |
parseMarkupdecl()
Parse a markup declaration in the internal or external DTD subset. |
private void |
parseMisc()
Parse miscellaneous markup outside the document element and DOCTYPE declaration. |
private void |
parseMixed(char[] saved)
Parse mixed content. |
private void |
parseNotationDecl()
Parse a notation declaration. |
private void |
parseNotationType()
Parse a notation type for an attribute. |
private void |
parsePEReference()
Parse and expand a parameter entity reference. |
private void |
parsePI()
Parse a processing instruction and do a call-back. |
private boolean |
parseProlog()
Parse the prolog of an XML document. |
private String |
parseTextDecl(String encoding)
Parse a text declaration. |
private void |
parseUntil(char[] delim)
|
private String |
parseXMLDecl(String encoding)
Parse the XML declaration. |
private void |
popInput()
Restore a previous input source. |
private void |
prefetchASCIIEncodingDecl()
Prefetch US-ASCII XML/text decl from input stream into read buffer. |
private void |
pushCharArray(String ename,
char[] ch,
int start,
int length)
Push a new internal input source. |
private void |
pushInput(String ename)
Save the current input source onto the stack. |
private void |
pushString(String ename,
String s)
This method pushes a string back onto input. |
private void |
pushURL(boolean isPE,
String ename,
XmlParser.ExternalIdentifiers ids,
Reader aReader,
InputStream aStream,
String aEncoding,
boolean doResolve)
Push, or skip, a new external input source. |
private String |
readAttType()
Parse the attribute type. |
private char |
readCh()
Read a single character from the readBuffer. |
private void |
readDataChunk()
Read a chunk of data from an external input source. |
private XmlParser.ExternalIdentifiers |
readExternalIds(boolean inNotation,
boolean isSubset)
Try reading external identifiers. |
private String |
readLiteral(int flags)
Read a literal. |
private String |
readNmtoken(boolean isName)
Read a name or (when parsing an enumeration) name token. |
private void |
require(char delim)
Require a character to appear, or throw an exception. |
private void |
require(String delim)
Require a string to appear, or throw an exception. |
private void |
requireWhitespace()
Require whitespace characters. |
private void |
setAttribute(String elName,
String name,
String type,
String enumeration,
String value,
int valueType)
Register an attribute declaration for later retrieval. |
private void |
setElement(String name,
int contentType,
String contentModel,
HashMap<String,XmlParser.AttributeDecl> attributes)
Register an element. |
private void |
setExternalEntity(String eName,
int eClass,
XmlParser.ExternalIdentifiers ids,
String nName)
Register an external entity declaration for later retrieval. |
(package private) void |
setHandler(SAXDriver handler)
Set the handler that will receive parsing events. |
private void |
setInternalEntity(String eName,
String value)
Register an entity declaration for later retrieval. |
private void |
setNotation(String nname,
XmlParser.ExternalIdentifiers ids)
Report a notation declaration, checking for duplicates. |
private void |
skipWhitespace()
Skip whitespace characters. |
private static boolean |
tryEncoding(byte[] sig,
byte b1,
byte b2)
Check for a two-byte signature. |
private static boolean |
tryEncoding(byte[] sig,
byte b1,
byte b2,
byte b3,
byte b4)
Check for a four-byte signature. |
private String |
tryEncodingDecl(String encoding)
Check for an encoding declaration. |
private boolean |
tryRead(char delim)
Return true if we can read the expected character. |
private boolean |
tryRead(char[] ch)
|
private boolean |
tryRead(String delim)
Return true if we can read the expected string. |
private void |
tryReadCharRef()
Try to read a character reference without consuming data from buffer. |
private boolean |
tryWhitespace()
Return true if we can read some whitespace. |
private void |
unread(char c)
Push a single character back onto the current input stream. |
private void |
unread(char[] ch,
int length)
Push a char array back onto the current input stream. |
private void |
warnAboutLackOfEncodingDecl(String encoding)
|
private void |
warnAboutPrivateUseChar()
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private static final boolean USE_CHEATS
private static final int SURROGATE_OFFSET
public static final int CONTENT_UNDECLARED
getElementContentType(java.lang.String)
,
Constant Field Valuespublic static final int CONTENT_ANY
getElementContentType(java.lang.String)
,
Constant Field Valuespublic static final int CONTENT_EMPTY
getElementContentType(java.lang.String)
,
Constant Field Valuespublic static final int CONTENT_MIXED
getElementContentType(java.lang.String)
,
Constant Field Valuespublic static final int CONTENT_ELEMENTS
getElementContentType(java.lang.String)
,
Constant Field Valuespublic static final int ENTITY_UNDECLARED
getEntityType(java.lang.String)
,
Constant Field Valuespublic static final int ENTITY_INTERNAL
getEntityType(java.lang.String)
,
Constant Field Valuespublic static final int ENTITY_NDATA
getEntityType(java.lang.String)
,
Constant Field Valuespublic static final int ENTITY_TEXT
getEntityType(java.lang.String)
,
Constant Field Valuespublic static final int ATTRIBUTE_DEFAULT_UNDECLARED
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
Constant Field Valuespublic static final int ATTRIBUTE_DEFAULT_SPECIFIED
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
getAttributeDefaultValue(java.lang.String, java.lang.String)
,
Constant Field Valuespublic static final int ATTRIBUTE_DEFAULT_IMPLIED
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
Constant Field Valuespublic static final int ATTRIBUTE_DEFAULT_REQUIRED
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
Constant Field Valuespublic static final int ATTRIBUTE_DEFAULT_FIXED
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
getAttributeDefaultValue(java.lang.String, java.lang.String)
,
Constant Field Valuesprivate static final int INPUT_NONE
private static final int INPUT_INTERNAL
private static final int INPUT_READER
private static final int LIT_ENTITY_REF
private static final int LIT_NORMALIZE
private static final int LIT_ATTRIBUTE
private static final int LIT_DISABLE_PE
private static final int LIT_DISABLE_CREF
private static final int LIT_DISABLE_EREF
private static final int LIT_PUBID
private static final int CONTEXT_NORMAL
private static final int CONTEXT_LITERAL
static boolean uriWarnings
private SAXDriver handler
private Reader reader
private InputStream is
private int line
private int column
private int sourceType
private LinkedList<XmlParser.Input> inputStack
private String characterEncoding
private int currentByteCount
private InputSource scratch
private char[] readBuffer
private int readBufferPos
private int readBufferLength
private int readBufferOverflow
private static final int READ_BUFFER_MAX
private byte[] rawReadBuffer
private static int DATA_BUFFER_INITIAL
private char[] dataBuffer
private int dataBufferPos
private static int NAME_BUFFER_INITIAL
private char[] nameBuffer
private int nameBufferPos
private boolean docIsStandalone
private HashMap<String,XmlParser.ElementDecl> elementInfo
private HashMap<String,XmlParser.EntityInfo> entityInfo
private HashMap<String,String> notationInfo
private boolean skippedPE
private String currentElement
private int currentElementContent
private LinkedList<String> entityStack
private boolean inLiteral
private boolean expandPE
private boolean peIsError
private boolean doReport
private static final int SYMBOL_TABLE_LENGTH
private Object[][] symbolTable
private String[] tagAttributes
private int tagAttributePos
private boolean sawCR
private boolean inCDATA
private static final int XML_10
private static final int XML_11
private int xmlVersion
private NormalizationChecker normalizationChecker
static final char[] startDelimComment
static final char[] endDelimComment
static final char[] startDelimPI
static final char[] endDelimPI
static final char[] endDelimCDATA
private boolean isDirtyCurrentElement
private boolean alreadyWarnedAboutPrivateUseCharacters
private char prev
Constructor Detail |
---|
XmlParser()
setHandler(fi.iki.hsivonen.gnu.xml.aelfred2.SAXDriver)
,
#parse
Method Detail |
---|
void setHandler(SAXDriver handler)
handler
- The handler to receive callback events.#parse
void doParse(String systemId, String publicId, Reader reader, InputStream stream, String encoding) throws Exception
Only one thread at a time may use this parser; since it is private to this package, post-parse cleanup is done by the caller, which MUST NOT REUSE the parser (just null it).
systemId
- Absolute URI of the document; should never be null,
but may be so iff a reader or a stream is provided.publicId
- The public identifier of the document, or null.reader
- A character stream; must be null if stream isn't.stream
- A byte input stream; must be null if reader isn't.characterEncoding
- The suggested encoding, or null if unknown.
Exception
- Basically SAXException or IOExceptionprivate void fatal(String message, String textFound, String textExpected) throws SAXException
message
- The error message.textFound
- The text that caused the error (or null).
SAXException
SAXDriver#error
,
line
private void fatal(String message, char textFound, String textExpected) throws SAXException
message
- The error message.textFound
- The text that caused the error (or null).
SAXException
private void fatal(String message) throws SAXException
SAXException
private void err(String message) throws SAXException
SAXException
private void parseDocument() throws Exception
[1] document ::= prolog element Misc*
This is the top-level parsing function for a single XML document. As a minimum, a well-formed document must have a document element, and a valid document must have a prolog (one with doctype) as well.
Exception
private void parseComment() throws Exception
[15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* "-->"
(The <!--
has already been read.)
Exception
private void parsePI() throws SAXException, IOException
[16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>' [17] PITarget ::= Name - ( ('X'|'x') ('M'|m') ('L'|l') )
(The <?
has already been read.)
SAXException
IOException
private void parseCDSect() throws Exception
[18] CDSect ::= CDStart CData CDEnd [19] CDStart ::= '<![CDATA[' [20] CData ::= (Char* - (Char* ']]>' Char*)) [21] CDEnd ::= ']]>'
(The '<![CDATA[' has already been read.)
Exception
private boolean parseProlog() throws Exception
[22] prolog ::= XMLDecl? Misc* (Doctypedecl Misc*)?
We do not look for the XML declaration here, because it was handled by pushURL ().
Exception
pushURL
private void checkLegalVersion(String version) throws SAXException
SAXException
private String parseXMLDecl(String encoding) throws SAXException, IOException
[23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>' [24] VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"' ) [26] VersionNum ::= ([a-zA-Z0-9_.:] | '-')* [32] SDDecl ::= S 'standalone' Eq ( "'"" ('yes' | 'no') "'"" | '"' ("yes" | "no") '"' ) [80] EncodingDecl ::= S 'encoding' Eq ( "'" EncName "'" | "'" EncName "'" ) [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
(The <?xml
and whitespace have already been read.)
SAXException
IOException
parseTextDecl(java.lang.String)
,
#setupDecoding
private void checkEncodingLiteral(String encodingName) throws SAXException
SAXException
private String parseTextDecl(String encoding) throws SAXException, IOException
[79] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>' [80] EncodingDecl ::= S 'encoding' Eq ( '"' EncName '"' | "'" EncName "'" ) [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
(The <?xml
' and whitespace have already been read.)
SAXException
IOException
parseXMLDecl(java.lang.String)
,
#setupDecoding
private void checkEncodingMatch(String used, String detected) throws SAXException
SAXException
private void draconianInputStreamReader(String encoding, InputStream stream, boolean requireAsciiSuperset) throws SAXException, IOException
SAXException
IOException
private void draconianInputStreamReader(String encoding, InputStream stream, boolean requireAsciiSuperset, String actualName) throws SAXException, IOException
SAXException
IOException
private void parseMisc() throws Exception
[27] Misc ::= Comment | PI | S
Exception
private void parseDoctypedecl() throws Exception
[28] doctypedecl ::= '<!DOCTYPE' S Name (S ExternalID)? S? ('[' (markupdecl | PEReference | S)* ']' S?)? '>'
(The <!DOCTYPE
has already been read.)
Exception
private void parseMarkupdecl() throws Exception
[29] markupdecl ::= elementdecl | Attlistdecl | EntityDecl | NotationDecl | PI | Comment [30] extSubsetDecl ::= (markupdecl | conditionalSect | PEReference | S) *
Reading toplevel PE references is handled as a lexical issue by the caller, as is whitespace.
Exception
private void parseElement(boolean maybeGetSubset) throws Exception
[39] element ::= EmptyElementTag | STag content ETag [40] STag ::= '<' Name (S Attribute)* S? '>' [44] EmptyElementTag ::= '<' Name (S Attribute)* S? '/>'
(The '<' has already been read.)
NOTE: this method actually chains onto parseContent (), if necessary, and parseContent () will take care of calling parseETag ().
Exception
private void parseAttribute(String name) throws Exception
[41] Attribute ::= Name Eq AttValue
name
- The name of the attribute's element.
Exception
SAXDriver.attribute(java.lang.String, java.lang.String, boolean)
private void parseEq() throws SAXException, IOException
[25] Eq ::= S? '=' S?
SAXException
IOException
private void parseETag() throws Exception
[42] ETag ::= '' Name S? '>'
NOTE: parseContent () chains to here, we already read the "</".
Exception
private void parseContent() throws Exception
[43] content ::= (element | CharData | Reference | CDSect | PI | Comment)* [67] Reference ::= EntityRef | CharRef
NOTE: consumes ETtag.
Exception
private void parseElementDecl() throws Exception
[45] elementdecl ::= '<!ELEMENT' S Name S contentspec S? '>'
NOTE: the '<!ELEMENT' has already been read.
Exception
private void parseContentspec(String name) throws Exception
[46] contentspec ::= 'EMPTY' | 'ANY' | Mixed | elements
Exception
private void parseElements(char[] saved) throws Exception
[47] elements ::= (choice | seq) ('?' | '*' | '+')? [49] choice ::= '(' S? cp (S? '|' S? cp)+ S? ')' [50] seq ::= '(' S? cp (S? ',' S? cp)* S? ')'
NOTE: the opening '(' and S have already been read.
saved
- Buffer for entity that should have the terminal ')'
Exception
private void parseCp() throws Exception
[48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
Exception
private void parseMixed(char[] saved) throws Exception
[51] Mixed ::= '(' S? ( '#PCDATA' (S? '|' S? Name)*) S? ')*' | '(' S? ('#PCDATA') S? ')'
saved
- Buffer for entity that should have the terminal ')'
Exception
private void parseAttlistDecl() throws Exception
[52] AttlistDecl ::= '<!ATTLIST' S Name AttDef* S? '>'
NOTE: the '<!ATTLIST' has already been read.
Exception
private void parseAttDef(String elementName) throws Exception
[53] AttDef ::= S Name S AttType S DefaultDecl
Exception
private String readAttType() throws Exception
[54] AttType ::= StringType | TokenizedType | EnumeratedType [55] StringType ::= 'CDATA' [56] TokenizedType ::= 'ID' | 'IDREF' | 'IDREFS' | 'ENTITY' | 'ENTITIES' | 'NMTOKEN' | 'NMTOKENS' [57] EnumeratedType ::= NotationType | Enumeration
Exception
private void parseEnumeration(boolean isNames) throws Exception
[59] Enumeration ::= '(' S? Nmtoken (S? '|' S? Nmtoken)* S? ')'
NOTE: the '(' has already been read.
Exception
private void parseNotationType() throws Exception
[58] NotationType ::= 'NOTATION' S '(' S? NameNtoks (S? '|' S? name)* S? ')'
NOTE: the 'NOTATION' has already been read
Exception
private void parseDefault(String elementName, String name, String type, String enumer) throws Exception
[60] DefaultDecl ::= '#REQUIRED' | '#IMPLIED' | (('#FIXED' S)? AttValue)
Exception
private void parseConditionalSect(char[] saved) throws Exception
[61] conditionalSect ::= includeSect || ignoreSect [62] includeSect ::= '<![' S? 'INCLUDE' S? '[' extSubsetDecl ']]>' [63] ignoreSect ::= '<![' S? 'IGNORE' S? '[' ignoreSectContents* ']]>' [64] ignoreSectContents ::= Ignore ('<![' ignoreSectContents* ']]>' Ignore )* [65] Ignore ::= Char* - (Char* ( '<![' | ']]>') Char* )
NOTE: the '>![' has already been read.
Exception
private void parseCharRef() throws SAXException, IOException
SAXException
IOException
private void tryReadCharRef() throws SAXException, IOException
[66] CharRef ::= '' [0-9]+ ';' | '' [0-9a-fA-F]+ ';'
NOTE: the '' has already been read.
SAXException
IOException
private void parseCharRef(boolean doFlush) throws SAXException, IOException
[66] CharRef ::= '' [0-9]+ ';' | '' [0-9a-fA-F]+ ';'
NOTE: the '' has already been read.
SAXException
IOException
private void parseEntityRef(boolean externalAllowed) throws SAXException, IOException
[68] EntityRef ::= '&' Name ';'
NOTE: the '&' has already been read.
externalAllowed
- External entities are allowed here.
SAXException
IOException
private void parsePEReference() throws SAXException, IOException
[69] PEReference ::= '%' Name ';'
NOTE: the '%' has already been read.
SAXException
IOException
private void parseEntityDecl() throws Exception
[70] EntityDecl ::= GEDecl | PEDecl [71] GEDecl ::= '<!ENTITY' S Name S EntityDef S? '>' [72] PEDecl ::= '<!ENTITY' S '%' S Name S PEDef S? '>' [73] EntityDef ::= EntityValue | (ExternalID NDataDecl?) [74] PEDef ::= EntityValue | ExternalID [75] ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S PubidLiteral S SystemLiteral [76] NDataDecl ::= S 'NDATA' S Name
NOTE: the '<!ENTITY' has already been read.
Exception
private void parseNotationDecl() throws Exception
[82] NotationDecl ::= '<!NOTATION' S Name S (ExternalID | PublicID) S? '>' [83] PublicID ::= 'PUBLIC' S PubidLiteral
NOTE: the '<!NOTATION' has already been read.
Exception
private void parseCharData() throws Exception
[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)
Exception
private void requireWhitespace() throws SAXException, IOException
SAXException
IOException
private void skipWhitespace() throws SAXException, IOException
[3] S ::= (#x20 | #x9 | #xd | #xa)+
SAXException
IOException
private String readNmtoken(boolean isName) throws SAXException, IOException
[5] Name ::= (Letter | '_' | ':') (NameChar)* [7] Nmtoken ::= (NameChar)+
SAXException
IOException
private static boolean isExtender(char c)
private String readLiteral(int flags) throws SAXException, IOException
[9] EntityValue ::= ... ([^%&] | PEReference | Reference)* ... [10] AttValue ::= ... ([^<&] | Reference)* ... [11] SystemLiteral ::= ... (URLchar - "'")* ... [12] PubidLiteral ::= ... (PubidChar - "'")* ...as well as the quoted strings in XML and text declarations (for version, encoding, and standalone) which have their own constraints.
SAXException
IOException
private XmlParser.ExternalIdentifiers readExternalIds(boolean inNotation, boolean isSubset) throws Exception
inNotation
- Are we parsing a notation decl?isSubset
- Parsing external subset decl (may be omitted)?
Exception
private final boolean isWhitespace(char c)
[3] S ::= (#x20 | #x9 | #xd | #xa)+
c
- The character to test.
private void dataBufferAppend(char c)
private void dataBufferAppend(String s)
private void dataBufferAppend(char[] ch, int start, int length)
private void dataBufferNormalize()
private String dataBufferToString()
private void dataBufferFlush() throws SAXException
SAXException
private void require(String delim) throws SAXException, IOException
Precondition: Entity expansion is not required.
Precondition: data buffer has no characters that will get sent to the application.
SAXException
IOException
private void require(char delim) throws SAXException, IOException
SAXException
IOException
public String intern(char[] ch, int start, int length)
==
instead of String.equals ()
.
This is much more efficient than constructing a non-interned string first, and then interning it.
ch
- an array of characters for building the string.start
- the starting position in the array.length
- the number of characters to place in the string.
(String)
,
String.intern()
private Object extendArray(Object array, int currentSize, int requiredSize)
boolean isStandalone()
private int getContentType(XmlParser.ElementDecl element, int defaultType)
public int getElementContentType(String name)
name
- The element type name.
CONTENT_UNDECLARED
,
CONTENT_ANY
,
CONTENT_EMPTY
,
CONTENT_MIXED
,
CONTENT_ELEMENTS
private void setElement(String name, int contentType, String contentModel, HashMap<String,XmlParser.AttributeDecl> attributes) throws SAXException
SAXException
private HashMap<String,XmlParser.AttributeDecl> getElementAttributes(String name)
private Iterator<String> declaredAttributes(XmlParser.ElementDecl element)
elname
- The name of the element type.
getAttributeType(java.lang.String, java.lang.String)
,
getAttributeEnumeration(java.lang.String, java.lang.String)
,
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
getAttributeDefaultValue(java.lang.String, java.lang.String)
,
#getAttributeExpandedValue
public Iterator<String> declaredAttributes(String elname)
elname
- The name of the element type.
getAttributeType(java.lang.String, java.lang.String)
,
getAttributeEnumeration(java.lang.String, java.lang.String)
,
getAttributeDefaultValueType(java.lang.String, java.lang.String)
,
getAttributeDefaultValue(java.lang.String, java.lang.String)
,
#getAttributeExpandedValue
public String getAttributeType(String name, String aname)
name
- The name of the associated element.aname
- The name of the attribute.
public String getAttributeEnumeration(String name, String aname)
name
- The name of the associated element.aname
- The name of the attribute.
public String getAttributeDefaultValue(String name, String aname)
name
- The name of the associated element.aname
- The name of the attribute.
#getAttributeExpandedValue
public int getAttributeDefaultValueType(String name, String aname)
ATTRIBUTE_DEFAULT_SPECIFIED
,
ATTRIBUTE_DEFAULT_IMPLIED
,
ATTRIBUTE_DEFAULT_REQUIRED
,
ATTRIBUTE_DEFAULT_FIXED
private void setAttribute(String elName, String name, String type, String enumeration, String value, int valueType) throws Exception
Exception
private XmlParser.AttributeDecl getAttribute(String elName, String name)
public int getEntityType(String ename)
ENTITY_UNDECLARED
,
ENTITY_INTERNAL
,
ENTITY_NDATA
,
ENTITY_TEXT
public XmlParser.ExternalIdentifiers getEntityIds(String ename)
ename
- The name of the external entity.
getEntityType(java.lang.String)
public String getEntityValue(String ename)
ename
- The name of the internal entity.
getEntityType(java.lang.String)
private void setInternalEntity(String eName, String value) throws SAXException
SAXException
private void setExternalEntity(String eName, int eClass, XmlParser.ExternalIdentifiers ids, String nName)
private void setNotation(String nname, XmlParser.ExternalIdentifiers ids) throws SAXException
SAXException
public int getLineNumber()
public int getColumnNumber()
private char readCh() throws SAXException, IOException
The readDataChunk () method maintains the buffer.
If we hit the end of an entity, try to pop the stack and keep going.
(This approach doesn't really enforce XML's rules about entity boundaries, but this is not currently a validating parser).
This routine also attempts to keep track of the current position in external entities, but it's not entirely accurate.
SAXException
IOException
(char)
,
readDataChunk()
,
readBuffer
,
line
private void unread(char c) throws SAXException
This method usually pushes the character back onto the readBuffer.
I don't think that this would ever be called with readBufferPos = 0, because the methods always reads a character before unreading it, but just in case, I've added a boundary condition.
c
- The character to push back.
SAXException
readCh()
,
(char[])
,
readBuffer
private void unread(char[] ch, int length) throws SAXException
NOTE: you must never push back characters that you haven't actually read: use pushString () instead.
SAXException
readCh()
,
(char)
,
readBuffer
,
pushString(java.lang.String, java.lang.String)
private void pushURL(boolean isPE, String ename, XmlParser.ExternalIdentifiers ids, Reader aReader, InputStream aStream, String aEncoding, boolean doResolve) throws SAXException, IOException
url
- The java.net.URL object for the entity.
SAXException
IOException
SAXDriver.resolveEntity(boolean, java.lang.String, org.xml.sax.InputSource, java.lang.String)
,
pushString(java.lang.String, java.lang.String)
,
sourceType
,
pushInput(java.lang.String)
,
detectEncoding()
,
sourceType
,
readBuffer
private String tryEncodingDecl(String encoding) throws SAXException, IOException
Because this part starts to fill parser buffers with this data, it's tricky to setup a reader so that Java's built-in decoders can be used for the character encodings that aren't built in to this parser (such as EUC-JP, KOI8-R, Big5, etc).
SAXException
IOException
detectEncoding
private void warnAboutLackOfEncodingDecl(String encoding) throws SAXException
characterEncoding
-
SAXException
private void detectEncoding() throws SAXException, IOException
The trick here (as suggested in the XML standard) is that any entity not in UTF-8, or in UCS-2 with a byte-order mark, must begin with an XML declaration or an encoding declaration; we simply have to look for "<?xml" in various encodings.
This method has no way to distinguish among 8-bit encodings. Instead, it sets up for UTF-8, then (possibly) revises its assumption later in setupDecoding (). Any ASCII-derived 8-bit encoding should work, but most will be rejected later by setupDecoding ().
SAXException
IOException
(byte[], byte, byte, byte, byte)
,
(byte[], byte, byte)
,
#setupDecoding
private static boolean tryEncoding(byte[] sig, byte b1, byte b2, byte b3, byte b4)
Utility routine for detectEncoding ().
Always looks for some part of "
sig
- The first four bytes read.b1
- The first byte of the signatureb2
- The second byte of the signatureb3
- The third byte of the signatureb4
- The fourth byte of the signaturedetectEncoding()
private static boolean tryEncoding(byte[] sig, byte b1, byte b2)
Looks for a UCS-2 byte-order mark.
Utility routine for detectEncoding ().
sig
- The first four bytes read.b1
- The first byte of the signatureb2
- The second byte of the signaturedetectEncoding()
private void pushString(String ename, String s) throws SAXException
It is useful either as the expansion of an internal entity, or for backtracking during the parse.
Call pushCharArray () to do the actual work.
s
- The string to push back onto input.
SAXException
pushCharArray(java.lang.String, char[], int, int)
private void pushCharArray(String ename, char[] ch, int start, int length) throws SAXException
This method is useful for expanding an internal entity, or for unreading a string of characters. It creates a new readBuffer containing the characters in the array, instead of characters converted from an input byte stream.
ch
- The char array to push.
SAXException
pushString(java.lang.String, java.lang.String)
,
pushURL(boolean, java.lang.String, fi.iki.hsivonen.gnu.xml.aelfred2.XmlParser.ExternalIdentifiers, java.io.Reader, java.io.InputStream, java.lang.String, boolean)
,
readBuffer
,
sourceType
,
pushInput(java.lang.String)
private void pushInput(String ename) throws SAXException
This method saves all of the global variables associated with the current input source, so that they can be restored when a new input source has finished. It also tests for entity recursion.
The method saves the following global variables onto a stack using a fixed-length array:
ename
- The name of the entity (if any) causing the new input.
SAXException
popInput()
,
sourceType
,
#externalEntity
,
readBuffer
,
readBufferPos
,
readBufferLength
,
line
,
characterEncoding
private void popInput() throws SAXException, IOException
This method restores all of the global variables associated with the current input source.
EOFException
- If there are no more entries on the input stack.
SAXException
IOException
pushInput(java.lang.String)
,
sourceType
,
readBuffer
,
readBufferPos
,
readBufferLength
,
line
,
characterEncoding
private boolean tryRead(char delim) throws SAXException, IOException
Note that the character will be removed from the input stream on success, but will be put back on failure. Do not attempt to read the character again if the method succeeds.
delim
- The character that should appear next. For a
insensitive match, you must supply this in upper-case.
SAXException
IOException
(String)
private boolean tryRead(String delim) throws SAXException, IOException
This is simply a convenience method.
Note that the string will be removed from the input stream on success, but will be put back on failure. Do not attempt to read the string again if the method succeeds.
This method will push back a character rather than an array whenever possible (probably the majority of cases).
delim
- The string that should appear next.
SAXException
IOException
(char)
private boolean tryRead(char[] ch) throws SAXException, IOException
SAXException
IOException
private boolean tryWhitespace() throws SAXException, IOException
This is simply a convenience method.
This method will push back a character rather than an array whenever possible (probably the majority of cases).
SAXException
IOException
private void parseUntil(char[] delim) throws SAXException, IOException
SAXException
IOException
private void prefetchASCIIEncodingDecl() throws SAXException, IOException
SAXException
IOException
private void readDataChunk() throws SAXException, IOException
This is simply a front-end that fills the rawReadBuffer with bytes, then calls the appropriate encoding handler.
SAXException
IOException
characterEncoding
,
rawReadBuffer
,
readBuffer
,
filterCR(boolean)
,
#copyUtf8ReadBuffer
,
#copyIso8859_1ReadBuffer
,
#copyUcs_2ReadBuffer
,
#copyUcs_4ReadBuffer
private void filterCR(boolean moreData)
moreData
- true iff more data might come from the same sourcereadDataChunk()
,
readBuffer
,
readBufferOverflow
private void warnAboutPrivateUseChar() throws SAXException
SAXException
private boolean isPrivateUse(char c)
private boolean isPrivateUse(int c)
private boolean isAstralPrivateUse(int c)
private boolean isNonCharacter(int c)
private void initializeVariables()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |