public class Parser extends Object
Parses the document into a tree of nodes using the
NodeTokenizer. Nodes are defined by a token or
offset range in the document, Token. Attributes in beginning
nodes are also parsed into token offsets by the AttributeTokenizer.
A document tree is built representing nodes in the target document. The
document can be a HTML fragment that is not well-formed or an XML
fragment of a XHTML document.
| Modifier and Type | Field and Description |
|---|---|
static org.apache.shale.clay.parser.Parser.Rule[] |
BEGIN_CDATA_RULES
Declare an array of
Parser.Rules that validate a begin CDATA Token. |
static org.apache.shale.clay.parser.Parser.Rule[] |
BEGIN_COMMENT_TAG_RULES
Declare an array of
Parser.Rules that validate a begin comment Token. |
static org.apache.shale.clay.parser.Parser.Rule[] |
BEGIN_TAG_RULES
Declare an array of
Parser.Rules that validate a begining Token. |
static org.apache.shale.clay.parser.Parser.Rule[] |
DOCTYPE_TAG_RULES
Declare an array of
Parser.Rules that validate document type Token. |
static org.apache.shale.clay.parser.Parser.Rule[] |
END_CDATA_RULES
Declare an array of
Parser.Rules that validate an end CDATA Token. |
static String |
END_CHARSET_TOKEN
The end of the comment token used to override the template
encoding type.
|
static org.apache.shale.clay.parser.Parser.Rule[] |
END_COMMENT_TAG_RULES
Declare an array of
Parser.Rules that validate an end comment Token. |
static String |
START_CHARSET_TOKEN
The start of the comment token used to override the template
encoding type.
|
| Constructor and Description |
|---|
Parser() |
| Modifier and Type | Method and Description |
|---|---|
protected Node |
buildNode(Token token)
|
protected void |
discoverNodeAttributes(Node node)
If the
Node is a starting tag and not a comment,
use the AttributeTokenizer to realize the node attributes. |
protected void |
discoverNodeName(Node node)
|
protected void |
discoverNodeOverrides(Node node)
|
protected void |
discoverNodeShape(Node node)
Determine if the
Node is a starting, ending, or body text
tag. |
protected Node |
findBeginingNode(Node current,
Node node) |
protected boolean |
isNodeNameEqual(Node node1,
Node node2)
Compares two
Node instances by name. |
protected boolean |
isOptionalEndingTag(String nodeName)
Determines if a HTML nodeName is a type of tag that can optionally have a
ending tag.
|
protected boolean |
isSelfTerminating(String nodeName)
Checks to see if the nodeName is within the
SELF_TERMINATING
table of values. |
protected boolean |
isValidOptionalEndingTagParent(String nodeName,
String parentNodeName)
Checks to see if a optional ending tag has a valid parent.
|
List |
parse(StringBuffer document)
Parse a document fragment into graphs of
Node. |
public static final String START_CHARSET_TOKEN
The start of the comment token used to override the template encoding type.
public static final String END_CHARSET_TOKEN
The end of the comment token used to override the template encoding type.
public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_CDATA_RULES
Declare an array of Parser.Rules that validate a begin CDATA Token.
public static final org.apache.shale.clay.parser.Parser.Rule[] END_CDATA_RULES
Declare an array of Parser.Rules that validate an end CDATA Token.
public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_COMMENT_TAG_RULES
Declare an array of Parser.Rules that validate a begin comment Token.
public static final org.apache.shale.clay.parser.Parser.Rule[] END_COMMENT_TAG_RULES
Declare an array of Parser.Rules that validate an end comment Token.
public static final org.apache.shale.clay.parser.Parser.Rule[] DOCTYPE_TAG_RULES
Declare an array of Parser.Rules that validate document type Token.
public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_TAG_RULES
Declare an array of Parser.Rules that validate a begining Token.
protected boolean isOptionalEndingTag(String nodeName)
Determines if a HTML nodeName is a type of tag that can optionally have a ending tag.
nodeName - the name of the html nodetrue if the nodeName is in the
OPTIONAL-ENDING_TAG array; otherwise, false is returnedprotected boolean isValidOptionalEndingTagParent(String nodeName, String parentNodeName)
Checks to see if a optional ending tag has a valid parent. This is use to detect a implicit ending tag
nodeName - of the optional ending tagparentNodeName - name of the parenttrue if the parentNodeName is a valid parent for
the nodeName; otherwise, a false value is returnedprotected Node findBeginingNode(Node current, Node node)
current - top of the stacknode - ending nodepublic List parse(StringBuffer document)
Parse a document fragment into graphs of Node. The resulting
type is a list because the fragment might not be well-formed.
document - input sourceNodeprotected boolean isNodeNameEqual(Node node1, Node node2)
Compares two Node instances by name.
This method is used to match a beginning tag with an ending tag
while building the document stack. Returns true if
the node name properties are the same.
node1 - first nodenode2 - secnod nodetrue if they are the sameprotected boolean isSelfTerminating(String nodeName)
Checks to see if the nodeName is within the SELF_TERMINATING
table of values.
nodeName - to check for self terminationtrue if is self terminating otherwise
falseprotected Node buildNode(Token token)
token - node offset in the documentprotected void discoverNodeShape(Node node)
Determine if the Node is a starting, ending, or body text
tag. The array of Parser.Shapes are used to determine the type of
Node the Token representes.
node - target nodeprotected void discoverNodeName(Node node)
node - targetprotected void discoverNodeAttributes(Node node)
If the Node is a starting tag and not a comment,
use the AttributeTokenizer to realize the node attributes.
node - targetCopyright © 2004-2013 Apache Software Foundation. All Rights Reserved.