Class EndTag
-
- All Implemented Interfaces:
java.lang.CharSequence,java.lang.Comparable<Segment>
public final class EndTag extends Tag
Represents the end tag of an element in a specific source document.An end tag always has a type that is a subclass of
EndTagType, meaning it always starts with the characters '</'.EndTaginstances are obtained using one of the following methods:Element.getEndTag()Tag.getNextTag()Tag.getPreviousTag()Source.getPreviousEndTag(int pos)Source.getPreviousEndTag(int pos, String name)Source.getPreviousTag(int pos)Source.getPreviousTag(int pos, TagType)Source.getNextEndTag(int pos)Source.getNextEndTag(int pos, String name)Source.getNextEndTag(int pos, String name, EndTagType)Source.getNextTag(int pos)Source.getNextTag(int pos, TagType)Source.getEnclosingTag(int pos)Source.getEnclosingTag(int pos, TagType)Source.getTagAt(int pos)Segment.getAllTags()Segment.getAllTags(TagType)
The
Tagsuperclass defines thegetName()method used to get the name of this end tag.See also the XML 1.0 specification for end tags.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.lang.StringgenerateHTML(java.lang.String tagName)java.lang.StringgetDebugInfo()Returns a string representation of this object useful for debugging purposes.ElementgetElement()Returns the element that is ended by this end tag.EndTagTypegetEndTagType()Returns the type of this end tag.TagTypegetTagType()Returns the type of this tag.booleanisUnregistered()Indicates whether this tag has a syntax that does not match any of the registered tag types.java.lang.Stringtidy()Returns an XML representation of this end tag.-
Methods inherited from class net.htmlparser.jericho.Tag
getName, getNameSegment, getNextTag, getPreviousTag, getUserData, isXMLName, isXMLNameChar, isXMLNameStartChar, setUserData
-
Methods inherited from class net.htmlparser.jericho.Segment
charAt, compareTo, encloses, encloses, equals, getAllCharacterReferences, getAllElements, getAllElements, getAllElements, getAllElements, getAllElements, getAllElementsByClass, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTagsByClass, getAllTags, getAllTags, getBegin, getChildElements, getEnd, getFirstElement, getFirstElement, getFirstElement, getFirstElement, getFirstElementByClass, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTagByClass, getFormControls, getFormFields, getMaxDepthIndicator, getNodeIterator, getRenderer, getRowColumnVector, getSource, getStyleURISegments, getTextExtractor, getURIAttributes, hashCode, ignoreWhenParsing, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString
-
-
-
-
Method Detail
-
getElement
public Element getElement()
Returns the element that is ended by this end tag.Returns
nullif this end tag is not properly matched to any start tag in the source document.This method is much less efficient than the
StartTag.getElement()method.IMPLEMENTATION NOTE: The explanation for why this method is relatively inefficient lies in the fact that more than one start tag type can have the same corresponding end tag type, so it is not possible to know for certain which type of start tag this end tag is matched to (see
EndTagType.getCorrespondingStartTagType()for more explanation). Because of this uncertainty, the implementation of this method must check every start tag preceding this end tag, calling itsStartTag.getElement()method to see whether it is terminated by this end tag.- Specified by:
getElementin classTag- Returns:
- the element that is ended by this end tag.
-
getEndTagType
public EndTagType getEndTagType()
Returns the type of this end tag.This is equivalent to
(EndTagType)getTagType().- Returns:
- the type of this end tag.
-
getTagType
public TagType getTagType()
Description copied from class:TagReturns the type of this tag.- Specified by:
getTagTypein classTag- Returns:
- the type of this tag.
-
isUnregistered
public boolean isUnregistered()
Description copied from class:TagIndicates whether this tag has a syntax that does not match any of the registered tag types.The only requirement of an unregistered tag type is that it starts with '
<' and there is a closing '>' character at some position after it in the source document.The absence or presence of a '
/' character after the initial '<' determines whether an unregistered tag is respectively aStartTagwith a type ofStartTagType.UNREGISTEREDor anEndTagwith a type ofEndTagType.UNREGISTERED.There are no restrictions on the characters that might appear between these delimiters, including other '
<' characters. This may result in a '>' character that is identified as the closing delimiter of two separate tags, one an unregistered tag, and the other a tag of any type that begins in the middle of the unregistered tag. As explained below, unregistered tags are usually only found when specifically looking for them, so it is up to the user to detect and deal with any such nonsensical results.Unregistered tags are only returned by the
Source.getTagAt(int pos)method, named search methods, where the specifiednamematches the first characters inside the tag, and by tag type search methods, where the specifiedtagTypeis eitherStartTagType.UNREGISTEREDorEndTagType.UNREGISTERED.Open tag searches and other searches always ignore unregistered tags, although every discovery of an unregistered tag is logged by the parser.
The logic behind this design is that unregistered tag types are usually the result of a '
<' character in the text that was mistakenly left unencoded, or a less-than operator inside a script, or some other occurrence which is of no interest to the user. By returning unregistered tags in named and tag type search methods, the library allows the user to specifically search for tags with a certain syntax that does not match any existingTagType. This expediency feature avoids the need for the user to create a custom tag type to define the syntax before searching for these tags. By not returning unregistered tags in the less specific search methods, it is providing only the information that most users are interested in.- Specified by:
isUnregisteredin classTag- Returns:
trueif this tag has a syntax that does not match any of the registered tag types, otherwisefalse.
-
tidy
public java.lang.String tidy()
Returns an XML representation of this end tag.The tidying of the tag is carried out as follows:
- if this end tag is a
NORMALend tag then any white space before the closing angle bracket is removed. - otherwise the original source text of the entire tag is returned.
- Specified by:
tidyin classTag- Returns:
- an XML representation of this end tag.
- See Also:
StartTag.tidy()
- if this end tag is a
-
generateHTML
public static java.lang.String generateHTML(java.lang.String tagName)
Generates the HTML text of a normal end tag with the specified tag name.- Example:
-
The following method call:
returns the following output:EndTag.generateHTML("INPUT")</INPUT>
- Parameters:
tagName- the name of the end tag.- Returns:
- the HTML text of a normal end tag with the specified tag name.
- See Also:
StartTag.generateHTML(String tagName, Map attributesMap, boolean emptyElementTag)
-
getDebugInfo
public java.lang.String getDebugInfo()
Description copied from class:SegmentReturns a string representation of this object useful for debugging purposes.- Overrides:
getDebugInfoin classSegment- Returns:
- a string representation of this object useful for debugging purposes.
-
-