Package org.apache.lucene.analysis.ja
Class Token
- java.lang.Object
-
- org.apache.lucene.analysis.ja.Token
-
public class Token extends java.lang.ObjectAnalyzed token with morphological data from its dictionary.
-
-
Field Summary
Fields Modifier and Type Field Description private Dictionarydictionaryprivate intlengthprivate intoffsetprivate intpositionprivate intpositionLengthprivate char[]surfaceFormprivate JapaneseTokenizer.Typetypeprivate intwordId
-
Constructor Summary
Constructors Constructor Description Token(int wordId, char[] surfaceForm, int offset, int length, JapaneseTokenizer.Type type, int position, Dictionary dictionary)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringgetBaseForm()java.lang.StringgetInflectionForm()java.lang.StringgetInflectionType()intgetLength()intgetOffset()java.lang.StringgetPartOfSpeech()intgetPosition()Get index of this token in input textintgetPositionLength()Get the length (in tokens) of this token.java.lang.StringgetPronunciation()java.lang.StringgetReading()char[]getSurfaceForm()java.lang.StringgetSurfaceFormString()JapaneseTokenizer.TypegetType()Returns the type of this tokenbooleanisKnown()Returns true if this token is known wordbooleanisUnknown()Returns true if this token is unknown wordbooleanisUser()Returns true if this token is defined in user dictionaryvoidsetPositionLength(int positionLength)Set the position length (in tokens) of this token.java.lang.StringtoString()
-
-
-
Field Detail
-
dictionary
private final Dictionary dictionary
-
wordId
private final int wordId
-
surfaceForm
private final char[] surfaceForm
-
offset
private final int offset
-
length
private final int length
-
position
private final int position
-
positionLength
private int positionLength
-
type
private final JapaneseTokenizer.Type type
-
-
Constructor Detail
-
Token
public Token(int wordId, char[] surfaceForm, int offset, int length, JapaneseTokenizer.Type type, int position, Dictionary dictionary)
-
-
Method Detail
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
getSurfaceForm
public char[] getSurfaceForm()
- Returns:
- surfaceForm
-
getOffset
public int getOffset()
- Returns:
- offset into surfaceForm
-
getLength
public int getLength()
- Returns:
- length of surfaceForm
-
getSurfaceFormString
public java.lang.String getSurfaceFormString()
- Returns:
- surfaceForm as a String
-
getReading
public java.lang.String getReading()
- Returns:
- reading. null if token doesn't have reading.
-
getPronunciation
public java.lang.String getPronunciation()
- Returns:
- pronunciation. null if token doesn't have pronunciation.
-
getPartOfSpeech
public java.lang.String getPartOfSpeech()
- Returns:
- part of speech.
-
getInflectionType
public java.lang.String getInflectionType()
- Returns:
- inflection type or null
-
getInflectionForm
public java.lang.String getInflectionForm()
- Returns:
- inflection form or null
-
getBaseForm
public java.lang.String getBaseForm()
- Returns:
- base form or null if token is not inflected
-
getType
public JapaneseTokenizer.Type getType()
Returns the type of this token- Returns:
- token type, not null
-
isKnown
public boolean isKnown()
Returns true if this token is known word- Returns:
- true if this token is in standard dictionary. false if not.
-
isUnknown
public boolean isUnknown()
Returns true if this token is unknown word- Returns:
- true if this token is unknown word. false if not.
-
isUser
public boolean isUser()
Returns true if this token is defined in user dictionary- Returns:
- true if this token is in user dictionary. false if not.
-
getPosition
public int getPosition()
Get index of this token in input text- Returns:
- position of token
-
setPositionLength
public void setPositionLength(int positionLength)
Set the position length (in tokens) of this token. For normal tokens this is 1; for compound tokens it's > 1.
-
getPositionLength
public int getPositionLength()
Get the length (in tokens) of this token. For normal tokens this is 1; for compound tokens it's > 1.- Returns:
- position length of token
-
-