Class BinaryDictionary
- java.lang.Object
-
- org.apache.lucene.analysis.ko.dict.BinaryDictionary
-
- All Implemented Interfaces:
Dictionary
- Direct Known Subclasses:
TokenInfoDictionary,UnknownDictionary
public abstract class BinaryDictionary extends java.lang.Object implements Dictionary
Base class for a binary-encoded in-memory dictionary.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classBinaryDictionary.ResourceSchemeUsed to specify where (dictionary) resources get loaded from.-
Nested classes/interfaces inherited from interface org.apache.lucene.analysis.ko.dict.Dictionary
Dictionary.Morpheme
-
-
Field Summary
Fields Modifier and Type Field Description private java.nio.ByteBufferbufferstatic java.lang.StringDICT_FILENAME_SUFFIXstatic java.lang.StringDICT_HEADERstatic intHAS_READINGflag that the entry has reading data.static intHAS_SINGLE_POSflag that the entry has a single part of speech (leftPOS)private POS.Tag[]posDictstatic java.lang.StringPOSDICT_FILENAME_SUFFIXstatic java.lang.StringPOSDICT_HEADERprivate java.lang.StringresourcePathprivate BinaryDictionary.ResourceSchemeresourceSchemeprivate int[]targetMapstatic java.lang.StringTARGETMAP_FILENAME_SUFFIXstatic java.lang.StringTARGETMAP_HEADERprivate int[]targetMapOffsetsstatic intVERSION
-
Constructor Summary
Constructors Modifier Constructor Description protectedBinaryDictionary()protectedBinaryDictionary(BinaryDictionary.ResourceScheme resourceScheme, java.lang.String resourcePath)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.io.InputStreamgetClassResource(java.lang.Class<?> clazz, java.lang.String suffix)private static java.io.InputStreamgetClassResource(java.lang.String path)intgetLeftId(int wordId)Get left id of specified wordPOS.TaggetLeftPOS(int wordId)Get the leftPOS.Tagof specfied word.Dictionary.Morpheme[]getMorphemes(int wordId, char[] surfaceForm, int off, int len)Get the morphemes of specified word (e.g.POS.TypegetPOSType(int wordId)Get thePOS.Typeof specified word (morpheme, compound, inflect or pre-analysis)java.lang.StringgetReading(int wordId)Get the reading of specified word (mainly used for Hanja to Hangul conversion).protected java.io.InputStreamgetResource(java.lang.String suffix)static java.io.InputStreamgetResource(BinaryDictionary.ResourceScheme scheme, java.lang.String path)intgetRightId(int wordId)Get right id of specified wordPOS.TaggetRightPOS(int wordId)Get the rightPOS.Tagof specfied word.intgetWordCost(int wordId)Get word cost of specified wordprivate booleanhasReadingData(int wordId)private booleanhasSinglePOS(int wordId)voidlookupWordIds(int sourceId, IntsRef ref)private java.lang.StringreadString(int offset)
-
-
-
Field Detail
-
TARGETMAP_FILENAME_SUFFIX
public static final java.lang.String TARGETMAP_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
DICT_FILENAME_SUFFIX
public static final java.lang.String DICT_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
POSDICT_FILENAME_SUFFIX
public static final java.lang.String POSDICT_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
DICT_HEADER
public static final java.lang.String DICT_HEADER
- See Also:
- Constant Field Values
-
TARGETMAP_HEADER
public static final java.lang.String TARGETMAP_HEADER
- See Also:
- Constant Field Values
-
POSDICT_HEADER
public static final java.lang.String POSDICT_HEADER
- See Also:
- Constant Field Values
-
VERSION
public static final int VERSION
- See Also:
- Constant Field Values
-
resourceScheme
private final BinaryDictionary.ResourceScheme resourceScheme
-
resourcePath
private final java.lang.String resourcePath
-
buffer
private final java.nio.ByteBuffer buffer
-
targetMapOffsets
private final int[] targetMapOffsets
-
targetMap
private final int[] targetMap
-
posDict
private final POS.Tag[] posDict
-
HAS_SINGLE_POS
public static final int HAS_SINGLE_POS
flag that the entry has a single part of speech (leftPOS)- See Also:
- Constant Field Values
-
HAS_READING
public static final int HAS_READING
flag that the entry has reading data. otherwise reading is surface form- See Also:
- Constant Field Values
-
-
Constructor Detail
-
BinaryDictionary
protected BinaryDictionary() throws java.io.IOException- Throws:
java.io.IOException
-
BinaryDictionary
protected BinaryDictionary(BinaryDictionary.ResourceScheme resourceScheme, java.lang.String resourcePath) throws java.io.IOException
- Parameters:
resourceScheme- - scheme for loading resources (FILE or CLASSPATH).resourcePath- - where to load resources (dictionaries) from. If null, with CLASSPATH scheme only, use this class's name as the path.- Throws:
java.io.IOException
-
-
Method Detail
-
getResource
protected final java.io.InputStream getResource(java.lang.String suffix) throws java.io.IOException- Throws:
java.io.IOException
-
getResource
public static java.io.InputStream getResource(BinaryDictionary.ResourceScheme scheme, java.lang.String path) throws java.io.IOException
- Throws:
java.io.IOException
-
getClassResource
public static java.io.InputStream getClassResource(java.lang.Class<?> clazz, java.lang.String suffix) throws java.io.IOException- Throws:
java.io.IOException
-
getClassResource
private static java.io.InputStream getClassResource(java.lang.String path) throws java.io.IOException- Throws:
java.io.IOException
-
lookupWordIds
public void lookupWordIds(int sourceId, IntsRef ref)
-
getLeftId
public int getLeftId(int wordId)
Description copied from interface:DictionaryGet left id of specified word- Specified by:
getLeftIdin interfaceDictionary
-
getRightId
public int getRightId(int wordId)
Description copied from interface:DictionaryGet right id of specified word- Specified by:
getRightIdin interfaceDictionary
-
getWordCost
public int getWordCost(int wordId)
Description copied from interface:DictionaryGet word cost of specified word- Specified by:
getWordCostin interfaceDictionary
-
getPOSType
public POS.Type getPOSType(int wordId)
Description copied from interface:DictionaryGet thePOS.Typeof specified word (morpheme, compound, inflect or pre-analysis)- Specified by:
getPOSTypein interfaceDictionary
-
getLeftPOS
public POS.Tag getLeftPOS(int wordId)
Description copied from interface:DictionaryGet the leftPOS.Tagof specfied word. ForPOS.Type.MORPHEMEandPOS.Type.COMPOUNDthe left and right POS are the same.- Specified by:
getLeftPOSin interfaceDictionary
-
getRightPOS
public POS.Tag getRightPOS(int wordId)
Description copied from interface:DictionaryGet the rightPOS.Tagof specfied word. ForPOS.Type.MORPHEMEandPOS.Type.COMPOUNDthe left and right POS are the same.- Specified by:
getRightPOSin interfaceDictionary
-
getReading
public java.lang.String getReading(int wordId)
Description copied from interface:DictionaryGet the reading of specified word (mainly used for Hanja to Hangul conversion).- Specified by:
getReadingin interfaceDictionary
-
getMorphemes
public Dictionary.Morpheme[] getMorphemes(int wordId, char[] surfaceForm, int off, int len)
Description copied from interface:DictionaryGet the morphemes of specified word (e.g. 가깝으나: 가깝 + 으나).- Specified by:
getMorphemesin interfaceDictionary
-
readString
private java.lang.String readString(int offset)
-
hasSinglePOS
private boolean hasSinglePOS(int wordId)
-
hasReadingData
private boolean hasReadingData(int wordId)
-
-