Package org.apache.lucene.analysis.cjk
Class CJKWidthCharFilter
- java.lang.Object
-
- java.io.Reader
-
- org.apache.lucene.analysis.CharFilter
-
- org.apache.lucene.analysis.charfilter.BaseCharFilter
-
- org.apache.lucene.analysis.cjk.CJKWidthCharFilter
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable,java.lang.Readable
public class CJKWidthCharFilter extends BaseCharFilter
ACharFilterthat normalizes CJK width differences:- Folds fullwidth ASCII variants into the equivalent basic latin
- Folds halfwidth Katakana variants into the equivalent kana
NOTE: this char filter is the exact counterpart of
CJKWidthFilter.
-
-
Field Summary
Fields Modifier and Type Field Description private static intHW_KATAKANA_SEMI_VOICED_MARKprivate static intHW_KATAKANA_VOICED_MARKprivate intinputOffprivate static byte[]KANA_COMBINE_SEMI_VOICEDprivate static byte[]KANA_COMBINE_VOICEDprivate static char[]KANA_NORMprivate intprevChar-
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
-
-
Constructor Summary
Constructors Constructor Description CJKWidthCharFilter(java.io.Reader in)Default constructor that takes aReader.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private intcombineVoiceMark(int ch, int voiceMark)returns combined char if we successfully combined the voice mark, otherwise original charintread()intread(char[] cbuf, int off, int len)-
Methods inherited from class org.apache.lucene.analysis.charfilter.BaseCharFilter
addOffCorrectMap, correct, getLastCumulativeDiff
-
Methods inherited from class org.apache.lucene.analysis.CharFilter
close, correctOffset
-
-
-
-
Field Detail
-
KANA_NORM
private static final char[] KANA_NORM
-
KANA_COMBINE_VOICED
private static final byte[] KANA_COMBINE_VOICED
-
KANA_COMBINE_SEMI_VOICED
private static final byte[] KANA_COMBINE_SEMI_VOICED
-
HW_KATAKANA_VOICED_MARK
private static final int HW_KATAKANA_VOICED_MARK
- See Also:
- Constant Field Values
-
HW_KATAKANA_SEMI_VOICED_MARK
private static final int HW_KATAKANA_SEMI_VOICED_MARK
- See Also:
- Constant Field Values
-
prevChar
private int prevChar
-
inputOff
private int inputOff
-
-
Method Detail
-
read
public int read() throws java.io.IOException- Overrides:
readin classjava.io.Reader- Throws:
java.io.IOException
-
combineVoiceMark
private int combineVoiceMark(int ch, int voiceMark)returns combined char if we successfully combined the voice mark, otherwise original char
-
read
public int read(char[] cbuf, int off, int len) throws java.io.IOException- Specified by:
readin classjava.io.Reader- Throws:
java.io.IOException
-
-