Package org.apache.lucene.analysis.icu
Class ICUNormalizer2CharFilter
- java.lang.Object
-
- java.io.Reader
-
- org.apache.lucene.analysis.CharFilter
-
- org.apache.lucene.analysis.charfilter.BaseCharFilter
-
- org.apache.lucene.analysis.icu.ICUNormalizer2CharFilter
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable,java.lang.Readable
public final class ICUNormalizer2CharFilter extends BaseCharFilter
Normalize token text with ICU'sNormalizer2.
-
-
Field Summary
Fields Modifier and Type Field Description private booleanafterQuickCheckYesprivate intcharCountprivate intcheckedInputBoundaryprivate java.lang.StringBuilderinputBufferprivate booleaninputFinishedprivate com.ibm.icu.text.Normalizer2normalizerprivate java.lang.StringBuilderresultBufferprivate CharacterUtils.CharacterBuffertmpBuffer-
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
-
-
Constructor Summary
Constructors Constructor Description ICUNormalizer2CharFilter(java.io.Reader in)Create a new Normalizer2CharFilter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer)Create a new Normalizer2CharFilter with the specified Normalizer2ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer, int bufferSize)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private intnormalizeInputUpto(int length)private intoutputFromResultBuffer(char[] cbuf, int begin, int len)intread(char[] cbuf, int off, int len)private intreadAndNormalizeFromInput()private intreadFromInputWhileSpanQuickCheckYes()private intreadFromIoNormalizeUptoBoundary()private voidreadInputToBuffer()private voidrecordOffsetDiff(int inputLength, int outputLength)-
Methods inherited from class org.apache.lucene.analysis.charfilter.BaseCharFilter
addOffCorrectMap, correct, getLastCumulativeDiff
-
Methods inherited from class org.apache.lucene.analysis.CharFilter
close, correctOffset
-
-
-
-
Field Detail
-
normalizer
private final com.ibm.icu.text.Normalizer2 normalizer
-
inputBuffer
private final java.lang.StringBuilder inputBuffer
-
resultBuffer
private final java.lang.StringBuilder resultBuffer
-
inputFinished
private boolean inputFinished
-
afterQuickCheckYes
private boolean afterQuickCheckYes
-
checkedInputBoundary
private int checkedInputBoundary
-
charCount
private int charCount
-
tmpBuffer
private final CharacterUtils.CharacterBuffer tmpBuffer
-
-
Constructor Detail
-
ICUNormalizer2CharFilter
public ICUNormalizer2CharFilter(java.io.Reader in)
Create a new Normalizer2CharFilter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)
-
ICUNormalizer2CharFilter
public ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer)Create a new Normalizer2CharFilter with the specified Normalizer2- Parameters:
in- textnormalizer- normalizer to use
-
ICUNormalizer2CharFilter
ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer, int bufferSize)
-
-
Method Detail
-
read
public int read(char[] cbuf, int off, int len) throws java.io.IOException- Specified by:
readin classjava.io.Reader- Throws:
java.io.IOException
-
readInputToBuffer
private void readInputToBuffer() throws java.io.IOException- Throws:
java.io.IOException
-
readAndNormalizeFromInput
private int readAndNormalizeFromInput()
-
readFromInputWhileSpanQuickCheckYes
private int readFromInputWhileSpanQuickCheckYes()
-
readFromIoNormalizeUptoBoundary
private int readFromIoNormalizeUptoBoundary()
-
normalizeInputUpto
private int normalizeInputUpto(int length)
-
recordOffsetDiff
private void recordOffsetDiff(int inputLength, int outputLength)
-
outputFromResultBuffer
private int outputFromResultBuffer(char[] cbuf, int begin, int len)
-
-