- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.MultiByteEncoding
-
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
CanBeTrailTableEncoding,EmacsMuleEncoding,EucEncoding,GB18030Encoding,UnicodeEncoding
public abstract class MultiByteEncoding extends Encoding
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedMultiByteEncoding(String name, int minLength, int maxLength, int[] EncLen, int[][] Trans, short[] CTypeTable)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidapplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)onigenc_ascii_apply_all_case_fold / used also by multibyte encodingsprotected voidasciiApplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)protected CaseFoldCodeItem[]asciiCaseFoldCodesByString(int flag, byte[] bytes, int p, int end)protected intasciiMbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)CaseFoldCodeItem[]caseFoldCodesByString(int flag, byte[] bytes, int p, int end)onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodingsintcaseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)Oniguruma equivalent:case_mapprotected booleanisCodeCTypeInternal(int code, int ctype)ONIGENC_IS_XXXXXX_CODE_CTYPEbooleanisNewLine(byte[] bytes, int p, int end)onigenc_is_mbc_newline_0x0a / used also by multibyte encodingsintlength(byte c)Returns character length given character head returns1for singlebyte encodings or performs direct length table lookup for multibyte ones.protected intlengthForTwoUptoFour(byte[] bytes, int p, int end, int b, int s)protected intmb2CodeToMbc(int code, byte[] bytes, int p)protected intmb2CodeToMbcLength(int code)protected booleanmb2IsCodeCType(int code, int ctype)protected intmb4CodeToMbc(int code, byte[] bytes, int p)protected intmb4CodeToMbcLength(int code)protected booleanmb4IsCodeCType(int code, int ctype)intmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)onigenc_ascii_mbc_case_foldprotected intmbnMbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)protected intmbnMbcToCode(byte[] bytes, int p, int end)protected intmissing(int n)protected intmissing(int b, int delta)intpropertyNameToCType(byte[] bytes, int p, int end)onigenc_minimum_property_name_to_ctype notably overridden by unicode encodingsprotected intsafeLengthForUptoFour(byte[] bytes, int p, int end)protected intsafeLengthForUptoThree(byte[] bytes, int p, int end)protected intsafeLengthForUptoTwo(byte[] bytes, int p, int end)intstrCodeAt(byte[] bytes, int p, int end, int index)intstrLength(byte[] bytes, int p, int end)-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, codeToMbc, codeToMbcLength, ctypeCodeRange, digitVal, equals, getCharset, getCharsetName, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isCodeCType, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isReverseMatchAllowed, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, leftAdjustCharHead, length, load, maxLength, maxLengthDistance, mbcodeStartPosition, mbcToCode, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Field Detail
-
EncLen
protected final int[] EncLen
-
A
protected static final int A
- See Also:
- Constant Field Values
-
F
protected static final int F
- See Also:
- Constant Field Values
-
Trans
protected final int[][] Trans
-
TransZero
protected final int[] TransZero
-
-
Constructor Detail
-
MultiByteEncoding
protected MultiByteEncoding(String name, int minLength, int maxLength, int[] EncLen, int[][] Trans, short[] CTypeTable)
-
-
Method Detail
-
length
public int length(byte c)
Description copied from class:EncodingReturns character length given character head returns1for singlebyte encodings or performs direct length table lookup for multibyte ones.
-
missing
protected final int missing(int n)
-
missing
protected final int missing(int b, int delta)
-
safeLengthForUptoFour
protected final int safeLengthForUptoFour(byte[] bytes, int p, int end)
-
lengthForTwoUptoFour
protected final int lengthForTwoUptoFour(byte[] bytes, int p, int end, int b, int s)
-
safeLengthForUptoThree
protected final int safeLengthForUptoThree(byte[] bytes, int p, int end)
-
safeLengthForUptoTwo
protected final int safeLengthForUptoTwo(byte[] bytes, int p, int end)
-
mbnMbcToCode
protected final int mbnMbcToCode(byte[] bytes, int p, int end)
-
caseMap
public int caseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)
Description copied from class:EncodingOniguruma equivalent:case_map
-
mbnMbcCaseFold
protected final int mbnMbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)
-
mb2CodeToMbcLength
protected final int mb2CodeToMbcLength(int code)
-
mb4CodeToMbcLength
protected final int mb4CodeToMbcLength(int code)
-
mb2CodeToMbc
protected final int mb2CodeToMbc(int code, byte[] bytes, int p)
-
mb4CodeToMbc
protected final int mb4CodeToMbc(int code, byte[] bytes, int p)
-
mb2IsCodeCType
protected final boolean mb2IsCodeCType(int code, int ctype)
-
mb4IsCodeCType
protected final boolean mb4IsCodeCType(int code, int ctype)
-
strLength
public int strLength(byte[] bytes, int p, int end)
-
strCodeAt
public int strCodeAt(byte[] bytes, int p, int end, int index)
-
isCodeCTypeInternal
protected final boolean isCodeCTypeInternal(int code, int ctype)ONIGENC_IS_XXXXXX_CODE_CTYPE
-
isNewLine
public boolean isNewLine(byte[] bytes, int p, int end)onigenc_is_mbc_newline_0x0a / used also by multibyte encodings
-
asciiMbcCaseFold
protected final int asciiMbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)onigenc_ascii_mbc_case_fold- Specified by:
mbcCaseFoldin classEncoding- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headlower- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
asciiApplyAllCaseFold
protected final void asciiApplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)
-
applyAllCaseFold
public void applyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)onigenc_ascii_apply_all_case_fold / used also by multibyte encodings- Specified by:
applyAllCaseFoldin classEncoding- Parameters:
flag- case fold flagfun- case folding functor (look at:ApplyCaseFold)arg- case folding functor argument (look at:ApplyCaseFoldArg) Oniguruma equivalent:apply_all_case_fold
-
asciiCaseFoldCodesByString
protected final CaseFoldCodeItem[] asciiCaseFoldCodesByString(int flag, byte[] bytes, int p, int end)
-
caseFoldCodesByString
public CaseFoldCodeItem[] caseFoldCodesByString(int flag, byte[] bytes, int p, int end)
onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodings- Specified by:
caseFoldCodesByStringin classEncoding
-
propertyNameToCType
public int propertyNameToCType(byte[] bytes, int p, int end)onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings- Specified by:
propertyNameToCTypein classEncoding
-
-