- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.MultiByteEncoding
-
- org.jcodings.unicode.UnicodeEncoding
-
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
FixedWidthUnicodeEncoding,NonStrictUTF8Encoding,UTF16BEEncoding,UTF16LEEncoding,UTF8Encoding
public abstract class UnicodeEncoding extends MultiByteEncoding
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedUnicodeEncoding(String name, int minLength, int maxLength, int[] EncLen)protectedUnicodeEncoding(String name, int minLength, int maxLength, int[] EncLen, int[][] Trans)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidapplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)onigenc_ascii_apply_all_case_fold / used also by multibyte encodingsprotected voidasciiApplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)protected CaseFoldCodeItem[]asciiCaseFoldCodesByString(int flag, byte[] bytes, int p, int end)protected intasciiMbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)CaseFoldCodeItem[]caseFoldCodesByString(int flag, byte[] bytes, int p, int end)onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodingsintcaseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)Oniguruma equivalent:case_mapprotected int[]ctypeCodeRange(int ctype)StringgetCharsetName()The name of the equivalent Java Charset for this encoding.booleanisCodeCType(int code, int ctype)Perform a check whether given code is of given character type (e.g.protected booleanisCodeCTypeInternal(int code, int ctype)ONIGENC_IS_XXXXXX_CODE_CTYPEbooleanisNewLine(byte[] bytes, int p, int end)onigenc_is_mbc_newline_0x0a / used also by multibyte encodingsintmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)onigenc_ascii_mbc_case_foldintpropertyNameToCType(byte[] name, int p, int end)onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings-
Methods inherited from class org.jcodings.MultiByteEncoding
length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLength
-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, codeToMbc, codeToMbcLength, ctypeCodeRange, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isReverseMatchAllowed, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, leftAdjustCharHead, length, load, maxLength, maxLengthDistance, mbcodeStartPosition, mbcToCode, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Method Detail
-
getCharsetName
public String getCharsetName()
Description copied from class:EncodingThe name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetNamein classEncoding- Returns:
- the name of the equivalent Java Charset for this encoding
-
isCodeCType
public boolean isCodeCType(int code, int ctype)Description copied from class:EncodingPerform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)- Specified by:
isCodeCTypein classEncoding- Parameters:
code- a code point of a characterctype- a character type to check against Oniguruma equivalent:is_code_ctype
-
ctypeCodeRange
protected final int[] ctypeCodeRange(int ctype)
-
propertyNameToCType
public int propertyNameToCType(byte[] name, int p, int end)onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)onigenc_ascii_mbc_case_fold- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headfold- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
applyAllCaseFold
public void applyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)onigenc_ascii_apply_all_case_fold / used also by multibyte encodings- Parameters:
flag- case fold flagfun- case folding functor (look at:ApplyCaseFold)arg- case folding functor argument (look at:ApplyCaseFoldArg) Oniguruma equivalent:apply_all_case_fold
-
caseFoldCodesByString
public CaseFoldCodeItem[] caseFoldCodesByString(int flag, byte[] bytes, int p, int end)
onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodings
-
caseMap
public final int caseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)
Description copied from class:EncodingOniguruma equivalent:case_map- Overrides:
caseMapin classMultiByteEncoding
-
isCodeCTypeInternal
protected final boolean isCodeCTypeInternal(int code, int ctype)ONIGENC_IS_XXXXXX_CODE_CTYPE
-
isNewLine
public boolean isNewLine(byte[] bytes, int p, int end)onigenc_is_mbc_newline_0x0a / used also by multibyte encodings
-
asciiMbcCaseFold
protected final int asciiMbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)
-
asciiApplyAllCaseFold
protected final void asciiApplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)
-
asciiCaseFoldCodesByString
protected final CaseFoldCodeItem[] asciiCaseFoldCodesByString(int flag, byte[] bytes, int p, int end)
-
-