Class UnicodeBidiAlgorithm
- java.lang.Object
-
- org.apache.fop.complexscripts.bidi.UnicodeBidiAlgorithm
-
- All Implemented Interfaces:
BidiConstants
public final class UnicodeBidiAlgorithm extends java.lang.Object implements BidiConstants
The
UnicodeBidiAlgorithmclass implements functionality prescribed by the Unicode Bidirectional Algorithm, Unicode Standard Annex #9.This work was originally authored by Glenn Adams (gadams@apache.org).
-
-
Field Summary
Fields Modifier and Type Field Description private static org.apache.commons.logging.Logloglogging instance
-
Constructor Summary
Constructors Modifier Constructor Description privateUnicodeBidiAlgorithm()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static intconvertToScalar(int chHi, int chLo)Convert UTF-16 surrogate pair to unicode scalar valuee.private static booleanconvertToScalar(java.lang.CharSequence cs, int[] chars)Convert character sequence (a UTF-16 encoded string) to an array of unicode scalar values expressed as integers.private static int[]copySequence(int[] ta)private static intdirectionOfLevel(int level)private static voiddump(java.lang.String header, int[] chars, int[] classes, int defaultLevel, int[] levels)private static intfindNextNonRetainedFormattingLevel(int[] wca, int[] ea, int start, int lPrev)private static int[]getClasses(int[] chars)private static java.lang.StringgetClassName(int bc)private static intgetLevelRunLength(int[] ea, int start)private static intgetRetainedFormattingRunLength(int[] wca, int start)private static booleanisNeutral(int bc)private static booleanisRetainedFormatting(int bc)private static booleanisRetainedFormatting(int[] ca, int s, int e)private static booleanisStrong(int bc)private static intlevelOfEmbedding(int embedding)private static int[]levelsFromEmbeddings(int[] ea, int[] la)private static intmax(int x, int y)private static java.lang.StringpadLeft(int n, int width)private static java.lang.StringpadLeft(java.lang.String s, int width)private static java.lang.StringpadRight(java.lang.String s, int width)private static voidresolveAdjacentBoundaryNeutrals(int[] wca, int start, int end, int index, int bcNew)private static voidresolveExplicit(int[] wca, int defaultLevel, int[] ea)private static voidresolveImplicit(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int sor, int eor)static int[]resolveLevels(int[] chars, int[] classes, int defaultLevel, int[] levels, boolean useRuleL1)Resolve the directionality levels of each character in a character seqeunce.static int[]resolveLevels(int[] chars, int defaultLevel, int[] levels)Resolve the directionality levels of each character in a character seqeunce.static int[]resolveLevels(java.lang.CharSequence cs, Direction defaultLevel)Resolve the directionality levels of each character in a character seqeunce.private static voidresolveNeutrals(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int sor, int eor)private static intresolveRun(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int levelPrev)private static voidresolveRuns(int[] wca, int defaultLevel, int[] ea, int[] la)private static voidresolveSeparators(int[] ica, int[] wca, int dl, int[] la)Resolve separators and boundary neutral levels to account for UAX#9 3.4 L1 while taking into account retention of formatting codes (5.2).private static voidresolveWeak(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int sor, int eor)private static booleanstartsWithRetainedFormattingRun(int[] wca, int[] ea, int start)private static booleantriggersBidi(int ch)Determine of character CH triggers bidirectional processing.
-
-
-
Method Detail
-
resolveLevels
public static int[] resolveLevels(java.lang.CharSequence cs, Direction defaultLevel)Resolve the directionality levels of each character in a character seqeunce. If some character is encoded in the character sequence as a Unicode Surrogate Pair, then the directionality level of each of the two members of the pair will be identical.- Parameters:
cs- input character sequence representing a UTF-16 encoded stringdefaultLevel- the default paragraph level, which must be zero (LR) or one (RL)- Returns:
- null if bidirectional processing is not required; otherwise, returns an array of integers, where each integer corresponds to exactly one UTF-16 encoding element present in the input character sequence, and where each integer denotes the directionality level of the corresponding encoding element
-
resolveLevels
public static int[] resolveLevels(int[] chars, int defaultLevel, int[] levels)Resolve the directionality levels of each character in a character seqeunce.- Parameters:
chars- array of input characters represented as unicode scalar valuesdefaultLevel- the default paragraph level, which must be zero (LR) or one (RL)levels- array to receive levels, one for each character in chars array- Returns:
- null if bidirectional processing is not required; otherwise, returns an array of integers, where each integer corresponds to exactly one UTF-16 encoding element present in the input character sequence, and where each integer denotes the directionality level of the corresponding encoding element
-
resolveLevels
public static int[] resolveLevels(int[] chars, int[] classes, int defaultLevel, int[] levels, boolean useRuleL1)Resolve the directionality levels of each character in a character seqeunce.- Parameters:
chars- array of input characters represented as unicode scalar valuesclasses- array containing one bidi class per character in chars arraydefaultLevel- the default paragraph level, which must be zero (LR) or one (RL)levels- array to receive levels, one for each character in chars arrayuseRuleL1- true if rule L1 should be used- Returns:
- null if bidirectional processing is not required; otherwise, returns an array of integers, where each integer corresponds to exactly one UTF-16 encoding element present in the input character sequence, and where each integer denotes the directionality level of the corresponding encoding element
-
copySequence
private static int[] copySequence(int[] ta)
-
resolveExplicit
private static void resolveExplicit(int[] wca, int defaultLevel, int[] ea)
-
directionOfLevel
private static int directionOfLevel(int level)
-
levelOfEmbedding
private static int levelOfEmbedding(int embedding)
-
levelsFromEmbeddings
private static int[] levelsFromEmbeddings(int[] ea, int[] la)
-
resolveRuns
private static void resolveRuns(int[] wca, int defaultLevel, int[] ea, int[] la)
-
findNextNonRetainedFormattingLevel
private static int findNextNonRetainedFormattingLevel(int[] wca, int[] ea, int start, int lPrev)
-
getLevelRunLength
private static int getLevelRunLength(int[] ea, int start)
-
startsWithRetainedFormattingRun
private static boolean startsWithRetainedFormattingRun(int[] wca, int[] ea, int start)
-
getRetainedFormattingRunLength
private static int getRetainedFormattingRunLength(int[] wca, int start)
-
resolveRun
private static int resolveRun(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int levelPrev)
-
resolveWeak
private static void resolveWeak(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int sor, int eor)
-
resolveNeutrals
private static void resolveNeutrals(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int sor, int eor)
-
resolveAdjacentBoundaryNeutrals
private static void resolveAdjacentBoundaryNeutrals(int[] wca, int start, int end, int index, int bcNew)
-
resolveImplicit
private static void resolveImplicit(int[] wca, int defaultLevel, int[] ea, int[] la, int start, int end, int level, int sor, int eor)
-
resolveSeparators
private static void resolveSeparators(int[] ica, int[] wca, int dl, int[] la)Resolve separators and boundary neutral levels to account for UAX#9 3.4 L1 while taking into account retention of formatting codes (5.2).- Parameters:
ica- original input class array (sequence)wca- working copy of original intput class array (sequence), as modified by prior stepsdl- default paragraph levella- array of output levels to be adjusted, as produced by bidi algorithm
-
isStrong
private static boolean isStrong(int bc)
-
isNeutral
private static boolean isNeutral(int bc)
-
isRetainedFormatting
private static boolean isRetainedFormatting(int bc)
-
isRetainedFormatting
private static boolean isRetainedFormatting(int[] ca, int s, int e)
-
max
private static int max(int x, int y)
-
getClasses
private static int[] getClasses(int[] chars)
-
convertToScalar
private static boolean convertToScalar(java.lang.CharSequence cs, int[] chars) throws java.lang.IllegalArgumentExceptionConvert character sequence (a UTF-16 encoded string) to an array of unicode scalar values expressed as integers. If a valid UTF-16 surrogate pair is encountered, it is converted to two integers, the first being the equivalent unicode scalar value, and the second being negative one (-1). This special mechanism is used to track the use of surrogate pairs while working with unicode scalar values, and permits maintaining indices that apply both to the input UTF-16 and out scalar value sequences.- Parameters:
cs- a UTF-16 encoded character sequencechars- an integer array to accept the converted scalar values, where the length of the array must be the same as the length of the input character sequence- Returns:
- a boolean indicating that content is present that triggers bidirectional processing
- Throws:
java.lang.IllegalArgumentException- if the input sequence is not a valid UTF-16 string, e.g., if it contains an isolated UTF-16 surrogate
-
convertToScalar
private static int convertToScalar(int chHi, int chLo)Convert UTF-16 surrogate pair to unicode scalar valuee.- Parameters:
chHi- high (most significant or first) surrogatechLo- low (least significant or second) surrogate- Returns:
- a unicode scalar value
- Throws:
java.lang.IllegalArgumentException- if one of the input surrogates is not valid
-
triggersBidi
private static boolean triggersBidi(int ch)
Determine of character CH triggers bidirectional processing. Bidirectional processing is deemed triggerable if CH is a strong right-to-left character, an arabic letter or number, or is a right-to-left embedding or override character.- Parameters:
ch- a unicode scalar value- Returns:
- true if character triggers bidirectional processing
-
dump
private static void dump(java.lang.String header, int[] chars, int[] classes, int defaultLevel, int[] levels)
-
getClassName
private static java.lang.String getClassName(int bc)
-
padLeft
private static java.lang.String padLeft(int n, int width)
-
padLeft
private static java.lang.String padLeft(java.lang.String s, int width)
-
padRight
private static java.lang.String padRight(java.lang.String s, int width)
-
-