Package org.apache.lucene.index
Class TermsHashPerField
- java.lang.Object
-
- org.apache.lucene.index.TermsHashPerField
-
- All Implemented Interfaces:
java.lang.Comparable<TermsHashPerField>
- Direct Known Subclasses:
FreqProxTermsWriterPerField,TermVectorsConsumerPerField
abstract class TermsHashPerField extends java.lang.Object implements java.lang.Comparable<TermsHashPerField>
This class stores streams of information per term without knowing the size of the stream ahead of time. Each stream typically encodes one level of information like term frequency per document or term proximity. Internally this class allocates a linked list of slices that can be read by aByteSliceReaderfor each term. Terms are first deduplicated in aBytesRefHashonce this is done internal data-structures point to the current offset of each stream that can be written to.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classTermsHashPerField.PostingsBytesStartArray
-
Field Summary
Fields Modifier and Type Field Description (package private) ByteBlockPoolbytePoolprivate BytesRefHashbytesHashprivate booleandoNextCallprivate java.lang.StringfieldNameprivate static intHASH_INIT_SIZE(package private) IndexOptionsindexOptionsprivate IntBlockPoolintPoolprivate intlastDocIDprivate TermsHashPerFieldnextPerField(package private) ParallelPostingsArraypostingsArrayprivate int[]sortedTermIDsprivate intstreamAddressOffsetprivate intstreamCountprivate int[]termStreamAddressBuffer
-
Constructor Summary
Constructors Constructor Description TermsHashPerField(int streamCount, IntBlockPool intPool, ByteBlockPool bytePool, ByteBlockPool termBytePool, Counter bytesUsed, TermsHashPerField nextPerField, java.lang.String fieldName, IndexOptions indexOptions)streamCount: how many streams this field stores per term.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description private voidadd(int textStart, int docID)(package private) voidadd(BytesRef termBytes, int docID)Called once per inverted token.(package private) abstract voidaddTerm(int termID, int docID)Called when a previously seen term is seen again.private booleanassertDocId(int docId)intcompareTo(TermsHashPerField other)(package private) abstract ParallelPostingsArraycreatePostingsArray(int size)Creates a new postings array of the specified size.(package private) voidfinish()Finish adding all instances of this field to the current document.(package private) java.lang.StringgetFieldName()(package private) TermsHashPerFieldgetNextPerField()(package private) intgetNumTerms()(package private) int[]getSortedTermIDs()Returns the sorted term IDs.(package private) voidinitReader(ByteSliceReader reader, int termID, int stream)private voidinitStreamSlices(int termID, int docID)(package private) abstract voidnewPostingsArray()Called when the postings array is initialized or resized.(package private) abstract voidnewTerm(int termID, int docID)Called when a term is seen for the first time.private intpositionStreamSlice(int termID, int docID)(package private) voidreinitHash()(package private) voidreset()(package private) voidsortTerms()Collapse the hash table and sort in-place; also sets this.sortedTermIDs to the results This method must not be called twice unlessreset()orreinitHash()was called.(package private) booleanstart(IndexableField field, boolean first)Start adding a new field instance; first is true if this is the first time this field name was seen in the document.(package private) voidwriteByte(int stream, byte b)(package private) voidwriteBytes(int stream, byte[] b, int offset, int len)(package private) voidwriteVInt(int stream, int i)
-
-
-
Field Detail
-
HASH_INIT_SIZE
private static final int HASH_INIT_SIZE
- See Also:
- Constant Field Values
-
nextPerField
private final TermsHashPerField nextPerField
-
intPool
private final IntBlockPool intPool
-
bytePool
final ByteBlockPool bytePool
-
termStreamAddressBuffer
private int[] termStreamAddressBuffer
-
streamAddressOffset
private int streamAddressOffset
-
streamCount
private final int streamCount
-
fieldName
private final java.lang.String fieldName
-
indexOptions
final IndexOptions indexOptions
-
bytesHash
private final BytesRefHash bytesHash
-
postingsArray
ParallelPostingsArray postingsArray
-
lastDocID
private int lastDocID
-
sortedTermIDs
private int[] sortedTermIDs
-
doNextCall
private boolean doNextCall
-
-
Constructor Detail
-
TermsHashPerField
TermsHashPerField(int streamCount, IntBlockPool intPool, ByteBlockPool bytePool, ByteBlockPool termBytePool, Counter bytesUsed, TermsHashPerField nextPerField, java.lang.String fieldName, IndexOptions indexOptions)streamCount: how many streams this field stores per term. E.g. doc(+freq) is 1 stream, prox+offset is a second.
-
-
Method Detail
-
reset
void reset()
-
initReader
final void initReader(ByteSliceReader reader, int termID, int stream)
-
sortTerms
final void sortTerms()
Collapse the hash table and sort in-place; also sets this.sortedTermIDs to the results This method must not be called twice unlessreset()orreinitHash()was called.
-
getSortedTermIDs
final int[] getSortedTermIDs()
Returns the sorted term IDs.sortTerms()must be called before
-
reinitHash
final void reinitHash()
-
add
private void add(int textStart, int docID) throws java.io.IOException- Throws:
java.io.IOException
-
initStreamSlices
private void initStreamSlices(int termID, int docID) throws java.io.IOException- Throws:
java.io.IOException
-
assertDocId
private boolean assertDocId(int docId)
-
add
void add(BytesRef termBytes, int docID) throws java.io.IOException
Called once per inverted token. This is the primary entry point (for first TermsHash); postings use this API.- Throws:
java.io.IOException
-
positionStreamSlice
private int positionStreamSlice(int termID, int docID) throws java.io.IOException- Throws:
java.io.IOException
-
writeByte
final void writeByte(int stream, byte b)
-
writeBytes
final void writeBytes(int stream, byte[] b, int offset, int len)
-
writeVInt
final void writeVInt(int stream, int i)
-
getNextPerField
final TermsHashPerField getNextPerField()
-
getFieldName
final java.lang.String getFieldName()
-
compareTo
public final int compareTo(TermsHashPerField other)
- Specified by:
compareToin interfacejava.lang.Comparable<TermsHashPerField>
-
finish
void finish() throws java.io.IOExceptionFinish adding all instances of this field to the current document.- Throws:
java.io.IOException
-
getNumTerms
final int getNumTerms()
-
start
boolean start(IndexableField field, boolean first)
Start adding a new field instance; first is true if this is the first time this field name was seen in the document.
-
newTerm
abstract void newTerm(int termID, int docID) throws java.io.IOExceptionCalled when a term is seen for the first time.- Throws:
java.io.IOException
-
addTerm
abstract void addTerm(int termID, int docID) throws java.io.IOExceptionCalled when a previously seen term is seen again.- Throws:
java.io.IOException
-
newPostingsArray
abstract void newPostingsArray()
Called when the postings array is initialized or resized.
-
createPostingsArray
abstract ParallelPostingsArray createPostingsArray(int size)
Creates a new postings array of the specified size.
-
-