Class STUniformSplitTermsWriter
- java.lang.Object
-
- org.apache.lucene.codecs.FieldsConsumer
-
- org.apache.lucene.codecs.uniformsplit.UniformSplitTermsWriter
-
- org.apache.lucene.codecs.uniformsplit.sharedterms.STUniformSplitTermsWriter
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable
public class STUniformSplitTermsWriter extends UniformSplitTermsWriter
ExtendsUniformSplitTermsWriterby sharing all the fields terms in the same dictionary and by writing all the fields of a term in the same block line.The
block filecontains all the term blocks for all fields. Each block line, for a single term, may have multiple fieldsTermState. The block file also contains the fields metadata at the end of the file.The
dictionary filecontains a single trie (FSTbytes) for all fields.This structure is adapted when there are lots of fields. In this case the shared-terms dictionary trie is much smaller.
This
FieldsConsumerrequires a custommerge(MergeState, NormsProducer)method for efficiency. The regular merge would scan all the fields sequentially, which internally would scan the whole shared-terms dictionary as many times as there are fields. Whereas the custom merge directly scans the internal shared-terms dictionary of all segments to merge, thus scanning once whatever the number of fields is.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classSTUniformSplitTermsWriter.FieldsIteratorprivate classSTUniformSplitTermsWriter.FieldTermsprivate classSTUniformSplitTermsWriter.MergingFieldTermsprotected classSTUniformSplitTermsWriter.SegmentPostingsprivate classSTUniformSplitTermsWriter.SegmentTermsprivate static interfaceSTUniformSplitTermsWriter.SharedTermsWriterprivate classSTUniformSplitTermsWriter.TermIterator<T>private classSTUniformSplitTermsWriter.TermIteratorQueue<T>
-
Field Summary
-
Fields inherited from class org.apache.lucene.codecs.uniformsplit.UniformSplitTermsWriter
blockEncoder, blockOutput, DEFAULT_DELTA_NUM_LINES, DEFAULT_TARGET_NUM_BLOCK_LINES, deltaNumLines, dictionaryOutput, fieldInfos, fieldMetadataWriter, MAX_NUM_BLOCK_LINES, maxDoc, postingsWriter, targetNumBlockLines
-
-
Constructor Summary
Constructors Modifier Constructor Description STUniformSplitTermsWriter(PostingsWriterBase postingsWriter, SegmentWriteState state, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder)protectedSTUniformSplitTermsWriter(PostingsWriterBase postingsWriter, SegmentWriteState state, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder, FieldMetadata.Serializer fieldMetadataWriter, java.lang.String codecName, int versionCurrent, java.lang.String termsBlocksExtension, java.lang.String dictionaryExtension)STUniformSplitTermsWriter(PostingsWriterBase postingsWriter, SegmentWriteState state, BlockEncoder blockEncoder)
-
Method Summary
-
Methods inherited from class org.apache.lucene.codecs.uniformsplit.UniformSplitTermsWriter
close, validateSettings, writeDictionary, writeEncodedFieldsMetadata, writeFieldsMetadata, writeFieldTerms, writePostingLine, writeUnencodedFieldsMetadata
-
-
-
-
Constructor Detail
-
STUniformSplitTermsWriter
public STUniformSplitTermsWriter(PostingsWriterBase postingsWriter, SegmentWriteState state, BlockEncoder blockEncoder) throws java.io.IOException
- Throws:
java.io.IOException
-
STUniformSplitTermsWriter
public STUniformSplitTermsWriter(PostingsWriterBase postingsWriter, SegmentWriteState state, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder) throws java.io.IOException
- Throws:
java.io.IOException
-
STUniformSplitTermsWriter
protected STUniformSplitTermsWriter(PostingsWriterBase postingsWriter, SegmentWriteState state, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder, FieldMetadata.Serializer fieldMetadataWriter, java.lang.String codecName, int versionCurrent, java.lang.String termsBlocksExtension, java.lang.String dictionaryExtension) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
write
public void write(Fields fields, NormsProducer normsProducer) throws java.io.IOException
Description copied from class:FieldsConsumerWrite all fields, terms and postings. This the "pull" API, allowing you to iterate more than once over the postings, somewhat analogous to using a DOM API to traverse an XML tree.Notes:
- You must compute index statistics, including each Term's docFreq and totalTermFreq, as well as the summary sumTotalTermFreq, sumTotalDocFreq and docCount.
- You must skip terms that have no docs and fields that have no terms, even though the provided Fields API will expose them; this typically requires lazily writing the field or term until you've actually seen the first term or document.
- The provided Fields instance is limited: you cannot call any methods that return statistics/counts; you cannot pass a non-null live docs when pulling docs/positions enums.
- Overrides:
writein classUniformSplitTermsWriter- Throws:
java.io.IOException
-
writeSegment
private void writeSegment(STUniformSplitTermsWriter.SharedTermsWriter termsWriter) throws java.io.IOException
Writes the new segment with the providedSTUniformSplitTermsWriter.SharedTermsWriter, which can be either a single segment writer, or a multiple segment merging writer.- Throws:
java.io.IOException
-
writeSingleSegment
private java.util.Collection<FieldMetadata> writeSingleSegment(Fields fields, NormsProducer normsProducer, STBlockWriter blockWriter, IndexDictionary.Builder dictionaryBuilder) throws java.io.IOException
- Throws:
java.io.IOException
-
createFieldMetadataList
private java.util.List<FieldMetadata> createFieldMetadataList(java.util.Iterator<FieldInfo> fieldInfos, int maxDoc)
-
createFieldTermsQueue
private STUniformSplitTermsWriter.TermIteratorQueue<STUniformSplitTermsWriter.FieldTerms> createFieldTermsQueue(Fields fields, java.util.List<FieldMetadata> fieldMetadataList) throws java.io.IOException
- Throws:
java.io.IOException
-
groupByTerm
private <T> void groupByTerm(STUniformSplitTermsWriter.TermIteratorQueue<T> termIteratorQueue, STUniformSplitTermsWriter.TermIterator<T> topTermIterator, java.util.List<STUniformSplitTermsWriter.TermIterator<T>> groupedTermIterators)
-
writePostingLines
private void writePostingLines(BytesRef term, java.util.List<? extends STUniformSplitTermsWriter.TermIterator<STUniformSplitTermsWriter.FieldTerms>> groupedFieldTerms, NormsProducer normsProducer, java.util.List<FieldMetadataTermState> termStates) throws java.io.IOException
- Throws:
java.io.IOException
-
nextTermForIterators
private <T> void nextTermForIterators(java.util.List<? extends STUniformSplitTermsWriter.TermIterator<T>> termIterators, STUniformSplitTermsWriter.TermIteratorQueue<T> termIteratorQueue) throws java.io.IOException
- Throws:
java.io.IOException
-
writeFieldMetadataList
private int writeFieldMetadataList(java.util.Collection<FieldMetadata> fieldMetadataList) throws java.io.IOException
- Throws:
java.io.IOException
-
writeDictionary
protected void writeDictionary(int fieldsNumber, IndexDictionary.Builder dictionaryBuilder) throws java.io.IOException- Throws:
java.io.IOException
-
merge
public void merge(MergeState mergeState, NormsProducer normsProducer) throws java.io.IOException
Description copied from class:FieldsConsumerMerges in the fields from the readers inmergeState. The default implementation skips and maps around deleted documents, and callsFieldsConsumer.write(Fields,NormsProducer). Implementations can override this method for more sophisticated merging (bulk-byte copying, etc).- Overrides:
mergein classFieldsConsumer- Throws:
java.io.IOException
-
mergeSegments
private java.util.Collection<FieldMetadata> mergeSegments(MergeState mergeState, NormsProducer normsProducer, java.util.List<STUniformSplitTermsWriter.TermIterator<STUniformSplitTermsWriter.SegmentTerms>> segmentTermsList, STBlockWriter blockWriter, IndexDictionary.Builder dictionaryBuilder) throws java.io.IOException
- Throws:
java.io.IOException
-
createMergingFieldTermsMap
private java.util.Map<java.lang.String,STUniformSplitTermsWriter.MergingFieldTerms> createMergingFieldTermsMap(java.util.List<FieldMetadata> fieldMetadataList, int numSegments)
-
createSegmentTermsQueue
private STUniformSplitTermsWriter.TermIteratorQueue<STUniformSplitTermsWriter.SegmentTerms> createSegmentTermsQueue(java.util.List<STUniformSplitTermsWriter.TermIterator<STUniformSplitTermsWriter.SegmentTerms>> segmentTermsList) throws java.io.IOException
- Throws:
java.io.IOException
-
combineSegmentsFields
private void combineSegmentsFields(java.util.List<STUniformSplitTermsWriter.TermIterator<STUniformSplitTermsWriter.SegmentTerms>> groupedSegmentTerms, java.util.Map<java.lang.String,java.util.List<STUniformSplitTermsWriter.SegmentPostings>> fieldPostingsMap)
-
combinePostingsPerField
private void combinePostingsPerField(BytesRef term, java.util.Map<java.lang.String,STUniformSplitTermsWriter.MergingFieldTerms> fieldTermsMap, java.util.Map<java.lang.String,java.util.List<STUniformSplitTermsWriter.SegmentPostings>> fieldPostingsMap, java.util.List<STUniformSplitTermsWriter.MergingFieldTerms> groupedFieldTerms)
-
-