Class CheckIndex
- java.lang.Object
-
- org.apache.lucene.index.CheckIndex
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable
public final class CheckIndex extends java.lang.Object implements java.io.CloseableBasic tool and API to check the health of an index and write a new segments file that removes reference to problematic segments.As this tool checks every byte in the index, on a large index it can take quite a long time to run.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classCheckIndex.ConstantRelationIntersectVisitorprivate static interfaceCheckIndex.DocValuesIteratorSupplierstatic classCheckIndex.OptionsRun-time configuration options for CheckIndex commands.static classCheckIndex.StatusReturned fromcheckIndex()detailing the health and status of the index.static classCheckIndex.VerifyPointsVisitorWalks the entire N-dimensional points space, verifying that all points fall within the last cell's boundaries.
-
Field Summary
Fields Modifier and Type Field Description private static booleanassertsOnprivate booleanchecksumsOnlyprivate booleanclosedprivate Directorydirprivate booleandoSlowChecksprivate booleanfailFastprivate java.io.PrintStreaminfoStreamprivate booleanverboseprivate LockwriteLock
-
Constructor Summary
Constructors Constructor Description CheckIndex(Directory dir)Create a new CheckIndex on the directory.CheckIndex(Directory dir, Lock writeLock)Expert: create a directory with the specified lock.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static booleanassertsOn()Check whether asserts are enabled or not.private static voidcheckBinaryDocValues(java.lang.String fieldName, int maxDoc, BinaryDocValues bdv, BinaryDocValues bdv2)private static voidcheckDocValues(FieldInfo fi, DocValuesProducer dvReader, int maxDoc, java.io.PrintStream infoStream, CheckIndex.Status.DocValuesStatus status)private static voidcheckDVIterator(FieldInfo fi, int maxDoc, CheckIndex.DocValuesIteratorSupplier producer)private static CheckIndex.Status.TermIndexStatuscheckFields(Fields fields, Bits liveDocs, int maxDoc, FieldInfos fieldInfos, NormsProducer normsProducer, boolean doPrint, boolean isVectors, java.io.PrintStream infoStream, boolean verbose, boolean doSlowChecks)checks Fields api is consistent with itself.(package private) static voidcheckImpacts(Impacts impacts, int lastTarget)CheckIndex.StatuscheckIndex()Returns aCheckIndex.Statusinstance detailing the state of the index.CheckIndex.StatuscheckIndex(java.util.List<java.lang.String> onlySegments)Returns aCheckIndex.Statusinstance detailing the state of the index.private static voidcheckNumericDocValues(java.lang.String fieldName, NumericDocValues ndv, NumericDocValues ndv2)private static booleancheckSingleTermRange(java.lang.String field, int maxDoc, Terms terms, BytesRef minTerm, BytesRef maxTerm, FixedBitSet normalDocs, FixedBitSet intersectDocs)Test Terms.intersect on this range, and validates that it returns the same doc ids as using non-intersect TermsEnum.private static voidcheckSoftDeletes(java.lang.String softDeletesField, SegmentCommitInfo info, SegmentReader reader, java.io.PrintStream infoStream, boolean failFast)private static voidcheckSortedDocValues(java.lang.String fieldName, int maxDoc, SortedDocValues dv, SortedDocValues dv2)private static voidcheckSortedNumericDocValues(java.lang.String fieldName, int maxDoc, SortedNumericDocValues ndv, SortedNumericDocValues ndv2)private static voidcheckSortedSetDocValues(java.lang.String fieldName, int maxDoc, SortedSetDocValues dv, SortedSetDocValues dv2)voidclose()intdoCheck(CheckIndex.Options opts)Actually perform the index checkprivate static intdoMain(java.lang.String[] args)booleandoSlowChecks()private voidensureOpen()voidexorciseIndex(CheckIndex.Status result)Repairs the index using previously returned result fromcheckIndex().booleangetChecksumsOnly()SeegetChecksumsOnly().private static longgetDocsFromTermRange(java.lang.String field, int maxDoc, TermsEnum termsEnum, FixedBitSet docsSeen, BytesRef minTerm, BytesRef maxTerm, boolean isIntersect)Visits all terms in the range minTerm (inclusive) to maxTerm (exclusive), marking all doc IDs encountered into allDocsSeen, and returning the total number of terms visited.booleangetFailFast()SeesetFailFast(boolean).static voidmain(java.lang.String[] args)Command-line interface to check and exorcise corrupt segments from an index.private static voidmsg(java.io.PrintStream out, java.lang.String msg)private static doublensToSec(long ns)static CheckIndex.OptionsparseOptions(java.lang.String[] args)Parse command line args into fieldsvoidsetChecksumsOnly(boolean v)If true, only validate physical integrity for all files.voidsetDoSlowChecks(boolean v)If true, additional slow checks are performed.voidsetFailFast(boolean v)If true, just throw the original exception immediately when corruption is detected, rather than continuing to iterate to other segments looking for more corruption.voidsetInfoStream(java.io.PrintStream out)Set infoStream where messages should go.voidsetInfoStream(java.io.PrintStream out, boolean verbose)Set infoStream where messages should go.private static booleantestAsserts()static CheckIndex.Status.DocValuesStatustestDocValues(CodecReader reader, java.io.PrintStream infoStream, boolean failFast)Test docvalues.static CheckIndex.Status.FieldInfoStatustestFieldInfos(CodecReader reader, java.io.PrintStream infoStream, boolean failFast)Test field infos.static CheckIndex.Status.FieldNormStatustestFieldNorms(CodecReader reader, java.io.PrintStream infoStream, boolean failFast)Test field norms.static CheckIndex.Status.LiveDocStatustestLiveDocs(CodecReader reader, java.io.PrintStream infoStream, boolean failFast)Test live docs.static CheckIndex.Status.PointsStatustestPoints(CodecReader reader, java.io.PrintStream infoStream, boolean failFast)Test the points indexstatic CheckIndex.Status.TermIndexStatustestPostings(CodecReader reader, java.io.PrintStream infoStream)Test the term index.static CheckIndex.Status.TermIndexStatustestPostings(CodecReader reader, java.io.PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast)Test the term index.static CheckIndex.Status.IndexSortStatustestSort(CodecReader reader, Sort sort, java.io.PrintStream infoStream, boolean failFast)Tests index sort order.static CheckIndex.Status.StoredFieldStatustestStoredFields(CodecReader reader, java.io.PrintStream infoStream, boolean failFast)Test stored fields.static CheckIndex.Status.TermVectorStatustestTermVectors(CodecReader reader, java.io.PrintStream infoStream)Test term vectors.static CheckIndex.Status.TermVectorStatustestTermVectors(CodecReader reader, java.io.PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast)Test term vectors.
-
-
-
Field Detail
-
infoStream
private java.io.PrintStream infoStream
-
dir
private Directory dir
-
writeLock
private Lock writeLock
-
closed
private volatile boolean closed
-
doSlowChecks
private boolean doSlowChecks
-
failFast
private boolean failFast
-
verbose
private boolean verbose
-
checksumsOnly
private boolean checksumsOnly
-
assertsOn
private static boolean assertsOn
-
-
Constructor Detail
-
CheckIndex
public CheckIndex(Directory dir) throws java.io.IOException
Create a new CheckIndex on the directory.- Throws:
java.io.IOException
-
CheckIndex
public CheckIndex(Directory dir, Lock writeLock)
Expert: create a directory with the specified lock. This should really not be used except for unit tests!!!! It exists only to support special tests (such as TestIndexWriterExceptions*), that would otherwise be more complicated to debug if they had to close the writer for each check.
-
-
Method Detail
-
ensureOpen
private void ensureOpen()
-
close
public void close() throws java.io.IOException- Specified by:
closein interfacejava.lang.AutoCloseable- Specified by:
closein interfacejava.io.Closeable- Throws:
java.io.IOException
-
setDoSlowChecks
public void setDoSlowChecks(boolean v)
If true, additional slow checks are performed. This will likely drastically increase time it takes to run CheckIndex!
-
doSlowChecks
public boolean doSlowChecks()
-
setFailFast
public void setFailFast(boolean v)
If true, just throw the original exception immediately when corruption is detected, rather than continuing to iterate to other segments looking for more corruption.
-
getFailFast
public boolean getFailFast()
SeesetFailFast(boolean).
-
getChecksumsOnly
public boolean getChecksumsOnly()
SeegetChecksumsOnly().
-
setChecksumsOnly
public void setChecksumsOnly(boolean v)
If true, only validate physical integrity for all files. Note that the returned nested status objects (e.g. storedFieldStatus) will be null.
-
setInfoStream
public void setInfoStream(java.io.PrintStream out, boolean verbose)Set infoStream where messages should go. If null, no messages are printed. If verbose is true then more details are printed.
-
setInfoStream
public void setInfoStream(java.io.PrintStream out)
Set infoStream where messages should go. SeesetInfoStream(PrintStream,boolean).
-
msg
private static void msg(java.io.PrintStream out, java.lang.String msg)
-
checkIndex
public CheckIndex.Status checkIndex() throws java.io.IOException
Returns aCheckIndex.Statusinstance detailing the state of the index.As this method checks every byte in the index, on a large index it can take quite a long time to run.
WARNING: make sure you only call this when the index is not opened by any writer.
- Throws:
java.io.IOException
-
checkIndex
public CheckIndex.Status checkIndex(java.util.List<java.lang.String> onlySegments) throws java.io.IOException
Returns aCheckIndex.Statusinstance detailing the state of the index.- Parameters:
onlySegments- list of specific segment names to checkAs this method checks every byte in the specified segments, on a large index it can take quite a long time to run.
- Throws:
java.io.IOException
-
testSort
public static CheckIndex.Status.IndexSortStatus testSort(CodecReader reader, Sort sort, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Tests index sort order.- Throws:
java.io.IOException
-
testLiveDocs
public static CheckIndex.Status.LiveDocStatus testLiveDocs(CodecReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Test live docs.- Throws:
java.io.IOException
-
testFieldInfos
public static CheckIndex.Status.FieldInfoStatus testFieldInfos(CodecReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Test field infos.- Throws:
java.io.IOException
-
testFieldNorms
public static CheckIndex.Status.FieldNormStatus testFieldNorms(CodecReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Test field norms.- Throws:
java.io.IOException
-
getDocsFromTermRange
private static long getDocsFromTermRange(java.lang.String field, int maxDoc, TermsEnum termsEnum, FixedBitSet docsSeen, BytesRef minTerm, BytesRef maxTerm, boolean isIntersect) throws java.io.IOExceptionVisits all terms in the range minTerm (inclusive) to maxTerm (exclusive), marking all doc IDs encountered into allDocsSeen, and returning the total number of terms visited.- Throws:
java.io.IOException
-
checkSingleTermRange
private static boolean checkSingleTermRange(java.lang.String field, int maxDoc, Terms terms, BytesRef minTerm, BytesRef maxTerm, FixedBitSet normalDocs, FixedBitSet intersectDocs) throws java.io.IOExceptionTest Terms.intersect on this range, and validates that it returns the same doc ids as using non-intersect TermsEnum. Returns true if any fake terms were seen.- Throws:
java.io.IOException
-
checkFields
private static CheckIndex.Status.TermIndexStatus checkFields(Fields fields, Bits liveDocs, int maxDoc, FieldInfos fieldInfos, NormsProducer normsProducer, boolean doPrint, boolean isVectors, java.io.PrintStream infoStream, boolean verbose, boolean doSlowChecks) throws java.io.IOException
checks Fields api is consistent with itself. searcher is optional, to verify with queries. Can be null.- Throws:
java.io.IOException
-
checkImpacts
static void checkImpacts(Impacts impacts, int lastTarget)
-
testPostings
public static CheckIndex.Status.TermIndexStatus testPostings(CodecReader reader, java.io.PrintStream infoStream) throws java.io.IOException
Test the term index.- Throws:
java.io.IOException
-
testPostings
public static CheckIndex.Status.TermIndexStatus testPostings(CodecReader reader, java.io.PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast) throws java.io.IOException
Test the term index.- Throws:
java.io.IOException
-
testPoints
public static CheckIndex.Status.PointsStatus testPoints(CodecReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Test the points index- Throws:
java.io.IOException
-
testStoredFields
public static CheckIndex.Status.StoredFieldStatus testStoredFields(CodecReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Test stored fields.- Throws:
java.io.IOException
-
testDocValues
public static CheckIndex.Status.DocValuesStatus testDocValues(CodecReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException
Test docvalues.- Throws:
java.io.IOException
-
checkDVIterator
private static void checkDVIterator(FieldInfo fi, int maxDoc, CheckIndex.DocValuesIteratorSupplier producer) throws java.io.IOException
- Throws:
java.io.IOException
-
checkBinaryDocValues
private static void checkBinaryDocValues(java.lang.String fieldName, int maxDoc, BinaryDocValues bdv, BinaryDocValues bdv2) throws java.io.IOException- Throws:
java.io.IOException
-
checkSortedDocValues
private static void checkSortedDocValues(java.lang.String fieldName, int maxDoc, SortedDocValues dv, SortedDocValues dv2) throws java.io.IOException- Throws:
java.io.IOException
-
checkSortedSetDocValues
private static void checkSortedSetDocValues(java.lang.String fieldName, int maxDoc, SortedSetDocValues dv, SortedSetDocValues dv2) throws java.io.IOException- Throws:
java.io.IOException
-
checkSortedNumericDocValues
private static void checkSortedNumericDocValues(java.lang.String fieldName, int maxDoc, SortedNumericDocValues ndv, SortedNumericDocValues ndv2) throws java.io.IOException- Throws:
java.io.IOException
-
checkNumericDocValues
private static void checkNumericDocValues(java.lang.String fieldName, NumericDocValues ndv, NumericDocValues ndv2) throws java.io.IOException- Throws:
java.io.IOException
-
checkDocValues
private static void checkDocValues(FieldInfo fi, DocValuesProducer dvReader, int maxDoc, java.io.PrintStream infoStream, CheckIndex.Status.DocValuesStatus status) throws java.lang.Exception
- Throws:
java.lang.Exception
-
testTermVectors
public static CheckIndex.Status.TermVectorStatus testTermVectors(CodecReader reader, java.io.PrintStream infoStream) throws java.io.IOException
Test term vectors.- Throws:
java.io.IOException
-
testTermVectors
public static CheckIndex.Status.TermVectorStatus testTermVectors(CodecReader reader, java.io.PrintStream infoStream, boolean verbose, boolean doSlowChecks, boolean failFast) throws java.io.IOException
Test term vectors.- Throws:
java.io.IOException
-
exorciseIndex
public void exorciseIndex(CheckIndex.Status result) throws java.io.IOException
Repairs the index using previously returned result fromcheckIndex(). Note that this does not remove any of the unreferenced files after it's done; you must separately open anIndexWriter, which deletes unreferenced files when it's created.WARNING: this writes a new segments file into the index, effectively removing all documents in broken segments from the index. BE CAREFUL.
- Throws:
java.io.IOException
-
testAsserts
private static boolean testAsserts()
-
assertsOn
public static boolean assertsOn()
Check whether asserts are enabled or not.- Returns:
- true iff asserts are enabled
-
main
public static void main(java.lang.String[] args) throws java.io.IOException, java.lang.InterruptedExceptionCommand-line interface to check and exorcise corrupt segments from an index.Run it like this:
java -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex pathToIndex [-exorcise] [-verbose] [-segment X] [-segment Y]-exorcise: actually write a new segments_N file, removing any problematic segments. *LOSES DATA*-segment X: only check the specified segment(s). This can be specified multiple times, to check more than one segment, eg-segment _2 -segment _a. You can't use this with the -exorcise option.
WARNING:
-exorciseshould only be used on an emergency basis as it will cause documents (perhaps many) to be permanently removed from the index. Always make a backup copy of your index before running this! Do not run this tool on an index that is actively being written to. You have been warned!Run without -exorcise, this tool will open the index, report version information and report any exceptions it hits and what action it would take if -exorcise were specified. With -exorcise, this tool will remove any segments that have issues and write a new segments_N file. This means all documents contained in the affected segments will be removed.
This tool exits with exit code 1 if the index cannot be opened or has any corruption, else 0.
- Throws:
java.io.IOExceptionjava.lang.InterruptedException
-
doMain
private static int doMain(java.lang.String[] args) throws java.io.IOException, java.lang.InterruptedException- Throws:
java.io.IOExceptionjava.lang.InterruptedException
-
parseOptions
public static CheckIndex.Options parseOptions(java.lang.String[] args)
Parse command line args into fields- Parameters:
args- The command line arguments- Returns:
- An Options struct
- Throws:
java.lang.IllegalArgumentException- if any of the CLI args are invalid
-
doCheck
public int doCheck(CheckIndex.Options opts) throws java.io.IOException, java.lang.InterruptedException
Actually perform the index check- Parameters:
opts- The options to use for this check- Returns:
- 0 iff the index is clean, 1 otherwise
- Throws:
java.io.IOExceptionjava.lang.InterruptedException
-
checkSoftDeletes
private static void checkSoftDeletes(java.lang.String softDeletesField, SegmentCommitInfo info, SegmentReader reader, java.io.PrintStream infoStream, boolean failFast) throws java.io.IOException- Throws:
java.io.IOException
-
nsToSec
private static double nsToSec(long ns)
-
-