Package org.apache.pdfbox.pdfparser
The pdfparser package contains classes to parse PDF documents and objects within the document.
-
Class Summary Class Description BaseParser This class is used to contain parsing logic that will be used by all parsers.BruteForceParser Brute force parser to be used as last resort if a malformed pdf can't be read.COSParser COS-Parser which first reads startxref and xref tables in order to know valid objects and parse only these objects.EndstreamFilterStream This class is only for the readUntilEndStream method, to prevent a final CR LF or LF (but not a final CR!) from being written to the output, unless the beginning of the stream is assumed to be ASCII.FDFParser PDFObjectStreamParser This will parse a PDF 1.5 object stream and extract the object with given object number from the stream.PDFParser PDFStreamParser This will parse a PDF byte stream and extract operands and such.PDFXRefStream PDFXrefStreamParser This will parse a PDF 1.5 (or better) Xref stream and extract the xref information from the stream.PDFXrefStreamParser.ObjectNumbers XrefTrailerResolver This class will collect all XRef/trailer objects and creates correct xref/trailer information after all objects are read using startxref and 'Prev' information (unused XRef/trailer objects are discarded).XrefTrailerResolver.XrefTrailerObj A class which represents a xref/trailer object. -
Enum Summary Enum Description XrefTrailerResolver.XRefType The XRefType of a trailer.