Package org.apache.pdfbox.cos
Class COSDocument
- java.lang.Object
-
- org.apache.pdfbox.cos.COSBase
-
- org.apache.pdfbox.cos.COSDocument
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable,COSObjectable
public class COSDocument extends COSBase implements java.io.Closeable
This is the in-memory representation of the PDF document. You need to call close() on this object when you are done using it!!
-
-
Field Summary
Fields Modifier and Type Field Description private booleanclosedprivate COSDocumentStatedocumentStateprivate booleanhasHybridXRefprivate longhighestXRefObjectNumberUsed for incremental saving, to avoid XRef object numbers from being reused.private booleanisDecryptedSignal that document is already decrypted.private booleanisXRefStreamprivate static org.apache.commons.logging.LogLOGLog instance.private java.util.Map<COSObjectKey,COSObject>objectPoolMaps ObjectKeys to a COSObject.private ICOSParserparserprivate longstartXrefprivate RandomAccessStreamCachestreamCacheprivate java.util.List<COSStream>streamsList containing all streams which are created when creating a new pdf.private COSDictionarytrailerDocument trailer dictionary.private floatversionprivate java.util.Map<COSObjectKey,java.lang.Long>xrefTableMaps object and generation id to object byte offsets.
-
Constructor Summary
Constructors Constructor Description COSDocument()Constructor.COSDocument(ICOSParser parser)Constructor.COSDocument(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.COSDocument(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction, ICOSParser parser)Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaccept(ICOSVisitor visitor)visitor pattern double dispatch method.voidaddXRefTable(java.util.Map<COSObjectKey,java.lang.Long> xrefTableValues)Populate XRef HashMap with given values.voidclose()This will close all storage and delete the tmp files.COSStreamcreateCOSStream()Creates a new COSStream using the current configuration for scratch files.COSStreamcreateCOSStream(COSDictionary dictionary, long startPosition, long streamLength)Creates a new COSStream using the current configuration for scratch files.COSArraygetDocumentID()This will get the document ID.COSDocumentStategetDocumentState()Returns theCOSDocumentStateof thisCOSDocument.COSDictionarygetEncryptionDictionary()This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.longgetHighestXRefObjectNumber()Internal PDFBox use only.COSDictionarygetLinearizedDictionary()Get the dictionary containing the linearization information if the pdf is linearized.COSObjectgetObjectFromPool(COSObjectKey key)This will get an object from the pool.private java.util.List<COSObject>getObjectsByType(java.util.List<COSObjectKey> keys, COSName type1, COSName type2)java.util.List<COSObject>getObjectsByType(COSName type)This will get all dictionaries objects by type.java.util.List<COSObject>getObjectsByType(COSName type1, COSName type2)This will get all dictionaries objects by type.longgetStartXref()Return the startXref Position of the parsed document.private RandomAccessStreamCachegetStreamCache(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)COSDictionarygetTrailer()This will get the document trailer.floatgetVersion()This will get the version extracted from the header of this PDF document.java.util.Map<COSObjectKey,java.lang.Long>getXrefTable()Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.booleanhasHybridXRef()Determines if the pdf has hybrid cross references, both plain tables and streams.booleanisClosed()Returns true if this document has been closed.booleanisDecrypted()Indicates if a encrypted pdf is already decrypted after parsing.booleanisEncrypted()This will tell if this is an encrypted document.booleanisXRefStream()Determines if the trailer is a XRef stream or not.voidsetDecrypted()Signals that the document is decrypted completely.voidsetDocumentID(COSArray id)This will set the document ID.voidsetEncryptionDictionary(COSDictionary encDictionary)This will set the encryption dictionary, this should only be called when encrypting the document.voidsetHasHybridXRef()Marks the pdf as document using hybrid cross references.voidsetHighestXRefObjectNumber(long highestXRefObjectNumber)Internal PDFBox use only.voidsetIsXRefStream(boolean isXRefStreamValue)Sets isXRefStream to the given value.voidsetStartXref(long startXrefValue)This method set the startxref value of the document.voidsetTrailer(COSDictionary newTrailer)// MIT added, maybe this should not be supported as trailer is a persistence construct.voidsetVersion(float versionValue)This will set the header version of this PDF document.
-
-
-
Field Detail
-
LOG
private static final org.apache.commons.logging.Log LOG
Log instance.
-
version
private float version
-
objectPool
private final java.util.Map<COSObjectKey,COSObject> objectPool
Maps ObjectKeys to a COSObject. Note that references to these objects are also stored in COSDictionary objects that map a name to a specific object.
-
xrefTable
private final java.util.Map<COSObjectKey,java.lang.Long> xrefTable
Maps object and generation id to object byte offsets.
-
streams
private final java.util.List<COSStream> streams
List containing all streams which are created when creating a new pdf.
-
trailer
private COSDictionary trailer
Document trailer dictionary.
-
isDecrypted
private boolean isDecrypted
Signal that document is already decrypted.
-
startXref
private long startXref
-
closed
private boolean closed
-
isXRefStream
private boolean isXRefStream
-
hasHybridXRef
private boolean hasHybridXRef
-
streamCache
private final RandomAccessStreamCache streamCache
-
highestXRefObjectNumber
private long highestXRefObjectNumber
Used for incremental saving, to avoid XRef object numbers from being reused.
-
parser
private final ICOSParser parser
-
documentState
private final COSDocumentState documentState
-
-
Constructor Detail
-
COSDocument
public COSDocument()
Constructor. Uses main memory to buffer PDF streams.
-
COSDocument
public COSDocument(ICOSParser parser)
Constructor. Uses main memory to buffer PDF streams.- Parameters:
parser- Parser to be used to parse the document on demand
-
COSDocument
public COSDocument(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)
Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.- Parameters:
streamCacheCreateFunction- a function to create an instance of a stream cache
-
COSDocument
public COSDocument(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction, ICOSParser parser)
Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.- Parameters:
streamCacheCreateFunction- a function to create an instance of a stream cacheparser- Parser to be used to parse the document on demand
-
-
Method Detail
-
getStreamCache
private RandomAccessStreamCache getStreamCache(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)
-
createCOSStream
public COSStream createCOSStream()
Creates a new COSStream using the current configuration for scratch files.- Returns:
- the new COSStream
-
createCOSStream
public COSStream createCOSStream(COSDictionary dictionary, long startPosition, long streamLength) throws java.io.IOException
Creates a new COSStream using the current configuration for scratch files. Not for public use. Only COSParser should call this method.- Parameters:
dictionary- the corresponding dictionarystartPosition- the start position within the sourcestreamLength- the stream length- Returns:
- the new COSStream
- Throws:
java.io.IOException- if the random access view can't be read
-
getLinearizedDictionary
public COSDictionary getLinearizedDictionary()
Get the dictionary containing the linearization information if the pdf is linearized.- Returns:
- the dictionary containing the linearization information
-
getObjectsByType
public java.util.List<COSObject> getObjectsByType(COSName type)
This will get all dictionaries objects by type.- Parameters:
type- The type of the object.- Returns:
- This will return all objects with the specified type.
-
getObjectsByType
public java.util.List<COSObject> getObjectsByType(COSName type1, COSName type2)
This will get all dictionaries objects by type.- Parameters:
type1- The first possible type of the object, mandatory.type2- The second possible type of the object, usually an abbreviation, optional.- Returns:
- This will return all objects with the specified type(s).
-
getObjectsByType
private java.util.List<COSObject> getObjectsByType(java.util.List<COSObjectKey> keys, COSName type1, COSName type2)
-
setVersion
public void setVersion(float versionValue)
This will set the header version of this PDF document.- Parameters:
versionValue- The version of the PDF document.
-
getVersion
public float getVersion()
This will get the version extracted from the header of this PDF document.- Returns:
- The header version.
-
setDecrypted
public void setDecrypted()
Signals that the document is decrypted completely.
-
isDecrypted
public boolean isDecrypted()
Indicates if a encrypted pdf is already decrypted after parsing.- Returns:
- true indicates that the pdf is decrypted.
-
isEncrypted
public boolean isEncrypted()
This will tell if this is an encrypted document.- Returns:
- true If this document is encrypted.
-
getEncryptionDictionary
public COSDictionary getEncryptionDictionary()
This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.- Returns:
- The encryption dictionary.
-
setEncryptionDictionary
public void setEncryptionDictionary(COSDictionary encDictionary)
This will set the encryption dictionary, this should only be called when encrypting the document.- Parameters:
encDictionary- The encryption dictionary.
-
getDocumentID
public COSArray getDocumentID()
This will get the document ID.- Returns:
- The document id.
-
setDocumentID
public void setDocumentID(COSArray id)
This will set the document ID. This should be an array of two strings. This method cannot be used to remove the document id by passing null or an empty array; it will be recreated. Only the first existing string is used when writing, the second one is always recreated. If you don't want this, you'll have to modify theCOSWriterclass, look forCOSName.ID.- Parameters:
id- The document id.
-
getTrailer
public COSDictionary getTrailer()
This will get the document trailer.- Returns:
- the document trailer dict
-
setTrailer
public void setTrailer(COSDictionary newTrailer)
// MIT added, maybe this should not be supported as trailer is a persistence construct. This will set the document trailer.- Parameters:
newTrailer- the document trailer dictionary
-
getHighestXRefObjectNumber
public long getHighestXRefObjectNumber()
Internal PDFBox use only. Get the object number of the highest XRef stream. This is needed to avoid reusing such a number in incremental saving.- Returns:
- The object number of the highest XRef stream, or 0 if there was no XRef stream.
-
setHighestXRefObjectNumber
public void setHighestXRefObjectNumber(long highestXRefObjectNumber)
Internal PDFBox use only. Sets the object number of the highest XRef stream. This is needed to avoid reusing such a number in incremental saving.- Parameters:
highestXRefObjectNumber- The object number of the highest XRef stream.
-
accept
public void accept(ICOSVisitor visitor) throws java.io.IOException
visitor pattern double dispatch method.
-
close
public void close() throws java.io.IOExceptionThis will close all storage and delete the tmp files.- Specified by:
closein interfacejava.lang.AutoCloseable- Specified by:
closein interfacejava.io.Closeable- Throws:
java.io.IOException- If there is an error close resources.
-
isClosed
public boolean isClosed()
Returns true if this document has been closed.- Returns:
- true if the document is already closed, false otherwise
-
getObjectFromPool
public COSObject getObjectFromPool(COSObjectKey key)
This will get an object from the pool.- Parameters:
key- The object key.- Returns:
- The object in the pool or a new one if it has not been parsed yet.
-
addXRefTable
public void addXRefTable(java.util.Map<COSObjectKey,java.lang.Long> xrefTableValues)
Populate XRef HashMap with given values. Each entry maps ObjectKeys to byte offsets in the file.- Parameters:
xrefTableValues- xref table entries to be added
-
getXrefTable
public java.util.Map<COSObjectKey,java.lang.Long> getXrefTable()
Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.- Returns:
- mapping of ObjectsKeys to byte offsets
-
setStartXref
public void setStartXref(long startXrefValue)
This method set the startxref value of the document. This will only be needed for incremental updates.- Parameters:
startXrefValue- the value for startXref
-
getStartXref
public long getStartXref()
Return the startXref Position of the parsed document. This will only be needed for incremental updates.- Returns:
- a long with the old position of the startxref
-
isXRefStream
public boolean isXRefStream()
Determines if the trailer is a XRef stream or not.- Returns:
- true if the trailer is a XRef stream
-
setIsXRefStream
public void setIsXRefStream(boolean isXRefStreamValue)
Sets isXRefStream to the given value. You need to take care that the version of your PDF is 1.5 or higher.- Parameters:
isXRefStreamValue- the new value for isXRefStream
-
hasHybridXRef
public boolean hasHybridXRef()
Determines if the pdf has hybrid cross references, both plain tables and streams.- Returns:
- true if the pdf has hybrid cross references
-
setHasHybridXRef
public void setHasHybridXRef()
Marks the pdf as document using hybrid cross references.
-
getDocumentState
public COSDocumentState getDocumentState()
Returns theCOSDocumentStateof thisCOSDocument.- Returns:
- The
COSDocumentStateof thisCOSDocument. - See Also:
COSDocumentState
-
-