Class COSDocument

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, COSObjectable

    public class COSDocument
    extends COSBase
    implements java.io.Closeable
    This is the in-memory representation of the PDF document. You need to call close() on this object when you are done using it!!
    • Field Detail

      • LOG

        private static final org.apache.commons.logging.Log LOG
        Log instance.
      • version

        private float version
      • objectPool

        private final java.util.Map<COSObjectKey,​COSObject> objectPool
        Maps ObjectKeys to a COSObject. Note that references to these objects are also stored in COSDictionary objects that map a name to a specific object.
      • xrefTable

        private final java.util.Map<COSObjectKey,​java.lang.Long> xrefTable
        Maps object and generation id to object byte offsets.
      • streams

        private final java.util.List<COSStream> streams
        List containing all streams which are created when creating a new pdf.
      • trailer

        private COSDictionary trailer
        Document trailer dictionary.
      • isDecrypted

        private boolean isDecrypted
        Signal that document is already decrypted.
      • startXref

        private long startXref
      • closed

        private boolean closed
      • isXRefStream

        private boolean isXRefStream
      • hasHybridXRef

        private boolean hasHybridXRef
      • highestXRefObjectNumber

        private long highestXRefObjectNumber
        Used for incremental saving, to avoid XRef object numbers from being reused.
    • Constructor Detail

      • COSDocument

        public COSDocument()
        Constructor. Uses main memory to buffer PDF streams.
      • COSDocument

        public COSDocument​(ICOSParser parser)
        Constructor. Uses main memory to buffer PDF streams.
        Parameters:
        parser - Parser to be used to parse the document on demand
      • COSDocument

        public COSDocument​(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction)
        Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.
        Parameters:
        streamCacheCreateFunction - a function to create an instance of a stream cache
      • COSDocument

        public COSDocument​(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction,
                           ICOSParser parser)
        Constructor that will use the provided function to create a stream cache for the storage of the PDF streams.
        Parameters:
        streamCacheCreateFunction - a function to create an instance of a stream cache
        parser - Parser to be used to parse the document on demand
    • Method Detail

      • createCOSStream

        public COSStream createCOSStream()
        Creates a new COSStream using the current configuration for scratch files.
        Returns:
        the new COSStream
      • createCOSStream

        public COSStream createCOSStream​(COSDictionary dictionary,
                                         long startPosition,
                                         long streamLength)
                                  throws java.io.IOException
        Creates a new COSStream using the current configuration for scratch files. Not for public use. Only COSParser should call this method.
        Parameters:
        dictionary - the corresponding dictionary
        startPosition - the start position within the source
        streamLength - the stream length
        Returns:
        the new COSStream
        Throws:
        java.io.IOException - if the random access view can't be read
      • getLinearizedDictionary

        public COSDictionary getLinearizedDictionary()
        Get the dictionary containing the linearization information if the pdf is linearized.
        Returns:
        the dictionary containing the linearization information
      • getObjectsByType

        public java.util.List<COSObject> getObjectsByType​(COSName type)
        This will get all dictionaries objects by type.
        Parameters:
        type - The type of the object.
        Returns:
        This will return all objects with the specified type.
      • getObjectsByType

        public java.util.List<COSObject> getObjectsByType​(COSName type1,
                                                          COSName type2)
        This will get all dictionaries objects by type.
        Parameters:
        type1 - The first possible type of the object, mandatory.
        type2 - The second possible type of the object, usually an abbreviation, optional.
        Returns:
        This will return all objects with the specified type(s).
      • setVersion

        public void setVersion​(float versionValue)
        This will set the header version of this PDF document.
        Parameters:
        versionValue - The version of the PDF document.
      • getVersion

        public float getVersion()
        This will get the version extracted from the header of this PDF document.
        Returns:
        The header version.
      • setDecrypted

        public void setDecrypted()
        Signals that the document is decrypted completely.
      • isDecrypted

        public boolean isDecrypted()
        Indicates if a encrypted pdf is already decrypted after parsing.
        Returns:
        true indicates that the pdf is decrypted.
      • isEncrypted

        public boolean isEncrypted()
        This will tell if this is an encrypted document.
        Returns:
        true If this document is encrypted.
      • getEncryptionDictionary

        public COSDictionary getEncryptionDictionary()
        This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.
        Returns:
        The encryption dictionary.
      • setEncryptionDictionary

        public void setEncryptionDictionary​(COSDictionary encDictionary)
        This will set the encryption dictionary, this should only be called when encrypting the document.
        Parameters:
        encDictionary - The encryption dictionary.
      • getDocumentID

        public COSArray getDocumentID()
        This will get the document ID.
        Returns:
        The document id.
      • setDocumentID

        public void setDocumentID​(COSArray id)
        This will set the document ID. This should be an array of two strings. This method cannot be used to remove the document id by passing null or an empty array; it will be recreated. Only the first existing string is used when writing, the second one is always recreated. If you don't want this, you'll have to modify the COSWriter class, look for COSName.ID.
        Parameters:
        id - The document id.
      • getTrailer

        public COSDictionary getTrailer()
        This will get the document trailer.
        Returns:
        the document trailer dict
      • setTrailer

        public void setTrailer​(COSDictionary newTrailer)
        // MIT added, maybe this should not be supported as trailer is a persistence construct. This will set the document trailer.
        Parameters:
        newTrailer - the document trailer dictionary
      • getHighestXRefObjectNumber

        public long getHighestXRefObjectNumber()
        Internal PDFBox use only. Get the object number of the highest XRef stream. This is needed to avoid reusing such a number in incremental saving.
        Returns:
        The object number of the highest XRef stream, or 0 if there was no XRef stream.
      • setHighestXRefObjectNumber

        public void setHighestXRefObjectNumber​(long highestXRefObjectNumber)
        Internal PDFBox use only. Sets the object number of the highest XRef stream. This is needed to avoid reusing such a number in incremental saving.
        Parameters:
        highestXRefObjectNumber - The object number of the highest XRef stream.
      • accept

        public void accept​(ICOSVisitor visitor)
                    throws java.io.IOException
        visitor pattern double dispatch method.
        Specified by:
        accept in class COSBase
        Parameters:
        visitor - The object to notify when visiting this object.
        Throws:
        java.io.IOException - If an error occurs while visiting this object.
      • close

        public void close()
                   throws java.io.IOException
        This will close all storage and delete the tmp files.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Throws:
        java.io.IOException - If there is an error close resources.
      • isClosed

        public boolean isClosed()
        Returns true if this document has been closed.
        Returns:
        true if the document is already closed, false otherwise
      • getObjectFromPool

        public COSObject getObjectFromPool​(COSObjectKey key)
        This will get an object from the pool.
        Parameters:
        key - The object key.
        Returns:
        The object in the pool or a new one if it has not been parsed yet.
      • addXRefTable

        public void addXRefTable​(java.util.Map<COSObjectKey,​java.lang.Long> xrefTableValues)
        Populate XRef HashMap with given values. Each entry maps ObjectKeys to byte offsets in the file.
        Parameters:
        xrefTableValues - xref table entries to be added
      • getXrefTable

        public java.util.Map<COSObjectKey,​java.lang.Long> getXrefTable()
        Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.
        Returns:
        mapping of ObjectsKeys to byte offsets
      • setStartXref

        public void setStartXref​(long startXrefValue)
        This method set the startxref value of the document. This will only be needed for incremental updates.
        Parameters:
        startXrefValue - the value for startXref
      • getStartXref

        public long getStartXref()
        Return the startXref Position of the parsed document. This will only be needed for incremental updates.
        Returns:
        a long with the old position of the startxref
      • isXRefStream

        public boolean isXRefStream()
        Determines if the trailer is a XRef stream or not.
        Returns:
        true if the trailer is a XRef stream
      • setIsXRefStream

        public void setIsXRefStream​(boolean isXRefStreamValue)
        Sets isXRefStream to the given value. You need to take care that the version of your PDF is 1.5 or higher.
        Parameters:
        isXRefStreamValue - the new value for isXRefStream
      • hasHybridXRef

        public boolean hasHybridXRef()
        Determines if the pdf has hybrid cross references, both plain tables and streams.
        Returns:
        true if the pdf has hybrid cross references
      • setHasHybridXRef

        public void setHasHybridXRef()
        Marks the pdf as document using hybrid cross references.