Class PDFStreamParser


  • public class PDFStreamParser
    extends BaseParser
    This will parse a PDF byte stream and extract operands and such.
    • Field Detail

      • LOG

        private static final org.apache.commons.logging.Log LOG
        Log instance.
      • streamObjects

        private final java.util.List<java.lang.Object> streamObjects
      • MAX_BIN_CHAR_TEST_LENGTH

        private static final int MAX_BIN_CHAR_TEST_LENGTH
        See Also:
        Constant Field Values
      • binCharTestArr

        private final byte[] binCharTestArr
    • Constructor Detail

      • PDFStreamParser

        @Deprecated
        public PDFStreamParser​(PDStream stream)
                        throws java.io.IOException
        Deprecated.
        Constructor.
        Parameters:
        stream - The stream to parse.
        Throws:
        java.io.IOException - If there is an error initializing the stream.
      • PDFStreamParser

        @Deprecated
        public PDFStreamParser​(COSStream stream)
                        throws java.io.IOException
        Deprecated.
        Constructor.
        Parameters:
        stream - The stream to parse.
        Throws:
        java.io.IOException - If there is an error initializing the stream.
      • PDFStreamParser

        public PDFStreamParser​(PDContentStream contentStream)
                        throws java.io.IOException
        Constructor.
        Parameters:
        contentStream - The content stream to parse.
        Throws:
        java.io.IOException - If there is an error initializing the stream.
      • PDFStreamParser

        public PDFStreamParser​(byte[] bytes)
                        throws java.io.IOException
        Constructor.
        Parameters:
        bytes - the bytes to parse.
        Throws:
        java.io.IOException - If there is an error initializing the stream.
    • Method Detail

      • parse

        public void parse()
                   throws java.io.IOException
        This will parse all the tokens in the stream. This will close the stream when it is finished parsing. You can then access these with getTokens().
        Throws:
        java.io.IOException - If there is an error while parsing the stream.
      • getTokens

        public java.util.List<java.lang.Object> getTokens()
        This will get the tokens that were parsed from the stream by the parse() method.
        Returns:
        All of the tokens in the stream.
      • parseNextToken

        public java.lang.Object parseNextToken()
                                        throws java.io.IOException
        This will parse the next token in the stream.
        Returns:
        The next token in the stream or null if there are no more tokens in the stream.
        Throws:
        java.io.IOException - If an io error occurs while parsing the stream.
      • hasNoFollowingBinData

        private boolean hasNoFollowingBinData​(SequentialSource pdfSource)
                                       throws java.io.IOException
        Looks up an amount of bytes if they contain only ASCII characters (no control sequences etc.), and that these ASCII characters begin with a sequence of 1-3 non-blank characters between blanks
        Returns:
        true if next bytes are probably printable ASCII characters starting with a PDF operator, otherwise false
        Throws:
        java.io.IOException
      • readOperator

        protected java.lang.String readOperator()
                                         throws java.io.IOException
        This will read an operator from the stream.
        Returns:
        The operator that was read from the stream.
        Throws:
        java.io.IOException - If there is an error reading from the stream.
      • isSpaceOrReturn

        private boolean isSpaceOrReturn​(int c)
      • hasNextSpaceOrReturn

        private boolean hasNextSpaceOrReturn()
                                      throws java.io.IOException
        Checks if the next char is a space or a return.
        Returns:
        true if the next char is a space or a return
        Throws:
        java.io.IOException - if something went wrong