Package org.apache.pdfbox.tools
Class FilteredTextStripper
- java.lang.Object
-
- org.apache.pdfbox.contentstream.PDFStreamEngine
-
- org.apache.pdfbox.text.LegacyPDFStreamEngine
-
- org.apache.pdfbox.text.PDFTextStripper
-
- org.apache.pdfbox.tools.FilteredTextStripper
-
class FilteredTextStripper extends PDFTextStripper
TextStripper that only processes glyphs that have angle 0.
-
-
Field Summary
-
Fields inherited from class org.apache.pdfbox.text.PDFTextStripper
charactersByArticle, document, LINE_SEPARATOR, output
-
-
Constructor Summary
Constructors Constructor Description FilteredTextStripper()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
processTextPosition(TextPosition text)
This will process a TextPosition object and add the text to the list of characters on a page.-
Methods inherited from class org.apache.pdfbox.text.PDFTextStripper
endArticle, endDocument, endPage, getAddMoreFormatting, getArticleEnd, getArticleStart, getAverageCharTolerance, getCharactersByArticle, getCurrentPageNo, getDropThreshold, getEndBookmark, getEndPage, getIndentThreshold, getLineSeparator, getListItemPatterns, getOutput, getPageEnd, getPageStart, getParagraphEnd, getParagraphStart, getSeparateByBeads, getSortByPosition, getSpacingTolerance, getStartBookmark, getStartPage, getSuppressDuplicateOverlappingText, getText, getWordSeparator, matchPattern, processPage, processPages, setAddMoreFormatting, setArticleEnd, setArticleStart, setAverageCharTolerance, setDropThreshold, setEndBookmark, setEndPage, setIndentThreshold, setLineSeparator, setListItemPatterns, setPageEnd, setPageStart, setParagraphEnd, setParagraphStart, setShouldSeparateByBeads, setSortByPosition, setSpacingTolerance, setStartBookmark, setStartPage, setSuppressDuplicateOverlappingText, setWordSeparator, startArticle, startArticle, startDocument, startPage, writeCharacters, writeLineSeparator, writePage, writePageEnd, writePageStart, writeParagraphEnd, writeParagraphSeparator, writeParagraphStart, writeString, writeString, writeText, writeWordSeparator
-
Methods inherited from class org.apache.pdfbox.text.LegacyPDFStreamEngine
computeFontHeight, showGlyph
-
Methods inherited from class org.apache.pdfbox.contentstream.PDFStreamEngine
addOperator, applyTextAdjustment, beginMarkedContentSequence, beginText, decreaseLevel, endMarkedContentSequence, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getLevel, getResources, getTextLineMatrix, getTextMatrix, increaseLevel, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, registerOperatorProcessor, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showFontGlyph, showForm, showGlyph, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, showType3Glyph, transformedPoint, transformWidth, unsupportedOperator
-
-
-
-
Method Detail
-
processTextPosition
protected void processTextPosition(TextPosition text)
Description copied from class:PDFTextStripper
This will process a TextPosition object and add the text to the list of characters on a page. It takes care of overlapping text.- Overrides:
processTextPosition
in classPDFTextStripper
- Parameters:
text
- The text to process.
-
-