Package org.apache.pdfbox.text
Class LegacyPDFStreamEngine
- java.lang.Object
-
- org.apache.pdfbox.contentstream.PDFStreamEngine
-
- org.apache.pdfbox.text.LegacyPDFStreamEngine
-
- Direct Known Subclasses:
PDFMarkedContentExtractor
,PDFTextStripper
class LegacyPDFStreamEngine extends PDFStreamEngine
LEGACY text calculations which are known to be incorrect but are depended on by PDFTextStripper. This class exists only so that we don't break the code of users who have their own subclasses of PDFTextStripper. It replaces the mostly empty implementation of showGlyph() in PDFStreamEngine with a heuristic implementation which is backwards compatible. DO NOT USE THIS CODE UNLESS YOU ARE WORKING WITH PDFTextStripper. THIS CODE IS DELIBERATELY INCORRECT, USE PDFStreamEngine INSTEAD.
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.Map<COSDictionary,java.lang.Float>
fontHeightMap
private GlyphList
glyphList
private static org.apache.commons.logging.Log
LOG
private int
pageRotation
private PDRectangle
pageSize
private Matrix
translateMatrix
-
Constructor Summary
Constructors Constructor Description LegacyPDFStreamEngine()
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected float
computeFontHeight(PDFont font)
Compute the font height.void
processPage(PDPage page)
This will initialize and process the contents of the stream.protected void
processTextPosition(TextPosition text)
A method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.protected void
showGlyph(Matrix textRenderingMatrix, PDFont font, int code, java.lang.String unicode, Vector displacement)
Called when a glyph is to be processed.-
Methods inherited from class org.apache.pdfbox.contentstream.PDFStreamEngine
addOperator, applyTextAdjustment, beginMarkedContentSequence, beginText, decreaseLevel, endMarkedContentSequence, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getLevel, getResources, getTextLineMatrix, getTextMatrix, increaseLevel, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, registerOperatorProcessor, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showFontGlyph, showForm, showGlyph, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, showType3Glyph, transformedPoint, transformWidth, unsupportedOperator
-
-
-
-
Field Detail
-
LOG
private static final org.apache.commons.logging.Log LOG
-
pageRotation
private int pageRotation
-
pageSize
private PDRectangle pageSize
-
translateMatrix
private Matrix translateMatrix
-
glyphList
private final GlyphList glyphList
-
fontHeightMap
private final java.util.Map<COSDictionary,java.lang.Float> fontHeightMap
-
-
Method Detail
-
processPage
public void processPage(PDPage page) throws java.io.IOException
This will initialize and process the contents of the stream.- Overrides:
processPage
in classPDFStreamEngine
- Parameters:
page
- the page to process- Throws:
java.io.IOException
- if there is an error accessing the stream.
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, java.lang.String unicode, Vector displacement) throws java.io.IOException
Called when a glyph is to be processed. The heuristic calculations here were originally written by Ben Litchfield for PDFStreamEngine.- Overrides:
showGlyph
in classPDFStreamEngine
- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphunicode
- the Unicode text for this glyph, or null if the PDF does provide itdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
java.io.IOException
- if the glyph cannot be processed
-
computeFontHeight
protected float computeFontHeight(PDFont font) throws java.io.IOException
Compute the font height. Override this if you want to use own calculations.- Parameters:
font
- the font.- Returns:
- the font height.
- Throws:
java.io.IOException
- if there is an error while getting the font bounding box.
-
processTextPosition
protected void processTextPosition(TextPosition text)
A method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.- Parameters:
text
- The text to be processed.
-
-