Skip to content

PdfReader

PdfReader parses existing PDF files into the phpdftk object model. It handles classic xref tables, cross-reference streams, object streams, incremental updates, and encrypted PDFs.

use Phpdftk\Pdf\Reader\PdfReader;
// From file
$pdf = PdfReader::fromFile('document.pdf');
// From string (e.g., HTTP response body)
$pdf = PdfReader::fromString($bytes);
// From stream resource
$pdf = PdfReader::fromStream(fopen('php://stdin', 'rb'));
// Encrypted PDF
$pdf = PdfReader::fromFile('secured.pdf', password: 'secret');
// Public-key encrypted PDF
$pdf = PdfReader::fromFilePublicKey('secured.pdf', $certPem, $keyPem);
echo $pdf->getVersion(); // "1.7"
echo $pdf->getPageCount(); // 42
// Linearization detection
if ($pdf->isLinearized()) {
$params = $pdf->getLinearizationParameters();
echo "Web-optimized, {$params['pageCount']} pages";
}
// All pages as raw dictionaries
$pages = $pdf->getPages();
// Single page by 0-based index
$page = $pdf->getPage(0);
// Typed Page objects (hydrated into Core\Document\Page)
$typedPages = $pdf->getTypedPages();
$typedPage = $pdf->getTypedPage(0);
$catalog = $pdf->getCatalog(); // raw PdfDictionary
$typed = $pdf->getTypedCatalog(); // hydrated Core\Document\Catalog
$trailer = $pdf->getTrailer();
$info = $pdf->getInfo();
// By object number
$obj = $pdf->getObject(42);
// By reference
$target = $pdf->resolveReference($ref);
// Typed hydration of any object
$typed = $pdf->getTypedObject(42);
// Single page (0-based index)
$text = $pdf->extractText(0);
// All pages concatenated
$allText = $pdf->extractAllText("\n\n");

In lenient mode, the reader recovers from common PDF issues:

$pdf = PdfReader::fromFile('damaged.pdf', strict: false);
// Check what was wrong
foreach ($pdf->getParseWarnings() as $warning) {
echo "Warning: $warning\n";
}

Recoverable issues include displaced headers, malformed xref tables, and missing trailers (reconstructed via object scanning).

The reader automatically handles all standard encryption methods:

MethodVersion
RC4 40-bitV=1 R=2
RC4 128-bitV=2 R=3
AES-128V=4 R=4
AES-256V=5 R=6
Public-key (Adobe.PubSec)AES-128/256