PdfDocEncodingTable
in package
FinalYes
PDFDocEncoding — the encoding for PDF text strings (Info dict, bookmarks, annotations) when they don't start with the UTF-16BE BOM (U+FEFF / 0xFE 0xFF).
Per PDF spec ISO 32000-2:2020, Table D.2.
Maps byte values 0-255 directly to Unicode code points (not glyph names).
Table of Contents
Methods
- decode() : string
- Decode a PDFDocEncoding byte string to a UTF-8 string.
- decodeTextString() : string
- Decode a PDF text string — auto-detects UTF-16BE (BOM) vs PDFDocEncoding.
- getTable() : array<int, int|null>
Methods
decode()
Decode a PDFDocEncoding byte string to a UTF-8 string.
public
static decode(string $bytes) : string
PDF text strings use either PDFDocEncoding (single-byte) or UTF-16BE (indicated by a BOM prefix 0xFE 0xFF). This method handles only the PDFDocEncoding case.
Parameters
- $bytes : string
Return values
stringdecodeTextString()
Decode a PDF text string — auto-detects UTF-16BE (BOM) vs PDFDocEncoding.
public
static decodeTextString(string $bytes) : string
Parameters
- $bytes : string
Return values
stringgetTable()
public
static getTable() : array<int, int|null>
Return values
array<int, int|null> —byte value (0-255) to Unicode code point, null = undefined