phpdftk API Documentation

PdfDocEncodingTable
in package

FinalYes

PDFDocEncoding — the encoding for PDF text strings (Info dict, bookmarks, annotations) when they don't start with the UTF-16BE BOM (U+FEFF / 0xFE 0xFF).

Per PDF spec ISO 32000-2:2020, Table D.2.

Maps byte values 0-255 directly to Unicode code points (not glyph names).

Table of Contents

Methods

decode()  : string
Decode a PDFDocEncoding byte string to a UTF-8 string.
decodeTextString()  : string
Decode a PDF text string — auto-detects UTF-16BE (BOM) vs PDFDocEncoding.
getTable()  : array<int, int|null>

Methods

decode()

Decode a PDFDocEncoding byte string to a UTF-8 string.

public static decode(string $bytes) : string

PDF text strings use either PDFDocEncoding (single-byte) or UTF-16BE (indicated by a BOM prefix 0xFE 0xFF). This method handles only the PDFDocEncoding case.

Parameters
$bytes : string
Return values
string

decodeTextString()

Decode a PDF text string — auto-detects UTF-16BE (BOM) vs PDFDocEncoding.

public static decodeTextString(string $bytes) : string
Parameters
$bytes : string
Return values
string

getTable()

public static getTable() : array<int, int|null>
Return values
array<int, int|null>

byte value (0-255) to Unicode code point, null = undefined


        
On this page

Search results