|
ABCpdf .NET is world class when it comes to fonts and text. A
backgrounder as to what goes on behind the scenes...
ABCpdf has a Deep, Native Understanding of PDF Font
Structures
Unlike libraries that simply "draw text" ABCpdf .NET has an
intricate understanding of how fonts are embedded and referenced
within a PDF file.
-
Font Embedding: ABCpdf .NET automatically
and correctly handles font embedding, which is crucial for document
portability. It subsets fonts (including only the glyphs actually
used) to minimize file size.
-
Font Programs: It can parse and work with
different font types (TrueType, OpenType, Type 1, Type 3, CFF, CID
fonts) at a low level, understanding their encoding maps (CMaps),
glyph widths, and metrics. ABCpdf does not use system functions as
these are often imprecise and incomplete. Instead it determines
metrics and finds features by directly reading and parsing the
original font files.
-
Unicode Compliance: Right from inception
ABCpdf .NET was designed with Unicode in mind. It correctly maps
characters from your input string to the correct glyph index in the
font, ensuring characters like é, ß,
or ち appear correctly.
Advanced Text Layout and Rendering Engine
The core layout mechanisms are built for precision.
-
Glyph Positioning: It does not just place
characters; it places glyphs with exact control
over their position, including kerning (adjusting the space between
specific character pairs, like "AV") and ligatures (combining
characters like "fi" into a single glyph).
-
Text State Control: It provides meticulous
control over the PDF text state parameters: character spacing, word
spacing, horizontal and vertical scaling, and text rise (for
superscript/subscript). This allows for professional-grade
micro-typography.
-
Correct Spacing and Line Breaks: The
layout algorithms calculate string widths and line breaks based on
the actual embedded font metrics, not on guesses from the system
font. This guarantees that what you design is what you get, on
every device.
Robust Support for Complex Scripts
This is a major differentiator. Many simpler PDF libraries fail
miserably with non-Western languages.
-
Right-to-Left (RTL) Languages: ABCpdf .NET
has robust support for bi-directional languages like Arabic,
Hebrew, and Farsi. It handles the complex shaping rules where a
character's form changes based on its position in a word (initial,
medial, final, isolated).
-
CJK (Chinese, Japanese, Korean)
Support: It expertly handles the thousands of glyphs
in these languages, including vertical writing layout and correct
line-breaking rules.
-
Indic Scripts: Supports the complex
conjuncts and reordering rules required for languages like Hindi,
Tamil, or Bengali.
-
Cursive and Psudo-random Scripts: Supports
the contextual alternates needed for the support of cursive scripts
- one where the letters are designed to connect smoothly to one
another, often with flowing strokes. Supports pseudo-random fonts -
a font which includes multiple glyphs for the same character and
switches between them to avoid repetitive patterns.
- Shaping and Breaking: Uses a variety of
shaping and breaking engines for control over complex scripts.
Typically ABCpdf uses the official Unicode engine along with
HarfBuzz and sometimes Uniscribe. HarfBuzz is the
industry-standard, open-source text shaping engine used by Android,
Chrome, Firefox, and most Linux systems. It supports OpenType
features and all major world scripts.
The Styled Text System
The styled text system makes complex text layout easy.
-
HTML-like Styling: You can style text with
properties like indent, padding, text style, and fonts simply and
easily in a way similar to HTML.
-
Automatic Layout: Styled text is a
sophisticated layout engine. You add styled text, and ABCpdf
.NET automatically handles pagination, line breaks and alignment
across multiple pages.
-
Rich Text: You can control many features
from obvious ones like font weight and italicisation through to
less obvious ones like highlight, multiple strike-thru and text
direction.
Meticulous Attention to the PDF Specification
The ABCpdf .NET developers are experts on the ISO-standardized
PDF specification (PDF 2.0, PDF 1.7, etc.). This means the library
produces files that are not just visually correct but are also
structurally sound and compliant. This is critical for:
-
Accessibility (PDF/UA): Generating tagged
PDFs where text has a logical reading order, alt text for images,
and proper language specification for screen readers. Correct font
handling is a prerequisite for this.
-
Archive (PDF/A): Generating PDFs where
text is inserted using correct subsets, entries and values for long
term archival standards.
-
Text Extraction and Search: Because text
is stored and mapped correctly, extracting text from an ABCpdf
.NET-generated PDF for search indexing or other processing is
highly accurate and reliable. You will not get gibberish where
special characters or ligatures should be.
Comparison to Simpler Alternatives
Many other tools (like simple report generators or basic
wrappers) take a naive approach:
-
They "Print" Text: They often use the
operating system graphics API (e.g., GDI+ on Windows, Java2D) to
draw text as a series of shapes (curves and lines) onto the page.
This results in larger files (as the text is not real text but
paths) and makes the content non-selectable, non-searchable, and
inaccessible.
-
No Font Embedding: They rely on the system
fonts, so the document will look different on another machine that
does not have those fonts installed.
-
No Complex Text Layout: They completely
break on RTL or CJK languages.
In Summary:
ABCpdf .NET excels at fonts and text because
it respects text as a complex data structure
and because it understands the raw font
formats. It knows about the semantics, encoding, and
intricate rules of written language and translates that knowledge
into a perfectly precise, portable, and compliant PDF document. It
is the difference between a word processor and a typewriter.
|
|
|