Cid Font F1 F2 F3 F4 May 2026
The Architecture of CID Fonts: Decoding the Roles of F1, F2, F3, and F4
In the realm of digital typography, particularly for complex scripts like Chinese, Japanese, and Korean (CJK), the limitations of traditional font formats such as Type 1 quickly became apparent. The need to handle thousands of glyphs efficiently led to the development of CID-keyed fonts (Character Identifier fonts). Within the technical documentation and internal structuring of these fonts, the designators F1, F2, F3, and F4 serve critical, distinct roles. These are not merely arbitrary labels but represent a logical hierarchy for processing character identifiers, mapping them to glyphs, and managing font resources. Understanding F1 through F4 is essential to grasping how modern CJK typesetting systems operate with speed and precision.
First, it is necessary to establish the foundational concept of a CID-keyed font. Unlike traditional fonts that rely on a single-byte encoding (e.g., ASCII for Latin fonts), a CID font separates the character collection from the glyph descriptions. A CID is a number that identifies a character, not its visual representation. A CMap (Character Map) then translates between an external encoding (like Shift-JIS or Unicode) and these internal CIDs. The "F" designators—F1 through F4—are specific data structures or processing states within the Adobe Type Manager and PostScript rendering engines that facilitate this mapping and glyph retrieval process.
F1 typically represents the font’s primary CIDFont resource. It acts as the central dictionary or container that holds the glyph descriptions (charstrings) indexed by their CID numbers. In essence, F1 is the core visual database. When a rendering engine receives a CID, it queries F1 to find the corresponding vector outline for that character. F1 also contains crucial metadata, such as the default metrics (widths, heights) and the supplement number, which indicates the version of the character collection. Without F1, the raw CIDs would have no visual form; it is the "glyph library."
F2 and F3 are more specialized, often functioning as subsidiary or composite dictionaries. In complex scripts, a single final glyph may be composed of multiple parts. For example, a CJK character might consist of a radical and a phonetic component, or a vertical writing variant may require rotated or shifted glyphs. F2 commonly stores composite character data—instructions on how to combine base glyphs (referenced via their CIDs in F1) to form a new, higher-level character. F3, in turn, might hold variation or stylistic alternates, such as different glyph forms for the same CID (e.g., traditional vs. simplified, or printing vs. handwriting style). By organizing this data across F2 and F3, the font achieves modularity and avoids redundant storage of similar glyph parts.
F4 often serves the most dynamic role: the CMap or processing context. Unlike the static dictionaries F1-F3, F4 represents the active mapping interface between an input encoding (like Unicode text) and the internal CIDs used by F1. In some technical descriptions, F4 is the "virtual font" or the composite font object that ties together multiple F1 resources (e.g., one for Japanese, one for Latin) and selects which F2/F3 rules apply based on the context (e.g., horizontal vs. vertical writing mode). It is through F4 that a text renderer decides which CID to request from F1 and how to instruct F2/F3 to modify that glyph.
In practical operation, the four functions work in a pipeline. When a document containing Japanese text is rendered:
- The text engine receives a Unicode character.
- It consults the F4 composite font to find the correct CMap and determine which CIDFont (F1) to use.
- It uses that CMap to translate the Unicode value into a specific CID.
- The CID is passed to F1, which retrieves the base glyph outline.
- If the character requires composition (e.g., a kanji with a repeating element), F2 provides the composition rules.
- If a stylistic variant is requested (e.g., a specific regional form), F3 supplies the alternate glyph data.
The separation of duties among F1, F2, F3, and F4 confers immense advantages: efficiency (reusing common glyph parts), compactness (no need to store every CJK character as a unique, atomic glyph), extensibility (adding a new character collection only requires a new CMap, not a new glyph set), and flexibility (switching between horizontal and vertical writing or regional variants becomes a matter of changing the F4 context, not the entire font). cid font f1 f2 f3 f4
In conclusion, the designators F1 through F4 in CID-keyed fonts are not superficial technical labels but represent a sophisticated, layered architecture for multilingual typography. F1 acts as the glyph repository, F2 manages composition, F3 handles variations, and F4 orchestrates the mapping and context. Together, they solve the historic problem of representing thousands of complex characters without bloating file sizes or compromising rendering speed. For designers, engineers, and typographers working with East Asian languages, understanding this F1-F4 framework is not merely academic—it is essential to harnessing the full power of digital type.
In PDF document structures, CIDFont+F1, F2, F3, and F4 are internal labels assigned by PDF-generation software (like Adobe Distiller or Microsoft Print to PDF) when it cannot or chooses not to embed the original font names. These are not "real" font names you can find in a standard font library; rather, they are placeholders for Character Identifier (CID) fonts used to handle large character sets or encoding issues. Breakdown of CID Font Labels
The labels F1 through F4 (and beyond) are generally assigned incrementally by the PDF producer. While the exact mapping can vary between documents, they typically represent different styles or weights of the primary fonts used in the original source:
CIDFont+F1: Often represents the primary typeface in Bold style (e.g., Arial Bold).
CIDFont+F2: Typically represents the primary typeface in Regular style (e.g., Arial Regular).
CIDFont+F3 & F4: These usually correspond to other variations like Italic, Bold Italic, or secondary typefaces used in the document. Technical Overview The Architecture of CID Fonts: Decoding the Roles
Structure: A CID-keyed font is a "composite" font that uses Character IDs (CIDs) to index glyphs, making it more efficient for languages with thousands of characters, such as Chinese, Japanese, or Korean (CJK).
Encoding: These fonts often use the Identity-H or Identity-V encoding. This maps character codes directly to glyph indices in the font file, which can sometimes make text extraction difficult if the mapping is incomplete.
Anonymization: Because these names are randomly generated during the export process, they do not tell you the original font's name. To identify the actual font, you must often use advanced tools like iTextSharp to look inside the embedded font program itself. Common Issues and Solutions How to fix font issue to make PDF file show properly?
In the quiet architecture of digital documentation, there exists a phenomenon that is simultaneously a glitch, an aesthetic, and a philosophical statement: The CID Font Hierarchy.
When you see the sequence F1, F2, F3, F4, you are not looking at a mistake. You are looking at the exposed skeleton of communication. You are seeing the ghost in the machine refusing to wear its skin.
Here is a deep dive into the quiet tragedy of the CID Font. The text engine receives a Unicode character
PDF internals — how F1..F4 appear
- In a PDF page’s Resources dictionary you’ll often see: /Font << /F1 12 0 R /F2 13 0 R /F3 14 0 R >> Each Fi is a resource name mapping to a Font object.
- A Font object may be a Type0 (composite) that references a CIDFont (Type2 CIDFont) with a specified CIDSystemInfo.
- Text operators reference fonts by resource name: BT /F1 14 Tf (shows text with font F1 at size 14).
How to Identify Which Font F1 Actually Is
You don't have to guess. Here is how to map F1 to the real font name:
In Adobe Acrobat Pro:
- Open the PDF.
- Go to File > Properties > Fonts.
- Look at the list. You will see entries labeled
F1,F2, etc. - Immediately next to the tag is the Actual Font Name (e.g.,
F1 - "KozMinPr6N-Regular").
Using Command Line (pdffonts - Linux/macOS):
pdffonts document.pdf
Output:
name type encoding emb sub uni object ID
----------- ------------ ------------ --- --- --- ---------
F1 CID Type 0 Identity-H yes yes yes 12 0
F2 CID Type 2 UniJIS-UCS2-H yes yes yes 14 0
This shows F1 is an embedded, subset PostScript CID font.
6.3 pdffonts (from Xpdf/Mupdf)
List all fonts in a PDF, showing if they are CID and their internal names:
pdffonts document.pdf
Output example:
name type encoding emb sub uni object ID
----------------- ------------ ------------ --- --- --- ---------
F1 CID Type0 Identity-H yes yes yes 7 0
F2 CID Type2 Identity-V yes yes yes 10 0