Typography | Lingo | Google Fonts | Font Library
UTF-8 Unicode Blocks | Code Point Tables | Browser Test | A–Z Index | Quackit.com
- Basic Latin U(
0000–007F
) - Latin-1 Supplement U(
0080–00FF
) - Latin Extended-A U(
0100–017F
) - Greek and Coptic U(
U+0370-03FF
) - Currency Symbols U(
20A0–20CF
) - Letterlike Symbols U(2100-214F)
- Miscellaneous Technical U(
2300–23FF
) - Geometric Shapes U(
23A0–25FF
) - Dingbats U(
2700–27BF
) - Mathematical Alphanumeric Symbols U(
U+1D400-1D7FF
) | @ quackit.com - Transport and Map Symbols U(
1F680–1F6FF
) - Emoticons U(
1F600–1F64F
) … are not emojis, but adopted many (2008/2010) into this code block. - Emoji | Unicode Emoji Characters | Unicode Version 10.0 | Full Emoji List
- Emojis are pictographs, whereas emoticons are typographs. Emoji span several Unicode Code Blocks; their renderings vary widely per platform; originated in Japan (1997) for use with SMS; popularized across West when several mobile operators began including them (2010).
Unicode is a character-set
- Unicode Charset ::
char
<—>int
A glyph maps to a Code Point.
UTF-8 is an encoding
- UTF-8 Encoding ::
int
<—>bin
A Code Point maps to byte(s).
- One Unicode glyph is 1 to 4 bytes in
UTF-8
; this encoding allows for more than a million glyphs.
- One Unicode glyph is 1 to 4 bytes in
- UTF-8 is the universally accepted encoding for all Unicode Charsets.
Unicode Code Point
A namespaced integer defining a single Unicode character, a.k.a. glyph, a.k.a. rune, a.k.a., symbol. The integer is most often referenced by its hexidecimal representation preceeded by "
U+
". For example, the popular "🚀" glyph is defined under Unicode as:Character: 🚀 Code Point: U+1F680
Name: ROCKET
Block: Transport and Map Symbols
Convert between Unicode and Bytes (Encode/Decode)
Convert between Unicode and HTML Entity/Symbol using the hex representation of the Unicode Code Point: "
U+
" <—> "&#
", "0
|00
" <—> "x
", and append ";
" . E.g.,U+00A7
<—>§
Disambiguation
A Code Point is a unique identifier, and one that is unambiguous, always and everywhere, unlike all other relevant typographical terms.
In the context of Unicode/UTF-8 (
character-set
/encoding
), all references to the visually rendered (character, glyph, rune, symbol) are synonymous. However, in the context of typefaces, such references may differ. The whole purpose of a typeface is to render its set of characters distinctly from those of others, with further variations amongst its ownfont-family
(light, bold, condensed, …). So, in such contexts, a glyph, character, rune, symbol or whatever, references a particular design and/or rendering of such. That is, for any one Code Point, e.g., "A", "LATIN CAPITAL LETTER A
(U+0041
)", there are many glyphs, one from each font, each rendering distinctly by design (normal, italics, extended, …). In such a context, the distinct renderings are often referred to as glyphs, while the (unique) Code Point is often referred to as the character or the symbol. This ambiguous lingo is pervasive across the whole of typography. See Typographical Terms.
HTML Entities
Every Unicode Code Point converts to its HTML entity for browser rendering. An HTML entity, a.k.a., HTML Symbol, has several representations, all of which are equivalent; hexidecimal (&#x*;
), decimal (&#*;
), and — for some — named (&NAME;
). Note that an HTML Entity Name is typically not its Unicode Code Point Name. For example, the ellipsis symbol (…) is Unicode Code Point U+2026
and Unicode Name "HORIZONTAL ELLIPSIS
". That maps to HTML Entity …
(hex), …
(dec), …
(named).
symbol code hex dec Unicode name
A U+0041 A A LATIN CAPITAL LETTER A
§ U+00A7 § § SECTION SIGN
© U+00A9 © © COPYRIGHT SIGN
֍ U+058D ֍ ֍ RIGHT-FACING ARMENIAN ETERNITY SIGN
ᛟ U+16DF ᛟ ᛟ RUNIC LETTER OTHALAN ETHEL O
☧ U+2627 ☧ ☧ CHI RHO
☩ U+2629 ☩ ☩ CROSS OF JERUSALEM
⋮ U+22EE ⋮ ⋮ VERTICAL ELLIPSIS
︙ U+FE19 ︙ ︙ PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS
☰ U+2630 ☰ ☰ TRIGRAM FOR HEAVEN
✕ U+2715 ✕ ✕ MULTIPLICATION X
Render identically (per browser) regardless of representation:
HEX:A|§|©|֍|ᛟ|☧|☩|⋮|︙|☰|✕ … 𐍈|狼|𐎿|𓃲|⳩
DEC:A|§|©|֍|ᛟ|☧|☩|⋮|︙|☰|✕ … 𐍈|狼|𐎿|𓃲|⳩
🐐|🐏|🐓|🐘|🐙|🐝|🐞|🐧|🐨|🐯|🦑|🐢|🐪|🐫|🐬|🐷|🐍|🐉|🐲|🐳|🐸|🦉|🦇|🐺|🦊|🐻|🐼|🦍|🦖
- Others:
✱|⊛|✥|✲|❊|🞾|🞿|⁕|Ϻ|Ͼ|ॐ|⛉|🄰|🅅 ✔|✖|✘|✚|❝|❞|❠|🙶|🙷|🙸|❖|❤|❶|❷|❿
…|❡|¶|⁋|⸿|※|⁘|⁛|⁜|№|₿|•|℠|™
•|●|⚪|⬤|웃|℗|⌒|♠|♣|♥|♦
🛒|🎃|💀|🏁|😀|🤢|👌|🥡|🏠|🚀|🤖|🤡
🌡|⛈|🌪|☃|❄|🌌|⚡|🥇|🚹|🚺|🚮
💻|📱 ƿ૯ωძɿ૯ƿɿ૯
Non-breaking space:
(foo bar)Non-breaking hyphen:
‑
(foo‑bar)
Typographic Terms
- Typesetting — The composition of text by means of arranging physical types, or the digital equivalent, e.g., TeX typesetting system.
- LaTex — Document preparation system (1983) including its own markup tagging conventions; widely used in acadamia.
- Language — Character set; alphabetic writing system:
- LGC — Latin/Greek/Cyrillic
- CJK — Chinese/Japanese/Korean
- Typeface — One font family (see Typeface Classifications); may comprise many fonts of varying style & weight; -thin, -medium, -bold, -bold-italic, -black, -condensed, -expanded-light, …
- Font — One complete set of unique glyphs. One or more constitute a font family (typeface); earliest digital fonts were designed as bitmaps for rendering at a specified pixel size, but today virtually all are of vectors (see Outline Font).
- Extended (meaning per context):
- Larger character set(s); more glyphs; typically declared per Unicode table(s), e.g., "Latin-1 Sup", "Latin Extended-A", "Currency Symbols", …
- A synonym for an expanded font; wider glyphs (horizontally stretched) especially relative to its font family (typeface).
- Larger character set(s); more glyphs; typically declared per Unicode table(s), e.g., "Latin-1 Sup", "Latin Extended-A", "Currency Symbols", …
- Subsetting — Rebuilding/converting a font per some subset of its character set; typically per character type or Unicode table(s); to reduce TX/RX file size.
- Extended (meaning per context):
- Glyph — One character/rune/symbol (visual) rendering; an elemental symbol; designed and rendered per either vector or bitmap (per font); a font is a set of glyphs; every Unicode character (glyph/rune/symbol) has a unique Name and Code Point (namespaced integer).
- X-height — Distance between the baseline and the mean line of a glyph.
- Kerning — Mortising; adjusting spacing between characters of a proportional font, e.g., "Wa", "AV"; designed, per-font effect(s).
- Ligature — Joining adjacent chars into 1 glyph, e.g., "fl". Note the ampersand, "&", is a ligature of Latin "et"; designed, per-font effect(s).
- Diacritic (Diacritic Mark, Diacritical Point, or Diacritical Sign) — Accent symbol(s) added to the basic glyph to compose certain letters of a character set. Many glyphs of the Cyrillic character set include such symbols.
- Bitmap/Pixel/Raster Font — Defines each of its glyphs by a matrix of pixel declarations (describing every pixel thereof); designed and rendered to its specified pixel dimensions, exclusively; faster and simpler to render than vector fonts, but does not scale; degrades if rendered to any font size (
px
) other than that of its design. Older technology than Outline Font. - Outline Font (Vector Font) — Computer font implemented using vector graphics; an image (glyph) consisting of lines and curves (Bézier splines) defining only its boundary (outline).
- (Adobe) PostScript — Computer language for creating vector graphics, created by Adobe. The universal standard for Outline Fonts:
- PostScript Type 1
- PostScript Type 3
- TrueType (TTF;
.ttf
)- Developed by Apple & Microsoft (1980s); competitor to Adobe Type 1 fonts used in PostScript.
- Developed by Apple & Microsoft (1980s); competitor to Adobe Type 1 fonts used in PostScript.
- OpenType (OTF;
.otf
)- Successor to TrueType; developed by Microsoft & Adobe; now a standard; (Web) Open Font Format (OFF/WOFF).
- Embedded Open Type (EOT;
.eot
) — Compressed form of OTF; designed and implemented by and for Microsoft (for Internet Explorer).
- Web Open Font Format (WOFF;
.woff
&.woff2
) — Compressed format of TTF (.ttf
) or OTF (.otf
); the defacto standard for web-served, browser-rendered fonts:- WOFF 1.0 (
.woff
) is supported by most modern browsers; Chrome(v36+), Firefox(v35+), and Opera(v26+); MIME Typefont/woff
. - WOFF 2.0 (
.woff2
) compression (~ 30% smaller) is not as widely supported; MIME Typefont/woff2
.
- WOFF 1.0 (
- SVG Fonts — Font description per SVG
<font>
element; rejected by most browser vendors; "… not meant for compatibility with other formats … currently [2019] supported only in Safari and Android Browser … removed from Chrome 38 (and Opera 25) … Firefox has postponed its implementation indefinitely to concentrate on WOFF"
- (Adobe) PostScript — Computer language for creating vector graphics, created by Adobe. The universal standard for Outline Fonts:
- ClearType — Sub-pixel font rendering technology by Microsoft and utilized in Windows OS; improves legibility (important); sacrifices color accuracy (less important).
- Typeface Classifications
- Buckets of ambiguous notions; multiple contradictory meanings, even within the same context. E.g., both 'Humanist' and 'Realist' may refer to either serif or sans-serif; 'Old Style' is serif, whereas 'Old English' is Blackletter; 'Gothic' is sans-serif, whereas 'Gothic Script' is Blackletter; 'Geometric' is sans-serif, but may reference a certain subset of typefaces therein. Here's the hellscape:
- Roman: An ambiguous term having different meanings per context:
- Typeface Category: Serif (vs. Blackletter or Gaelic/Irish)
- Style: normal (vs. italic).
- Weight: normal (vs. bold).
- Variety of Cyrillic script (vs. Slavonic).
- Typeface Category: Serif (vs. Blackletter or Gaelic/Irish)
- Serif Categories:
- Roman
- Old Style (Humanist)
- Transitional (Baroque)
- Modern (Didone)
- Slab (Egyptian, Realist)
- Roman
- Sans-serif Categories:
- Grotesque (Grotesk{German})
- Gothic
- Realist (if modern)
- Neo-grotesque
- Geometric
- Humanist
- Grotesque (Grotesk{German})
- Blackletter: A class of typeface; mimics handwriting; scripts of old (1100–1800) W. Europe; Antiqua, Textura, Gothic Script, Gothic Minuscule, Old English;
- Blackletter categories:
- Fraktur
- Kurrentschrift
- Sütterlin
- Blackletter categories:
- Vox-ATypI (Association Typographique Internationale) — An older (1962), defunct (2010), typeface classification scheme:
- Classicals
- Humanist, Garalde, Transitional
- Humanist, Garalde, Transitional
- Moderns
- Didone, Mechanistic, Lineal, Grotesque, Neo-grotesque,Geometric,Humanist
- Didone, Mechanistic, Lineal, Grotesque, Neo-grotesque,Geometric,Humanist
- Calligraphics
- Glyphic, Script, Graphic, Blackletter, Gaeilic
- Glyphic, Script, Graphic, Blackletter, Gaeilic
- Non-Latin/Exotic
- Greek, Cyrillic, Hebrew, Arabic, Chinese
- Greek, Cyrillic, Hebrew, Arabic, Chinese
- Classicals
Typographic Properties :: Material Design
Grammatical Usage
Hyphen / En dash / Em dash
Hyphen (-)
- As a compound modifier before a noun:
dog-friendly apartments - As a separator for prefix, suffix, compound number, or line-break:
mid-October, president-elect, forty-five, inter-
national - As a field separator:
1-800-123-4567
- As a compound modifier before a noun:
En dash (–)
- As a range separator:
pages 32–37, Ja–Li, 6:30–8:00, 1995–
- As a range separator:
Em dash (—)
- In place of parentheses, perhaps as a second parenthetical:
Start main notion — inject parenthetical notion — continue with main (perhaps containing this too) notion.- Apply the em dash with or without spaces, but do so consistently.
- In place of a colon:
The verdict was in — guilty. - To indicate some kind of lexical distinction:
… label, input, button — they're all HTML Form elements.
That's wizardry — peddling a perfect inversion of reality. - Use 2 consecutively to indicate missing letter(s):
The mob was chanting something like ad——n [addiction?]. - Use 3 consecutively to indicate missing word(s):
They ——— before, and then ——— drug store.
The Em dash is less formal than its alternatives; less prevalent in business and academia; more so in consumer-facing prose. Especially useful for depicting verbal dialogue.
- In place of parentheses, perhaps as a second parenthetical:
Round/Square Brackets
Round brackets, a.k.a. parentheses, are mainly used to separate off information that isn’t essential to the meaning of the rest of the sentence:
- Mount Denali (in Alaska) is the highest mountain in North America.
There are several books on the subject (see page 120).
- Mount Denali (in Alaska) is the highest mountain in North America.
Square brackets, a.k.a. braces, are mainly used to enclose words added by someone other than the original writer or speaker, typically in order to clarify the situation:
- He [the cousin] left the house long before that happened.