u8g2fontformat

U8g2 Font Format

The data format of U8G2 fonts is based on the BDF font format. Its glyph bitmaps are compressed with a run-lenght-encoding algorithm and its header data are designed with variable bit field width to minimize flash memory footprint.

Font Header
Offset (Byte)	Size	Content
0.	1 Byte	Number of glyphs
1.	1 Byte	Bounding Box Mode (not relevant for decoding)
2.	1 Byte	m₀ Bit width of Zero-Bit run-length encoding in bitmap
3.	1 Byte	m₁ Bit width of One-Bit run-length encoding in bitmap
4.	1 Byte	Bit width of glyph bitmap width
5.	1 Byte	Bit width of glyph bitmap height
6.	1 Byte	Bit width of glyph bitmap x offset
7.	1 Byte	Bit width of glyph bitmap y offset
8.	1 Byte	Bit width of glyph character pitch
9.	1 Byte	Font Bounding Box width
10.	1 Byte	Font Bounding Box height
11.	1 Byte	Font Bounding Box x Offset
12.	1 Byte	Font Bounding Box y Offset
13.	1 Byte	Ascent (size above baseline) of letter "A"
14.	1 Byte	Descent (size below baseline) of letter "g"
15.	1 Byte	Ascent of "("
16.	1 Byte	Descent of "("
17+18.	2 Bytes	Array offset of glyph "A"
19+20.	2 Bytes	Array offset of glyph "a"
21+22.	2 Bytes	Array offset of glyph 0x0100

Glyph (variable length)

Offset	Size (uncompressed)	Content
0.	1/2 Byte(s)	Unicode of character/glyph
1. (+1)	1 Byte	jump offset to next glyph
	1 Byte	glyph bitmap width (variable width)
	1 Byte	glyph bitmap height (variable width)
	1 Byte	glyph bitmap x offset (variable width)
	1 Byte	glyph bitmap y offset (variable width)
	1 Byte	character pitch (variable width)
	n Bytes	Bitmap (horizontal, RLE)

Font header

The font header (see table above) is always 23 Bytes long. It contains decoding parameters for the glyph bitmap data, font outline data, such as overall font bounding box dimensions, and index offsets for frequently used character ranges.

Glyph data

Data for each glyph contains the unicode with 1 or 2 Byte lenght for the range below and above 0x100, respectively. After that follows the jump offset to the next glyph.

For finding the corresponding glyph to a unicode, the array is jumped through, until the correct unicode is either found or excessed. For the most important code range from 32...255 (decimal), array offsets of important glyphs ('A', 'a' and 0x100) are provided by the font header to minimize search overhead.

All following glyph data does not rely on byte boundaries, but their bit widths are provided by the font header (See "Bit width of XX"). Since only relevant bit width for each property is stored instead of full bytes, we have reasonable reduction of stored data without much decoding overhead.

Glyph bitmaps

u8g2_order

Glyph bitmaps are 1-Bit horizontally packed bitmaps with tight fit bounding box (see image above). They are compressed by an ad-hoc run length algorithm. The bit array has

m₀ Bits (see font header) denoting the number of zeros
m₁ Bits (see font header) denoting the number of ones
n Bits == 1 (to be counted) denoting the number of repetitions of the sequence and
1 Bit == 0 as stop marker for each sequence.

Run lengths go beyond lines. Glyph bitmaps don't contain end markers, since their widths and heights are known.

Bounding boxes

u8g2_font_boundingbox

In contrast to some other gfx systems, the co-ordinates of the bounding boxes point to the right and upwards. For the most characters, x-offset is the horizontal distance between the very left pixel of the glyph and the left border, so in the picture above, the x-offset is quite exaggerated. y-offset is the vertical distance between baseline and the lowest pixel. For most glyphs withount descender, it is equal to zero, for glyphs with descender, it is negative.

Generally, the font bounding box is the outline of all glyph bounding boxes, with all offsets taken into account. Due to some large glyphs (e.g. for '@' and '|'), the font bounding box can be quite large - larger that typical character pitch or line height.

There are some fonts, where the font bounding box is a bit larger than the outline of all contained glyphs. This happens, when glyphs are removed. However, this makes no difference in font decoding or display.

Fonts can be generated in different bounding box modes:

[t=1]: Transparent mode. All glyph bounding boxes are tight fit. This generates minimal flash memory footprint.
[h=2]: Height mode. All glyph bounding boxes are horizontally tight fit, but have the same height. This allows overwriting text lines without clearing.
[m=3]: Monospaced mode. All glyph bounding boxes have the same size, which is equal to font bounding box size.

Bounding box mode is the second last character of the font name (see font names).

Pitches and distances

Line pitch

For tabular data, the height of the font bounding box is a valid choice for the line pitch. For plain text, Ascents and descent of 'A', 'g' and '(', which are provided by the font header, can serve as base for typographically correct line pitch.

Horizontal pitch and line break

Horizontal pitch after each glyph is provided by the glyph header data.

No preparations for kerning have been taken. Advanced kerning could be implemented manually by counting white space between glyphs in display memory and for calculating of kerning correction. However, one should avoid kerning on LCD screens for better readability.

For calculation of line breaks, the width and x-offset of the glyph bounding box are suitable.

Font Decoder

Glyph data (glyph header data as well as glyph bitmap) is stored in bit arrays independent of byte boundaries to minimize flash memory footprint (see section Glyph data). Each data can be accessed independently and stateless, since all offsets are provided by the font header. However, the decoder functions are designed to decode all glyph data in fixed sequence to increase efficiency.

A state variable of type u8g2_font_decode_t contains decoder state es well as decoded glyph header data. The functions u8g2_font_decode_get_unsigned_bits and u8g2_font_decode_get_signed_bits fetch the byte-independent data, store it to standard fixed-width integer types and update the decoder state variable.