Okay, good morning everyone.
I'm going to talk about the
Linux Font Rendering stack,
and that is what I learned the last
four and a half years.
I worked at the
city administration of Munich.
For introduction to font rendering,
I can say writing is
the most used interface,
or probably the most used interface
between humans and the machines.
So all of you probably use it everyday
on your computer.
Screen text is replacing text on paper
more and more
and this is still an ongoing process.
The way from a string in a computer to
display text is surprisingly complex,
And writing in itself is quite profound.
For example, it's influenced by history
so probably all, or at least most,
writing systems are derived from some
pictograph writing systems.
This is also true for Roman letters.
And also the Roman letters themselves,
they changed a lot since 2000 years ago.
In the Roman Empire they used some
different letters than we use now.
So the goal of this talk is to raise
awareness of font rendering in computing,
with the focus on the Linux desktop.
Now to the history,
and then I talk about typography.
Text display in early computing.
In early computing it was just
a way to display results.
Before that there were
only flashing lights
where you can read the results
and...
[audience member] ??... BT100
?? raise your hand
[Max] And the output device receives
a page or a character stream,
and then the presentation is the job
of the output device.
So for example, a typewriter
or a printer or a terminal.
And this is a BT100
you can see on the right side.
So this is a real terminal.
But going on in history
there's the home computer era.
In the home computer era you have
fixed character sets
and most of the time due to system
limitations, memory limitations,
you have graphics mode and text mode.
In the graphics mode you can basically
set the pixels more or less as you want,
and in the text mode you can use fonts.
The screen configuration at that time
is more or less defined
by the state of technology.
So every screen at that time has more or
less the same resolution,
and that's something that the engineers
of that time relied on,
and also software engineers relied on.
The text rendering was just
using bitmap fonts.
That means you have a fixed size character
and you just define it pixelwise
so it's some kind of hand-crafted
characters.
You have a given raster and then
you really display
these hard-defined characters.
So then I need to introduce a few
typography terms.
First, there's glyph or a character.
That's...well you can say a symbol.
So I think most people can understand what
that means.
Then a font, this is a set of glyphs.
And then we have a collection of fonts,
that is called typeface or font family.
Those are closely related fonts with
the same visual appearance,
And they differ only in slopes or widths
or weights for example.
Usually with typesetting, with moving
letters in the printing press,
you really had, for weight for example,
different pieces/letters/hardware.
And then the point size,
or typographic unit.
This is a measure of the size, and
derives directly from the printing press
with moving letters.
There are different definitions, and what
we probably all use as point size is the
DTP point – desktop publishing point,
where one point is defined as 1/72 inch,
which is 0.3527 period millimeters.
It is interesting to note that this has
nothing to do with pixels.
[audience] (unintelligible question)
[Max] Please use the microphone
if you a question.
[audience] ...3257, what's the bar
over the 7 mean?
[Max] It means that the 7 goes
on and on and on forever.
So this is typography of the
Latin alphabet...terminology.
There's something called kerning.
You can see an example of kerning
on the right side.
So in the printing press you have letters
and they usually have a gap.
To improve the readability, and to
make all the text appear more uniform
the typesetter reduces the gap.
With moving letters they have special
letter pairs...
if you go on with this
there are ligatures,
which are replacement glyphs for close
standing characters.
An example of this is
a double f, fi or fl.
So you can see also in the slides here
that double f is such a ligature.
It's one connected glyph.
And then there's a speciality
if you reverse all of the ?? [static].
The term non-proportional font.
So you have a font where very character,
every glyph, has the same width.
This is also called monospace font.
Sometimes you distinguish half-width
and full width letters.
This is also something you would see
on a typewriter for example.
Early bitmap fonts are usually
monospaced fonts.
Okay, let's go to
the text rendering stack.
Text rendering today.
In all computing nowadays,
typography is introduced.
So won't find many devices on the market
which don't use accurate typography.
Glyphs are now represented as
size-independent outlines.
We can see on the right side there's
a font editor: Font Forge.
This is the way that such fonts
are created.
So you can see here in the picture
the outline of the "S".
It's a size-independent outline and
you can just use that for
every kind of display.
And then you do something that is called
rasterization, or sampling, or rendering
for displaying it at the actual display.
In the past this allowed for new
applications for computers.
First it allowed word processing and
computer-based typesetting.
Before that typesetting was
a very hard job.
It allowed internationalization.
It allows universal graphical interface
which goes
hand-in-hand with internationalization.
Come to think of it, not so long ago it
was not so common that you could
buy devices where you can easily switch
the language to
any language you would like.
This is just a recent development.
So now I will give you an overview of
the formats we see on Linux systems.
First I talk about the bitmap formats.
Today we have bitmap formats in the
PC Screen Font format,
PSF or PSFU.
These are the fonts that are used for the
Linux VT, also known as the console.
Usually you can store in such a font
256 or 512 glyphs.
Some of them contain
a Unicode translation table.
And on the other hand there are
X Window System Bitmap Fonts.
There are three formats:
There's a Sever Normal Format (SNF).
Glyph Bitmap Distribution Format (BDF).
And Portable Compiled Format (PCF).
Nowadays PCF is the only letter that
you can find on a Linux system if so,
and the first two are deprecated.
And also the X Window Bitmap Fonts
themselves are not very common these days.
To the outline font formats.
There are Postscript Formats.
Postscript Formats have different versions
as I would call them. It's called types.
Type 1 is the one that is still relevant,
to a certain degree.
Those are the ones you can probably
find on a standard Debian installation
for example.
It uses cubic bezier curves.
The file format is divided into
different files.
So for every font you have
Printer Font ASCII
or a Printer Font Binary file,
and you have
a Print Font Metric or
Adobe Font Metric file.
Then you have the Truetype file format.
This is more common. I guess everyone
heard of that already.
It uses quadratic bezier curves.
It can contain optional code for
TrueType Hinting Virtual Machine.
What that means I will explain latter.
There's a third format,
the OpenType format,
which has two possible glyph formats:
one is TrueType,
and the other is Compact Font Format,
which is based on Postscript Type 2.
So you can say what is new in this format,
or what's the difference?
Well it supports so-called Smartfonts.
That means you can have language-specific
ligatures or character substitutions.
For example, kerning classes, which means
you have a class of characters like
the A and different variations.
So these are A with different diacritic
symbols,
and you can just use one kerning class for
all of those characters and don't have to
invent the wheel for every A with
diacritic symbols anew.
Now to the font rendering techniques that
I use today.
This only applies to the outline fonts
because well rendering bitmap fonts is
obviously quite easy. You just paint the
pixels that are in the bitmap.
So rasterization is all about using
outline fonts.
The one library library that is used all
over the free and open source world
is FreeType.
It is used on Linux system, BSD desktops,
Android, and ReactOS, and some others.
For example it is contained in Ghostscript
and therefore in most of the printers.
It's also on iOS [exhale].
So the naive rasterization algorithm would
be just lay a pixel raster over the
outline font, over the character you want
to display, and if the center of the pixel
is inside the outline then you
set the color to black.
The problem is the so called
aliasing effect.
This is what you can see
in the picture below.
So those are the same words,
rendered at different resolutions.
The left one has 10x the resolution
of the right one.
You can see that, for example,
the 'w' is quite deformed.
Or the curl of the 'g';
there's even a part missing.
So you have details of the font
which are lost,
and you have artefacts.
Somehow, especially in the early computing
days you wanted to reduce those,
and therefore you used a technique
called hinting.
This is to avoid such artefacts and
improve the readability
at lower resolutions.
Therefore the outline is adjusted to fit
the pixel raster at the rasterization.
Some of the fonts can contain
instructions, the so-called hints
where the the name comes from:
"hinting".
You can see in the example here,
in the picture above,
that it's quite a good result.
This is much more readable
than the word above.
However, with the use of hinting, there
are also characteristics of the font lost
which is obvious because
if you change the outline
then obviously characteristics get lost.
And especially important for
word processing,
what you see is what you get
word processing,
is that the tracking
of the font is changed.
So tracking means the width
of single characters.
The picture below, this is a picture I
took from LibreOffice.
If you have a close look, you can see
that the gap between those 'i' characters
is not always the same.
But if do word processing, you actually
want to have a result which looks the same
as on the printout.
So I would recommend not
to hint fonts in
a what you see is what you get
word processor.
If you have a look beyond the backyard,
on macOS there exists no hinting,
and in the Windows world, hint can't
be turned off,
so this is hardwired in the
font rendering of the Windows system.
Well another approach to improve
readability is anti-aliasing
which applies multisampling.
So for every pixel you take samples at
different spots
and then you compute from
that a gray value,
which is a measure of how much of the
area of the pixel is covered by the glyph.
On the picture you can see again
the simple raster word,
and then in the middle you see the word
with anti-aliasing,
and below you can see a combination of
hinting and antialiasing.
Suddenly the text becomes very readable
also on low resolutions.
This is what was usually used in
computing in the 90s.
So there is another approach to improve
and therefore I have to
talk a bit about LCD displays.
So in the picture on the left, above you
can see tube monitor pixels and
on the bottom you can see
LCD monitor pixels.
All of the pixels consist of so-called
subpixels with different colors.
The mixture of the three different colors
gives them a wide range
of different colors.
With the tube monitors this is not used,
but with LCD display
depending on the
configuration of the subpixels
it can be used to improve improve the
resolution in one direction.
Therefore you have to know how the
configuration of the display is.
So usually, one pixel, which you can see
in the picture below right?
These are the usual computer monitors.
One pixel consists of a red, green and a
blue subpixel in this order.
But you have to keep in mind, especially
with tablets, or smartphones,
you can rotate the screen, so you have to
keep that in mind.
Also there are other subpixel
configurations.
For example, depending on the technology
there can be additional red or green
or even white subpixels.
So there are also multiple possibilities
of the configuration.
We can use this configuration as I said
to improve the resolution
and in the usual case
this is horizontally.
You can see on the picture
on the left side,
first there's the naive,
simple rendered character,
then the character
just using anti-aliasing,
and then there's the
subpixel rendered character.
This one is the most readable
or most sharp character.
Not sharp!... but it's most correctly
rendered according to the outline font.
Depending on the technique of the display,
you can see a color haze
around the characters.
So this happens when the software
and the display technology
don't match each other.
So this is what you can see
on the picture on the right.
Okay, let's talk about the font rendering
software parts in the Linux desktop.
There's the so-called server-side
text rendering.
So in the X server there's the
Core Font subsystem.
With that, X11 clients can request the
server to display a text
by sending a string via libXfont
and using the so-called
X Logical Font Description (XLFD).
Here's an example for Adobe Career font,
then the X server has to
render the text using the font.
If you imagine a terminal server setup
with thin clients,
then the thin client runs the X server
and the terminal server runs the X client
and every thin client then has to
handle the font rendering.
So I don't know if
this is the only reason,
but nowadays server-side font
rendering is not so common.
Probably not used anymore.
Now I have to talk first
about font management.
In Linux systems there is
a software called fontconfig,
which manages installed fonts
on the system
and it configures for example how to
substitute fonts.
For example if in a document there's a
font to render a text
and the font is not
available on the system
then in the font config system there can
be rules to replace the font
with a similar font.
This is heavily used in Linux systems.
There are also rules for what font to
use if the current selected font
doesn't contain a character
you want to display.
You can see on the picture on the right
that this is also used...
although I have to admit,
I had to take this picture on a Debian 5.0
system in LibreOffice
because nowadays
it works a bit different,
but this is a good example where
you can see that those characters
come from different fonts actually.
There's also a command line tool
which is quite nice.
fc-match command line tool.
So for example if you want to see what is
the replacement font for, let's say, Arial
then you get the output:
okay, it's Nimbus Sans.
And you can also set rendering options,
which means you can set anti-aliasing
or you can turn off hinting.
Usually these configurations are stored in
/etc/fonts
and there is also a per-user configuration
in .config/fontconfig
and it's in XML file format snippets.
In the picture you can see an example.
This is the replacement rule for
Carlito and Calibri.
Carlito is a replacement font for the
nowadays often used
Calibri font from Microsoft.
So this configuration says the one way,
Carlito is the same as Calibri,
and the other way around.
This is just an example,
you can have a look at /etc/fonts
and a lot of such snippets.
It defines how fonts are
displayed on the system.
Another thing, and this is the reason
why I had to take the screenshot
on an older system, there's a piece of
software called HarfBuzz.
HarfBuzz is Persian for OpenType,
so this software relies on the OpenType
font format.
You can see on the right side
the HarfBuzz logo
that actually says
HarfBuzz in Persian.
Before I talked about ligatures,
In some languages ligatures are required
to render fonts correctly
and this is an example on the left of
Devanagari, which is an Indic script.
So if you have the first two characters
combined then this is the rule
of how to replace those characters
with a third glyph.
There were early implementations
by Qt and Pango
and those were integrated in HarfBuzz.
Well those parts are now known as
Harfbuzz Old,
and the current HarfBuzz is a rewrite.
Nowadays it's also used for
the so-called simple script,
meaning especially Latin script,
and is integrated into Qt, GTK,
LibreOffice, Firefox, Android, and XETEX.
This is what I use for the slides here,
so all the slides here are
also rendered using HarfBuzz.
And of course the whole
font rendering stack.
HarfBuzz also has fancy features like
variable widths or weight
with only one font or you can define
in the font characters with
variable widths and without stepping you
can change it using HarfBuzz.
Those techniques are used in
client-side rendering.
One library often was
until recently often used, Xft.
Well this is the applications meaning the
X clients render the text
based on FreeType and fontconfig.
Then the X server
only displays the results.
Well, there is some caching involved,
and it requires
an extension to the protocol.
In the widget libraries this was also
used a long time,
but now Qt, for example, has its own
code for font rendering
based on HarfBuzz, FreeType,
and fontconfig.
There's the combination of
Pango and Cairo,
which is used in the GTK environment.
Pango derives from Greek and Japanese:
pan means all and go means languages.
So this is the background to write
in all languages.
This also uses HarfBuzz, FreeType,
and fontconfig.
And if you look at Wayland clients,
they only do client-side rendering.
To sum it up, we have a variety of
techniques for text display.
Some have historical than
practical value nowadays.
The modern font rendering stack
is quite complex.
But writing is one of the
main interface to computers,
so developers should be aware
of the complexity.
Keep that in mind.
Thank you.
[applause]
Are there any questions?
[audience] More a request than a question,
I find the topic very interesting
and I'd like to check it again. I know
your video was not recorded
or I think it was not recorded...
[Max] Well it was I think.
[audience] Okay so nevermind, but still
if you are able to share your presentation
[Max] Yes
[audience] I would be very
happy to look at it. Thank you.
[audience] [tests mic] What's
the problem due to the
complexity from the software stack
in your open ??.
[Max] The problem of what?
[audience] What's the problem it came from
this complex software stack
in your open ??.
There's no problem with that,
or there's some problem
with these lot of layers, very lot of
components to render the font.
[Max] Okay, I try to rephrase the question
[audience] Please.
[Max] What is the reason there is so many
layers in the font rendering stack today.
[Max] Is that correct?
[audience] And it caused some problem. Do
you know any problem
with current font rendering system.
[Max] Colored font rendering system?
[audience] Current, now.
[Max] Ah current font-
[audience 2] The question is
whether the complexity
means that there are
difficulties for developers.
Is it too complex?
Does that cause problems?
[Max] Well the complexity is hidden behind
the font rendering stack.
So it's a huge collection of software,
usually there's HarfBuzz, fontconfig,
and FreeType heavily used nowadays.
But it's all encapsulated
in these libraries
and it's used all over
the free software world
and it makes it easy to write applications
which are easily translatable.
[audience] Yes but for example,
the fontconfig is not so flexible.
For example, when we're using
mainly Japanese
and other display of the language,
then the font of config chooses the
non-Japanese font like the Chinese one.
I know some distributions like Ubuntu make
language specific fontconfig file
for each language to deal with but
Debian doesn't have such a mechanism.
So in my opinion the fontconfig
is a bit terrible so
is there any ?? meant or any hack for it?
[Max] Okay, the fontconfig system comes
from the time where XML was popular.
[audience] Mhmm
[Max] As you can see.
Well the problem here is
not so much the software stack
more than the Unicode system.
So in the Unicode system there are
same code points for
Japanese and Chinese characters somewhere
although they have different appearances.
In the fontconfig system there are
replacement rules for characters
and it is just a list and usually Chinese
fonts are listed first.
So that is the reason why in the Japanese
language you get Chinese characters
instead of Japanese characters.
So I also think there's room to improve
the software stack, yes.
[audience] Thank you.
[Max] Okay, done. Thank you.
[applause]