Okay, good morning everyone. I'm going to talk about the Linux Font Rendering stack, and that is what I learned the last four and a half years. I worked at the city administration of Munich. For introduction to font rendering, I can say writing is the most used interface, or probably the most used interface between humans and the machines. So all of you probably use it everyday on your computer. Screen text is replacing text on paper more and more and this is still an ongoing process. The way from a string in a computer to display text is surprisingly complex, And writing in itself is quite profound. For example, it's influenced by history so probably all, or at least most, writing systems are derived from some pictograph writing systems. This is also true for Roman letters. And also the Roman letters themselves, they changed a lot since 2000 years ago. In the Roman Empire they used some different letters than we use now. So the goal of this talk is to raise awareness of font rendering in computing, with the focus on the Linux desktop. Now to the history, and then I talk about typography. Text display in early computing. In early computing it was just a way to display results. Before that there were only flashing lights where you can read the results and... [audience member] ??... BT100 ?? raise your hand [Max] And the output device receives a page or a character stream, and then the presentation is the job of the output device. So for example, a typewriter or a printer or a terminal. And this is a BT100 you can see on the right side. So this is a real terminal. But going on in history there's the home computer era. In the home computer era you have fixed character sets and most of the time due to system limiatations, memory limitations, you have graphics mode and text mode. In the graphics mode you can basically set the pixels more or less as you want, and in the text mode you can use fonts. The screen configuration at that time is more or less defined by the state of technology. So every screen at that time has more or less the same resolution, and that's something that the engineers of that time relied on, and also software engineers relied on. The text rendering was just using bitmap fonts. That means you have a fixed size character and you just define it pixelwise so it's some kind of hand-crafted characters. You have a given raster and then you really display these hard-defined characters. So then I need to introduce a few typography terms. First, there's glyph or a character. That's...well you can say a symbol. So I think most people can understand what that means. Then a font, this is a set of glyphs. And then we have a collection of fonts, that is called typeface or font family. Those are closely related fonts with the same visual appearance, And they differ only in slopes or widths or weights for example. Usually with typesetting, with moving letters in the printing press, you really had, for weight for example, different pieces/letters/hardware. And then the point size, or typographic unit. This is a measure of the size, and derives directly from the printing press with moving letters. There are different definitions, and what we probably all use as point size is the DTP point – desktop publishing point, where one point is defined as 1/72 inch, which is 0.3527 period millimeters. It is interesting to note that this has nothing to do with pixels. [question] (unintelligible question) [Max] Please use the microphone if you a question. [question] ...3257, what's the bar over the 7 mean? [Max] It means that the 7 goes on and on and on forever. So this is typography of the Latin alphabet...terminology. There's something called kerning. You can see an example of kerning on the right side. So in the printing press you have letters and they usually have a gap. To improve the readability, and to make all the text appear more uniform the typesetter reduces the gap. With moving letters they have special letter pairs... if you go on with this there are ligatures, which are replacement glyphs for close standing characters. An example of this is a double f, fi or fl. So you can see also in the slides here that double f is such a ligature. It's one connected glyph. And then there's a speciality if you reverse all of the ?? [static]. The term non-proportional font. So you have a font where very character, every glyph, has the same width. This is also called monospace font. Sometimes you distinguish half-width and full width letters. This is also something you would see on a typewriter for example. Early bitmap fonts are usually monospaced fonts. Okay, let's go to the text rendering stack. Text rendering today. In all computing nowadays, typography is introduced. So won't find many devices on the market which don't use accurate typography. Glyphs are now represented as size-independent outlines. We can see on the right side there's a font editor: Font Forge. This is the way that such fonts are created. So you can see here in the picture the outline of the "S". It's a size-independent outline and you can just use that for every kind of display. And then you do something that is called rasterization, or sampling, or rendering for displaying it at the actual display. In the past this allowed for new applications for computers. First it allowed word processing and computer-based typesetting. Before that typesetting was a very hard job. It allowed internationalization. It allows universal graphical interface which goes hand-in-hand with internationalization. Come to think of it, not so long ago it was not so common that you could buy devices where you can easily switch the language to any language you would like. This is just a recent development. So now I will give you an overview of the formats we see on Linux systems. First I talk about the bitmap formats. Today we have bitmap formats in the PC Screen Font format, PSF or PSFU. These are the fonts that are used for the Linux VT, also known as the console. Usually you can store in such a font 256 or 512 glyphs. Some of them contain a Unicode translation table. And on the other hand there are X Window System Bitmap Fonts. There are three formats: There's a Sever normal Format (SNF). Glyph Bitmap Distribution Format (BDF). And Portable Compiled Format (PCF). Nowadays PCF is the only letter that you can find on a Linux system if so, and the first two are deprecated. And also the X Window Bitmap Fonts themselves are not very common these days. To the outline font formats. There are Postscript Formats. Postscript Formats have different versions as I would call them. It's called types. Type 1 is the one that is still relevant, to a certain degree. Those are the ones you can probably find on a standard Debian installation for example. It uses cubic bezier curves. The file format is divided into different files. So for every font you have Printer Font ASCII or a Printer Font Binary file, and you have a Print Font Metric or Adobe Font Metric file. Then you have the Truetype file format. This is more common. I guess everyone heard of that already. It uses quadratic bezier curves. It can contain optional code for TrueType Hinting Virtual Machine. What that means I will explain latter. There's a third format, the OpenType format, which has two possible glyph formats: one is truetype, and the other is Compact Font Format, which is based on Postscript Type 2. So you can say what is new in this format, or what's the difference? Well it supports so-called Smartfonts. That means you can have language-specific ligatures or character substitutions. For example, kerning classes, which means you have a class of characters like the A and different variations. So these are A with different diacritic symbols, and you can just use one kerning class for all of those characters and don't have to invent the wheel for every A with diacritic symbols anew. Now to the font rendering techniques that I use today. This only applies to the outline fonts because well rendering bitmap fonts is obviously quite easy. You just paint the pixels that are in the bitmap. So rasterization is all about using outline fonts. The one library library that is used all over the free and open source world is FreeType. It is used on Linux system, BSD desktops, Android, and ReactOS, and also some others. For example it is contained in Ghostscript and therefore in most of the printers. It's also on iOS [exhale]. So the naive rasterization algorithm would be just lay a pixel raster over the outline, over the character you want to display, and if the center of the pixel is inside the outline then you set the color to black. The problem is the so called aliasing effect. This is what you can see in the picture below. So those are the same words, rendered at different resolutions. The left one has 10x the resolution of the right one. You can see that, for example, the 'w' is quite deformed. Or the curl of the 'g'; there's even a part missing. So you have details of the font which are lost, and you have artefacts. Somehow, especially in the early computing days you wanted to reduce those, and therefore you used a technique called hinting. This is to avoid such artefacts and improve the readability at lower resolutions. Therefore the outline adjusted to fit the pixel raster at the rasterization. Some of the fonts can contain instructions, the so-called hints where the the name comes from. You can see in the example here, in the picture above, that it's quite a good result. This is much more readable than the word above. However, with the use of hinting, there are also characteristics of the font lost which is obvious because if you change the outline then obviously characteristics get lost. And especially important for word processing, what you see is what you get word processing, that the tracking of the font is changed. So tracking means the width of single characters. The picture below, this is a picture I took from LibreOffice. If you have a close look, you can see that the gap between those 'i' characters is not always the same. But if do word processing, you actually want to have a result which looks the same as on the printout. So I would recommend no to hint fonts in a what you see is what you get word processor. If you have a look beyond the backyard, on macOS, there exists no hinting, and in the Windows world, hint can't be turned off, so this is hardwired in the font rendering of the Windows system. Well anther approach to improve readability is anti-aliasing which applies multisampling. So for every pixel you take samples at different spots and then you compute from that a gray value, which is a measure of how much of the area of the pixel is covered by the glyph. On the picture you can see again the simple raster word, and then in the middle you see the word with anti-alising, and below you can see a combination of hinting and antialiasing. Suddenly the text becomes very readable also on low resolutions. This is what is usually used in computing in the 90s. So there is another approach to improve and therefore I have to talk a bit about LCD displays. So in the picture on the left, above you can see tube monitor pixels and on the bottom you can see LCD monitor pixels. All of the pixels consist of so-called subpixels with different colors. The mixture of the three different colors gives them a wide range of different colors. With the tube monitors this is not used, but with LCD display depending on the configuration of the subpixels it can be used to improve improve the resolution in one direction. Therefore you have to know how the configuration of the display is. So usually, one pixel, which you can see in the picture below right? These are the usual computer monitors. One pixel consists of a red, green and a blue subpixel in this order. But you have to keep in mind, especially with tablets, or smartphones, you can rotate the screen, so you have to keep that in mind. Also there are other subpixel configurations. For example, depending on the technology there can be additional red or green or even white subpixels. So there are also multiple possibilities of the configuration. We can use this configuration as I said to improve the resolution and in the usual case this is horizontally. You can see on the picture on the left side, first there's the naive, simple rendered character, then the character just using anti-aliasing, and then there's the subpixel rendered character. This one is the most readable or most sharp character. Not sharp!... but it's most correctly rendered according to the outline font. Depending on the technique of the display, you can see a color haze around the characters. So this happens when the software and the display technology don't match each other. So this is what you can see on the picture on the right. Okay, let's talk about the font rendering software parts in the Linux desktop. There's the so-called server-side text rendering. So in the X server there's the Core Font subsystem. With that, X11 clients can request the server to display a text by sending a string via libXfont and using the so-called X Logical Font Description (XLFD). Here's an example for Adobe Career font, then the X server has to then render the text using the font. If you imagine a terminal server setup with thin clients, then the thin client runs the X server and the terminal server runs the X client and every thin client then has to handle the font rendering. So I don't know if this is the only reason, but nowadays server-side font rendering is not so common. Probably not used anymore. Now I have to talk first about font management. In Linux systems there is a software called fontconfig, which manages installed fonts on the system and it configures for example how to substitute fonts. For example if in a document there's a font to render a text and the font is not available on the system then in the font config system there can be rules to replace the font with a similar font. This is heavily used in Linux systems. There are also rules for what font to use if the current selected font doesn't contain a character you want to display. You can see on the picture on the right that this is also used... although I have to admit, I had to take this picture on a Debian 5.0 system in LibreOffice because nowadays it works a bit different, but this is a good example where you can see that those characters come from different fonts actually. There's also a command line tool which is quite nice. fc-match command line tool. So for example if you want to see what is the replacement font for, let's say, Arial then you get the output: okay, it's Nimbus Sans. And you can also set rendering options, which means you can set anti-aliasing or you can turn off hinting. Usually these configurations are stored in /etc/fonts and there is also a per-user configuration in .config/fontconfig and it's in XML file format snippets. In the picture you can see an example. This is the replacement rule for Carlito and Calibri. Carlito is a replacement font for the nowadays often used Calibri font from Microsoft. So this configuration says the one way, Carlito is the same as Calibri, and the other way around. This is just an example, you can have a look at /etc/fonts and a lot of such snippets. It defines how fonts are displayed on the system. Another thing, and this is the reason why I had to take the screenshot on an older system, there's a piece of software called HarfBuzz. HarfBuzz is Persian for OpenType, so this software relies on the OpenType font format. You can see on the right side the HarfBuzz logo that actually says HarfBuzz in Persian. Before I talked about ligatures, In some languages ligatures are required to render fonts correctly and this is an example on the left of Devanagari, which is an Indic script. So if you have the first two characters combined then this is the rule of how to replace those characters with a third glyph. There were early implementations by Qt and Pango and those were integrated in HarfBuzz. Well those parts are now known as Harfbuzz Old, and the current HarfBuzz is a rewrite. Nowaday's it's also used for the so-called simple script, meaning especially Latin script, and is integrated into Qt, GTK, LibreOffice, Firefox, Android, and XETEX. This is what I use for the slides here, so all the slides here are also rendered using HarfBuzz. And of course the whole font rendering stack. HarfBuzz also has fancy features like variable widths or weight with only one font or you can define in the font characters with variable widths and without stepping you can change it using HarfBuzz. Those techniques are used in client-side rendering. One library often was until recently often used, Xft. Well this is the applications meaning the X clients render the text based on FreeType and fontconfig. Then the X server only displays the results. Well, there is some caching involved, and it requires an extension to the protocol. In the widget libraries this was also used a long time, but now Qt, for example, has it's own code for font rendering based on HarfBuzz, FreeType, and fontconfig. There's the combination of Pango and Cairo, which is used in the GTK environment. Pango derives from Greek and Japanese: pan means all and go means languages. So this is the background to write in all languages. This also uses HarfBuzz, FreeType, and fontconfig. And if you look at Wayland clients, they only do client-side rendering. To sum it up, we have a variety of techniques for text display. Some have historical than practical value nowadays. The modern font rendering stack is quite complex. But writing is one of the main interface to computers, so developers should be aware of the complexity. Keep that in mind. Thank you. [applause] Are there any questions? [audience] More a request than a question, I find the topic very interesting and I'd like to check it again. I know your video was not recorded or I think it was not recorded... [Max] Well it was I think. [audience] Okay so nevermind, but still if you are able to share your presentation [Max] Yes [audience] I would be very happy to look at it. [audience] [tests mic] What's the problem due to the compressed ?? from the software stack in your open ??. [Max] The problem of what? [audience] What's the problem it came from this complex software stack in your open ??. There's no problem with that, or there's some problem with these lot of layers, very lot of components to render the font. [Max] Okay, I try to rephrase the question [question] Please [Max] What is the reason there is so many layers in the font rendering stack today. [Max] Is that correct? [audience] And it caused some problem. Do you know any problem with current rendering system. [Max] Colored font rendering system? [audience] Current, now. [Max] Ah current font- [audience 2] The question is whether the complexity means that there are difficulties for developers. Is it too complex? Does that cause problems? [Max] Well the complexity is hidden behind the font rendering stack. So it's a huge collection of software, usually there's HarfBuzz, fontconfig, and FreeType heavily used nowadays. But it's all encapsulated in these libraries and it's used all over the free software world and it makes it easy to write applications which are easily translatable. [audience] Yes but for example, the fontconfig is not so flexible. For example, when we're using mainly[?] Japanese and other display of the language, then the font of config chooses the non-Japanese font like the Chinese one. I know some distributions like Ubuntu make language specific fontconfig file for each language to deal with but Debian doesn't have such a mechanism. So in my opinion the fontconfig is a bit terrible so is there any ?? meant or hack for it? [Max] Okay, the fontconfig system comes from the time where XML was popular. [audience] Mhmm [Max] As you can see. Well the problem here is not so much the software stack more than the Unicode system. So in the Unicode system there are same code points for Japanese and Chinese characters somewhere although they have different appearances. In the fontconfig system there are replacement rules for characters and it is just a list and usually Chinese fonts are listed first. So that is the reason why in the Japanese language you get Chinese characters instead of Japanese characters. So I also think there's room to improve the software stack, yes. [audience] Thank you. [Max] Okay, done. Thank you. [applause]