<i>36c3 intro</i>

Herald: Good morning again. Thanks. First
off for today is by Hannes Mehnert. It's

titled "Leaving Legacy Behind". It's about
the reduction of carbon footprint through

micro kernels in MirageOS. Give a warm
welcome to Hannes.

<i>Applause</i>

Hannes Mehnert: Thank you. So let's talk a
bit about legacy, so legacy we had have.

Nowadays we run services usually on a Unix
based operating system, which is

demonstrated here on the left a bit the
layering. So at the lowest layer we have

the hardware. So some physical CPU, some
lock devices, maybe a network interface

card and maybe some memories, some non-
persistent memory. On top of that, we

usually run the Unix kernel. So to say.
That is marked here in brown which is

which consists of a filesystem. Then it
has a scheduler, it has some process

management that has network stacks. So the
TCP/IP stack, it also has some user

management and hardware and drivers. So it
has drivers for the physical hard drive,

for their network interface and so on.
The ground stuff. So the kernel runs in

privilege mode. It exposes a system called
API or and/or a socket API to the

actual application where we are there to
run, which is here in orange. So the

actual application is on top, which is the
application binary and may depend on some

configuration files distributed randomly
across the filesystem with some file

permissions set on. Then the application
itself also depends likely on a programming

language runtime that may either be a Java
virtual machine if you run Java or Python

interpreter if you run Python, or a ruby
interpreter if you run Ruby and so on.

Then additionally we usually have a system
library. Lip C which is just runtime

library basically of the C programming
language and it exposes a much nicer

interface than the system calls. We may as
well have open SSL or another crypto

library as part of the application binary
which is also here in Orange. So what's a

drop of the kernel? So the brown stuff
actually has a virtual memory subsystem

and it should separate the orange stuff
from each other. So you have multiple

applications running there and the brown
stuff is responsible to ensure that the

orange that different pieces of orange
stuff don't interfere with each other so

that they are not randomly writing into
each other's memory and so on. Now if the

orange stuff is compromised. So if you
have some attacker from the network or

from wherever else who's able to find a
flaw in the orange stuff, the kernel is still

responsible for strict isolation between
the orange stuff. So as long as the

attacker only gets access to the orange
stuff, it should be very well contained.

But then we look at the bridge between the
brown and orange stuff. So between kernel

and user space and there we have an API
which is roughly 600 system calls at

least on my FreeBSD machine here in Sys
calls. So it's 600 different functions or

the width of this API is 600 different
functions, which is quite big. And it's

quite easy to hide some flaws in there.
And as soon as you're able to find a flaw

in any of those system calls, you can
escalate your privileges and then you

basically run into brown moats and kernel
mode and you have access to the raw

physical hardware. And you can also read
arbitrary memory from any processor

running there. So now over the years it
actually evolved and we added some more

layers, which is hypervisors. So at the
lowest layer, we still have the hardware

stack, but on top of the hardware we now
have a hypervisor, which responsibility it

is to split the physical hardware into
pieces and slice it up and run different

virtual machines. So now we have the byte
stuff, which is the hypervisor. And on top

of that, we have multiple brown things and
multiple orange things as well. So now the

hypervisor is responsible for distributing
the CPUs to virtual machines. And the

memory to virtual machines and so on. It
is also responsible for selecting which

virtual machine to run on which physical
CPU. So it actually includes the scheduler

as well. And the hypervisors
responsibility is again to isolate the

different virtual machines from each
other. Initially, hypervisors were done

mostly in software. Nowadays, there are a
lot of CPU features available, which

allows you to have some CPU support, which
makes them fast, and you don't have to

trust so much software anymore, but you
have to trust in the hardware. So that's

extended page tables and VTD and VTX
stuff. OK, so that's the legacy we have

right now. So when you ship a binary, you
actually care about some tip of the

iceberg. That is the code you actually
write and you care about. You care about

deeply because it should work well and you
want to run it. But at the bottom you have

the sole operating system and that is the
code. The operating system insist that you

need it. So you can't get it without the
bottom of the iceberg. So you will always

have a process management and user
management and likely as well the

filesystem around on a UNIX system. Then
in addition, back in May, I think their

was a blog entry from someone who analyzed
from Google Project Zero, which is a

security research team and red team which
tries to fund a lot of flaws in vitally

use applications . And they found in a
year maybe 110 different vulnerabilities

which they reported and so on. And someone
analyzed what these 110 vulnerabilities

were about and it turned out that more
than two thirds of them, that the root

cause of the flaw was memory corruption.
And memory corruption means arbitrary

reads of rights from from arbitrary
memory, which a process that's not

supposed to be in. So why does that
happen? That happens because we on the

Unix system, we mainly use program
languages where we have tight control over

the memory management. So we do it
ourselves. So we allocate the memory

ourselves and we free it ourselves. There
is a lot of boilerplate we need to write

down and that is also a lot of boilerplate
which you can get wrong. So now we talked

a bit about legacy. Let's talk about the
goals of this talk. The goals is on the

one side to be more secure. So to reduce
the attack vectors because C and languages

like that from the 70s and we may have
some languages from the 80s or even from

the 90s who offer you automated memory
management and memory safety languages

such as Java or Rust or Python or
something like that. But it turns out not

many people are writing operating systems
in those languages. Another point here is

I want to reduce the attack surface. So we
have seen this huge stack here and I want

to minimize the orange and the brown part.
Then as an implication of that. I also

want to reduce the runtime complexity
because that is actually pretty cumbersome

to figure out what is now wrong. Why does
your application not start? And if the

whole reason is because some file on your
harddisk has the wrong filesystem

permissions, then it's pretty hard to
get across if you're not yet a Unix expert

who has a lift in the system for years or
at least months. And then the final goal,

thanks to the topic of this conference and
to some analysis I did, is to actually

reduce the carbon footprint. So if you run
a service, you certainly that service does

some computation and this computation
takes some CPU takes. So it takes some CPU

time in order to be evaluated. And now
reducing that means if you condense down

the complexity and the code size, we also
reduce the amount of computation which

needs to be done. These are the goals. So
what are MirageOS unikernels? That is

basically the project i have been involved
in since six years or so. The general idea

is that each service is isolated in a
separate MirageOS unikernel. So your DNS

resover or your web server don't run on
this general purpose UNIX system as a

process, but you have a separate virtual
machine for each of them. So you have one

unikernel which only does DNS resolution
and in that unikernel you don't even need

a user management. You don't even need
process management because there's only a

single process. There's a DNS resolver.
Actually, a DNS resolver also doesn't

really need a file system. So we got rid
of that. We also don't really need virtual

memory because we only have one process.
So we don't need virtual memory and we

just use a single address space. So
everything is mapped in a single address

space. We use program language called
OCaml, which is functional programming

language which provides us with memory
safety. So it has automated memory

measurement and we use this memory
management and the isolation, which the

program manager guarantees us by its type
system. We use that to say, okay, we can

all live in a single address space and
it'll still be safe as long as the

components are safe. And as long as we
minimize the components which are by

definition unsafe. So we need to run some
C code there as well. So in addition,

well. Now, if we have a single service, we
only put in the libraries or the stuff we

actually need in that service. So as I
mentioned that the DNS resolver won't need

a user management, it doesn't need a
shell. Why would I need to shell? What

should I need to do there? And so on. So
we have a lot of libraries, a lot of OCaml

libraries which are picked by the single
servers or which are mixed and matched for

the different services. So libraries are
developed independently of the whole

system or of the unikernel and are reused
across the different components or across

the different services. Some further
limitation which I take as freedom and

simplicity is not even we have a single
address space. We are also only focusing

on single core and have a single process.
So we don't have a process. We don't know

the concept of process yet. We also don't
work in a preemptive way. So preemptive

means that if you run on a CPU as a
function or as a program, you can at any

time be interrupted because something
which is much more important than you can

now get access to the CPU. And we don't do
that. We do co-operative tasks. So we are

never interrupted. We don't even have
interrupts. So there are no interrupts.

And as I mentioned, it's executed as a
virtual machine. So how does that look

like? So now we have the same picture as
previously. We have at the bottom the

hypervisor. Then we have the host system,
which is the brownish stuff. And on top of

that we have maybe some virtual machines.
Some of them run via KVM and qemu UNIX

system. Using some Virtio that is on the
right and on the left. And in the middle

we have this MirageOS as Unicode where we
and the whole system don't run any qemu,

but we run a minimized so-called tender,
which is this solo5-hvt monitor process.

So that's something which just tries to
allocate or will allocate some host system

resources for the virtual machine and then
does interaction with the virtual machine.

So what does this solo5-hvt do in this
case is to set up the memory, load the

unikernel image which is a statically
linked ELF binary and it sets up the

virtual CPU. So the CPU needs some
initialization and then booting is jumped

to an address. It's already in 64 bit mode.
There's no need to boot via 16 or 32 bit

modes. Now solo5-hvt and the MirageOS they
also have an interface and the interface

is called hyper calls and that interface
is rather small. So it only contains in

total 14 different functions. Which main
function yields a way to get the argument

vector clock. Actually, two clocks, one is
a POSIX clock, which takes care of this

whole time stamping and timezone business
and another one in a monotonic clock which

by its name guarantees that time will pass
monotonically. Then the other console

interface. The console interface is only
one way. So we only output data. We never

read from console. A block device. Well a
block devices and network interfaces and

that's all the hyper calls we have. To
look a bit further down into detail of how

a MirageOS unikernel looks like. Here I
pictured on the left again the tender at

the bottom, and then the hyper calls. And
then in pink I have the pieces of code

which still contain some C code and the
MirageOS unikernel. And in green I have

the pieces of code which does not include
any C code, but only OCaml code. So

looking at the C code which is dangerous
because in C we have to deal with memory

management on our own, which means it's a
bit brittle. We need to carefully review

that code. It is definitely the OCaml
runtime which we have here, which is round

25 thousand lines of code. Then we have a
library which is called nolibc which is

basically a C library which implements
malloc and string compare and some

basic functions which are needed by the
OCaml runtime. That's roughly 8000 lines

of code. That nolibc also provides a lot
of stops which just exit to or return

null for the OCaml runtime because we use
an unmodified OCaml runtime to be able to

upgrade our software more easily. We don't
have any patents for The OCaml runtime.

Then we have a library called
solo5-bindings, which is basically

something which translates into hyper
calls or which can access a hyper calls

and which communicates with the host
system via hyper calls. That is roughly

2000 lines of code. Then we have a math
library for sinus and cosinus and tangents

and so on. And that is just the openlibm
which is originally from the freeBSD

project and has roughly 20000 lines of
code. So that's it. So I talked a bit

about solo5, about the bottom layer and I
will go a bit more into detail about the

solo5 stuff, which is really the stuff 
which you run at the bottom

of the MirageOS. There's another choice.
You can also run Xen or Qubes OS at

the bottom of the MirageOS unikernel. But
I'm focusing here mainly on solo5. So

solo5 has a sandbox execution environment
for unikernels. It handles resources from

the host system, but only aesthetically.
So you say at startup time how much memory

it will take. How many network interfaces
and which ones are taken and how many

block devices and which ones are taken by
the virtual machine. You don't have any

dynamic resource management, so you can't
add at a later point in time a new network

interface. That's just not supported. And it
makes the code much easier. We don't even

have dynamic allocation inside of 
solo5. We have a hyper cool interface. As I

mentioned, it's only 14 functions. We have
bindings for different targets. So we can

run on KVM, which is hypervisor developed
for the Linux project, but also for

Beehive, which is a free BSD hypervisor or
VMM which is openBSD hypervisor. We also

target other systems such as the g-node,
wich is an operating system, based on a

micro kernel written mainly in C++,
virtio, which is a protocol usually spoken

between the host system and the guest
system, and virtio is used in a lot of

cloud deployments. So it's OK. So qemu for
example, provides you with a virtio

protocol implementation. And a last
implementation of solo5 or bindings for

solo5 is seccomb. So Linux seccomb is a
filter in the Linux kernel where you can

restrict your process that will only use a
certain number or a certain amount of

system calls and we use seccomb so you can
deploy it without virtual machine in the

second case, but you are restricted to
which system calls you can use. So solo5

also provides you with the host system
tender where applicable. So in the virtio

case it not applicable. In the g-note case
it is also not applicable. In KVM we

already saw the solo5 HVT, wich is a
hardware virtualized tender. Which is just

a small binary because if you run qemu at
least hundreds of thousands of lines of

code in the solo5 HVT case, it's more like
thousands of lines of code. So here we

have a comparison from left to right of
solo5 and how the host system or the host

system kernel and the guest system works.
In the middle we have a virtual machine, a

common Linux qemu KVM based virtual
machine for example, and on the right hand

we have the host system and the container.
Container is also a technology where you

try to restrict as much access as you can
from process. So it is contained and the

potential compromise is also very isolated
and contained. On the left hand side we

see that solo5 is basically some bits and
pieces in the host system. So is the solo5

HVT and then some bits and pieces in
Unikernel. So is the solo5 findings I

mentioned earlier. And that is to
communicate between the host and the guest

system. In the middle we see that the API
between the host system and the virtual

machine. It's much bigger than this. And
commonly using virtio and virtio is really

a huge protocol which does feature
negotiation and all sorts of things where

you can always do something wrong, like
you can do something wrong and a floppy

disk driver. And that led to some
exploitable vulnerability, although

nowadays most operating systems don't
really need a floppy disk drive anymore.

And on the right hand side, you can see
that the whole system interface for a

container is much bigger than for a
virtual machine because the whole system

interface for a container is exactly those
system calls you saw earlier. So it's run

600 different calls. And in order to
evaluate the security, you need basically

to audit all of them. So that's just a
brief comparison between those. If we look

into more detail, what solo5 what shapes
it can have here on the left side. We can

see it running in a hardware virtualized
tender, which is you have the Linux

freebies, your openBSD at the bottom and
you have solo5 blob, which is a blue thing

here in the middle. And then on top you
have the unikernel. On the right hand side

you can see the Linux satcom process and
you have a much smaller solo5 blob because

it doesn't need to do that much anymore,
because all the hyper calls are basically

translated to system calls. So you
actually get rid of them and you don't

need to communicate between the host and
the guest system because in seccomb you

run as a whole system process so you don't
have this virtualization. The advantage of

using seccomb as well, but you can deploy
it without having access to virtualization

features of the CPU. Now to get it in even
smaller shape. There's another backend I

haven't talked to you about. It's called
the Muen. It's a separation kernel

developed in Ada. So you basically ... so
now we try to get rid of this huge Unix

system below it. Which is this big kernel
thingy here. And Muen is an open source

project developed in Switzerland in Ada,
as I mentioned, and that uses SPARK, which

is proof system, which guarantees the
memory isolation between the different

components. And Muen now goes a step
further and it says, "Oh yeah. For you as

a guest system, you don't do static
allocations and you don't do dynamic

resource management." We as a host system,
we as a hypervisor, we don't do any

dynamic resource allocation as well. So it
only does static resource management. So

at compile time of your Muen separation
kernel you decide how many virtual

machines or how many unikernels you are
running and which resources are given to

them. You even specify which communication
channels are there. So if one of your

virtual machines needs to talk to another
one, you need to specify that at

compile time and at runtime you don't have
any dynamic resource management. So that

again makes the code much easier, much,
much less complex. And you get to much

fewer lines of code. So to conclude with
this Mirage and how this and also the Muen

and solo5. And how that is. I like to cite
Antoine: "Perfection is achieved, not when

there is nothing more to add, but when
there is nothing left to take away." I

mean obviously the most secure system is a
system which doesn't exist.

<i>Laughter</i>

Let's look a bit further

into the decisions of MirageOS.
Why do you use this strange

programming language called OCaml and
what's it all about? And what are the case

studies? So OCaml has been around since
more than 20 years. It's a multi paradigm

programming language. The goal for us and
for OCaml is usually to have declarative

code. To achieve declarative code you need
to provide the developers with some

orthogonal abstraction facilities such as
here we have variables then functions you

likely know if you're a software
developer. Also higher order functions. So

that just means that the function is able
to take a function as input. Then in OCaml

we tried to always focus on the problem
and do not distract with boilerplate. So

some running example again would be this
memory management. We don't manually deal

with that, but we have computers to
actually deal with that. In OCaml you have

a very expressive and static type system,
which can spot a lot of invariance or

violation of invariance at build time.
So the program won't compile if you don't

handle all the potential return types or
return values of your function. So now a

type system, you know, you may know it
from Java is a bit painful. If you have to

express at every location where you want
to have a variable, which type this

variable is. What OCaml provides is type
inference similar to Scala and other

languages. So you don't need to type all
the types manually. And types are also

unlike in Java. Types are erased during
compilation. So types are only information

about values the compiler has at compile
time. But at runtime these are all erased

so they don't exist. You don't see them.
And OCaml compiles to native machine code,

which I think is important for security
and performance. Because otherwise you run

an interpreter or an abstract machine and
you have to emulate something else and

that is never as fast as you can. OCaml
has one distinct feature, which is its

module system. So you have all your
values, which types or functions. And now

each of those values is defined inside of
a so-called module. And the simplest

module is just the filename. But you can
nest modules so you can explicitly say, oh

yeah, this value or this binding is now
living in a sub module here off. So each

module you can also give it a type. So it
has a set of types and a set of functions

and that is called its signature, which is
the interface of the module. Now you have

another abstraction mechanism in OCaml
which is functors. And functors are

basically compile time functions from
module to module. So they allow a

pyramidisation. Like you can implement
your generic map structure and all you

require. So map is just a hash map or a
implementation is maybe a binary tree. And

you need to have is some comparison for
the keys and that is modeled in OCaml by

module. So you have a module called map
and you have a functor called make. And the

make takes some module which implements
this comparison method and then provides

you with map data structure for that key
type. And then MirageOS we actually use a

module system quite a bit more because we
have all these resources which are

different between Xen and KVM and so on.
So each of the different resources like a

network interface has a signature. OK, and
target specific implementation. So we have

the TCP/IP stack, which is much higher
than the network card, but it doesn't

really care if you run on Xen or if you
run on KVM. You just program against this

abstract interface against the interface
of the network device. But you don't need

to program. You don't need to write in
your TCP/IP stack any code to run on Xen

or to run on KVM. So MirageOS also
doesn't really use the complete OCaml

programming language. OCaml also provides
you with an object system and we barely

use that. We also have in MirageOS... well
OCaml also allows you for with mutable

state. And we barely used that mutable
state, but we use mostly immutable data

whenever sensible. We also have a value
passing style, so we put state and data as

inputs. So stage is just some abstract
state and data is just a byte vector

in a protocol implementation. And then the
output is also a new state which may be

modified and some reply maybe so some
other byte vector or some application

data. Or the output data may as well be an
error because the incoming data and state

may be invalid or might maybe violate some
some constraints. And errors are also

explicitly types, so they are declared in
the API and the call of a function needs

to handle all these errors explicitly. As
I said, the single core, but we have some

promise based or some even based
concurrent programming stuff. And yeah, we

have the ability to express a really
strong and variants like this is a read-

only buffer in the type system. And the
type system is, as I mentioned, only

compile time, no runtime overhead. So it's
all pretty nice and good. So let's take a

look at some of the case studies. The
first one is unikernel. So it's called the

Bitcoin Pinata. It started in 2015 when we
were happy with from the scratch developed

TLS stack. TLS is transport layer
security. So what use if you browse to

HTTPS. So we have an TLS stack in OCaml
and we wanted to do some marketing for

that. Bitcoin Pinata is basically
unikernel which uses TLS and provides you

with TLS endpoints, and it contains the
private key for a bitcoin wallet which is

filled with, which used to be filled with
10 bitcoins. And this means it's a

security bait. So if you can compromise
the system itself, you get the private key

and you can do whatever you want with it.
And being on this bitcoin block chain, it

also means it's transparent so everyone
can see that that has been hacked or not.

Yeah and it has been online since three years
and it was not hacked. But the bitcoin we

got were only borrowed from friends of us
and they were then reused in other

projects. It's still online. And you can
see here on the right that we had some

HTTP traffic, like an aggregate of maybe
600,000 hits there. Now I have a size

comparison of the Bitcoin Pinata on the
left. You can see the unikernel, which is

less than 10 megabytes in size or in
source code it's maybe a hundred thousand

lines of code. On the right hand side you
have a very similar thing, but running as

a Linux service so it runs an openSSL S
server, which is a minimal TLS server you

can get basically on a Linux system using
openSSL. And there we have mainly maybe a

size of 200 megabytes and maybe two
million two lines of code. So that's

roughly a vector of 25. In other examples,
we even got a bit less code, much bigger

effect. Performance analysis I showed that
... Well, in 2015 we did some evaluation

of our TLS stack and it turns out we're in
the same ballpark as other

implementations. Another case study is
CalDAV server, which we developed last

year with a grant from Prototypefund which
is a German government funding. It is

intolerable with other clients. It stores
data in a remote git repository. So we

don't use any block device or persistent
storage, but we store it in a git

repository so whenever you add the
calendar event, it does actually a git

push. And we also recently got some
integration with CalDAV web, which is a

JavaScript user interface doing in
JavaScript, doing a user interface. And we

just bundle that with the thing. It's
online, open source, there is a demo

server and the data repository online.
Yes, some statistics and I zoom in

directly to the CPU usage. So we had the
luck that we for half of a month, we used

it as a process on a freeBSD system. And
that happened roughly the first half until

here. And then at some point we thought,
oh, yeah, let's migrated it to MirageOS

unikernel and don't run the freeBSD system
below it. And you can see here on the x

axis the time. So that was the month of
June, starting with the first of June on

the left and the last of June on the
right. And on the y axis, you have the

number of CPU seconds here on the left or
the number of CPU ticks here on the right.

The CPU ticks are virtual CPU ticks
which debug counters from the hypervisor.

So from beehive and freeBSD here in that
system. And what you can see here is this

massive drop by a factor of roughly 10.
And that is when we switched from a Unix

virtual machine with the process to a
freestanding Unikernel. So we actually use

much less resources. And if we look into
the bigger picture here, we also see that

the memory dropped by a factor of 10 or
even more. This is now a logarithmic scale

here on the y axis, the network bandwidth
increased quite a bit because now we do

all the monitoring traffic, also via net
interface and so on. Okay, that's CalDAV.

Another case study is authoritative DNS
servers. And I just recently wrote a

tutorial on that. Which I will skip
because I'm a bit short on time. Another

case study is a firewall for QubesOS.
QubesOS is a reasonable, secure operating

system which uses Xen for isolation of
workspaces and applications such as PDF

reader. So whenever you receive a PDF, you
start your virtual machine, which is only

run once and you, well which is just run to
open and read your PDF. And Qubes Mirage

firewall is now small or a tiny
replacement for the Linux based firewall

written in OCaml now. And instead of
roughly 300mb, you only use 32mb

of memory. There's now also recently
some support for dynamic firewall rules

as defined by Qubes 4.0. And that is not
yet merged into master, but it's under

review. Libraries in MirageOS yeah we have
since we write everything from scratch and

in OCaml we don't have now. We don't have
every protocol, but we have quite a few

protocols. There are also more unikernels
right now, which you can see here in the

slides. Also online in the Fahrplan so you
can click on the links later. Repeaters

were built. So for security purposes we
don't get shipped binaries. But I plan to

ship binaries and in order to ship
binaries. I don't want to ship non

reputable binaries. What is reproducible
builds? Well it means that if you have the

same source code, you should get the
binary identical output. And issues are

temporary filenames and timestamps and so
on. In December we managed in MirageOS to

get some tooling on track to actually test
the reproducibility of unikernels and we

fixed some issues and now all the tests in
MirageOS unikernels reporducable, which

are basically most of them from this list.
Another topic, a supply chain security,

which is important, I think, and we have
this is still a work in progress. We still

haven't deployed that widely. But there
are some test repositories out there to

provide more, to provide signatures signed
by the actual authors of a library and

getting you across until the use of the
library can verify that. And some

decentralized authorization and delegation
of that. What about deployment? Well, in

conventional orchestration systems such as
Kubernetes and so on. We don't yet have

a proper integration of MirageOS, but we
would like to get some proper integration

there. If you already generate some
libvirt.xml files from Mirage. So for each

unikernel you get the libvirt.xml and you
can do that and run that in your libvirt

based orchestration system. For Xen, we
also generate those .xl and .xe files,

which I personally don't really
know much about, but that's it. On the

other side, I developed an orchestration
system called Albatross because I was a

bit worried if I now have those tiny
unikernels which are megabytes in size

and now I should trust the big Kubernetes,
which is maybe a million lines of code

running on the host system with
privileges. So I thought, oh well let's

try to come up with a minimal
orchestration system which allows me some

console access. So I want to see the debug
messages or whenever it fails to boot I

want to see the output of the console.
Want to get some metrics like the Graphana

screenshot you just saw. And that's
basically it. Then since I developed also

a TLS stack, I thought, oh yeah, well why
not just use it for remote deployment? So

in TLS you have mutual authentication, you
can have client certificates and

certificate itself is more or less an
authenticated key value store because you

have those extensions and X 509 version 3
and you can put arbitrary data in there

with keys being so-called object
identifiers and values being whatever

else. TLS certificates have this great
advantage that or X 509 certificates have

the advantage that during a TLS handshake
they are transferred on the wire in not

base64 or PEM encoding as you usually see
them, but in basic encoding which is much

nicer to the amount of bits you transfer.
So it's not transferred in base64, but

directly in raw basically. And with
Alabtross you can basically do a TLS

handshake and in that client certificate
you present, you already have the

unikernel image and the name and the boot
arguments and you just deploy it directly.

You can alter an X 509. You have a chain
of certificate authorities, which you send

with and this chain of certificate
authorities also contain some extensions

in order to specify which policies are
active. So how many virtual machines are

you able to deploy on my system? How much
memory you you have access to and which

bridges or which network interfaces you
have access to? So Albatross is really a

minimal orchestration system running as a
family of Unix processes. It's maybe 3000

lines of code or so. OCaml code. But using
then the TLS stack and so on. But yeah, it

seems to work pretty well. I at least use
it for more than two dozen unikernels at

any point in time. What about the
community? Well the whole MirageOS project

started around 2008 at University of
Cambridge, so it used to be a research

project with which still has a lot of
ongoing student projects at University of

Cambridge. But now it's an open source
permissive license, mostly BSD licensed

thing, where we have community event every
half a year and a retreat in Morocco where

we also use our own unikernels like the
DHTP server and the DNS resolve and so on.

We just use them to test them and to see
how does it behave and does it work for

us? We have quite a lot of open source
computer contributors from all over and

some of the MirageOS libraries have also
been used or are still used in this Docker

technology, Docker for Mac and Docker for
Windows, which emulates the guest system

or which needs some wrappers. And there is
a lot of OCaml code is used. So to finish

my talk, I would like to have another
side, which is that Rome wasn't built in a

day. So where we are is to conclude here
we have a radical approach to operating

systems development. We have a security
from the ground up with much fewer code

and we also have much fewer attack vectors
because we use a memory safe

language. So we have reduced the carbon
footprint, as I mentioned in the start of

the talk, because we use much less CPU
time, but also much less memory. So we use

less resources. MirageOS itself and O'Caml
have a reasonable performance. We have

seen some statistics about the TLS stack
that it was in the same ballpark as

OpenSSL and PolarSSL, which is nowadays
MBed TLS, and MirageOS unikernels, since

they don't really need to negotiate
features and wait for the Scottie Pass and

so on. They actually do it in
milliseconds, not in seconds, so they do

not hardware probing and so on. But they
know that startup time what they expect. I

would like to thank everybody who is and
was involved in this whole technology

stack because I myself I program quite a
bit of O'Caml, but I wouldn't have been

able to do that on my own. It is just a
bit too big. MirageOS currently spends

around maybe 200 different git
repositories with the libraries, mostly

developed on GitHub and open source. I
am at the moment working in a nonprofit

company in Germany, which is called the
Center for the Cultivation of Technology

with a project called robur. So we work in
a collective way to develop full-stack

MirageOS unikernels. That's why I'm happy
to do that from Dublin. And if you're

interested, please talk to us. I have some
selected related talks, there are much

more talks about MirageOS. But here is
just a short list of something, if you're

interested in some certain aspects, please
help yourself to view them.

That's all from me.

<i>Applause</i>

Herald: Thank you very much. There's a bit
over 10 minutes of time for questions. If

you have any questions go to the
microphone. There's several microphones

around the room. Go ahead.
Question: Thank you very much for the talk

-
Herald: Writ of order. Thanking the

speaker can be done afterwards. Questions
are questions, so short sentences ending

with a question mark. Sorry, do go ahead.
Question: If I want to try this at home,

what do I need? Is a raspi sufficient? No,
it isn't.

Hannes: That is an excellent question. So
I usually develop it on such a thinkpad

machine, but we actually support also
ARM64 mode. So if you have a Raspberry Pi

3+, which I think has the virtualization
bits and the Linux kernel, which is reason

enough to support KVM on that Raspberry Pi
3+, then you can try it out there.

Herald: Next question.
Question: Well, currently most MirageOS

unikernels are used for running server
applications. And so obviously this all

static preconfiguration of OCaml and
maybe Ada SPARK is fine for that. But what

do you think about... Will it ever be
possible to use the same approach with all

this static reconfiguration for these very
dynamic end user desktop systems, for

example, like which at least currently use
quite a lot of plug-and-play.

Hannes: Do you have an example? What are
you thinking about?

Question: Well, I'm not that much into
the topic of its SPARK stuff, but you said

that all the communication's paths have to
be defined in advance. So especially with

plug-and-play devices like all this USB
stuff, we either have to allow everything

in advance or we may have to reboot parts
of the unikernels in between to allow

rerouting stuff.
Hannes: Yes. Yes. So I mean if you want to

design a USB plug-and-play system, you can
think of it as you plug in somewhere the

USB stick and then you start the unikernel
which only has access to that USB stick.

But having a unikernel... Well I wouldn't
design a unikernel which randomly does

plug and play with the the outer world,
basically. So. And one of the applications

I've listed here is at the top is a
picture viewer, which is a unikernel that

also at the moment, I think has static
embedded data in it. But is able on Qubes

OS or on Unix and SDL to display the
images and you can think of some way we

are a network or so to access the images
actually. So you didn't need to compile

the images in, but you can have a good
repository or TCP server or whatever in

order to receive the images. So I am
saying. So what I didn't mention is that

MirageOS instead of being general purpose
and having a shell and you can do

everything with it, it is that each
service, each unikernel is a single

service thing. So you can't do everything
with it. And I think that is an advantage

from a lot of points of view. I agree
that if you have a highly dynamic system,

that you may have some trouble on how to
integrate that.

Herald: Are there any other questions? 
No, it appears not. In which case,

thank you again, Hannes. 
Warm applause for Hannes.

<i>Applause</i>

<i>Outro music</i>

subtitles created by c3subtitles.de
in the year 2020. Join, and help us!