Return to Video

34C3 - Bringing Linux back to server boot ROMs with NERF and Heads

  • 0:15 - 0:20
    Herald Angel: So, most of the cloud
    services rely on closed source proprietary
  • 0:20 - 0:24
    server firmware; header[?] firmware; with
    known security implications to[?] tenants.
  • 0:24 - 0:30
    Imagine... So that's where LinuxBoot comes
    to the rescue because it wants to replace
  • 0:30 - 0:36
    this closed source firmware with an open
    Linux boot version and our next speaker
  • 0:36 - 0:41
    Trammell Hudson he's an integral part of
    that project and he's here to provide you
  • 0:41 - 0:46
    an overview of this Linux boot project.
    Thank you very much and please give a warm
  • 0:46 - 0:48
    round round of applause to Trammell
    Hudson, please!
  • 0:48 - 0:56
    applause
    Trammell: Thank you!
  • 0:56 - 1:02
    Securing the boot process is really
    fundamental to having secure systems
  • 1:02 - 1:10
    because of the vulnerabilities in firmware
    can affect any security that the operating
  • 1:10 - 1:15
    system tries to provide. And for that
    reason I think it's really important that
  • 1:15 - 1:21
    we replace the proprietary vendor
    firmwares with open source, like Linux.
  • 1:21 - 1:26
    And this is not a new idea. My
    collaborator Ron Minich started a project
  • 1:26 - 1:32
    called LinuxBIOS back in the 90s when he
    was at Los Alamos National Labs. They
  • 1:32 - 1:36
    built the world's third fastest
    supercomputer out of a Linux cluster that
  • 1:36 - 1:45
    used BIOS in the ROM to make it more
    reliable. LinuxBIOS turned into coreboot
  • 1:45 - 1:53
    in 2005 and the Linux part was removed and
    became a generic bootloader, and it now
  • 1:53 - 1:59
    powers the Chromebooks as well as projects
    like the Heads slightly more secure laptop
  • 1:59 - 2:05
    firmware that I presented last year at
    CCC. Unfortunately it doesn't support any
  • 2:05 - 2:13
    server main boards anymore. Most servers
    are running a variant of Intel's UEFI
  • 2:13 - 2:19
    firmware, which is a project that Intel
    started to replace the somewhat aging
  • 2:19 - 2:25
    16-bit real mode BIOS of the 80s and 90s.
    And, like a lot of second systems, it's
  • 2:25 - 2:30
    pretty complicated. If you've been to any
    talks on firmware security you've probably
  • 2:30 - 2:37
    seen this slide before. It goes through
    multiple phases as the system boots, the
  • 2:37 - 2:45
    first phase does a cryptographic
    verification of the pre-EFI phase this the
  • 2:45 - 2:50
    PEI phase is responsible for bringing up
    the memory controller the CPU interconnect
  • 2:50 - 2:55
    and a few other critical devices. It also
    enables paging and long mode and then
  • 2:55 - 3:03
    jumps into the device execution
    environment or DXE phase. This is where
  • 3:03 - 3:08
    UEFI option ROMs are executed as well, as
    well as all of the remaining devices are
  • 3:08 - 3:15
    initialized. Once the PCI bus and USB
    buses have been walked and enumerated, it
  • 3:15 - 3:20
    transfers to the boot device selection
    phase, which figures out which disk or USB
  • 3:20 - 3:27
    stick or network to boot from. That loads
    a boot loader from that device which
  • 3:27 - 3:32
    eventually loads a real operating system,
    which then is the operating system that's
  • 3:32 - 3:38
    running on the machine. What we're
    proposing is that we replace all of this
  • 3:38 - 3:46
    with the Linux boot kernel and runtime. We
    can do all of the device enumeration in
  • 3:46 - 3:50
    Linux, it already has support for doing
    this, and then we can use more
  • 3:50 - 3:55
    sophisticated protocols and tools to
    locate the real kernel that we want to
  • 3:55 - 4:04
    run, and use the kexec system call to be
    able to start that new kernel. And the
  • 4:04 - 4:09
    reason we want to use Linux here is
    because it gives us the ability to have a
  • 4:09 - 4:14
    more secure system. It gives us a lot more
    flexibility and hopefully it lets us
  • 4:14 - 4:21
    create a more resilient system out of it.
    On the security front one of the big areas
  • 4:21 - 4:28
    that we get some benefit is we reduced the
    attack surface that in the DXE phase these
  • 4:28 - 4:35
    drivers are an enormous amount of code on
    the Intel S2600 there are over 400 modules
  • 4:35 - 4:41
    that get loaded. They do things like the
    option roms that I mentioned, and if you
  • 4:41 - 4:44
    want an example of how dangerous option
    roms can be, you can look at my
  • 4:44 - 4:51
    Thunderstrike talks from a few years ago.
    They also do things like display the boot
  • 4:51 - 4:55
    splash, the vendor logo, and this has been
    a place where quite a few buffer overflows
  • 4:55 - 5:00
    have been found in vendor firmwares in the
    past. They have a complete network stack
  • 5:00 - 5:10
    IPv4 and v6 as well as HTTP and HTTPS.
    They have legacy device drivers for things
  • 5:10 - 5:14
    like floppy drives, and again, these sort
    of dusty corners are where vulnerabilities
  • 5:14 - 5:21
    in Xen have been found that allowed a
    hypervisor break. There are also modules
  • 5:21 - 5:26
    like the Microsoft OEM activation that we
    just don't know what they do, or things
  • 5:26 - 5:36
    like a y2k rollover module that probably
    hasn't been tested in two decades. So the
  • 5:36 - 5:41
    final OS bootloader phase is actually not
    part of UEFI, but it's typically in, the
  • 5:41 - 5:46
    Linux system, its GRUB, the grand unified
    bootloader. And y'all -- many of you are
  • 5:46 - 5:51
    probably familiar with its interface, but
    did you know that it has its own file
  • 5:51 - 5:59
    system, video, and network drivers. About
    almost 250 thousand lines of code make up
  • 5:59 - 6:04
    GRUB. I don't bring up the size of this to
    complain about the space it takes, but
  • 6:04 - 6:10
    because of how much it increases our
    attack surface. You might think that
  • 6:10 - 6:14
    having three different operating systems
    involved in this boot process gives us a
  • 6:14 - 6:19
    defense in depth, but I would argue that
    we are subject to the weakest link in this
  • 6:19 - 6:24
    chain because if you can compromise UEFI,
    you can compromise grub, and if you can
  • 6:24 - 6:28
    compromise grub you can compromise the
    Linux kernel that you want to run on the
  • 6:28 - 6:35
    machine. So there are lots of ways these
    attacks could be launched. As I mentioned
  • 6:35 - 6:40
    UEFI has a network device driver, grub has
    a network device driver, and of course
  • 6:40 - 6:44
    Linux has a network device driver. This
    means that a remote attacker could
  • 6:44 - 6:48
    potentially get code execution during the
    boot process.
  • 6:48 - 6:56
    UEFI has a USB driver, grub has a
    USB driver, and of course Linux has a USB
  • 6:56 - 7:03
    driver. There have been bugs found in USB
    stacks -- which unfortunately are very
  • 7:03 - 7:08
    complex -- and a buffer overflow in a USB
    descriptor handler could allow a local
  • 7:08 - 7:13
    attacker to plug in a rogue device and
    take control of the firmware during the
  • 7:13 - 7:19
    boot. Of course UEFI has a FAT driver,
    GRUB has a FAT driver, Linux has a FAT
  • 7:19 - 7:26
    driver. This gives an attacker a place to
    gain persistence and perhaps leverage code
  • 7:26 - 7:35
    execution during the initial file system
    or partition walk. So what we argue is
  • 7:35 - 7:40
    that we should have the operating system
    that has the most contributors, and the
  • 7:40 - 7:47
    most code review, and the most frequent
    update schedule, for these roles. Linux
  • 7:47 - 7:53
    has a lot more eyes on it, it undergoes a
    much more rapid update schedule than
  • 7:53 - 8:01
    pretty much any vendor firmware. You might
    ask, why do we keep the PEI and the SEC
  • 8:01 - 8:08
    phase from the UEFI firmware? Couldn't we
    use coreboot in this place, and the
  • 8:08 - 8:13
    problem is that vendors are not
    documenting the memory controller or the
  • 8:13 - 8:18
    CPU interconnect. Instead they're
    providing a opaque binary blob called the
  • 8:18 - 8:25
    firmware support package, or FSP, that
    does the memory controller and the CPU
  • 8:25 - 8:33
    initialization. On most coreboot systems
    -- on most modern coreboot systems --
  • 8:33 - 8:38
    coreboot actually calls into the FSP to do
    this initialization. And on a lot of the
  • 8:38 - 8:43
    devices the FSB has grown in scope so it
    now includes video device drivers and
  • 8:43 - 8:49
    power management, and it's actually larger
    than the PEI phase on some of the servers
  • 8:49 - 8:58
    that we're dealing with. The other wrinkle
    is that most modern CPUs don't come out of
  • 8:58 - 9:02
    reset into the legacy reset vector
    anymore. Instead, they execute an
  • 9:02 - 9:07
    authenticated code module, called boot
    guard, that's signed by Intel, and the CPU
  • 9:07 - 9:15
    will not start up if that's not present.
    The good news is that this boot guard ACM
  • 9:15 - 9:23
    measures the PEI phase into the TPM, which
    allows us to detect attempts to modify it
  • 9:23 - 9:28
    from malicious attacks. The bad news is
    that we are not able to change it on many
  • 9:28 - 9:34
    of these systems. But even with that in
    place, we still have a much, much more
  • 9:34 - 9:40
    flexible system. If you've ever worked
    with the UEFI shell or with GRUBs menu
  • 9:40 - 9:47
    config, it's really not as flexible, and
    the tooling is not anywhere near as
  • 9:47 - 9:51
    mature, as being able to write things with
    shell scripts, or with go, or with real
  • 9:51 - 9:58
    languages. Additionally we can configure
    at the Linux boot kernel with standard
  • 9:58 - 10:04
    Linux config tools. UEFI supports booting
    from FAT file systems, but with LinuxBoot
  • 10:04 - 10:09
    we can boot from any of the hundreds of
    file systems that Linux supports. We can
  • 10:09 - 10:16
    boot from encrypted filesystems, since we
    have LUKS and cryptsetup. Most UEFI
  • 10:16 - 10:20
    firmwares can only boot from the network
    device that is installed on the server
  • 10:20 - 10:25
    motherboard. We can boot from any network
    device that Linux supports, and we can use
  • 10:25 - 10:31
    proper protocols; we're not limited to PXE
    and TFTP. We can use SSL, we can do
  • 10:31 - 10:38
    cryptographic measurements of the kernels
    that we receive. And the runtime that
  • 10:38 - 10:43
    makes up Linux boot is also very flexible.
    Last year I presented the Heads runtime
  • 10:43 - 10:50
    for laptops. This is a very security
    focused initial ram disk that attempts to
  • 10:50 - 10:55
    provide a slightly more secure, measured,
    and attested firmware, and this works
  • 10:55 - 11:01
    really well with Linux boot. My
    collaborator Ron Minnich is working on a
  • 11:01 - 11:06
    go based firmware, called NERF, and this
    is written entirely in just-in-time
  • 11:06 - 11:11
    compiled Go, which is really nice because
    it gives you memory safety, and is very
  • 11:11 - 11:16
    popular inside of Google. Being able to
    tailor the device drivers that are
  • 11:16 - 11:22
    included also allows the system to boot
    much faster. UEFI on the Open Compute
  • 11:22 - 11:28
    Winterfell takes about eight minutes to
    startup. With NERF, excuse me -- with
  • 11:28 - 11:33
    with Linux boot and NERF it starts up in
    about 20 seconds. I found similar results
  • 11:33 - 11:39
    on the Intel mainboard that I'm working on
    and hopefully we will get a video there's
  • 11:39 - 11:45
    an action this is from power-on to
    executes the PEI phase out of the
  • 11:45 - 11:52
    ROM and then jumps into a small wrapper
    around the Linux kernel which then prints
  • 11:52 - 11:59
    to the serial port and we now have the
    Linux print case and we have an
  • 11:59 - 12:03
    interactive shell in about 20 seconds
    which is quite a bit better than the four
  • 12:03 - 12:11
    minutes that the system used to take it
    scrolled by pretty fast but you might have
  • 12:11 - 12:16
    noticed that the print case has ... - that
    the Linux kernel thinks it's running under
  • 12:16 - 12:20
    EFI this because we have a small wrapper
    around the kernel but for the most part
  • 12:20 - 12:26
    the kernel is able to do all of the PCI
    and device enumeration that it needs to do
  • 12:26 - 12:30
    because it already does it since it
    doesn't trust the vendor BIOSes in a lot
  • 12:30 - 12:39
    of cases so I'm really glad that the
    Congress has added a track on technical
  • 12:39 - 12:45
    resiliency and I would encourage Congress
    to also add a track on resiliency of our
  • 12:45 - 12:51
    social systems because it's really vital
    that we deal with both online and offline
  • 12:51 - 12:56
    harassment and I think that that will help
    us make a safer and more secure Congress
  • 12:56 - 13:06
    as well.
    applause
  • 13:06 - 13:12
    So last year when I presented at
    Heads I proposed three criteria for a
  • 13:12 - 13:16
    resilient technical system: that they need
    to be built with open-source software,
  • 13:16 - 13:20
    they need to be reproducibly built, and
    they need to be measured into some sort of
  • 13:20 - 13:27
    cryptographic hardware. The open is, you
    know, I think, in this crowd, is not
  • 13:27 - 13:34
    controversial. But the reason that we need
    it is because a lot of the server vendors
  • 13:34 - 13:38
    don't actually control their own firmware;
    they license it from independent BIOS
  • 13:38 - 13:45
    vendors who then tailor it for whatever
    current model of machine the
  • 13:45 - 13:51
    manufacturer is making. This means that
    they typically don't support older
  • 13:51 - 13:56
    hardware and, if there are
    vulnerabilities, it's necessary that we be
  • 13:56 - 14:01
    able to make these patches on our own
    schedule and we need to be able to self-
  • 14:01 - 14:07
    help when it comes to our own security.
    The other problem is that closed source
  • 14:07 - 14:13
    systems can hide vulnerabilities for
    decades — this is especially true for very
  • 14:13 - 14:17
    privileged devices like the management
    engine. There's been several talks here at
  • 14:17 - 14:24
    Congress about the concerns that we have
    with the management engine. Some vendors
  • 14:24 - 14:30
    are even violating our trust entirely and
    using their place in the firmware
  • 14:30 - 14:38
    to install malware or adware onto the
    systems. So for this reason we really need
  • 14:38 - 14:47
    our own control over this firmware.
    Reproducibility is becoming much more of
  • 14:47 - 14:54
    an issue, and the goal here is to be able
    to ensure that everyone who builds the
  • 14:54 - 14:59
    Linux boot firmware gets exactly the same
    result that everyone else does. This is a
  • 14:59 - 15:04
    requirement to be able to ensure that
    we're not introducing accidental
  • 15:04 - 15:09
    vulnerabilities through picking up the
    wrong library, or intentional ones through
  • 15:09 - 15:16
    compiler supply chain attacks, such as Ken
    Thompson's Trusting Trust article. With
  • 15:16 - 15:22
    the Linux boot firmware, our Kernel and
    Initial Ramdisk are reproducibly built, so
  • 15:22 - 15:28
    we get exactly the same hashes on the
    firmware. Unfortunately we don't control
  • 15:28 - 15:34
    the UEFI portions that we're using — the
    PEI and the SEC phase — so those aren't
  • 15:34 - 15:42
    included in our reproducibility
    right now. "Measured" is a another place
  • 15:42 - 15:48
    where we need to take into account the
    runtime security of the system. So
  • 15:48 - 15:53
    reproducible builds handle the compile
    time, but measuring what's running into
  • 15:53 - 15:59
    cryptographic coprocessors — like the TPM
    — gives us the ability to make
  • 15:59 - 16:03
    attestations as to what is actually
    running on the system. On the Heads
  • 16:03 - 16:09
    firmware we do this to the user that the
    firmware can produce a one-time secret
  • 16:09 - 16:14
    that you can compare against your phone to
    know that it has not been tampered with.
  • 16:14 - 16:18
    In the server case it uses remote
    attestation to be able to prove to the
  • 16:18 - 16:25
    user that the code that is running is what
    they expect. This is a collaboration with
  • 16:25 - 16:31
    the Mass Open Cloud Project, out of Boston
    University and MIT, that is attempting to
  • 16:31 - 16:38
    provide a hardware root of trust for the
    servers, so that you can know that a cloud
  • 16:38 - 16:46
    provider has not tampered with your
    system. The TPM is not invulnerable, as
  • 16:46 - 16:50
    Christopher Tarnovsky showed at DEFCON,
    but the level of effort that it takes to
  • 16:50 - 16:55
    break into a TPM, to decap it, and to read
    out the bits with a microscope raises the
  • 16:55 - 17:02
    bar really significantly. And part of
    resiliency is making honest trade-offs
  • 17:02 - 17:09
    about security threats versus the
    difficulty in launching the attacks, and
  • 17:09 - 17:15
    if the TPM prevents remote attacks or
    prevents software-only attacks, that is a
  • 17:15 - 17:23
    sufficiently high bar for a lot of these
    applications. We have quite a bit of
  • 17:23 - 17:29
    ongoing research with this. As I mentioned
    the management engine is an area of great
  • 17:29 - 17:35
    concern and we are working on figuring out
    how to remove most of its capabilities, so
  • 17:35 - 17:41
    that it's not able to interfere with the
    running system. There's another device in
  • 17:41 - 17:46
    most server motherboards called the board
    management controller, or the BMC, that
  • 17:46 - 17:53
    has a similar level of access to memory
    and devices. So we're concerned about
  • 17:53 - 17:58
    what's running on there, and there's a
    project out of Facebook called OpenBMC
  • 17:58 - 18:05
    that is an open source Linux distribution
    to run on that coprocessor, and what
  • 18:05 - 18:11
    Facebook has done through the Open Compute
    Initiative is, they have their OEMs pre-
  • 18:11 - 18:20
    installing that on the new open compute
    nodes, switches, and storage systems. And
  • 18:20 - 18:27
    this is really where we need to get with
    Linux boot as well. Right now it requires
  • 18:27 - 18:31
    physical access to the SPI Flash and a
    hardware programmer to be able to install.
  • 18:31 - 18:37
    That's not a hurdle for everyone, but this
    is not something that we want people to be
  • 18:37 - 18:44
    doing in their server rooms. We want OEMs
    to be providing these systems that are
  • 18:44 - 18:49
    secure by default so that it's not
    necessary to break out your chip clip to
  • 18:49 - 18:55
    make this happen. But if you do want to
    contribute, right now we support three
  • 18:55 - 19:03
    different main boards: The Intel S2600,
    which is a modern Wolf Pass CPU, the Mass
  • 19:03 - 19:09
    Open Cloud is working with the Dell R630,
    which is a Haswell, I believe, and then
  • 19:09 - 19:15
    Ron Minnich and John Murrie are working on
    the Open Compute Hardware, and this is
  • 19:15 - 19:22
    again a — in conjunction with OpenBMC — a
    real potential for having free software in
  • 19:22 - 19:28
    our firmware again. So, if you'd like more
    info, we have a website. There's some
  • 19:28 - 19:36
    install instructions and we'd love to help
    you build more secure, more flexible, and
  • 19:36 - 19:40
    more resilient systems. And I really want
    to thank everyone for coming here today,
  • 19:40 - 19:42
    and I'd love to answer any questions that
    you might have!
  • 19:42 - 19:51
    applause
  • 19:51 - 19:53
    Herald: Thank you very much Trammel Hudson
  • 19:53 - 19:59
    for this talk. We have 10 minutes for Q&A,
    so please line up at the microphones if
  • 19:59 - 20:03
    you have any questions but there are no
    questions from the signal angel and the
  • 20:03 - 20:05
    internet, so please, microphone number
    one.
  • 20:05 - 20:12
    Q: One quick question. Is Two Sigma using
    this for any of their internal systems,
  • 20:12 - 20:16
    and B, and how much vendor outreach is in
    there to try and make this beyond just the
  • 20:16 - 20:21
    Open Compute but also a lot of the vendors
    that were on your slides to adopt this.
  • 20:21 - 20:29
    A: So currently, we don't have any deployed
    systems taking advantage of it. It's still
  • 20:29 - 20:34
    very much at the research stage. I've
    been spending quite a bit of time visiting
  • 20:34 - 20:41
    OEMs, and one of my goals for 2018 is to
    have a mainstream OEM shipping it. The
  • 20:41 - 20:47
    Heads project is shipping firmware on some
    laptops from Librem, and I'm hoping we can
  • 20:47 - 20:54
    get Linux boot on servers as well.
    Herald: Microphone number 2, please.
  • 20:54 - 21:01
    Q: The question I have is about the size
    of Linux. So you mention that there is
  • 21:01 - 21:08
    problems with UEFI, and it's not open
    source, and stuff like that. But the issue
  • 21:08 - 21:16
    you mention is that the main part of Evo
    UEFI is EDK, which is open source and
  • 21:16 - 21:21
    then, I mean, I just have to guess that
    the HTTP client and stuff that they have
  • 21:21 - 21:28
    in the Apple boot, I assume it was, is for
    downloading their firmware, but how is
  • 21:28 - 21:33
    replacing something that's huge with
    something that's even bigger going to make
  • 21:33 - 21:37
    the thing more secure? Because I think the
    the whole point of having a security
  • 21:37 - 21:43
    kernel is to have it really small to be
    verifiable and I don't see that happening
  • 21:43 - 21:49
    with Linux, because at the same time
    people are coming up with other things. I
  • 21:49 - 21:54
    don't remember the the other hypervisor,
    which is supposed to be better than KVM,
  • 21:54 - 22:01
    because KVM is not really verifiable.
    A: So that that's a great question. The
  • 22:01 - 22:07
    concern is that Linux is a huge TCB — a
    Trusted Computing Base — and that that is
  • 22:07 - 22:13
    a big concern. Since we're already running
    linux on the server, it essentially is
  • 22:13 - 22:22
    inside our TCB already, so it is large, it
    is difficult to verify. However the
  • 22:22 - 22:26
    lessons that we've learned in porting
    Linux to run in this environment make it
  • 22:26 - 22:31
    also very conceivable that we could build
    other systems. If you want to use a
  • 22:31 - 22:37
    certified — excuse me, a verified
    microkernel, that would be a great place
  • 22:37 - 22:46
    to bring into the firmware and I'd
    love to figure out some way to make that
  • 22:46 - 22:55
    happen. The second question, just to point
    out, that even though EDK 2 — which is the
  • 22:55 - 23:02
    open source components of UEFI are open
    source — there's a huge amount of closed
  • 23:02 - 23:08
    source that goes into building a UEFI
    firmware, and we can't verify the closed
  • 23:08 - 23:14
    source part, and even the open source
    parts don't have the level of inspection
  • 23:14 - 23:22
    and correctness that the Linux kernel has
    gone through, and Linux systems that are
  • 23:22 - 23:31
    exposed on the internet. Most of the UEFI
    development is not focused on that level
  • 23:31 - 23:35
    of defense that Linux has to deal with
    everyday.
  • 23:35 - 23:41
    H: Microphone number 2, please.
    Q: Thank you for your talk. Would it be
  • 23:41 - 23:49
    possible also to support, apart from
    servers, to support laptops? Especially
  • 23:49 - 23:55
    the one locked down by Boot Guard?
    A: So the issue with Boot Guard on laptops
  • 23:55 - 24:02
    is that the CPU fuses are typically set in
    what's called a Verified Boot Mode, and
  • 24:02 - 24:08
    that will not exit the boot guard ACM if
    the firmware does not match the
  • 24:08 - 24:13
    manufacturer's hash. So this doesn't give
    us any way to take advantage– to
  • 24:13 - 24:19
    circumvent that. Most server chipsets are
    set in what's called Measured Boot Mode.
  • 24:19 - 24:25
    So the Boot Guard ACM just measures the
    next stage into the TPM, and then jumps
  • 24:25 - 24:31
    into it. So if an attacker has modified
    the firmware you will be able to detect it
  • 24:31 - 24:37
    during the attestation phase.
    H: Microphone number one, please — just
  • 24:37 - 24:46
    one question.
    Q: Thank you. On ARM it's much faster to
  • 24:46 - 24:52
    boot something. It's also much simpler:
    You have an address, you load the bin
  • 24:52 - 24:59
    file, and it boots. On x86 is much more
    complex, and the amount of codes you saw
  • 24:59 - 25:06
    was for GRUB relates to that. So my
    question: I've seen Allwinner boards,
  • 25:06 - 25:17
    Cortex A8, booting in four seconds just to
    get a shell, and six seconds to get a QT,
  • 25:17 - 25:22
    so the Linux kernel pretty QT app,
    to do a dashboard for a car — so five to
  • 25:22 - 25:30
    six seconds. So I'm wondering why is there
    such a big difference for a server to take
  • 25:30 - 25:38
    20 or 22 seconds? Is it the peripherals
    that needs to be initialized or what's the
  • 25:38 - 25:41
    reason for it?
    A: So there are several things that
  • 25:41 - 25:45
    contribute to the 20 seconds, and one of
    the things that we're looking into is
  • 25:45 - 25:51
    trying to profile that. We're able to swap
    out the PEI core and turn on a lot of
  • 25:51 - 25:56
    debugging. And what I've seen on the
    Dell system, a lot of that time is spent
  • 25:56 - 26:02
    waiting for the Management Engine to come
    online, and then there's also— it appears
  • 26:02 - 26:10
    to be a one second timeout for every CPU
    in the system, that they bring the CPUs on
  • 26:10 - 26:16
    one at a time, and it takes almost
    precisely 1 million microseconds for each
  • 26:16 - 26:22
    one. So there are things in the vendor
    firmware that we currently don't have the
  • 26:22 - 26:27
    ability to change — that appear to be the
    long tent, excuse me, long poll in the
  • 26:27 - 26:33
    tent on the boot process.
    H: Microphone 3 in the back, please.
  • 26:33 - 26:41
    Q: You addressed a lot about security, but
    my question is rather, there's a lot of
  • 26:41 - 26:48
    settings — for example BIOS, there's UEFI
    settings, and there's stuff like remote
  • 26:48 - 26:55
    booting — which is a whole bunch of weird
    protocols, proprietary stuff, and stuff
  • 26:55 - 27:02
    that's really hard to handle. If you have
    a large installation, for example, you
  • 27:02 - 27:10
    can't just say: Okay deploy all my boot
    orders for the BIOS settings. Are you
  • 27:10 - 27:14
    going to address that in some unified,
    nice way, where I can say, okay I have
  • 27:14 - 27:22
    this one protocol that runs on my Linux
    firmware that does that nicely.
  • 27:22 - 27:29
    A: That's exactly how most sites will
    deploy it. That they will write their own
  • 27:29 - 27:35
    boot scripts that use traditional — excuse
    me — that use normal protocols. So in the
  • 27:35 - 27:42
    Mass Open Cloud they are doing a wget over
    SSL that can then measure the received
  • 27:42 - 27:52
    kernel into the TPM and then kexec it. And
    that's done without requiring changes to
  • 27:52 - 27:57
    in-VRAM-variables, or all the sort of
    setup that you have to put into to
  • 27:57 - 28:02
    configuring a UEFI system. That can be
    replaced with a very small shell script.
  • 28:02 - 28:09
    H: We have time for one last question —
    and this is from the Signal Angel, because
  • 28:09 - 28:14
    the internet has a question.
    Q: Yes, the internet has two very simple
  • 28:14 - 28:18
    technical questions: Do you know if
    there's any progress, or do you know if
  • 28:18 - 28:24
    any ATAs on the Talus 2 project. And are
    there any size concerns when writing
  • 28:24 - 28:27
    firmware in Go?
  • 28:27 - 28:33
    A: So the Talus 2 project is
    a Power based system, and right now we're
  • 28:33 - 28:39
    mostly focused on the x86 servers, since
    that's the very mainstream available sorts
  • 28:39 - 28:45
    of boards, and the Go firmware is actually
    quite small. I've mostly been working on
  • 28:45 - 28:51
    the Heads side, which is based on shell
    scripts. My understanding is that the
  • 28:51 - 28:56
    just-in-time compiled Go does not add more
    than a few hundred kilobytes to the ROM
  • 28:56 - 29:03
    image and only a few 100 milliseconds to
    to the boot time. The advantage of Go is
  • 29:03 - 29:11
    that it is memory safe, and it's an actual
    programming language, so it allows the
  • 29:11 - 29:15
    initialization scripts to be verified in a
    way that shell scripts can be very
  • 29:15 - 29:19
    difficult to do.
    H: So thank you very much for answering
  • 29:19 - 29:22
    all these questions. Please
    give a warm round of applause to
  • 29:22 - 29:30
    Trammel Hudson. Thank you very much!
    applause
  • 29:30 - 29:34
    postroll music
  • 29:34 - 29:52
    subtitles created by c3subtitles.de
    in the year 2020. Join, and help us!
Title:
34C3 - Bringing Linux back to server boot ROMs with NERF and Heads
Description:

more » « less
Video Language:
English
Duration:
29:52

English subtitles

Revisions