< Return to Video

Software transparency: package security beyond signatures and reproducible builds

  • Not Synced
    This will be an academic talk
    as announced.
  • Not Synced
    I will try to bring some of my research
    I did during my PhD into the real world.
  • Not Synced
    We are going to talk about the security
    of software distribution and
  • Not Synced
    I'm going to propose a security feature
    that adds on top of
  • Not Synced
    the signatures we have up to today
  • Not Synced
    and also the reproducible builds that
    we already have to very large degree.
  • Not Synced
    I am going to highlight a few points where
    I think infrastructure changes are required
  • Not Synced
    to accommodate this system and I would
    also appreciate any feedback
  • Not Synced
    you might have.
  • Not Synced
    I'm going to ??? a few motivation of
    what should we care about.
  • Not Synced
    In the security of software distribution
  • Not Synced
    we already do have
    cryptographic signatures
  • Not Synced
    I've just put up a few examples of
    recent attacks that involved
  • Not Synced
    the distribution of software where
    people who presumably thought
  • Not Synced
    they knew what they were doing had
    grave problems with software distribution.
  • Not Synced
    For example, the juniper backdoors,
    pretty famous.
  • Not Synced
    Juniper discovered two backdoors
    in the code and
  • Not Synced
    nobody really knew where they were
    coming from.
  • Not Synced
    Another example would be
    Chrome extension developers
  • Not Synced
    who got their credentials fished and
    subsequently their extensions backdoored
  • Not Synced
    or another example, a signed update to
    a banking software actually included
  • Not Synced
    a malware and infected several banks.
  • Not Synced
    I hope this is motivation for us to
    consider this kinds of text
  • Not Synced
    to be possible and to prepare ourselves.
  • Not Synced
    I have two main goals in the system
    I am going to propose.
  • Not Synced
    The first is to relax trust in the archive.
  • Not Synced
    In particular, what I want to achieve is
    a level of security even if
  • Not Synced
    the archive is compromised and
    the specific thing I am going to do is
  • Not Synced
    to detect targeted backdoors.
  • Not Synced
    That means backdoors that are distributed
    only to a subset of the population and
  • Not Synced
    what we can achieve is to force
    the attacker to deliver the malware
  • Not Synced
    to everybody, thereby greatly decreasing
    their degree of stealth and increasing
  • Not Synced
    their danger of detection.
  • Not Synced
    This would work to our advantage.
  • Not Synced
    The second goal is the forensic auditability
  • Not Synced
    which overlaps to a surprising degree
    with the first one in technical terms,
  • Not Synced
    in terms of implementation.
  • Not Synced
    So, what I want to ensure is that we have
  • Not Synced
    inspectable source code for every binary.
  • Not Synced
    We do have of course the source code
    available from our packages, but
  • Not Synced
    only for the most recent version,
    everything else is a best effort
  • Not Synced
    by the code archiving services.
  • Not Synced
    The mapping between those and binary
    can be verified once we have
  • Not Synced
    reproducible builds to a large extent.
  • Not Synced
    I want to make sure that we can identify
    the maintainer responsible for distribution
  • Not Synced
    of a particular package and the system
    is also interested in providing
  • Not Synced
    attribution of where something went from,
  • Not Synced
    so that we are not in a situation where we
    notice something went wrong but
  • Not Synced
    we don't really know where we have to look
    in order to find the problems
  • Not Synced
    but that we really have specific and
    secured indication of
  • Not Synced
    where a compromised problem
    was coming from.
  • Not Synced
    Let's quickly recap how our software
    distribution works.
  • Not Synced
    We have the maintainers who upload
    their code to the archive.
  • Not Synced
    The archive has access to a signing key
    which signs the releases.
  • Not Synced
    Actually, metadata covering all the actual
    binary packages.
  • Not Synced
    These are distributed over
    the mirror network
  • Not Synced
    from where the app clients will download
    the package metadata.
  • Not Synced
    That means the hash sums for the packages,
    their dependencies and so on
  • Not Synced
    as well as the actual packages themselves.
  • Not Synced
    This central architecture has an important
    advantage,
  • Not Synced
    mainly the mirror network need not
    to be trusted, right?
  • Not Synced
    We have the signature that covers all
    the contents of binary and source packages
  • Not Synced
    and the metadata, so the mirror network
    need not to be trusted.
  • Not Synced
    On the other hand, it makes the archive and
    the signing key a very interesting target
  • Not Synced
    for attackers because this central point
    controls all the signing operations.
  • Not Synced
    So this is a place where we need to be
    particularly careful and perhaps
  • Not Synced
    maybe even do better than
    cryptographic signatures.
  • Not Synced
    This is where the main focus of this talk
    will be, although I will also consider
  • Not Synced
    the uploaders to some extent.
  • Not Synced
    We want to achieve two things:
  • Not Synced
    resistance against key compromise and
    targeted backdoors and
  • Not Synced
    to get some better support for auditing
    in case things go wrong.
  • Not Synced
    The approach that we choose to do this is
  • Not Synced
    we want to make sure that everybody runs
    exactly the same software
  • Not Synced
    or at least the parts of it these choose
    to install.
  • Not Synced
    If we think about that for a moment,
    this gives us a number of advantages.
  • Not Synced
    For example, all the analysis that's done
    on a piece of software immediately
  • Not Synced
    carries over to all other users of
    the software, right?
  • Not Synced
    Because if we haven't made sure that
    everybody installs the same software,
  • Not Synced
    they might not have exactly
    the same version and perhaps
  • Not Synced
    some backdoored version.
  • Not Synced
    This also ensures that we cannot suffer
    targeted backdoors by increasing
  • Not Synced
    the detection risk of attackers
  • Not Synced
    and we also want to have a cryptographic
    proof of where something went wrong.
  • Not Synced
    Now, to look at some pictures,
    I will present the data structure that
  • Not Synced
    we use in order to achieve these goals.
  • Not Synced
    The data structure is hash tree,
    a Merkle tree which is
  • Not Synced
    a data structure that operates over a list.
  • Not Synced
    So we have a list of these squares here
    which represent the list items.
  • Not Synced
    In our case, this is going to be
    the files containing a package metedata
  • Not Synced
    that just dependencies, a hash sum of
    packages
  • Not Synced
    and also the source packages themselves
    are going to be elements in this list.
  • Not Synced
    The tree works as follows.
  • Not Synced
    It uses a cryptographic has function
  • Not Synced
    which is a collision resistant compressing
    function
  • Not Synced
    and the labels of the inner nodes
    of the tree are computed as
  • Not Synced
    the hashes of the children. Ok?
  • Not Synced
    Once we have computed the root hash,
    the root label,
  • Not Synced
    we have fixed all the elements and
    none of the elements can be changed
  • Not Synced
    without changing the root hash.
  • Not Synced
    We can exploit this in order to
    efficiently prove
  • Not Synced
    the two following properties for ???
  • Not Synced
    First of all, we can efficiently prove
    the inclusion of a given element
  • Not Synced
    in the list.
  • Not Synced
    If we know the tree root ???,
    this works as follows:
  • Not Synced
    let's make a quick example, we see
    the third list item is marked with an X
  • Not Synced
    and if I know the tree root, then
    the server operating the tree structure
  • Not Synced
    will only need to give me the three grey
    marked labels,
  • Not Synced
    the three marked node values and then
    I can recompute the root hash and
  • Not Synced
    be convinced that this element actually
    was contained in the list.
  • Not Synced
    The second property is that we can also
    efficiently verify the append-only operation
  • Not Synced
    of the list.
  • Not Synced
    So we can have a log server operating
    this kind of structure and
  • Not Synced
    the log server need not to be trusted,
  • Not Synced
    it's not going to be trusted third party
    but rather, its operation can be
  • Not Synced
    verified from the outside.
  • Not Synced
    So, what does this design look like?
  • Not Synced
    The theoretical foundation is called
    a transparency overlay and
  • Not Synced
    in our system it looks like this:
  • Not Synced
    We have the archive as per usual,
  • Not Synced
    we have a log server and the archive will
    submit package metadata, the release file,
  • Not Synced
    the packages file containing dependencies
    and so on and the source code
  • Not Synced
    into this log server.
  • Not Synced
    The app client will be augmented with
    an auditor component and
  • Not Synced
    this auditor component is responsible for
    verifying the correct log operation
  • Not Synced
    as well as the inclusion of the downloaded
    release into the log.
Title:
Software transparency: package security beyond signatures and reproducible builds
Description:

more » « less
Video Language:
English
Team:
Debconf
Project:
2018_mini-debconf-hamburg
Duration:
25:04

English subtitles

Incomplete

Revisions Compare revisions