Return to Video

Debian_dependency_resolution_in_polynomial_time.webm

  • Not Synced
    (Offscreen) So welcome to the talk from Niels
  • Not Synced
    about dependency resolution in
    polynomial time.
  • Not Synced
    (Neils) I'm Neils, I'm doing my second
    talk today.
  • Not Synced
    This time about something slightly
    more technical.
  • Not Synced
    I wanted to do dependency resolution
    in deterministic polynomial time,
  • Not Synced
    which is what this is all about.
  • Not Synced
    I didn't get as far with this as I wanted.
  • Not Synced
    I wanted to have a tool ready and have
    some results, which I didn't quite finish.
  • Not Synced
    So these will be my findings work in
    progress notes.
  • Not Synced
    We have six topics I have to cover.
  • Not Synced
    A quick introduction.
  • Not Synced
    Then something about the hard problem.
  • Not Synced
    Then the part the part where it actually
    turns out to be highly tractable
  • Not Synced
    in practice, if you don't think too much
    about it.
  • Not Synced
    And then there is some installing,
    upgrading, and installing again
  • Not Synced
    for how to improve your solvers.
  • Not Synced
    Starting with introduction.
  • Not Synced
    I'm hoping mostly to debunk the myth
    that dependency resolution is a very
  • Not Synced
    hard problem that we cannot solve,
  • Not Synced
    and we should remove packages in the hopes
    that they will somehow keep the archive
  • Not Synced
    from exploding, and the dependency
    resolvers to break, and apt to die,
  • Not Synced
    and god knows what.
  • Not Synced
    I've been working on Britney, which is a
    different problem to solve, but it
  • Not Synced
    involves the same techniques, or uses some
    of the same techniques.
  • Not Synced
    Some will be directly applicable to apt,
    some will not.
  • Not Synced
    There's not a one size fits all solution, sorry.
  • Not Synced
    And I much less haven't wrote it yet,
    even if there was.
  • Not Synced
    So defining the problem.
  • Not Synced
    When you try to install a package on your
    system,
  • Not Synced
    there's actually two problems
    being solved.
  • Not Synced
    One is the part where apt figures out if I
    have to install Eclipse.
  • Not Synced
    You'll be needing a lot of Java packages,
    you'll be needing a lot of other stuff.
  • Not Synced
    And so it figures out which packages are
    needed for this to make sense.
  • Not Synced
    The separate part of the problem,
    the second problem is,
  • Not Synced
    once you have this list, they have to be
    unpackaged and configured in a certain
  • Not Synced
    order for this all to make sense.
  • Not Synced
    They are in theory two distinct problems.
  • Not Synced
    I'll be talking about the first problem,
  • Not Synced
    because that's actually
    dependency resolution.
  • Not Synced
    The other thing is just ordering.
  • Not Synced
    The ordering thing is certainly also very
    important part.
  • Not Synced
    I don't want to dismiss that, it's just
  • Not Synced
    it is in fact theoretically a polynomial
    time problem.
  • Not Synced
    So to solve the ordering problem, you
    basically compute all the action-
  • Not Synced
    Given a set of actions that need
    to be done,
  • Not Synced
    then there are some constraints between
    some of them,
  • Not Synced
    like you unpack things before you
    configure it.
  • Not Synced
    Maybe you deconfigure something else
    before that.
  • Not Synced
    So you end up with a list of partial ??
    constraints.
  • Not Synced
    From that you build a graph with ordering.
  • Not Synced
    It's fairly simple to do, in practice,
    if you have the right tools for it.
  • Not Synced
    Then without cycles, this is a trivial
    things where it just orders it
  • Not Synced
    and gives you this is the order,
    go fix.
  • Not Synced
    When there's cycles you will get a single
    action consisting of multiple things
  • Not Synced
    to do at the same time, which is really
    impossible to do.
  • Not Synced
    It turns out that the way we defined this
    whole thing,
  • Not Synced
    if you have a package that does not have a
    post install script,
  • Not Synced
    it does not need to be configured separately.
  • Not Synced
    That means that it's just unpacked or
    not unpacked.
  • Not Synced
    That tends to break most cylcles.
  • Not Synced
    If you want, there's problems with
    ordering constraints.
  • Not Synced
    Help us to find a way to get rid of most
    of the post install scripts,
  • Not Synced
    such as the ones just running lvconfig
    and nothing else.
  • Not Synced
    That could solve some of the cycles,
  • Not Synced
    and otherwise it's just feeding it to dpkg
    and it works.
  • Not Synced
    If you have a cycle, it's a separate
    problem fixing that.
  • Not Synced
    I won't be covering that.
  • Not Synced
    As I said, the runtime is polynomial.
  • Not Synced
    It's something like the size of the
    package, a regular graph.
  • Not Synced
    Which is fairly polynomial.
  • Not Synced
    This even covers finding cycles.
  • Not Synced
    Again breaking cycles is just removing
    post install scripts.
  • Not Synced
    And of course if you think this is too
    simple,
  • Not Synced
    you can always implement more features
    on top of it,
  • Not Synced
    such as trying to optimize for minimal
    downtime of services and all that.
  • Not Synced
    You can make any trivial problem very hard
    very soon with that.
  • Not Synced
    So, the hard problem.
  • Not Synced
    First of all, the players.
  • Not Synced
    We've got apt, aptitude, cupt, whatever.
  • Not Synced
    We've got Britney.
  • Not Synced
    We've got DOSE, edos, whatever its
    name is today.
  • Not Synced
    They solve the hard problem.
  • Not Synced
    Or they should be trying to do.
  • Not Synced
    Notable tools not affected by it.
  • Not Synced
    dpkg.
  • Not Synced
    So dpkg surely figures out if you are
    missing dependencies. It says,
  • Not Synced
    "I cannot resolve this, and I don't know
    where to fetch it, so I'm just giving up."
  • Not Synced
    And that is the only thing you can do.
  • Not Synced
    Which means it only verifies the solution
    which is known to be polynomial,
  • Not Synced
    and is therefore is not a game player.
  • Not Synced
    DAK rm is also doing a polynomial time
    check, it just happens to be so
  • Not Synced
    for other reasons.
  • Not Synced
    That's basically the known players.
  • Not Synced
    There are probably others.
  • Not Synced
    But we're moving onto the
    hard problem itself
  • Not Synced
    So the problem we actually have is,
  • Not Synced
    we've got version dependencies,
    we've got alternative choices,
  • Not Synced
    we've got virtual packages.
  • Not Synced
    All three makes this problem hard.
  • Not Synced
    You basically have to remove all possible
    alternative choices, guesses, whatever,
  • Not Synced
    to make this simple.
  • Not Synced
    It just so happens if you move version
    dependencies among other things,
  • Not Synced
    things become really fun.
  • Not Synced
    ?? to solve the problem, but upgrading
    gets really really spicy.
  • Not Synced
    So how do we ensure that dpkg
    is configured before your new package,
  • Not Synced
    use a feature new dev package
    is installed.
  • Not Synced
    That's sort of impossible.
  • Not Synced
    It becomes simple, but we end up with a
    broken dependency ??,
  • Not Synced
    so it's not really an option.
  • Not Synced
    Technically there's also a way to make
    it simple by having multiple versions
  • Not Synced
    of things, and removing negative
    dependencies,
  • Not Synced
    but that's not necessarily easy to do either.
  • Not Synced
    Presumably file configs will get in our
    way or something.
  • Not Synced
    We have to redefine where we place all of
    the files, that's not going to be
  • Not Synced
    something we're doing anytime soon.
  • Not Synced
    So yeah, it's not really an option.
  • Not Synced
    To give you a better understanding of the
    problem, please consider this example.
  • Not Synced
    If we have coreutils at some version,
    depending on either this version
  • Not Synced
    of libc or a newer one.
  • Not Synced
    From this can we immediately conclude
  • Not Synced
    that if libc 2.19 is known to be
    installable,
  • Not Synced
    that coreutils will also be installable?
  • Not Synced
    How many people think we can immediately
    conclude something from this?
  • Not Synced
    How many people think we cannot
    conclude anything from this?
  • Not Synced
    We'll that's good.
  • Not Synced
    So it turns out that with negative
    dependencies and without negative
  • Not Synced
    dependencies, it doesn't really matter.
  • Not Synced
    With negative dependencies, libc or
    whatever it depends on could
  • Not Synced
    just conflict with coreutils, we're
    broken, it's out.
  • Not Synced
    And without negative dependencies, you
    could have one of the things where
  • Not Synced
    it depends on a version that's newer or
    older, and no.
  • Not Synced
    It's not really trivial to do.
  • Not Synced
    You can't conclude something locally,
    in theory.
  • Not Synced
    That's something that can
    become a problem.
  • Not Synced
    Anyway, it is highly tractable
    in practice.
  • Not Synced
    Because if you do have a break, conflicts,
    or otherwise negative dependency,
  • Not Synced
    it tends to be upper bound.
  • Not Synced
    So if the previous version of coreutils
    was installable with that version,
  • Not Synced
    the new version will also probably be.
  • Not Synced
    Likewise, most dependencies, circular or
    otherwise, tend to be upper bound.
  • Not Synced
    There are cases of version ranges which
    is a lower bound and upper bound
  • Not Synced
    at the same time.
  • Not Synced
    They exist, they are just not so
    common yet.
  • Not Synced
    And that's a good thing.
  • Not Synced
    The number of possible solutions to
    any clause
  • Not Synced
    actually tends to be fairly limited.
  • Not Synced
    Of course there are the exceptions.
  • Not Synced
    Packages from unstable.
  • Not Synced
    They might just be missing, in which case
    it's really hard to solve the dependency.
  • Not Synced
    You've got mutually exclusive packages
    all the sendmail stuff, MTAs.
  • Not Synced
    The version ranges I mentioned before.
  • Not Synced
    And then of course the strictly equal
    versions, which we see inside
  • Not Synced
    packages built from the same
    source packages.
  • Not Synced
    The redeeming feature here: same source.
  • Not Synced
    Because that is actually very simple
    to figure out
  • Not Synced
    if they are from the same source.
  • Not Synced
    So they tend to be upgraded ??
    anyhow.
  • Not Synced
    The new versions tend to be available
    at the same time sorta thing.
  • Not Synced
    The problem made hard in a nutshell, is
    this contrived example for example.
  • Not Synced
    You have a package you want to try,
    and it depends on any number of foos.
  • Not Synced
    Each of the foos can either be solved by
  • Not Synced
    picking up the bar package
    or the good one.
  • Not Synced
    Which might be a hint to which one we
    would choose if we want to solve it.
  • Not Synced
    And the bar packages, depends on a
    bad package that does not work
  • Not Synced
    with the starting-package,
    and we are broken.
  • Not Synced
    And the good package just doesn't have any
    dependencies and is therefore good.
  • Not Synced
    In the perfect example you saw,
  • Not Synced
    you try foo, good, foo1, and then good
    and be done.
  • Not Synced
    In practice, if you don't have some way
    of ordering these so you know
  • Not Synced
    one of them is better than the other.
  • Not Synced
    You now have an n times m trace,
    where you have to guess up and down,
  • Not Synced
    which one am I doing, backtrack back and
    forth and all, and it's going to be horrible.
  • Not Synced
    And this contrived example can of course
    be solved by some solvers that can see
  • Not Synced
    the pattern, but you can always a
    more contrived pattern
  • Not Synced
    that's harder to understand.
  • Not Synced
    So I tried to keep it somewhat simple.
  • Not Synced
    The good thing is, very few people write
    software like this.
  • Not Synced
    And that is the most redeeming feature
    about the package ??.
  • Not Synced
    We do have exceptions.
  • Not Synced
    Aspell-dictionaries, ispell-dictionaries.
  • Not Synced
    Basically if you depend on
    ispell-dictionary,
  • Not Synced
    which is a virtual package provided by 50
    different ispell packages,
  • Not Synced
    or aspell packages, and then they depend
    on something else after that.
  • Not Synced
    So in theory they can be a part of
    creating this problem.
  • Not Synced
    Fortunately, they themselves are fairly
    simple once you've passed them.
  • Not Synced
    You also have multiarch foreign, or
    multiarch allowed with
  • Not Synced
    package interdependencies.
  • Not Synced
    So if you have multiple architectures,
    you in theory can have any number
  • Not Synced
    of things satisfying the same clause.
  • Not Synced
    This gives extra unnecssary work for most
    resolvers,
  • Not Synced
    if you enable them to do
    multiarch and cross things.
  • Not Synced
    We have 12 of them, but I suspect most
    users of multiarch are somewhat limited
  • Not Synced
    to maybe 3.
  • Not Synced
    Possible exception being people like
    writing resolvers and trying them out,
  • Not Synced
    torture test them.
  • Not Synced
    The real problem, if you had to do this in
    the archive you would need to do,
  • Not Synced
    for example something like, write
    indifferent awk implementations.
  • Not Synced
    We have three.
  • Not Synced
    And then afterwards, for example,
    have m different distinct,
  • Not Synced
    but equally valid libc implementations.
  • Not Synced
    It's not something we're likely to do in
    the near future.
  • Not Synced
    And you have to do this on multiple
    levels,
  • Not Synced
    because you can just presolve the
    essential set most of the time.
  • Not Synced
    The real problem: if you had to do this in
    the archive, you would need to do,
  • Not Synced
    for example, something like write
    N different awk implementations.
  • Not Synced
    We have 3.
  • Not Synced
    Then afterwards for example, have
    M different distinct but equally valid
  • Not Synced
    libc implementations.
  • Not Synced
    It's not something we're likely to do
    in the near future.
  • Not Synced
    And you have to do this on multiple
    levels.
  • Not Synced
    Because you can just presolve the
    essential set most of the time,
  • Not Synced
    this particular thing is not very
    interesting.
  • Not Synced
    And the data packages can in theory
    blow up,
  • Not Synced
    when you have truly interchangable data,
    like we do with the aspell packages.
  • Not Synced
    But they tend to either not have any
    dependencies after the aspell,
  • Not Synced
    after the data package, or they have a
    simple loopback into what you came from.
  • Not Synced
    So the blowup blowup tends to be limited
    to one level.
  • Not Synced
    If you have enough of those, it's still
    going to be a problem,
  • Not Synced
    but it keeps it simple.
  • Not Synced
    Onto installing.
  • Not Synced
    The easier case, as mentioned is when you
    have a single suit
  • Not Synced
    and a single architecture.
  • Not Synced
    All things in the graph collapse into
    what looks like a regular graph,
  • Not Synced
    and therefore become simple,
    trivial.
  • Not Synced
    This actually happens so much that
    if you take Lintian,
  • Not Synced
    it has 26 distinct dependency clauses.
  • Not Synced
    If you look at each of them,
  • Not Synced
    when you only have a single suite
    and architecture, there's at most
  • Not Synced
    one package directly solving each
    clause locally.
  • Not Synced
    Then further down the graph somewhere
    you depend on for example, debconf,
  • Not Synced
    or debconf-2.0 which is one of these
    alternatives, giving rise to its ??
  • Not Synced
    Linitan itself is not subject to becoming
    a backtrack point.
  • Not Synced
    There's no guesswork when you reach
    Lintian, which of its dependencies
  • Not Synced
    should I pick, which version of
    it and all.
  • Not Synced
    That's not there, you have to go further
    down to see that.
Title:
Debian_dependency_resolution_in_polynomial_time.webm
Video Language:
English
Team:
Debconf
Project:
2015_debconf15

English subtitles

Incomplete

Revisions Compare revisions