WEBVTT 99:59:59.999 --> 99:59:59.999 (Offscreen) So welcome to the talk from Niels 99:59:59.999 --> 99:59:59.999 about dependency resolution in polynomial time. 99:59:59.999 --> 99:59:59.999 (Neils) I'm Neils, I'm doing my second talk today. 99:59:59.999 --> 99:59:59.999 This time about something slightly more technical. 99:59:59.999 --> 99:59:59.999 I wanted to do dependency resolution in deterministic polynomial time, 99:59:59.999 --> 99:59:59.999 which is what this is all about. 99:59:59.999 --> 99:59:59.999 I didn't get as far with this as I wanted. 99:59:59.999 --> 99:59:59.999 I wanted to have a tool ready and have some results, which I didn't quite finish. 99:59:59.999 --> 99:59:59.999 So these will be my findings work in progress notes. 99:59:59.999 --> 99:59:59.999 We have six topics I have to cover. 99:59:59.999 --> 99:59:59.999 A quick introduction. 99:59:59.999 --> 99:59:59.999 Then something about the hard problem. 99:59:59.999 --> 99:59:59.999 Then the part the part where it actually turns out to be highly tractable 99:59:59.999 --> 99:59:59.999 in practice, if you don't think too much about it. 99:59:59.999 --> 99:59:59.999 And then there is some installing, upgrading, and installing again 99:59:59.999 --> 99:59:59.999 for how to improve your solvers. 99:59:59.999 --> 99:59:59.999 Starting with introduction. 99:59:59.999 --> 99:59:59.999 I'm hoping mostly to debunk the myth that dependency resolution is a very 99:59:59.999 --> 99:59:59.999 hard problem that we cannot solve, 99:59:59.999 --> 99:59:59.999 and we should remove packages in the hopes that they will somehow keep the archive 99:59:59.999 --> 99:59:59.999 from exploding, and the dependency resolvers to break, and apt to die, 99:59:59.999 --> 99:59:59.999 and god knows what. 99:59:59.999 --> 99:59:59.999 I've been working on Britney, which is a different problem to solve, but it 99:59:59.999 --> 99:59:59.999 involves the same techniques, or uses some of the same techniques. 99:59:59.999 --> 99:59:59.999 Some will be directly applicable to apt, some will not. 99:59:59.999 --> 99:59:59.999 There's not a one size fits all solution, sorry. 99:59:59.999 --> 99:59:59.999 And I much less haven't wrote it yet, even if there was. 99:59:59.999 --> 99:59:59.999 So defining the problem. 99:59:59.999 --> 99:59:59.999 When you try to install a package on your system, 99:59:59.999 --> 99:59:59.999 there's actually two problems being solved. 99:59:59.999 --> 99:59:59.999 One is the part where apt figures out if I have to install Eclipse. 99:59:59.999 --> 99:59:59.999 You'll be needing a lot of Java packages, you'll be needing a lot of other stuff. 99:59:59.999 --> 99:59:59.999 And so it figures out which packages are needed for this to make sense. 99:59:59.999 --> 99:59:59.999 The separate part of the problem, the second problem is, 99:59:59.999 --> 99:59:59.999 once you have this list, they have to be unpackaged and configured in a certain 99:59:59.999 --> 99:59:59.999 order for this all to make sense. 99:59:59.999 --> 99:59:59.999 They are in theory two distinct problems. 99:59:59.999 --> 99:59:59.999 I'll be talking about the first problem, 99:59:59.999 --> 99:59:59.999 because that's actually dependency resolution. 99:59:59.999 --> 99:59:59.999 The other thing is just ordering. 99:59:59.999 --> 99:59:59.999 The ordering thing is certainly also very important part. 99:59:59.999 --> 99:59:59.999 I don't want to dismiss that, it's just 99:59:59.999 --> 99:59:59.999 it is in fact theoretically a polynomial time problem. 99:59:59.999 --> 99:59:59.999 So to solve the ordering problem, you basically compute all the action- 99:59:59.999 --> 99:59:59.999 Given a set of actions that need to be done, 99:59:59.999 --> 99:59:59.999 then there are some constraints between some of them, 99:59:59.999 --> 99:59:59.999 like you unpack things before you configure it. 99:59:59.999 --> 99:59:59.999 Maybe you deconfigure something else before that. 99:59:59.999 --> 99:59:59.999 So you end up with a list of partial ?? constraints. 99:59:59.999 --> 99:59:59.999 From that you build a graph with ordering. 99:59:59.999 --> 99:59:59.999 It's fairly simple to do, in practice, if you have the right tools for it. 99:59:59.999 --> 99:59:59.999 Then without cycles, this is a trivial things where it just orders it 99:59:59.999 --> 99:59:59.999 and gives you this is the order, go fix. 99:59:59.999 --> 99:59:59.999 When there's cycles you will get a single action consisting of multiple things 99:59:59.999 --> 99:59:59.999 to do at the same time, which is really impossible to do. 99:59:59.999 --> 99:59:59.999 It turns out that the way we defined this whole thing, 99:59:59.999 --> 99:59:59.999 if you have a package that does not have a post install script, 99:59:59.999 --> 99:59:59.999 it does not need to be configured separately. 99:59:59.999 --> 99:59:59.999 That means that it's just unpacked or not unpacked. 99:59:59.999 --> 99:59:59.999 That tends to break most cylcles. 99:59:59.999 --> 99:59:59.999 If you want, there's problems with ordering constraints. 99:59:59.999 --> 99:59:59.999 Help us to find a way to get rid of most of the post install scripts, 99:59:59.999 --> 99:59:59.999 such as the ones just running lvconfig and nothing else. 99:59:59.999 --> 99:59:59.999 That could solve some of the cycles, 99:59:59.999 --> 99:59:59.999 and otherwise it's just feeding it to dpkg and it works. 99:59:59.999 --> 99:59:59.999 If you have a cycle, it's a separate problem fixing that. 99:59:59.999 --> 99:59:59.999 I won't be covering that. 99:59:59.999 --> 99:59:59.999 As I said, the runtime is polynomial. 99:59:59.999 --> 99:59:59.999 It's something like the size of the package, a regular graph. 99:59:59.999 --> 99:59:59.999 Which is fairly polynomial. 99:59:59.999 --> 99:59:59.999 This even covers finding cycles. 99:59:59.999 --> 99:59:59.999 Again breaking cycles is just removing post install scripts. 99:59:59.999 --> 99:59:59.999 And of course if you think this is too simple, 99:59:59.999 --> 99:59:59.999 you can always implement more features on top of it, 99:59:59.999 --> 99:59:59.999 such as trying to optimize for minimal downtime of services and all that. 99:59:59.999 --> 99:59:59.999 You can make any trivial problem very hard very soon with that. 99:59:59.999 --> 99:59:59.999 So, the hard problem. 99:59:59.999 --> 99:59:59.999 First of all, the players. 99:59:59.999 --> 99:59:59.999 We've got apt, aptitude, cupt, whatever. 99:59:59.999 --> 99:59:59.999 We've got Britney. 99:59:59.999 --> 99:59:59.999 We've got DOSE, edos, whatever its name is today. 99:59:59.999 --> 99:59:59.999 They solve the hard problem. 99:59:59.999 --> 99:59:59.999 Or they should be trying to do. 99:59:59.999 --> 99:59:59.999 Notable tools not affected by it. 99:59:59.999 --> 99:59:59.999 dpkg. 99:59:59.999 --> 99:59:59.999 So dpkg surely figures out if you are missing dependencies. It says, 99:59:59.999 --> 99:59:59.999 "I cannot resolve this, and I don't know where to fetch it, so I'm just giving up." 99:59:59.999 --> 99:59:59.999 And that is the only thing you can do. 99:59:59.999 --> 99:59:59.999 Which means it only verifies the solution which is known to be polynomial, 99:59:59.999 --> 99:59:59.999 and is therefore is not a game player. 99:59:59.999 --> 99:59:59.999 DAK rm is also doing a polynomial time check, it just happens to be so 99:59:59.999 --> 99:59:59.999 for other reasons. 99:59:59.999 --> 99:59:59.999 That's basically the known players. 99:59:59.999 --> 99:59:59.999 There are probably others. 99:59:59.999 --> 99:59:59.999 But we're moving onto the hard problem itself 99:59:59.999 --> 99:59:59.999 So the problem we actually have is, 99:59:59.999 --> 99:59:59.999 we've got version dependencies, we've got alternative choices, 99:59:59.999 --> 99:59:59.999 we've got virtual packages. 99:59:59.999 --> 99:59:59.999 All three makes this problem hard. 99:59:59.999 --> 99:59:59.999 You basically have to remove all possible alternative choices, guesses, whatever, 99:59:59.999 --> 99:59:59.999 to make this simple. 99:59:59.999 --> 99:59:59.999 It just so happens if you move version dependencies among other things, 99:59:59.999 --> 99:59:59.999 things become really fun. 99:59:59.999 --> 99:59:59.999 ?? to solve the problem, but upgrading gets really really spicy. 99:59:59.999 --> 99:59:59.999 So how do we ensure that dpkg is configured before your new package, 99:59:59.999 --> 99:59:59.999 use a feature new dev package is installed. 99:59:59.999 --> 99:59:59.999 That's sort of impossible. 99:59:59.999 --> 99:59:59.999 It becomes simple, but we end up with a broken dependency ??, 99:59:59.999 --> 99:59:59.999 so it's not really an option. 99:59:59.999 --> 99:59:59.999 Technically there's also a way to make it simple by having multiple versions 99:59:59.999 --> 99:59:59.999 of things, and removing negative dependencies, 99:59:59.999 --> 99:59:59.999 but that's not necessarily easy to do either. 99:59:59.999 --> 99:59:59.999 Presumably file configs will get in our way or something. 99:59:59.999 --> 99:59:59.999 We have to redefine where we place all of the files, that's not going to be 99:59:59.999 --> 99:59:59.999 something we're doing anytime soon. 99:59:59.999 --> 99:59:59.999 So yeah, it's not really an option. 99:59:59.999 --> 99:59:59.999 To give you a better understanding of the problem, please consider this example. 99:59:59.999 --> 99:59:59.999 If we have coreutils at some version, depending on either this version 99:59:59.999 --> 99:59:59.999 of libc or a newer one. 99:59:59.999 --> 99:59:59.999 From this can we immediately conclude 99:59:59.999 --> 99:59:59.999 that if libc 2.19 is known to be installable, 99:59:59.999 --> 99:59:59.999 that coreutils will also be installable? 99:59:59.999 --> 99:59:59.999 How many people think we can immediately conclude something from this? 99:59:59.999 --> 99:59:59.999 How many people think we cannot conclude anything from this? 99:59:59.999 --> 99:59:59.999 We'll that's good. 99:59:59.999 --> 99:59:59.999 So it turns out that with negative dependencies and without negative 99:59:59.999 --> 99:59:59.999 dependencies, it doesn't really matter. 99:59:59.999 --> 99:59:59.999 With negative dependencies, libc or whatever it depends on could 99:59:59.999 --> 99:59:59.999 just conflict with coreutils, we're broken, it's out. 99:59:59.999 --> 99:59:59.999 And without negative dependencies, you could have one of the things where 99:59:59.999 --> 99:59:59.999 it depends on a version that's newer or older, and no. 99:59:59.999 --> 99:59:59.999 It's not really trivial to do. 99:59:59.999 --> 99:59:59.999 You can't conclude something locally, in theory. 99:59:59.999 --> 99:59:59.999 That's something that can become a problem. 99:59:59.999 --> 99:59:59.999 Anyway, it is highly tractable in practice. 99:59:59.999 --> 99:59:59.999 Because if you do have a break, conflicts, or otherwise negative dependency, 99:59:59.999 --> 99:59:59.999 it tends to be upper bound. 99:59:59.999 --> 99:59:59.999 So if the previous version of coreutils was installable with that version, 99:59:59.999 --> 99:59:59.999 the new version will also probably be. 99:59:59.999 --> 99:59:59.999 Likewise, most dependencies, circular or otherwise, tend to be upper bound. 99:59:59.999 --> 99:59:59.999 There are cases of version ranges which is a lower bound and upper bound 99:59:59.999 --> 99:59:59.999 at the same time. 99:59:59.999 --> 99:59:59.999 They exist, they are just not so common yet. 99:59:59.999 --> 99:59:59.999 And that's a good thing. 99:59:59.999 --> 99:59:59.999 The number of possible solutions to any clause 99:59:59.999 --> 99:59:59.999 actually tends to be fairly limited. 99:59:59.999 --> 99:59:59.999 Of course there are the exceptions. 99:59:59.999 --> 99:59:59.999 Packages from unstable. 99:59:59.999 --> 99:59:59.999 They might just be missing, in which case it's really hard to solve the dependency. 99:59:59.999 --> 99:59:59.999 You've got mutually exclusive packages all the sendmail stuff, MTAs. 99:59:59.999 --> 99:59:59.999 The version ranges I mentioned before. 99:59:59.999 --> 99:59:59.999 And then of course the strictly equal versions, which we see inside 99:59:59.999 --> 99:59:59.999 packages built from the same source packages. 99:59:59.999 --> 99:59:59.999 The redeeming feature here: same source. 99:59:59.999 --> 99:59:59.999 Because that is actually very simple to figure out 99:59:59.999 --> 99:59:59.999 if they are from the same source. 99:59:59.999 --> 99:59:59.999 So they tend to be upgraded ?? anyhow. 99:59:59.999 --> 99:59:59.999 The new versions tend to be available at the same time sorta thing. 99:59:59.999 --> 99:59:59.999 The problem made hard in a nutshell, is this contrived example for example. 99:59:59.999 --> 99:59:59.999 You have a package you want to try, and it depends on any number of foos. 99:59:59.999 --> 99:59:59.999 Each of the foos can either be solved by 99:59:59.999 --> 99:59:59.999 picking up the bar package or the good one. 99:59:59.999 --> 99:59:59.999 Which might be a hint to which one we would choose if we want to solve it. 99:59:59.999 --> 99:59:59.999 And the bar packages, depends on a bad package that does not work 99:59:59.999 --> 99:59:59.999 with the starting-package, and we are broken. 99:59:59.999 --> 99:59:59.999 And the good package just doesn't have any dependencies and is therefore good. 99:59:59.999 --> 99:59:59.999 In the perfect example you saw, 99:59:59.999 --> 99:59:59.999 you try foo, good, foo1, and then good and be done. 99:59:59.999 --> 99:59:59.999 In practice, if you don't have some way of ordering these so you know 99:59:59.999 --> 99:59:59.999 one of them is better than the other. 99:59:59.999 --> 99:59:59.999 You now have an n times m trace, where you have to guess up and down, 99:59:59.999 --> 99:59:59.999 which one am I doing, backtrack back and forth and all, and it's going to be horrible. 99:59:59.999 --> 99:59:59.999 And this contrived example can of course be solved by some solvers that can see 99:59:59.999 --> 99:59:59.999 the pattern, but you can always a more contrived pattern 99:59:59.999 --> 99:59:59.999 that's harder to understand. 99:59:59.999 --> 99:59:59.999 So I tried to keep it somewhat simple. 99:59:59.999 --> 99:59:59.999 The good thing is, very few people write software like this. 99:59:59.999 --> 99:59:59.999 And that is the most redeeming feature about the package ??. 99:59:59.999 --> 99:59:59.999 We do have exceptions. 99:59:59.999 --> 99:59:59.999 Aspell-dictionaries, ispell-dictionaries. 99:59:59.999 --> 99:59:59.999 Basically if you depend on ispell-dictionary, 99:59:59.999 --> 99:59:59.999 which is a virtual package provided by 50 different ispell packages, 99:59:59.999 --> 99:59:59.999 or aspell packages, and then they depend on something else after that. 99:59:59.999 --> 99:59:59.999 So in theory they can be a part of creating this problem. 99:59:59.999 --> 99:59:59.999 Fortunately, they themselves are fairly simple once you've passed them. 99:59:59.999 --> 99:59:59.999 You also have multiarch foreign, or multiarch allowed with 99:59:59.999 --> 99:59:59.999 package interdependencies. 99:59:59.999 --> 99:59:59.999 So if you have multiple architectures, you in theory can have any number 99:59:59.999 --> 99:59:59.999 of things satisfying the same clause. 99:59:59.999 --> 99:59:59.999 This gives extra unnecssary work for most resolvers, 99:59:59.999 --> 99:59:59.999 if you enable them to do multiarch and cross things. 99:59:59.999 --> 99:59:59.999 We have 12 of them, but I suspect most users of multiarch are somewhat limited 99:59:59.999 --> 99:59:59.999 to maybe 3. 99:59:59.999 --> 99:59:59.999 Possible exception being people like writing resolvers and trying them out, 99:59:59.999 --> 99:59:59.999 torture test them. 99:59:59.999 --> 99:59:59.999 The real problem, if you had to do this in the archive you would need to do, 99:59:59.999 --> 99:59:59.999 for example something like, write indifferent awk implementations. 99:59:59.999 --> 99:59:59.999 We have three. 99:59:59.999 --> 99:59:59.999 And then afterwards, for example, have m different distinct, 99:59:59.999 --> 99:59:59.999 but equally valid libc implementations. 99:59:59.999 --> 99:59:59.999 It's not something we're likely to do in the near future. 99:59:59.999 --> 99:59:59.999 And you have to do this on multiple levels, 99:59:59.999 --> 99:59:59.999 because you can just presolve the essential set most of the time.