1 99:59:59,999 --> 99:59:59,999 (Offscreen) So welcome to the talk from Niels 2 99:59:59,999 --> 99:59:59,999 about dependency resolution in polynomial time. 3 99:59:59,999 --> 99:59:59,999 (Neils) I'm Neils, I'm doing my second talk today. 4 99:59:59,999 --> 99:59:59,999 This time about something slightly more technical. 5 99:59:59,999 --> 99:59:59,999 I wanted to do dependency resolution in deterministic polynomial time, 6 99:59:59,999 --> 99:59:59,999 which is what this is all about. 7 99:59:59,999 --> 99:59:59,999 I didn't get as far with this as I wanted. 8 99:59:59,999 --> 99:59:59,999 I wanted to have a tool ready and have some results, which I didn't quite finish. 9 99:59:59,999 --> 99:59:59,999 So these will be my findings work in progress notes. 10 99:59:59,999 --> 99:59:59,999 We have six topics I have to cover. 11 99:59:59,999 --> 99:59:59,999 A quick introduction. 12 99:59:59,999 --> 99:59:59,999 Then something about the hard problem. 13 99:59:59,999 --> 99:59:59,999 Then the part the part where it actually turns out to be highly tractable 14 99:59:59,999 --> 99:59:59,999 in practice, if you don't think too much about it. 15 99:59:59,999 --> 99:59:59,999 And then there is some installing, upgrading, and installing again 16 99:59:59,999 --> 99:59:59,999 for how to improve your solvers. 17 99:59:59,999 --> 99:59:59,999 Starting with introduction. 18 99:59:59,999 --> 99:59:59,999 I'm hoping mostly to debunk the myth that dependency resolution is a very 19 99:59:59,999 --> 99:59:59,999 hard problem that we cannot solve, 20 99:59:59,999 --> 99:59:59,999 and we should remove packages in the hopes that they will somehow keep the archive 21 99:59:59,999 --> 99:59:59,999 from exploding, and the dependency resolvers to break, and apt to die, 22 99:59:59,999 --> 99:59:59,999 and god knows what. 23 99:59:59,999 --> 99:59:59,999 I've been working on Britney, which is a different problem to solve, but it 24 99:59:59,999 --> 99:59:59,999 involves the same techniques, or uses some of the same techniques. 25 99:59:59,999 --> 99:59:59,999 Some will be directly applicable to apt, some will not. 26 99:59:59,999 --> 99:59:59,999 There's not a one size fits all solution, sorry. 27 99:59:59,999 --> 99:59:59,999 And I much less haven't wrote it yet, even if there was. 28 99:59:59,999 --> 99:59:59,999 So defining the problem. 29 99:59:59,999 --> 99:59:59,999 When you try to install a package on your system, 30 99:59:59,999 --> 99:59:59,999 there's actually two problems being solved. 31 99:59:59,999 --> 99:59:59,999 One is the part where apt figures out if I have to install Eclipse. 32 99:59:59,999 --> 99:59:59,999 You'll be needing a lot of Java packages, you'll be needing a lot of other stuff. 33 99:59:59,999 --> 99:59:59,999 And so it figures out which packages are needed for this to make sense. 34 99:59:59,999 --> 99:59:59,999 The separate part of the problem, the second problem is, 35 99:59:59,999 --> 99:59:59,999 once you have this list, they have to be unpackaged and configured in a certain 36 99:59:59,999 --> 99:59:59,999 order for this all to make sense. 37 99:59:59,999 --> 99:59:59,999 They are in theory two distinct problems. 38 99:59:59,999 --> 99:59:59,999 I'll be talking about the first problem, 39 99:59:59,999 --> 99:59:59,999 because that's actually dependency resolution. 40 99:59:59,999 --> 99:59:59,999 The other thing is just ordering. 41 99:59:59,999 --> 99:59:59,999 The ordering thing is certainly also very important part. 42 99:59:59,999 --> 99:59:59,999 I don't want to dismiss that, it's just 43 99:59:59,999 --> 99:59:59,999 it is in fact theoretically a polynomial time problem. 44 99:59:59,999 --> 99:59:59,999 So to solve the ordering problem, you basically compute all the action- 45 99:59:59,999 --> 99:59:59,999 Given a set of actions that need to be done, 46 99:59:59,999 --> 99:59:59,999 then there are some constraints between some of them, 47 99:59:59,999 --> 99:59:59,999 like you unpack things before you configure it. 48 99:59:59,999 --> 99:59:59,999 Maybe you deconfigure something else before that. 49 99:59:59,999 --> 99:59:59,999 So you end up with a list of partial ?? constraints. 50 99:59:59,999 --> 99:59:59,999 From that you build a graph with ordering. 51 99:59:59,999 --> 99:59:59,999 It's fairly simple to do, in practice, if you have the right tools for it. 52 99:59:59,999 --> 99:59:59,999 Then without cycles, this is a trivial things where it just orders it 53 99:59:59,999 --> 99:59:59,999 and gives you this is the order, go fix. 54 99:59:59,999 --> 99:59:59,999 When there's cycles you will get a single action consisting of multiple things 55 99:59:59,999 --> 99:59:59,999 to do at the same time, which is really impossible to do. 56 99:59:59,999 --> 99:59:59,999 It turns out that the way we defined this whole thing, 57 99:59:59,999 --> 99:59:59,999 if you have a package that does not have a post install script, 58 99:59:59,999 --> 99:59:59,999 it does not need to be configured separately. 59 99:59:59,999 --> 99:59:59,999 That means that it's just unpacked or not unpacked. 60 99:59:59,999 --> 99:59:59,999 That tends to break most cylcles. 61 99:59:59,999 --> 99:59:59,999 If you want, there's problems with ordering constraints. 62 99:59:59,999 --> 99:59:59,999 Help us to find a way to get rid of most of the post install scripts, 63 99:59:59,999 --> 99:59:59,999 such as the ones just running lvconfig and nothing else. 64 99:59:59,999 --> 99:59:59,999 That could solve some of the cycles, 65 99:59:59,999 --> 99:59:59,999 and otherwise it's just feeding it to dpkg and it works. 66 99:59:59,999 --> 99:59:59,999 If you have a cycle, it's a separate problem fixing that. 67 99:59:59,999 --> 99:59:59,999 I won't be covering that. 68 99:59:59,999 --> 99:59:59,999 As I said, the runtime is polynomial. 69 99:59:59,999 --> 99:59:59,999 It's something like the size of the package, a regular graph. 70 99:59:59,999 --> 99:59:59,999 Which is fairly polynomial. 71 99:59:59,999 --> 99:59:59,999 This even covers finding cycles. 72 99:59:59,999 --> 99:59:59,999 Again breaking cycles is just removing post install scripts. 73 99:59:59,999 --> 99:59:59,999 And of course if you think this is too simple, 74 99:59:59,999 --> 99:59:59,999 you can always implement more features on top of it, 75 99:59:59,999 --> 99:59:59,999 such as trying to optimize for minimal downtime of services and all that. 76 99:59:59,999 --> 99:59:59,999 You can make any trivial problem very hard very soon with that. 77 99:59:59,999 --> 99:59:59,999 So, the hard problem. 78 99:59:59,999 --> 99:59:59,999 First of all, the players. 79 99:59:59,999 --> 99:59:59,999 We've got apt, aptitude, cupt, whatever. 80 99:59:59,999 --> 99:59:59,999 We've got Britney. 81 99:59:59,999 --> 99:59:59,999 We've got DOSE, edos, whatever its name is today. 82 99:59:59,999 --> 99:59:59,999 They solve the hard problem. 83 99:59:59,999 --> 99:59:59,999 Or they should be trying to do. 84 99:59:59,999 --> 99:59:59,999 Notable tools not affected by it. 85 99:59:59,999 --> 99:59:59,999 dpkg. 86 99:59:59,999 --> 99:59:59,999 So dpkg surely figures out if you are missing dependencies. It says, 87 99:59:59,999 --> 99:59:59,999 "I cannot resolve this, and I don't know where to fetch it, so I'm just giving up." 88 99:59:59,999 --> 99:59:59,999 And that is the only thing you can do. 89 99:59:59,999 --> 99:59:59,999 Which means it only verifies the solution which is known to be polynomial, 90 99:59:59,999 --> 99:59:59,999 and is therefore is not a game player. 91 99:59:59,999 --> 99:59:59,999 DAK rm is also doing a polynomial time check, it just happens to be so 92 99:59:59,999 --> 99:59:59,999 for other reasons. 93 99:59:59,999 --> 99:59:59,999 That's basically the known players. 94 99:59:59,999 --> 99:59:59,999 There are probably others. 95 99:59:59,999 --> 99:59:59,999 But we're moving onto the hard problem itself 96 99:59:59,999 --> 99:59:59,999 So the problem we actually have is, 97 99:59:59,999 --> 99:59:59,999 we've got version dependencies, we've got alternative choices, 98 99:59:59,999 --> 99:59:59,999 we've got virtual packages. 99 99:59:59,999 --> 99:59:59,999 All three makes this problem hard. 100 99:59:59,999 --> 99:59:59,999 You basically have to remove all possible alternative choices, guesses, whatever, 101 99:59:59,999 --> 99:59:59,999 to make this simple. 102 99:59:59,999 --> 99:59:59,999 It just so happens if you move version dependencies among other things, 103 99:59:59,999 --> 99:59:59,999 things become really fun. 104 99:59:59,999 --> 99:59:59,999 ?? to solve the problem, but upgrading gets really really spicy. 105 99:59:59,999 --> 99:59:59,999 So how do we ensure that dpkg is configured before your new package, 106 99:59:59,999 --> 99:59:59,999 use a feature new dev package is installed. 107 99:59:59,999 --> 99:59:59,999 That's sort of impossible. 108 99:59:59,999 --> 99:59:59,999 It becomes simple, but we end up with a broken dependency ??, 109 99:59:59,999 --> 99:59:59,999 so it's not really an option. 110 99:59:59,999 --> 99:59:59,999 Technically there's also a way to make it simple by having multiple versions 111 99:59:59,999 --> 99:59:59,999 of things, and removing negative dependencies, 112 99:59:59,999 --> 99:59:59,999 but that's not necessarily easy to do either. 113 99:59:59,999 --> 99:59:59,999 Presumably file configs will get in our way or something. 114 99:59:59,999 --> 99:59:59,999 We have to redefine where we place all of the files, that's not going to be 115 99:59:59,999 --> 99:59:59,999 something we're doing anytime soon. 116 99:59:59,999 --> 99:59:59,999 So yeah, it's not really an option. 117 99:59:59,999 --> 99:59:59,999 To give you a better understanding of the problem, please consider this example. 118 99:59:59,999 --> 99:59:59,999 If we have coreutils at some version, depending on either this version 119 99:59:59,999 --> 99:59:59,999 of libc or a newer one. 120 99:59:59,999 --> 99:59:59,999 From this can we immediately conclude 121 99:59:59,999 --> 99:59:59,999 that if libc 2.19 is known to be installable, 122 99:59:59,999 --> 99:59:59,999 that coreutils will also be installable? 123 99:59:59,999 --> 99:59:59,999 How many people think we can immediately conclude something from this? 124 99:59:59,999 --> 99:59:59,999 How many people think we cannot conclude anything from this? 125 99:59:59,999 --> 99:59:59,999 We'll that's good. 126 99:59:59,999 --> 99:59:59,999 So it turns out that with negative dependencies and without negative 127 99:59:59,999 --> 99:59:59,999 dependencies, it doesn't really matter. 128 99:59:59,999 --> 99:59:59,999 With negative dependencies, libc or whatever it depends on could 129 99:59:59,999 --> 99:59:59,999 just conflict with coreutils, we're broken, it's out. 130 99:59:59,999 --> 99:59:59,999 And without negative dependencies, you could have one of the things where 131 99:59:59,999 --> 99:59:59,999 it depends on a version that's newer or older, and no. 132 99:59:59,999 --> 99:59:59,999 It's not really trivial to do. 133 99:59:59,999 --> 99:59:59,999 You can't conclude something locally, in theory. 134 99:59:59,999 --> 99:59:59,999 That's something that can become a problem. 135 99:59:59,999 --> 99:59:59,999 Anyway, it is highly tractable in practice. 136 99:59:59,999 --> 99:59:59,999 Because if you do have a break, conflicts, or otherwise negative dependency, 137 99:59:59,999 --> 99:59:59,999 it tends to be upper bound. 138 99:59:59,999 --> 99:59:59,999 So if the previous version of coreutils was installable with that version, 139 99:59:59,999 --> 99:59:59,999 the new version will also probably be. 140 99:59:59,999 --> 99:59:59,999 Likewise, most dependencies, circular or otherwise, tend to be upper bound. 141 99:59:59,999 --> 99:59:59,999 There are cases of version ranges which is a lower bound and upper bound 142 99:59:59,999 --> 99:59:59,999 at the same time. 143 99:59:59,999 --> 99:59:59,999 They exist, they are just not so common yet. 144 99:59:59,999 --> 99:59:59,999 And that's a good thing. 145 99:59:59,999 --> 99:59:59,999 The number of possible solutions to any clause 146 99:59:59,999 --> 99:59:59,999 actually tends to be fairly limited. 147 99:59:59,999 --> 99:59:59,999 Of course there are the exceptions. 148 99:59:59,999 --> 99:59:59,999 Packages from unstable. 149 99:59:59,999 --> 99:59:59,999 They might just be missing, in which case it's really hard to solve the dependency. 150 99:59:59,999 --> 99:59:59,999 You've got mutually exclusive packages all the sendmail stuff, MTAs. 151 99:59:59,999 --> 99:59:59,999 The version ranges I mentioned before. 152 99:59:59,999 --> 99:59:59,999 And then of course the strictly equal versions, which we see inside 153 99:59:59,999 --> 99:59:59,999 packages built from the same source packages. 154 99:59:59,999 --> 99:59:59,999 The redeeming feature here: same source. 155 99:59:59,999 --> 99:59:59,999 Because that is actually very simple to figure out 156 99:59:59,999 --> 99:59:59,999 if they are from the same source. 157 99:59:59,999 --> 99:59:59,999 So they tend to be upgraded ?? anyhow. 158 99:59:59,999 --> 99:59:59,999 The new versions tend to be available at the same time sorta thing. 159 99:59:59,999 --> 99:59:59,999 The problem made hard in a nutshell, is this contrived example for example. 160 99:59:59,999 --> 99:59:59,999 You have a package you want to try, and it depends on any number of foos. 161 99:59:59,999 --> 99:59:59,999 Each of the foos can either be solved by 162 99:59:59,999 --> 99:59:59,999 picking up the bar package or the good one. 163 99:59:59,999 --> 99:59:59,999 Which might be a hint to which one we would choose if we want to solve it. 164 99:59:59,999 --> 99:59:59,999 And the bar packages, depends on a bad package that does not work 165 99:59:59,999 --> 99:59:59,999 with the starting-package, and we are broken. 166 99:59:59,999 --> 99:59:59,999 And the good package just doesn't have any dependencies and is therefore good. 167 99:59:59,999 --> 99:59:59,999 In the perfect example you saw, 168 99:59:59,999 --> 99:59:59,999 you try foo, good, foo1, and then good and be done. 169 99:59:59,999 --> 99:59:59,999 In practice, if you don't have some way of ordering these so you know 170 99:59:59,999 --> 99:59:59,999 one of them is better than the other. 171 99:59:59,999 --> 99:59:59,999 You now have an N times M trace, where you have to guess up and down, which 172 99:59:59,999 --> 99:59:59,999 one am I doing, backtrack back and forth and all, and it's going to be horrible. 173 99:59:59,999 --> 99:59:59,999 And this contrived example can of course be solved by some solvers that can see 174 99:59:59,999 --> 99:59:59,999 the pattern, but you can always a more contrived pattern 175 99:59:59,999 --> 99:59:59,999 that's harder to understand. 176 99:59:59,999 --> 99:59:59,999 So I tried to keep it somewhat simple. 177 99:59:59,999 --> 99:59:59,999 The good thing is, very few people write software like this. 178 99:59:59,999 --> 99:59:59,999 And that is the most redeeming feature about the package ??. 179 99:59:59,999 --> 99:59:59,999 We do have exceptions. 180 99:59:59,999 --> 99:59:59,999 Aspell-dictionaries, ispell-dictionaries. 181 99:59:59,999 --> 99:59:59,999 Basically if you depend on ispell-dictionary, 182 99:59:59,999 --> 99:59:59,999 which is a virtual package provided by 50 different ispell packages, 183 99:59:59,999 --> 99:59:59,999 or aspell packages, and then they depend on something else after that. 184 99:59:59,999 --> 99:59:59,999 So in theory they can be a part of creating this problem. 185 99:59:59,999 --> 99:59:59,999 Fortunately, they themselves are fairly simple once you've passed them. 186 99:59:59,999 --> 99:59:59,999 You also have multiarch foreign, or multiarch allowed with 187 99:59:59,999 --> 99:59:59,999 package interdependencies. 188 99:59:59,999 --> 99:59:59,999 So if you have multiple architectures, you in theory can have any number 189 99:59:59,999 --> 99:59:59,999 of things satisfying the same clause. 190 99:59:59,999 --> 99:59:59,999 This gives extra unnecssary work for most resolvers, 191 99:59:59,999 --> 99:59:59,999 if you enable them to do multiarch and cross things. 192 99:59:59,999 --> 99:59:59,999 We have 12 of them, but I suspect most users of multiarch are somewhat limited 193 99:59:59,999 --> 99:59:59,999 to maybe 3. 194 99:59:59,999 --> 99:59:59,999 Possible exception being people like writing resolvers and trying them out, 195 99:59:59,999 --> 99:59:59,999 torture test them. 196 99:59:59,999 --> 99:59:59,999 The real problem, if you had to do this in the archive you would need to do, 197 99:59:59,999 --> 99:59:59,999 for example something like, write indifferent awk implementations. 198 99:59:59,999 --> 99:59:59,999 We have three. 199 99:59:59,999 --> 99:59:59,999 And then afterwards, for example, have m different distinct, 200 99:59:59,999 --> 99:59:59,999 but equally valid libc implementations. 201 99:59:59,999 --> 99:59:59,999 It's not something we're likely to do in the near future. 202 99:59:59,999 --> 99:59:59,999 And you have to do this on multiple levels, 203 99:59:59,999 --> 99:59:59,999 because you can just presolve the essential set most of the time. 204 99:59:59,999 --> 99:59:59,999 The real problem: if you had to do this in the archive, you would need to do, 205 99:59:59,999 --> 99:59:59,999 for example, something like write N different awk implementations. 206 99:59:59,999 --> 99:59:59,999 We have 3. 207 99:59:59,999 --> 99:59:59,999 Then afterwards for example, have M different distinct but equally valid 208 99:59:59,999 --> 99:59:59,999 libc implementations. 209 99:59:59,999 --> 99:59:59,999 It's not something we're likely to do in the near future. 210 99:59:59,999 --> 99:59:59,999 And you have to do this on multiple levels. 211 99:59:59,999 --> 99:59:59,999 Because you can just presolve the essential set most of the time, 212 99:59:59,999 --> 99:59:59,999 this particular thing is not very interesting. 213 99:59:59,999 --> 99:59:59,999 And the data packages can in theory blow up, 214 99:59:59,999 --> 99:59:59,999 when you have truly interchangable data, like we do with the aspell packages. 215 99:59:59,999 --> 99:59:59,999 But they tend to either not have any dependencies after the aspell, 216 99:59:59,999 --> 99:59:59,999 after the data package, or they have a simple loopback into what you came from. 217 99:59:59,999 --> 99:59:59,999 So the blowup blowup tends to be limited to one level. 218 99:59:59,999 --> 99:59:59,999 If you have enough of those, it's still going to be a problem, 219 99:59:59,999 --> 99:59:59,999 but it keeps it simple. 220 99:59:59,999 --> 99:59:59,999 Onto installing. 221 99:59:59,999 --> 99:59:59,999 The easier case, as mentioned is when you have a single suit 222 99:59:59,999 --> 99:59:59,999 and a single architecture. 223 99:59:59,999 --> 99:59:59,999 All things in the graph collapse into what looks like a regular graph, 224 99:59:59,999 --> 99:59:59,999 and therefore become simple, trivial. 225 99:59:59,999 --> 99:59:59,999 This actually happens so much that if you take Lintian, 226 99:59:59,999 --> 99:59:59,999 it has 26 distinct dependency clauses. 227 99:59:59,999 --> 99:59:59,999 If you look at each of them, 228 99:59:59,999 --> 99:59:59,999 when you only have a single suite and architecture, there's at most 229 99:59:59,999 --> 99:59:59,999 one package directly solving each clause locally. 230 99:59:59,999 --> 99:59:59,999 Then further down the graph somewhere you depend on for example, debconf, 231 99:59:59,999 --> 99:59:59,999 or debconf-2.0 which is one of these alternatives, giving rise to its ?? 232 99:59:59,999 --> 99:59:59,999 Linitan itself is not subject to becoming a backtrack point. 233 99:59:59,999 --> 99:59:59,999 There's no guesswork when you reach Lintian, which of its dependencies 234 99:59:59,999 --> 99:59:59,999 should I pick, which version of it and all. 235 99:59:59,999 --> 99:59:59,999 That's not there, you have to go further down to see that. 236 99:59:59,999 --> 99:59:59,999 Actually, but the time you reach the debconf thing, 237 99:59:59,999 --> 99:59:59,999 it has already been solved by something else, as I recall. 238 99:59:59,999 --> 99:59:59,999 Oh, no wait. The dependencies of debconf is already covered by the time you're 239 99:59:59,999 --> 99:59:59,999 forced to resolve this choice, so it's not really a big issue, either. 240 99:59:59,999 --> 99:59:59,999 You just pick the debconf one, c-debconf currently depends on debconf, 241 99:59:59,999 --> 99:59:59,999 so it's trivial that way. 242 99:59:59,999 --> 99:59:59,999 The question is, when do we have these conditions. 243 99:59:59,999 --> 99:59:59,999 We do that a lot actually. 244 99:59:59,999 --> 99:59:59,999 If you do builts in unstable for example. 245 99:59:59,999 --> 99:59:59,999 ??, only one suite only one architecture, piece of cake. 246 99:59:59,999 --> 99:59:59,999 If you install packages of a pure Jessie system, with multiarch, 247 99:59:59,999 --> 99:59:59,999 you have the single suite one, which limits a lot of them, 248 99:59:59,999 --> 99:59:59,999 and then you have not the single architecture. 249 99:59:59,999 --> 99:59:59,999 That's what it is, but in pure Squeeze, you won't have multiarch because 250 99:59:59,999 --> 99:59:59,999 because it didn't work back then. 251 99:59:59,999 --> 99:59:59,999 So that's even simpler. 252 99:59:59,999 --> 99:59:59,999 Also you can do unstable + experimental, or stable + backports if you fiddle a bit 253 99:59:59,999 --> 99:59:59,999 with the end resolver and say that it is not allowed to pick experimental, 254 99:59:59,999 --> 99:59:59,999 unless it is explicitly requested to pick experimental, 255 99:59:59,999 --> 99:59:59,999 and from there it only picks what it absolutely needs to solve it. 256 99:59:59,999 --> 99:59:59,999 You basically get the same result with a single suite restriction, 257 99:59:59,999 --> 99:59:59,999 but that requires you to actually code something for it. 258 99:59:59,999 --> 99:59:59,999 And then there's Britney, with the testing migration thing, and that's because 259 99:59:59,999 --> 99:59:59,999 she basically takes things, moves it into a new version of testing, 260 99:59:59,999 --> 99:59:59,999 and tries to test that this is still the same solution. 261 99:59:59,999 --> 99:59:59,999 So she forces it into a single suite currently. 262 99:59:59,999 --> 99:59:59,999 So these are common cases where it happens. 263 99:59:59,999 --> 99:59:59,999 This is not everywhere where it happens, 264 99:59:59,999 --> 99:59:59,999 so the stable upgrades are still funny enough, not a single suite and all that. 265 99:59:59,999 --> 99:59:59,999 It happens because there is only one package and architecture. 266 99:59:59,999 --> 99:59:59,999 Only one instance of package, you only have one version, one architecture that 267 99:59:59,999 --> 99:59:59,999 solves that particular dependency. 268 99:59:59,999 --> 99:59:59,999 Whereas with multiarch you could have, i386 and the amd64 version. 269 99:59:59,999 --> 99:59:59,999 If you do an upgrade you could have the old version and the new version 270 99:59:59,999 --> 99:59:59,999 that may or may not satisfy it. 271 99:59:59,999 --> 99:59:59,999 Also we have this thing where we don't like libraries to have 272 99:59:59,999 --> 99:59:59,999 alternative implementations, it sort of breaks things horribly. 273 99:59:59,999 --> 99:59:59,999 Especially when they are not actually agreeing on the interface they provide. 274 99:59:59,999 --> 99:59:59,999 There's the social aspect of it that we don't really bother having 275 99:59:59,999 --> 99:59:59,999 200 interchangeable implementations of everything. 276 99:59:59,999 --> 99:59:59,999 I think the record is something like 277 99:59:59,999 --> 99:59:59,999 five or six different versions of sendmail, implemenatations. 278 99:59:59,999 --> 99:59:59,999 ?? 279 99:59:59,999 --> 99:59:59,999 I don't think so, but ?? might have. 280 99:59:59,999 --> 99:59:59,999 And even when you do have one of these explosions, 281 99:59:59,999 --> 99:59:59,999 it has to actually hide the breakage beneath one of these choice 282 99:59:59,999 --> 99:59:59,999 explosion alternative explosions for it to be a problem. 283 99:59:59,999 --> 99:59:59,999 You might have aspell over here on the right hand side, 284 99:59:59,999 --> 99:59:59,999 and you have a breakage over here which ?? to find out. 285 99:59:59,999 --> 99:59:59,999 So if you use the resolver to just the first explosion to solve the other part, 286 99:59:59,999 --> 99:59:59,999 it realizes this is not working, I'll bail out. 287 99:59:59,999 --> 99:59:59,999 There's a couple things that we do to make this easier. 288 99:59:59,999 --> 99:59:59,999 The mos interesting thing is of course, can we still make this simple without 289 99:59:59,999 --> 99:59:59,999 the single suite and single architecture restriction. 290 99:59:59,999 --> 99:59:59,999 We can to some extent. 291 99:59:59,999 --> 99:59:59,999 And that's where we move to upgrading, 292 99:59:59,999 --> 99:59:59,999 in deterministic polynomial time. 293 99:59:59,999 --> 99:59:59,999 This is where I spent most of my time working. 294 99:59:59,999 --> 99:59:59,999 So to start with, when we do an upgrade from stable to stable, 295 99:59:59,999 --> 99:59:59,999 and here I'm taking pure stable, no backports and all. 296 99:59:59,999 --> 99:59:59,999 The general recommended way to do it is to replace Wheezy with Jessie, 297 99:59:59,999 --> 99:59:59,999 hit apt-get update, and upgrade afterwards, 298 99:59:59,999 --> 99:59:59,999 and apt replaces all the old packages with the new version, 299 99:59:59,999 --> 99:59:59,999 if there is a new version. 300 99:59:59,999 --> 99:59:59,999 We also want all the packages from Wheezy that did not have any version in Jessie 301 99:59:59,999 --> 99:59:59,999 to be removed, and there might be a new essential package somewhere. 302 99:59:59,999 --> 99:59:59,999 So long story short, upgrade if it's present, remove if it is not present, 303 99:59:59,999 --> 99:59:59,999 and then install the new essential packages, if any. 304 99:59:59,999 --> 99:59:59,999 They don't tend to change that often. 305 99:59:59,999 --> 99:59:59,999 I'll be ignoring the installing part for now. 306 99:59:59,999 --> 99:59:59,999 It is of course valid and interesting, but we'll skip it. 307 99:59:59,999 --> 99:59:59,999 So let's take a simple example, somewhat contrived, somewhat made up the numbers. 308 99:59:59,999 --> 99:59:59,999 They are not too far from reality, but they are not exact either. 309 99:59:59,999 --> 99:59:59,999 We have have a system that we are upgrading from Wheezy to Jessie. 310 99:59:59,999 --> 99:59:59,999 I claim that there are 30 thousand packages in Wheezy, 311 99:59:59,999 --> 99:59:59,999 we've got 35 thousand in Jessie. 312 99:59:59,999 --> 99:59:59,999 The system we are working with has two thousand packages installed. 313 99:59:59,999 --> 99:59:59,999 Mine here has something like 1200. 314 99:59:59,999 --> 99:59:59,999 If you do a simple dpkg -L 315 99:59:59,999 --> 99:59:59,999 and then do a line count, you can see how many you have on your system, 316 99:59:59,999 --> 99:59:59,999 plus-minus five or so, to give you an idea of how many packages 317 99:59:59,999 --> 99:59:59,999 you're actually working with. 318 99:59:59,999 --> 99:59:59,999 We assume that all Wheezy package got a new version in Jessie, 319 99:59:59,999 --> 99:59:59,999 because I'm about to do a pop quiz! 320 99:59:59,999 --> 99:59:59,999 So with these numbers, what is the maximum problem size of this upgrade. 321 99:59:59,999 --> 99:59:59,999 Any takers? 322 99:59:59,999 --> 99:59:59,999 Is it 30? 323 99:59:59,999 --> 99:59:59,999 Anyone for 35? 324 99:59:59,999 --> 99:59:59,999 37? 325 99:59:59,999 --> 99:59:59,999 One for 65. 326 99:59:59,999 --> 99:59:59,999 One for 67.5 thousand. 327 99:59:59,999 --> 99:59:59,999 And who believes that I was an asshole and did not include the right answer? 328 99:59:59,999 --> 99:59:59,999 Oh, so little faith. 329 99:59:59,999 --> 99:59:59,999 I'm glad you believe me. 330 99:59:59,999 --> 99:59:59,999 The right answer is 35 thousand. 331 99:59:59,999 --> 99:59:59,999 The trick is, when we do the upgrade, we replace Wheezy with Jessie, 332 99:59:59,999 --> 99:59:59,999 we do an update, so all the Wheezy packages not installed 333 99:59:59,999 --> 99:59:59,999 on our system disappears. 334 99:59:59,999 --> 99:59:59,999 Then you get all of the Jessie packages, with the assumption that 335 99:59:59,999 --> 99:59:59,999 no Jessie packages have a new version compared to Wheezy. 336 99:59:59,999 --> 99:59:59,999 You've got the two thousand from Wheezy on your system, 337 99:59:59,999 --> 99:59:59,999 plus the 35 thousand packages from Jessie. 338 99:59:59,999 --> 99:59:59,999 So that means your average stable-to-stable upgrade, 339 99:59:59,999 --> 99:59:59,999 if you do it this way, is actually only about 40% of the worst case 340 99:59:59,999 --> 99:59:59,999 that it could've been if you kept the old stable as well. 341 99:59:59,999 --> 99:59:59,999 There's also another awesome feature with removing the old stable repository. 342 99:59:59,999 --> 99:59:59,999 It means that every time you upgrade a package, 343 99:59:59,999 --> 99:59:59,999 your effective problem size decreased by one. 344 99:59:59,999 --> 99:59:59,999 That's not always useful, it's not always awesome, 345 99:59:59,999 --> 99:59:59,999 and it's not always the thing that solves your problem, 346 99:59:59,999 --> 99:59:59,999 but it means that if you can actually make apt upgrade a single thing 347 99:59:59,999 --> 99:59:59,999 every now and then, eventually it might be able to figure out the rest of 348 99:59:59,999 --> 99:59:59,999 the upgrade path, after ?? 200 packages or so. 349 99:59:59,999 --> 99:59:59,999 As we upgrade, we end up to moving towards a single suite situation, 350 99:59:59,999 --> 99:59:59,999 so the more things we upgrade, the more we get to a single suite situation. 351 99:59:59,999 --> 99:59:59,999 Again, most ?? pure stable to stable upgrades. 352 99:59:59,999 --> 99:59:59,999 And of course the right answer would've been 65 thousand packages 353 99:59:59,999 --> 99:59:59,999 if you kept your stable. 354 99:59:59,999 --> 99:59:59,999 So that would've been a possible answer as well. 355 99:59:59,999 --> 99:59:59,999 Now upgrading. 356 99:59:59,999 --> 99:59:59,999 This should be as easy as mark new essential packages for install, 357 99:59:59,999 --> 99:59:59,999 Mark everything that has a new version for upgrade, remove stuff, 358 99:59:59,999 --> 99:59:59,999 and then figure out if something is missing, then install that. 359 99:59:59,999 --> 99:59:59,999 This solves upgrading by installing, so I'm not really interested in doing this, 360 99:59:59,999 --> 99:59:59,999 because installing is hard. 361 99:59:59,999 --> 99:59:59,999 We can do something smarter than that for upgrading. 362 99:59:59,999 --> 99:59:59,999 When we do upgrades in a polynomial deterministic time, 363 99:59:59,999 --> 99:59:59,999 I'm not going to give you a 100% solution, that's not going to work. 364 99:59:59,999 --> 99:59:59,999 If I could, I would be very very rich. 365 99:59:59,999 --> 99:59:59,999 So, I intended this to sort of find some easy low hanging fruits 366 99:59:59,999 --> 99:59:59,999 that can be solved cheaply, and then you can throw a general purpose resolver at 367 99:59:59,999 --> 99:59:59,999 the rest of the problem, after you've done the simple parts. 368 99:59:59,999 --> 99:59:59,999 Or you can feed your solver a partial solution and ask it to fix it up 369 99:59:59,999 --> 99:59:59,999 so it would be a slightly smaller problem size. 370 99:59:59,999 --> 99:59:59,999 It relies on two things. 371 99:59:59,999 --> 99:59:59,999 One, it exploits a lot of common patterns on how we do dependencies. 372 99:59:59,999 --> 99:59:59,999 And the other thing is, if you have a valid system state, 373 99:59:59,999 --> 99:59:59,999 so if you have a system where all of your packages are in fact installable, 374 99:59:59,999 --> 99:59:59,999 which dpkg tends to enforce, 375 99:59:59,999 --> 99:59:59,999 you can verify that is true in polynomial time. 376 99:59:59,999 --> 99:59:59,999 You can also verify a change to it in polynomial time. 377 99:59:59,999 --> 99:59:59,999 So here I'm going to do a bit of theory, and a bit of English inbetween. 378 99:59:59,999 --> 99:59:59,999 We start with a not-broken system, that is we have a set of installable 379 99:59:59,999 --> 99:59:59,999 packages that are mutually co-installable and all that. 380 99:59:59,999 --> 99:59:59,999 We call that 'I'. 381 99:59:59,999 --> 99:59:59,999 Then we can add things to I, we can remove things from this set, we can take something 382 99:59:59,999 --> 99:59:59,999 in the set and replace it with something else, basically add and remove. 383 99:59:59,999 --> 99:59:59,999 That can be done in linear time. 384 99:59:59,999 --> 99:59:59,999 If I take a random package, mash it in, that's constant, maybe, 385 99:59:59,999 --> 99:59:59,999 depending on your implementation. 386 99:59:59,999 --> 99:59:59,999 The theory is we can compute a new set, where we remove something, 387 99:59:59,999 --> 99:59:59,999 add something else, we get a new set. 388 99:59:59,999 --> 99:59:59,999 Then we can verify this new set is indeed still a valid solution in polynomial time. 389 99:59:59,999 --> 99:59:59,999 So, a constant time modification can be verified in polynomial time. 390 99:59:59,999 --> 99:59:59,999 The issue of the day is, randomly feeding packages in and out of the set 391 99:59:59,999 --> 99:59:59,999 is not going to work very well. 392 99:59:59,999 --> 99:59:59,999 So we need something smarter than just random modifications. 393 99:59:59,999 --> 99:59:59,999 Ah well, somebody might actually write something that works with that but anyway. 394 99:59:59,999 --> 99:59:59,999 So the first thing I did, as I mentioned earlier, 395 99:59:59,999 --> 99:59:59,999 we have these exclusively equal dependencies inside binaries, 396 99:59:59,999 --> 99:59:59,999 ?? binaries from the same source. 397 99:59:59,999 --> 99:59:59,999 So I grouped binaries from the same source. 398 99:59:59,999 --> 99:59:59,999 If I was to mark one of them for upgrade, 399 99:59:59,999 --> 99:59:59,999 I would immediately pull the rest of them as well, if they were installed. 400 99:59:59,999 --> 99:59:59,999 It also happens to sort out a very common case of breaks-replaces, 401 99:59:59,999 --> 99:59:59,999 when you move files between two binaries in the same source. 402 99:59:59,999 --> 99:59:59,999 Then you just try to upgrade each of these groups, in some order. 403 99:59:59,999 --> 99:59:59,999 Preferably deterministically, but I didn't get that far. 404 99:59:59,999 --> 99:59:59,999 And if the result of one of these modifications is upgradable, 405 99:59:59,999 --> 99:59:59,999 we can test that fairly cheap, we commit it, and you just rinse-and-repeat 406 99:59:59,999 --> 99:59:59,999 until you run out of things to test, that leads to something new. 407 99:59:59,999 --> 99:59:59,999 This works largely depending on what you have installed, 408 99:59:59,999 --> 99:59:59,999 unsurpisingly. 409 99:59:59,999 --> 99:59:59,999 So if you have very few packages, it may work better than if you have more, 410 99:59:59,999 --> 99:59:59,999 And sometimes it works better when you have more, and that's really exciting. 411 99:59:59,999 --> 99:59:59,999 So I did an example Wheezy installation based on what I have installed 412 99:59:59,999 --> 99:59:59,999 on my machine. 413 99:59:59,999 --> 99:59:59,999 Not quite, but close, well related. 414 99:59:59,999 --> 99:59:59,999 So libc for example was immediately upgradable by this procedure. 415 99:59:59,999 --> 99:59:59,999 As well as some Java package, worked fine. 416 99:59:59,999 --> 99:59:59,999 man-db, eclipse, xterm, then I had to do a couple packages before that, 417 99:59:59,999 --> 99:59:59,999 which were all upgradable. 418 99:59:59,999 --> 99:59:59,999 I think libc was primarily ?? and maybe some libc linux thing. 419 99:59:59,999 --> 99:59:59,999 But there are actually a set of packages you can upgrade with this. 420 99:59:59,999 --> 99:59:59,999 This is of course not testing on configurations. 421 99:59:59,999 --> 99:59:59,999 In this particular configuration, I could upgrade dpkg like this, 422 99:59:59,999 --> 99:59:59,999 but I don't expect you to be able to do that in all cases. 423 99:59:59,999 --> 99:59:59,999 In fact I find it highly unlikely because in Wheezy to Jessie, 424 99:59:59,999 --> 99:59:59,999 dpkg has tons of breaks for all sorts of things, 425 99:59:59,999 --> 99:59:59,999 so if you have the right thing there, you break something else, 426 99:59:59,999 --> 99:59:59,999 and that has to be upgraded at the same time. 427 99:59:59,999 --> 99:59:59,999 And that is sure to build a loop eventually. 428 99:59:59,999 --> 99:59:59,999 So basically what we are exploiting here is the greater-than-equal version, 429 99:59:59,999 --> 99:59:59,999 which is the common way of doing things. 430 99:59:59,999 --> 99:59:59,999 We rebuild a package, it gets the new dependencies on a higher 431 99:59:59,999 --> 99:59:59,999 version of things. 432 99:59:59,999 --> 99:59:59,999 That means the foo in stable ??. 433 99:59:59,999 --> 99:59:59,999 Right so, everything depending on foo is equally happy with the version 434 99:59:59,999 --> 99:59:59,999 in Jessie as well, cause it's a lower bound, and Jessie has a newer version. 435 99:59:59,999 --> 99:59:59,999 So that works very well. 436 99:59:59,999 --> 99:59:59,999 This is the common case for libraries without ABI transitions. 437 99:59:59,999 --> 99:59:59,999 Apparently including libc 438 99:59:59,999 --> 99:59:59,999 and enables you to upgrade from the inside out, from the core of the onion and out, 439 99:59:59,999 --> 99:59:59,999 if you think of the graph as an onion. 440 99:59:59,999 --> 99:59:59,999 The algorithm is fairly dumb, unsurprisingly. 441 99:59:59,999 --> 99:59:59,999 It cannot solve Perl, it can't talk Python because they tend to involve 442 99:59:59,999 --> 99:59:59,999 pulling in a new package so you have to install that and all that. 443 99:59:59,999 --> 99:59:59,999 In Perl it could technically solve if it would merge groups together 444 99:59:59,999 --> 99:59:59,999 into larger groups, but anyway the runtime of this is something like I2 * I+E. 445 99:59:59,999 --> 99:59:59,999 Which is upper bound by I to the power of 4. 446 99:59:59,999 --> 99:59:59,999 So it's fairly cheap, it is fairly polynomial. 447 99:59:59,999 --> 99:59:59,999 It's very trivial to do. 448 99:59:59,999 --> 99:59:59,999 We can do better than that as mentioned, if we can have it figure out that 449 99:59:59,999 --> 99:59:59,999 dpkg needs a new version of libc, we could upgrade those together. 450 99:59:59,999 --> 99:59:59,999 That's not a problem on my system, but it might be on others. 451 99:59:59,999 --> 99:59:59,999 Then there's the part where dpkg is breaking all of this stuff, 452 99:59:59,999 --> 99:59:59,999 so you might have to upgrade it at the same time, or before dpkg. 453 99:59:59,999 --> 99:59:59,999 Then end up with some sort of big loop or three structure of groups you have to 454 99:59:59,999 --> 99:59:59,999 merge together or upgrade together, and it should just work. 455 99:59:59,999 --> 99:59:59,999 There's also the part were we have to handle rename packages. 456 99:59:59,999 --> 99:59:59,999 This comes into ?? basically. 457 99:59:59,999 --> 99:59:59,999 They become an API bump, which will be very useful for stretch due to dcc5. 458 99:59:59,999 --> 99:59:59,999 Basically if you want to do a same restriction on this you could do something 459 99:59:59,999 --> 99:59:59,999 like, we have a clause that is not satisfied, but the thing we need for it 460 99:59:59,999 --> 99:59:59,999 is a package introduced in the new source, then we pull that. 461 99:59:59,999 --> 99:59:59,999 If it has the same dependencies, we've already solved it, that sort of thing. 462 99:59:59,999 --> 99:59:59,999 There's also the part where people rename the package from foo to foo replacement 463 99:59:59,999 --> 99:59:59,999 or something like that. 464 99:59:59,999 --> 99:59:59,999 Again, if it is from the same source, we might do some magic here, 465 99:59:59,999 --> 99:59:59,999 some heuristics here. 466 99:59:59,999 --> 99:59:59,999 And also at some point you will end up needing to install stuff, 467 99:59:59,999 --> 99:59:59,999 that is actually required for Wheezy to Jessie 468 99:59:59,999 --> 99:59:59,999 because it pulls in a new a system. 469 99:59:59,999 --> 99:59:59,999 If it had trivial no-guess solutions, 470 99:59:59,999 --> 99:59:59,999 by which I mean the first thing is of the alternative choices is a real package, 471 99:59:59,999 --> 99:59:59,999 and there is only one of them, 472 99:59:59,999 --> 99:59:59,999 you could solve this automatically in a deterministic way. 473 99:59:59,999 --> 99:59:59,999 And otherwise, give up and try something else. 474 99:59:59,999 --> 99:59:59,999 So this is the basic algorithm, and the basic ideas I've put down, 475 99:59:59,999 --> 99:59:59,999 and I've been fiddling a bit with, and didn't get very far with so far. 476 99:59:59,999 --> 99:59:59,999 Installing part 2. 477 99:59:59,999 --> 99:59:59,999 So after we've learned this with upgrading, we can now, 478 99:59:59,999 --> 99:59:59,999 with single suite single architecture it is trivial. 479 99:59:59,999 --> 99:59:59,999 Upgrading we can usually ?? it, reduce the problem size to some extent. 480 99:59:59,999 --> 99:59:59,999 We can do better on the installing part as well. 481 99:59:59,999 --> 99:59:59,999 We can look at these aspell packages again, 482 99:59:59,999 --> 99:59:59,999 because they are one of the big problems. 483 99:59:59,999 --> 99:59:59,999 We have a couple packages that appear to be almost identical, 484 99:59:59,999 --> 99:59:59,999 and then we also have packages that are clearly superior to others. 485 99:59:59,999 --> 99:59:59,999 Packages that are identical, shows up like this. 486 99:59:59,999 --> 99:59:59,999 So we ?? long enough in the aspell package from Russia and the one 487 99:59:59,999 --> 99:59:59,999 from Ukraine, something very special happens. 488 99:59:59,999 --> 99:59:59,999 You will notice that primary difference between the fields I've selected are 489 99:59:59,999 --> 99:59:59,999 the package name and the version. 490 99:59:59,999 --> 99:59:59,999 Exiting, because that means that if I can install aspell-uk, 491 99:59:59,999 --> 99:59:59,999 that's the only thing on the system, 492 99:59:59,999 --> 99:59:59,999 then the same solution is valid for the Russian one. 493 99:59:59,999 --> 99:59:59,999 So ?? they are different, but semantically they differ only by name and version. 494 99:59:59,999 --> 99:59:59,999 Sometimes you can have version dependencies that always 495 99:59:59,999 --> 99:59:59,999 satisfied ?? for example, then that's the same game actually. 496 99:59:59,999 --> 99:59:59,999 This is a simplified view because in theory you could have a package 497 99:59:59,999 --> 99:59:59,999 that refuses to work with the Ukranian one, but not the Russian one, 498 99:59:59,999 --> 99:59:59,999 and vice versa, so they actually become distinct solutions. 499 99:59:59,999 --> 99:59:59,999 But the general use of aspell dictionaries tends to be any one of them, 500 99:59:59,999 --> 99:59:59,999 and I really don't care which one you pick. 501 99:59:59,999 --> 99:59:59,999 So we find those by saying they have the same effective dependencies clauses. 502 99:59:59,999 --> 99:59:59,999 We can also look at the negative dependencies, that's fairly important too, 503 99:59:59,999 --> 99:59:59,999 they have to have the same there. 504 99:59:59,999 --> 99:59:59,999 Here we need to remember that negative dependencies, we do not think of them 505 99:59:59,999 --> 99:59:59,999 as just directed, they're not. 506 99:59:59,999 --> 99:59:59,999 So it really doesn't matter if foo breaks bar or bar breaks foo, 507 99:59:59,999 --> 99:59:59,999 the important part is they don't work together. 508 99:59:59,999 --> 99:59:59,999 Then we have assert that they satisfy the same clause. 509 99:59:59,999 --> 99:59:59,999 Both of them are a valid solution to the same clause as the other one. 510 99:59:59,999 --> 99:59:59,999 And this becomes interesting when you have dependency ranges, 511 99:59:59,999 --> 99:59:59,999 which is not truly a dependency range, but you have two clauses that says, 512 99:59:59,999 --> 99:59:59,999 I will need foo greater than version 1, and I need it strictly 513 99:59:59,999 --> 99:59:59,999 less than version 2, then one of them contains both of them, 514 99:59:59,999 --> 99:59:59,999 and the other one does not, when you are doing upgrades for example. 515 99:59:59,999 --> 99:59:59,999 It has to satisfy the same clauses as well. 516 99:59:59,999 --> 99:59:59,999 A little trick thing that I discovered a little too late, but it's fun. 517 99:59:59,999 --> 99:59:59,999 These things things can generally be done in polynomial time, that ballpark. 518 99:59:59,999 --> 99:59:59,999 I haven't really computed the actual results. 519 99:59:59,999 --> 99:59:59,999 But we can go further than that because equivalent or identical packages 520 99:59:59,999 --> 99:59:59,999 are easy to find, but there is something better than that. 521 99:59:59,999 --> 99:59:59,999 We can also have the case where one of the packages is clearly better than the other. 522 99:59:59,999 --> 99:59:59,999 For example, it has fewer of the same effective dependencies. 523 99:59:59,999 --> 99:59:59,999 So I need less to solve this package. 524 99:59:59,999 --> 99:59:59,999 It has fewer of the same negative dependencies, fewer things to 525 99:59:59,999 --> 99:59:59,999 not work with this, this is also good. 526 99:59:59,999 --> 99:59:59,999 And finally it solves at least as many dependency clauses as the other one. 527 99:59:59,999 --> 99:59:59,999 This is typically something you find in upgrade scenarios. 528 99:59:59,999 --> 99:59:59,999 So your version and the new version do not have and different dependencies 529 99:59:59,999 --> 99:59:59,999 for example, then don't have any new config relations, but the new version 530 99:59:59,999 --> 99:59:59,999 might solve more ?? dependencies. 531 99:59:59,999 --> 99:59:59,999 It might have more packages it can solve. 532 99:59:59,999 --> 99:59:59,999 In that case, you can just unconditionally take the new version, 533 99:59:59,999 --> 99:59:59,999 because if a solution works for the new version, 534 99:59:59,999 --> 99:59:59,999 it may or may not work for the old one, 535 99:59:59,999 --> 99:59:59,999 but if the solution works with the old one, 536 99:59:59,999 --> 99:59:59,999 it definitely works with the new one. 537 99:59:59,999 --> 99:59:59,999 So that's sort of a freebie. 538 99:59:59,999 --> 99:59:59,999 You're just solving for one of them, if that doesn't work, 539 99:59:59,999 --> 99:59:59,999 you know you don't have to try the other, that's the important part. 540 99:59:59,999 --> 99:59:59,999 And the point with these is basically that identical packages are just 541 99:59:59,999 --> 99:59:59,999 two-way substitutable. 542 99:59:59,999 --> 99:59:59,999 This being a one-way substitution instead. 543 99:59:59,999 --> 99:59:59,999 I haven't figured out how to find these, in general, more effectively than 544 99:59:59,999 --> 99:59:59,999 I do the identical ones. 545 99:59:59,999 --> 99:59:59,999 The identical ones are easy to find, but I would like to get some where 546 99:59:59,999 --> 99:59:59,999 we can find these superior packages trivially, because they are more 547 99:59:59,999 --> 99:59:59,999 useful in general, or there might be more of them as well. 548 99:59:59,999 --> 99:59:59,999 This is as far as I got, including writing slides five minutes before the talk. 549 99:59:59,999 --> 99:59:59,999 So are there any questions? 550 99:59:59,999 --> 99:59:59,999 There's a mic there, or there's a runner. 551 99:59:59,999 --> 99:59:59,999 [question] Thanks for your interesting talk. I have two questions in fact. 552 99:59:59,999 --> 99:59:59,999 The first question concerns your problem of finding equivalent packages. 553 99:59:59,999 --> 99:59:59,999 Are you aware of the ?? tool. 554 99:59:59,999 --> 99:59:59,999 [Neils] I have through the article about it. 555 99:59:59,999 --> 99:59:59,999 The thing I remember with that was that it wrote somewhere that 556 99:59:59,999 --> 99:59:59,999 this should blow up exponentially but it doesn't. 557 99:59:59,999 --> 99:59:59,999 [question] We these are all theoretically hard problems, or course. 558 99:59:59,999 --> 99:59:59,999 [Neils] [laughing] Yes. [question] So this tool in fact 559 99:59:59,999 --> 99:59:59,999 tries to group to find out where to classify packages with respect to 560 99:59:59,999 --> 99:59:59,999 the switch other packages that are coninstallable. 561 99:59:59,999 --> 99:59:59,999 [Neils] Yes. [question] And this works. 562 99:59:59,999 --> 99:59:59,999 One of the steps is to group together packages which behave the same. 563 99:59:59,999 --> 99:59:59,999 With respect to coninstallability with other packages. 564 99:59:59,999 --> 99:59:59,999 So this would be your notion of ?? if I understood it correctly. 565 99:59:59,999 --> 99:59:59,999 [Neils] It sounds like it. [question] At least it is very similar, 566 99:59:59,999 --> 99:59:59,999 so maybe we can do this offline later, but this is definitely something you 567 99:59:59,999 --> 99:59:59,999 should look at I think. 568 99:59:59,999 --> 99:59:59,999 And my other question is about this deterministic ptime installability. 569 99:59:59,999 --> 99:59:59,999 Of course you are well aware of the fact that this is a result which holds under 570 99:59:59,999 --> 99:59:59,999 some restrictions of the problems, of course, 571 99:59:59,999 --> 99:59:59,999 as you don't plan to solve P = NP, so can you, for instance, if you already know 572 99:59:59,999 --> 99:59:59,999 that it has this good complexity when you don't have any alternatives. 573 99:59:59,999 --> 99:59:59,999 As you've already said these are explicitly, not implicitly, 574 99:59:59,999 --> 99:59:59,999 and also when you don't any conflicts, either implicitly or explicitly, 575 99:59:59,999 --> 99:59:59,999 you seem to have a different class. 576 99:59:59,999 --> 99:59:59,999 Can you clarify precisely under which conditions on the dependency graph 577 99:59:59,999 --> 99:59:59,999 you get a ptime complexity of the problem. 578 99:59:59,999 --> 99:59:59,999 [Neils] So actually the negative dependencies are not an issue. 579 99:59:59,999 --> 99:59:59,999 If you take a package and the entire transitive dependencies below it 580 99:59:59,999 --> 99:59:59,999 can always be solved by looking elsewhere in this. 581 99:59:59,999 --> 99:59:59,999 So you have a set of transitive dependencies, and you can always 582 99:59:59,999 --> 99:59:59,999 solve it by following a clause where there's only one option, 583 99:59:59,999 --> 99:59:59,999 or there's a package you've already picked earlier. 584 99:59:59,999 --> 99:59:59,999 Then locally this becomes a simple regular graph, well you can massage it into one. 585 99:59:59,999 --> 99:59:59,999 I don't think it happens often enough for us to rely on it 100%, 586 99:59:59,999 --> 99:59:59,999 I think there may be cases, or corners of the graph where it happens locally, 587 99:59:59,999 --> 99:59:59,999 and then when you leave that part, go further down the stack, 588 99:59:59,999 --> 99:59:59,999 or outside that part, you get back to the original problem, 589 99:59:59,999 --> 99:59:59,999 but will have parts of it that are locally polynomial, 590 99:59:59,999 --> 99:59:59,999 which will be interesting to find. 591 99:59:59,999 --> 99:59:59,999 [question] So whenever the dependency problem fails, 592 99:59:59,999 --> 99:59:59,999 and you have to do it manually, I've always thought it would be optimal 593 99:59:59,999 --> 99:59:59,999 if a user could submit their way to resolve the issue to a public place 594 99:59:59,999 --> 99:59:59,999 and share it so others can see. 595 99:59:59,999 --> 99:59:59,999 And then it would be easier to solve dependency. 596 99:59:59,999 --> 99:59:59,999 You have things like pop-con, where you already upload what package 597 99:59:59,999 --> 99:59:59,999 you've installed, so it would be quite a small step to upload the problems 598 99:59:59,999 --> 99:59:59,999 you have, and the solutions. 599 99:59:59,999 --> 99:59:59,999 So I'd like to hear if you have any views on that. 600 99:59:59,999 --> 99:59:59,999 [Neils] I think it could be interesting to have that, 601 99:59:59,999 --> 99:59:59,999 but mostly as a way of generating test data for solvers. 602 99:59:59,999 --> 99:59:59,999 My take is that as a user, this problem should be simple enough that 603 99:59:59,999 --> 99:59:59,999 we can solve this in the tools, so the user doesn't have to 604 99:59:59,999 --> 99:59:59,999 manually fix the problem. 605 99:59:59,999 --> 99:59:59,999 But I said, feedback getting actual test case data might be useful for that. 606 99:59:59,999 --> 99:59:59,999 There was a question there. 607 99:59:59,999 --> 99:59:59,999 [question] You went off your slides. You say that in order for upgrade 608 99:59:59,999 --> 99:59:59,999 from stable distribution to the next stable one, 609 99:59:59,999 --> 99:59:59,999 it is only considered to be a ?? upgrade when you have only packages 610 99:59:59,999 --> 99:59:59,999 in the new stable, which means that if you have a package which no longer exists 611 99:59:59,999 --> 99:59:59,999 in the nearest table, the package is removed, is that really a good thing? 612 99:59:59,999 --> 99:59:59,999 [Neils] This is the short version of it. 613 99:59:59,999 --> 99:59:59,999 Usually when there is not a new version in the stable release, 614 99:59:59,999 --> 99:59:59,999 it's because we've removed it, and we no longer support it. 615 99:59:59,999 --> 99:59:59,999 I suppose there may be cases where the user might still want your version 616 99:59:59,999 --> 99:59:59,999 of the package to be there, if it's installable and all. 617 99:59:59,999 --> 99:59:59,999 This is one of the cases where we may be debating over what is 618 99:59:59,999 --> 99:59:59,999 practically useful, and what's the theoretical idea of doing this upgrade. 619 99:59:59,999 --> 99:59:59,999 That might collide. 620 99:59:59,999 --> 99:59:59,999 So this was just useful for me to get an idea of where I was headed, 621 99:59:59,999 --> 99:59:59,999 what problem I was trying to solve basically. 622 99:59:59,999 --> 99:59:59,999 [question] So maybe I can just give a remark to your question. 623 99:59:59,999 --> 99:59:59,999 So the problem when you are trying to precompute solutions to 624 99:59:59,999 --> 99:59:59,999 insolubility problems, then it of course also depends on which suite you are 625 99:59:59,999 --> 99:59:59,999 taking the packages you want to install or upgrade, 626 99:59:59,999 --> 99:59:59,999 but also what you have currently installed on your machine. 627 99:59:59,999 --> 99:59:59,999 This of course, there are hardly two users who have installed the same packages 628 99:59:59,999 --> 99:59:59,999 on their machine. 629 99:59:59,999 --> 99:59:59,999 For that reason, it is difficult to precompute this kind of problem, 630 99:59:59,999 --> 99:59:59,999 I would say. 631 99:59:59,999 --> 99:59:59,999 And the other remark is that in practice, even for very hard instances, 632 99:59:59,999 --> 99:59:59,999 like you have many different suites from which you install without having any 633 99:59:59,999 --> 99:59:59,999 pin preferences among them, stuff like that, in practice you get very good 634 99:59:59,999 --> 99:59:59,999 solutions with other solver technologies like ??, or other solvers, 635 99:59:59,999 --> 99:59:59,999 which can be used as external solvers in apt for precisely the same reason 636 99:59:59,999 --> 99:59:59,999 that you have already identified, but in practice, 637 99:59:59,999 --> 99:59:59,999 even theoretically hard problems are often much much easier than what you 638 99:59:59,999 --> 99:59:59,999 could obtain in the worst case. 639 99:59:59,999 --> 99:59:59,999 [Neils] I think we're running out of time. 640 99:59:59,999 --> 99:59:59,999 So if there are no more questions... 641 99:59:59,999 --> 99:59:59,999 last one. 642 99:59:59,999 --> 99:59:59,999 [question] So do you have any preliminary results from timings, 643 99:59:59,999 --> 99:59:59,999 or something like that, at least for your system where you've tried it. 644 99:59:59,999 --> 99:59:59,999 [Neils] No. What I've got is from Britney actually. 645 99:59:59,999 --> 99:59:59,999 I added some steps to Britney, and we got a dataset from Kali Linux. 646 99:59:59,999 --> 99:59:59,999 They were trying to go from their version of Wheezy to Jessie, 647 99:59:59,999 --> 99:59:59,999 which was in August, so that was before Jessie froze. 648 99:59:59,999 --> 99:59:59,999 And basically we were working with 70 thousand nodes, or packages. 649 99:59:59,999 --> 99:59:59,999 So basically, all of Wheezy, plus all of Jessie, had a unique version each. 650 99:59:59,999 --> 99:59:59,999 The interesting this is that, 651 99:59:59,999 --> 99:59:59,999 for example, the average node in that graph had like 3 to 4 dependencies, 652 99:59:59,999 --> 99:59:59,999 The median was 3 dependency clauses, and the average as 4.3. 653 99:59:59,999 --> 99:59:59,999 Each of these clauses had an average of 654 99:59:59,999 --> 99:59:59,999 -ah I can't read this shit anymore- 655 99:59:59,999 --> 99:59:59,999 had a median of 2 possible options for it, and an average of 1.8. 656 99:59:59,999 --> 99:59:59,999 And this is before Britney accounts for excluding things. 657 99:59:59,999 --> 99:59:59,999 So this is the roll graph and then, 658 99:59:59,999 --> 99:59:59,999 based on that she selects something that is in testing, 659 99:59:59,999 --> 99:59:59,999 and everything outside of that she ignores. 660 99:59:59,999 --> 99:59:59,999 So when I have two options, a lot of the time she would throw out one of them, 661 99:59:59,999 --> 99:59:59,999 because that one is not in testing. 662 99:59:59,999 --> 99:59:59,999 So that's the number's I have. 663 99:59:59,999 --> 99:59:59,999 I have more if you want to see them but, 664 99:59:59,999 --> 99:59:59,999 it's not very detailed, it's not to the point where I think I can predict 665 99:59:59,999 --> 99:59:59,999 something useful from it. 666 99:59:59,999 --> 99:59:59,999 [Moderator] So okay Neils, thank you for your talk, 667 99:59:59,999 --> 99:59:59,999 and this was a really good insight into the problems of dependency resolving. 668 99:59:59,999 --> 99:59:59,999 Thank you very much. 669 99:59:59,999 --> 99:59:59,999 [appluase]