WEBVTT 00:00:01.014 --> 00:00:04.185 (lift) 00:00:04.185 --> 00:00:07.244 (lift 12 - Feb 24 2012 - Geneva) 00:00:07.244 --> 00:00:10.044 (Rufus Pollock - Stories) 00:00:10.044 --> 00:00:11.788 [Rufus Pollock] Just to say for those of you who don't know: 00:00:11.788 --> 00:00:13.666 the Open Knowledge Foundation is a not-profit -- not for profit 00:00:13.666 --> 00:00:15.611 founded in 2004 00:00:15.611 --> 00:00:17.865 and which builds tools and communities 00:00:17.865 --> 00:00:20.934 to create, use and share open information 00:00:20.934 --> 00:00:24.585 and that's information that anyone can use, reuse and redistribute. 00:00:24.585 --> 00:00:28.321 And as such, we've been working on open data for quite a long time 00:00:28.321 --> 00:00:30.011 since we started in 2004. 00:00:30.011 --> 00:00:34.817 And today, I want to start the story by going back in time 5000 years, 00:00:34.817 --> 00:00:37.610 to ancient Mesopotamia. 00:00:37.610 --> 00:00:41.393 There, between the Tigris and the Euphrates rivers, 00:00:42.069 --> 00:00:44.390 flourished the Sumerian civilization. 00:00:44.390 --> 00:00:47.298 And they were confronted by a problem. 00:00:47.298 --> 00:00:50.269 They were confronted by the limitations of human memory 00:00:50.899 --> 00:00:54.338 in the recording of taxes, food and other goods. 00:00:54.338 --> 00:00:59.642 And those ancient civil servants and businessmen hit on a novel solution: 00:01:00.380 --> 00:01:04.666 What they decided to do was they would start counting things with small clay chits, 00:01:04.666 --> 00:01:09.234 which they would bake inside of a clay -- a little clay box 00:01:09.234 --> 00:01:12.617 and then mark, on the outside of that box, what they were counting. 00:01:12.617 --> 00:01:15.303 You know, was it grain, was it tax payments, whatever. 00:01:16.150 --> 00:01:19.786 And so, born out of necessity for a state and a society, 00:01:20.632 --> 00:01:25.773 came one of the great information technology revolutions of all time: writing. 00:01:25.773 --> 00:01:28.172 The Sumerians invented writing via cuneiform. 00:01:28.910 --> 00:01:34.039 And if we fast-forward from that a few thousand years, we come to the UK census. 00:01:34.039 --> 00:01:37.577 Again, it's always interesting that states, governments are often at the forefront 00:01:37.577 --> 00:01:42.681 of at least driving information technology and information systems innovations. 00:01:42.681 --> 00:01:44.654 The UK census: again, the state, 00:01:44.654 --> 00:01:46.565 this is during the Napoleon Wars, 00:01:46.565 --> 00:01:48.601 desired to count the population more accurately: 00:01:48.601 --> 00:01:51.995 and we have the first UK census in 1801. 00:01:51.995 --> 00:01:56.189 And in the US, they also had censuses, in fact starting in 1790. 00:01:56.819 --> 00:01:59.383 And one of the problems encountered in the 1880 census 00:01:59.383 --> 00:02:01.592 was they tabulated the census by hand. 00:02:02.345 --> 00:02:05.699 And by the 1880 census, it was taking seven years 00:02:05.699 --> 00:02:06.822 to tabulate the census. 00:02:06.822 --> 00:02:10.241 So after it got taken in 1880, it wasn't until 1887 00:02:10.241 --> 00:02:12.892 they actually had any data they could use. 00:02:12.892 --> 00:02:16.004 And they calculated that for the next census in 1890, 00:02:16.004 --> 00:02:18.164 they wouldn't be finished by 1900. 00:02:18.164 --> 00:02:21.936 They still wouldn't have the results of the census by the time they started the next one. 00:02:21.936 --> 00:02:24.233 They had a crisis of information technology. 00:02:24.233 --> 00:02:26.979 And what they went and did is they commissioned Herman Hollerith 00:02:26.979 --> 00:02:29.747 to build the first automatic tabulator. 00:02:29.747 --> 00:02:32.835 And for those of you who know your company history, of course, 00:02:32.835 --> 00:02:34.513 Herman Hollerith's company went on 00:02:34.513 --> 00:02:35.899 to be one of the founders, if you like, 00:02:35.899 --> 00:02:38.808 one of the companies that came and created IBM. 00:02:38.808 --> 00:02:42.258 And IBM, by the sixties, were building this 00:02:42.258 --> 00:02:44.374 -- they replaced those hand -- 00:02:44.374 --> 00:02:45.905 those kind of wooden, mechanical tabulators 00:02:45.905 --> 00:02:48.524 with this stuff: digital tabulators, 00:02:48.524 --> 00:02:50.375 the modern computer of this age. 00:02:50.375 --> 00:02:52.610 And again, much of this -- I don't know if you guys know -- 00:02:52.610 --> 00:02:53.705 IBM would have gone bankrupt 00:02:53.705 --> 00:02:58.477 if it hadn't been for Franklin Roosevelt passing the Social Security Act in the States, 00:02:58.477 --> 00:03:01.132 which necessitated a huge amount of new tabulation. 00:03:01.132 --> 00:03:04.629 So, again, a lot of innovation in this space came out of government need 00:03:04.629 --> 00:03:06.370 and also, of course, the nuclear program, 00:03:06.370 --> 00:03:08.641 the other great needer of computational power. 00:03:09.317 --> 00:03:11.899 And today, today, 00:03:12.623 --> 00:03:15.485 we find ourselves again in the midst of a revolution. 00:03:16.438 --> 00:03:19.331 It's a revolution driven by two needs: 00:03:19.331 --> 00:03:22.027 ones that have been the same throughout history as I've just shown, 00:03:22.027 --> 00:03:23.886 information complexity, which is the necessity, 00:03:24.456 --> 00:03:27.575 and information technology, which is the opportunity. 00:03:28.544 --> 00:03:32.702 And what we're doing in this case is a policy innovation, if you like. 00:03:32.702 --> 00:03:36.468 We are innovating by opening up information. 00:03:37.052 --> 00:03:39.436 So just take the obvious example, government, 00:03:39.436 --> 00:03:41.097 as I said, often the innovator. 00:03:41.097 --> 00:03:43.308 In the last -- 3 years ago, you go back 3 years, 00:03:43.308 --> 00:03:45.829 there's almost no open government data initiatives 00:03:45.829 --> 00:03:46.688 in the world. 00:03:46.688 --> 00:03:48.442 Today there are dozens. 00:03:48.442 --> 00:03:51.162 The UK, the US, Finland, Kenya, The Netherlands, 00:03:51.162 --> 00:03:53.049 and there's new ones almost every week. 00:03:53.049 --> 00:03:57.407 There's been a launch of an official kind of movement as a part of the UN 00:03:57.407 --> 00:04:00.097 called the Open Government Partnership in which countries sign up, 00:04:00.097 --> 00:04:02.433 and among other things, they open up their data. 00:04:03.002 --> 00:04:05.325 And of course, it's been, in the UK and other countries, 00:04:05.325 --> 00:04:06.562 Tim Berners-Lee has been involved. 00:04:06.562 --> 00:04:09.106 I've helped advise the government around this in the UK. 00:04:09.106 --> 00:04:11.221 But it's not just government, it's also companies. 00:04:11.651 --> 00:04:13.982 Companies are opening up data. 00:04:13.982 --> 00:04:15.690 Very interestingly, last year, 00:04:15.690 --> 00:04:19.092 Nike started an open data initiative there 00:04:19.092 --> 00:04:21.372 to open up supply chain and sustainability data, 00:04:21.372 --> 00:04:23.931 for themselves and also for their suppliers, 00:04:23.931 --> 00:04:26.800 which I think is a very interesting change. 00:04:26.800 --> 00:04:28.004 And it's also communities. 00:04:28.004 --> 00:04:29.715 Often, in fact, back there in the beginning, 00:04:29.715 --> 00:04:31.927 this incredible map that you saw in an earlier slide, 00:04:31.927 --> 00:04:35.002 is a OpenStreetMap activity, around the world. 00:04:35.002 --> 00:04:38.073 People adding to this crowd-built map of the world. 00:04:38.073 --> 00:04:41.074 And in the last 6 years, OpenStreetMap, 00:04:41.074 --> 00:04:42.445 from a bottom-up community, 00:04:42.445 --> 00:04:44.435 have built a complete, comprehensive, 00:04:44.435 --> 00:04:47.918 map of the world, of fully open data. 00:04:48.872 --> 00:04:50.898 So I've just gone on about Open Data, 00:04:50.898 --> 00:04:52.766 and one thing I'm aware of, of this audience, 00:04:52.766 --> 00:04:54.035 is you might not all know what it is. 00:04:54.035 --> 00:04:59.152 So I'm going to take a brief moment, a brief moment, to say what it is. 00:04:59.152 --> 00:05:01.493 What does it mean when I say 'open'? 00:05:01.493 --> 00:05:05.557 And was it, you know, what's different from anything else? What's different from simply public data? 00:05:05.557 --> 00:05:07.083 So there's actually a definition, 00:05:07.083 --> 00:05:10.177 a definition we the Open Knowledge Foundation helped write, it's very simple. 00:05:10.177 --> 00:05:13.671 In a nutshell, a piece of information, a piece of data, 00:05:13.671 --> 00:05:18.384 is open if anyone is free to use, reuse, 00:05:18.384 --> 00:05:20.797 and redistribute it, subject only at most 00:05:20.797 --> 00:05:22.891 to a requirement to attribute and share alike. 00:05:23.214 --> 00:05:25.784 And anyone means anyone! 00:05:25.784 --> 00:05:28.055 It doesn't mean -- there can't be any commercial restrictions. 00:05:28.055 --> 00:05:32.262 You can't say: hey, here's this data, but only people using it for non-commercial purposes. 00:05:32.262 --> 00:05:34.849 Or only people working in education. 00:05:34.849 --> 00:05:38.051 Or only people living in the developing world, or the developed world. 00:05:38.051 --> 00:05:40.743 There can't be any restrictions like that. 00:05:41.343 --> 00:05:43.189 And there's a reason for this, by the way, 00:05:43.209 --> 00:05:48.615 and it isn't just because one's obsessed about if you like, trademarking an attractive term. 00:05:49.315 --> 00:05:51.081 It's because it's about interoperability. 00:05:51.291 --> 00:05:54.617 One of my experiences at this conference, which I remember from previous trips to Geneva, 00:05:54.627 --> 00:05:56.974 is I've been unable to plug in my laptop! 00:05:56.974 --> 00:06:02.048 Even though I have a French adaptor, in fact, these wonderful Swiss plugs here, are, you know, 00:06:02.048 --> 00:06:03.582 these wonderful, small octagonal shape. 00:06:03.582 --> 00:06:05.379 And even with my adaptor I can't plug in. 00:06:05.379 --> 00:06:07.347 Right? And it's called interoperability. 00:06:07.347 --> 00:06:10.929 When we travel around to different countries, our power adaptors don't actually fit in. 00:06:10.929 --> 00:06:12.581 We have to buy something. 00:06:12.581 --> 00:06:16.755 And the point about this definition, and the point about caring about Open Data, 00:06:16.755 --> 00:06:18.317 is, it's about interoperability. 00:06:18.317 --> 00:06:22.112 The dream of Open Data is interoperability. 00:06:22.112 --> 00:06:26.058 Of seamlessly being able to share and interweave information. 00:06:27.898 --> 00:06:31.704 And if every time I get information from two different people I have to consult a lawyer, 00:06:31.704 --> 00:06:35.300 I have to work out whether I'm allowed to do it, whether I'm allowed to put these things together, 00:06:35.300 --> 00:06:37.634 we lose that dream, that dream is shattered. 00:06:37.634 --> 00:06:42.166 And the key point is, this definition, and those conditions, ensure interoperability. 00:06:42.166 --> 00:06:45.744 If you comply with them, we know that any piece of info, of Open Data, 00:06:45.744 --> 00:06:47.880 will work with any other piece of Open Data. 00:06:48.681 --> 00:06:52.932 And also, it's worth saying for a quick moment, what kind of data, and to emphasize a point. 00:06:52.932 --> 00:06:55.985 Just to foreclose those kinds of questions, otherwise I always get asked. 00:06:55.985 --> 00:06:58.809 When we talk about opening up data, in general, 00:06:58.809 --> 00:07:01.026 we're not talking about personal data. 00:07:01.026 --> 00:07:04.161 We're not talking about opening up your private health records 00:07:04.161 --> 00:07:08.302 or opening up your personal tax information. 00:07:08.302 --> 00:07:11.267 We're talking about information that is non-personal in nature. 00:07:11.267 --> 00:07:15.667 And for the government for example: transport, geodata, statistics, electoral, legal. 00:07:15.667 --> 00:07:19.510 Stuff that the UK has, in fact, for example been opening up over the last few years. 00:07:19.510 --> 00:07:23.381 This financial information, on government spending, this information on health outcomes, 00:07:23.381 --> 00:07:28.625 on prescriptions, this information on educational outcomes, this information on the law. 00:07:28.625 --> 00:07:30.765 This information -- statistical information. 00:07:30.785 --> 00:07:32.691 That's the kind of thing that we're talking about. 00:07:34.186 --> 00:07:37.393 Now, I want to say, it's in this story, we have this story of over time. 00:07:37.393 --> 00:07:38.996 But why governments are doing it now? 00:07:39.596 --> 00:07:40.598 And why Open Data? 00:07:41.268 --> 00:07:43.930 So, okay, for thousands of years, governments innovate, 00:07:43.930 --> 00:07:47.274 but why do they innovate at this particular moment and in this way? 00:07:47.274 --> 00:07:51.976 So I want to start here with a quick story, a story of medicine gone wrong. 00:07:52.006 --> 00:07:54.484 It is from a great book by a guy called Stephen Klaidman. 00:07:54.484 --> 00:07:55.918 It's in fact one of the things 00:07:55.918 --> 00:07:57.781 that made me think quite deeply about this: 00:07:57.781 --> 00:07:59.852 why I was interested in Open Data. 00:08:01.172 --> 00:08:02.917 In that picture there, you can see 00:08:02.917 --> 00:08:05.726 what was the Redding Medical Centre in Northern California. 00:08:05.726 --> 00:08:10.471 There, in 2002, in the Summer of 2002, John Corapi, 00:08:11.231 --> 00:08:12.401 in typical American style, 00:08:12.401 --> 00:08:15.374 an ex-accountant from Vegas turned Catholic priest, 00:08:15.374 --> 00:08:17.243 [scattered laughter] 00:08:17.783 --> 00:08:22.274 ...arrived at the Redding Medical Centre having been referred by his doctor for having chest pains. 00:08:22.784 --> 00:08:28.419 He had a cardiogram by the local cardiologist and was told that he needed an immediate heart bypass, 00:08:28.419 --> 00:08:31.484 that he was at serious risk, and that he should come back later that day, 00:08:31.484 --> 00:08:34.514 or at the latest, tomorrow, to have open heart surgery. 00:08:35.764 --> 00:08:37.985 Rather shocked, and dazed by this news, 00:08:37.985 --> 00:08:41.225 he returned home to pack his bags in order to return to hospital. 00:08:41.225 --> 00:08:45.102 He called up his best friend, who was still an accountant in Vegas, 00:08:46.032 --> 00:08:52.568 whose partner was a hospital nurse, and who advised him that he should get a second opinion, 00:08:52.568 --> 00:08:55.904 that, according to his partner, it was not, you know, 00:08:55.904 --> 00:08:58.981 it was very unusual that you would need to have immediate open heart surgery, 00:08:58.981 --> 00:09:00.235 and that he should get a second opinion. 00:09:00.975 --> 00:09:04.507 Rather doubtful about this, because he was extremely worried, he did get on a plane. 00:09:04.507 --> 00:09:07.919 He went to Vegas, he got seen by another specialist... 00:09:07.919 --> 00:09:11.785 who, to his complete surprise, told him there was nothing wrong with his heart. 00:09:12.805 --> 00:09:15.289 He saw another specialist, just to make sure. 00:09:15.289 --> 00:09:18.563 They told him also, there was nothing wrong with his heart. 00:09:19.343 --> 00:09:25.067 Relieved, and rather, you know, happy, he returned home and just wanted to really forget about it. 00:09:25.067 --> 00:09:27.389 But his friend said: "No, what's going on here? Something's wrong". 00:09:27.389 --> 00:09:32.613 And they went in to see the CEO of the Tenet Healthcare, the people running this hospital 00:09:32.613 --> 00:09:35.654 (which, by the way, was a private hospital), and said: 00:09:35.654 --> 00:09:38.614 "Look, something's wrong, what's going on, what are you going to do about this?" 00:09:38.614 --> 00:09:40.256 And basically they were told: not very much. 00:09:40.256 --> 00:09:44.581 You know, mistakes get made, it's bad luck, don't worry about it, 00:09:44.581 --> 00:09:46.233 we'll look into it, but thank you very much. 00:09:46.763 --> 00:09:51.631 They weren't convinced by this, and eventually they decided to contact the FBI. 00:09:51.631 --> 00:09:53.826 The reason they contacted the FBI, by the way, 00:09:53.826 --> 00:09:56.401 is it's a private healthcare provider in the United States, 00:09:56.401 --> 00:10:00.476 they provide Medicare provision of healthcare to the Federal Government. 00:10:00.476 --> 00:10:04.202 So, if the Federal Government is getting defrauded, the FBI can get involved. 00:10:04.982 --> 00:10:06.850 The FBI started investigating. 00:10:08.281 --> 00:10:12.081 Eventually it turned out, that hundreds, probably thousands of people 00:10:12.081 --> 00:10:15.854 over a ten or longer year period, had been operated on unnecessarily. 00:10:16.704 --> 00:10:19.561 Most of them had had serious procedures performed on them, 00:10:19.561 --> 00:10:22.189 open heart surgery, some had died as a result. 00:10:22.189 --> 00:10:24.325 Obviously it's quite a serious operation. 00:10:24.325 --> 00:10:27.437 Some people had basically been condemned to a lifetime of pain. 00:10:27.437 --> 00:10:31.437 One of the most traumatic examples was a 36-year-old, he had been cut open, 00:10:31.437 --> 00:10:33.000 which is obviously what happens in open heart surgery, 00:10:33.000 --> 00:10:35.369 and his chest had never knitted back together correctly. 00:10:35.999 --> 00:10:38.125 Basically, he would be in pain for the rest of his life. 00:10:39.395 --> 00:10:43.000 So, hundreds, thousands of people had been harmed. 00:10:43.610 --> 00:10:45.968 One of the interesting things was that in this community 00:10:45.968 --> 00:10:48.159 there was already some suspicion, there were anecdotes. 00:10:48.159 --> 00:10:50.853 I mean, one of the ones I really liked from this book was the story that went: 00:10:50.853 --> 00:10:56.021 'Don't get a flat tyre outside of Redding Medical Centre because you'll end up with a heart bypass.' 00:10:56.021 --> 00:10:57.303 [scattered laughter] 00:10:57.303 --> 00:11:00.258 You know, but the thing was, there was no data. 00:11:00.728 --> 00:11:04.563 People were you know, a bit suspicious, but it was among doctors who knew, 00:11:04.563 --> 00:11:06.867 you know, in the community, and who wants to doubt it. 00:11:06.867 --> 00:11:12.171 And guess what? Also, Redding Medical Centre had one of the best mortality rates, 00:11:13.001 --> 00:11:15.350 for cardiac procedures in the United States, 00:11:15.350 --> 00:11:19.609 because if you operate on healthy people, you have a good mortality rate! 00:11:19.619 --> 00:11:21.129 [scattered laughter] 00:11:21.129 --> 00:11:23.390 So, the other thing, though, 00:11:23.390 --> 00:11:25.452 and this is the point that comes to Open Data for me 00:11:25.452 --> 00:11:28.722 the other red flag if you had been looking at the data, 00:11:28.741 --> 00:11:31.927 was these two things: one is incredibly low mortality rate, 00:11:31.927 --> 00:11:35.351 and (B) that it had almost the highest number of procedures 00:11:35.351 --> 00:11:37.464 for the population that it covered in the United States, 00:11:38.144 --> 00:11:39.634 which should be a red flag, right? 00:11:39.634 --> 00:11:42.618 Because, one, it's just a massive outlier on that basis, and also, 00:11:42.618 --> 00:11:45.815 the more people you should be operating on, the more you're doing marginal cases, 00:11:45.815 --> 00:11:49.450 the higher should be your mortality rate unless something very odd is going on. 00:11:50.030 --> 00:11:53.015 The thought was: what if people had been looking at this data? 00:11:53.015 --> 00:11:56.045 What if we'd - if this data had been open and public, 00:11:56.045 --> 00:11:59.517 and not maybe just for particular researchers to look at or the government? 00:11:59.927 --> 00:12:04.129 And it kind of reminded me of a phrase that's very famous in Open Source software, which is: 00:12:04.129 --> 00:12:05.965 "To many eyes, all bugs are shallow". 00:12:05.965 --> 00:12:10.504 What's great about Open Source software is lots of people can look at it, lots of people can fix it. 00:12:10.504 --> 00:12:14.730 And for me, what this was saying was: to many eyes, all anomalies are noticeable. 00:12:14.730 --> 00:12:16.679 It's somewhat of an exaggeration, 00:12:16.679 --> 00:12:18.908 but what happens if rather than ten or twenty people 00:12:18.908 --> 00:12:22.077 who worked in monitoring Medicare provision in the US government, 00:12:22.077 --> 00:12:23.877 we'd had thousands or millions of people? 00:12:23.877 --> 00:12:26.919 If the local journalists or citizens, who had suspicions, 00:12:26.919 --> 00:12:28.747 had been able to go and look at that data and say: 00:12:28.747 --> 00:12:32.485 "Whoa! What's going on here? This isn't just anecdotes, there's some data". 00:12:34.205 --> 00:12:40.225 And so, and it's not just then, about kind of spotting healthcare errors, or issues, or risks, 00:12:40.225 --> 00:12:42.415 it's also about things like apps and services 00:12:42.415 --> 00:12:43.857 that you can build with Open Data. 00:12:43.867 --> 00:12:46.667 This is a great app built by mySociety in the UK, 00:12:46.667 --> 00:12:47.640 called Mapumental. 00:12:47.640 --> 00:12:48.974 And the question is, I don't know if people know, 00:12:48.974 --> 00:12:50.646 London house prices are very expensive, 00:12:50.646 --> 00:12:52.510 I don't know whether they rival Geneva's, 00:12:52.510 --> 00:12:55.238 but they're, it's a pretty difficult thing. 00:12:55.238 --> 00:12:57.978 And one of the questions was, if I have to work somewhere, 00:12:57.978 --> 00:13:01.752 and I want to know where I can live, and afford, 00:13:01.752 --> 00:13:05.757 and I can commute to work in a certain time, and it's not too ugly, 00:13:05.757 --> 00:13:07.583 this is what this app does. 00:13:07.583 --> 00:13:11.200 You can choose the price, you can say where you're going to work, 00:13:11.200 --> 00:13:14.195 you can choose the commute time, and you can choose the scenicness. 00:13:14.195 --> 00:13:17.167 And it will show you, on this map, where you can live. 00:13:18.427 --> 00:13:20.796 Another example, more about transparency, 00:13:20.796 --> 00:13:22.746 is a project we did called "Where Does My Money Go?". 00:13:23.976 --> 00:13:25.406 It's an interactive version, 00:13:25.406 --> 00:13:26.211 you can kind of draw it out, 00:13:26.211 --> 00:13:29.114 so what it starts with, is one, is it tells you what your tax is, 00:13:29.114 --> 00:13:30.821 something that most people often don't know, 00:13:30.821 --> 00:13:33.668 and it will tell you how much you're paying each day 00:13:33.668 --> 00:13:36.254 to a particular area of society. 00:13:36.254 --> 00:13:37.328 And the dream for me, 00:13:37.328 --> 00:13:39.127 a dream that we're on the way to realising, 00:13:39.127 --> 00:13:42.817 is in this visualisation, you can drill down into areas. 00:13:42.817 --> 00:13:45.092 And my dream is to keep drilling down. 00:13:45.472 --> 00:13:47.633 So depending on what day we have, I want to go down, 00:13:47.633 --> 00:13:49.628 right down through those bubbles, step by step, 00:13:49.628 --> 00:13:52.403 until I see the money spent on street lights on my street, 00:13:52.403 --> 00:13:55.270 on filling in potholes, on collecting my rubbish. 00:13:56.190 --> 00:13:57.138 And for two reasons: 00:13:57.138 --> 00:13:59.704 One, obviously there's a question, particularly in some countries, 00:13:59.704 --> 00:14:01.016 of inefficiency or corruption, 00:14:01.436 --> 00:14:05.176 but also, just because most of us don't feel very happy about paying tax. 00:14:06.066 --> 00:14:08.157 It's not one of those things people welcome! 00:14:08.157 --> 00:14:09.817 But it's something that we should. 00:14:09.817 --> 00:14:11.960 Government does an awful lot for us, 00:14:11.960 --> 00:14:14.287 and having a better sense of where it's going 00:14:14.287 --> 00:14:17.120 could make us feel an awful lot better about paying that tax. (14:14) 00:14:17.120 --> 00:14:24.887 In the way that when we go to a restaurant, we don't, when we get the bill we don't necessarily feel bad. We feel "Wow, that was a great meal. That was worth it." 00:14:24.887 --> 00:14:31.240 But why Open? I've given you examples, and you know, we see a lot of apps and services. Why is Open relevant here? 00:14:31.240 --> 00:14:42.055 This goes back to what I said about the information technology, the revolution. So it's the challenge and the opportunity. It's the challenge that we see today, is exploding informational complexity. 00:14:42.055 --> 00:14:56.434 I mean, another great story, in the 1820s, all bank clearing in the largest financial centre in the world was done in a single room, where one person from each bank gathered and they'd go round the room pulling out gold, and swapping it around, between different banks. 00:14:56.434 --> 00:14:59.256 And that's how they did bank clearing. 00:14:59.256 --> 00:15:10.507 Today we have billions of transactions a minute. And the way we as humans deal with complexity is by dividing and conquering it. We split it up into manageable chunks that we deal with. 00:15:10.507 --> 00:15:22.079 The other answer, and this answer's particularly relevant about Open Data, is information technology. Today, a smartphone has as much computing power as the system that ran the Apollo moon landings. 00:15:22.079 --> 00:15:39.967 And in an even better example of storage, one terabyte of storage today is a hundred dollars. In 1994, this would have cost 400,000 dollars. I can have every financial transaction the UK government, or the US government made last year, or even for the last decade, on my laptop. 00:15:39.967 --> 00:15:44.490 That was not possible for an average citizen a decade ago. 00:15:44.490 --> 00:15:51.778 So it's mass participation information access, processing, and production. It's decentralisation. And the claim here is that openness is key. 00:15:51.778 --> 00:16:13.208 It's because it's about scaling. What we are doing is weaving data together. Like I said, we deal with complexity by splitting it up, we componentise, we split data up into blocks that we recombine. But if we are going to recombine information, we need to put Humpty Dumpty back together again, it won't work most of the time if it is closed. 00:16:13.208 --> 00:16:17.292 We need Open Data to scale and to componentise. 00:16:17.292 --> 00:16:31.262 And it's a point just to make here in this respect, that you might think: "Well you know, you're talking about Open Data, you know, this could be true of anything! Why don't we have like, Open Cars, and Open Shoes, and you know, why don't we just share everything, man! It would be so beautiful!". 00:16:31.262 --> 00:16:38.750 Right? And the sad thing is, is that that hasn't generally worked as a way of organising most production in our society. 00:16:38.750 --> 00:16:48.843 Instead, we have private property, and so we don't do that much openness relatively. But there's something different about digital information. We all know it, but it's worth emphasising, which is, it's very cheaply copied. 00:16:48.843 --> 00:16:56.689 I mean, give me a copy of your data isn't a problem if you're the government. Give me a copy of your car, or your house, or whatever, is. 00:16:56.689 --> 00:17:05.488 And it's also about innovation here. I mean, in a way it's almost the purest aspect of markets. Markets are about moving things to the person who could use them most best. 00:17:05.488 --> 00:17:10.900 And that's true of data. The best thing to do with your data will likely be thought of by someone else. 00:17:10.900 --> 00:17:22.968 And vice versa! You will think of the best thing to do with someone else's data. And Open Data allows us, in the most frictionless, easiest way, to move data to where it can be most optimally used. 00:17:22.968 --> 00:17:24.415 Particularly, if you're government. 00:17:24.415 --> 00:17:29.277 So in short, it's about better understanding. It's about better government. It's about better research. It's about better economy. 00:17:29.277 --> 00:17:37.240 And something also for companies and governments, I think it's about better engagement. It's about a closer relationship, sometimes, between your citizens and you as the government. 00:17:37.240 --> 00:17:40.600 Between you, even possibly, as a company, and your users. 00:17:40.600 --> 00:17:43.972 So I wanted to kind of finish here by saying where we're going. 00:17:43.972 --> 00:17:50.017 The story was, of this talk, was, you know, where are we? Why have we got here? And where are we going? 00:17:50.017 --> 00:18:03.287 So one answer is just more use. So right now, I just said at the beginning, Open Data is relatively young. This vast outpouring, for example, of government data, that anyone can freely use, reuse, and redistribute, is really new. Even if it's done three years ago. 00:18:03.287 --> 00:18:08.959 For example, in the UK, much of the most useful data that could be released has only been released in the last six months or a year. 00:18:08.959 --> 00:18:19.160 You want prescription data? Are you a pharmaceutical company? And you want to know what kind of prescription habits are going on in the UK? I would emphasise: at an anonymised or somewhat aggregate level. 00:18:19.160 --> 00:18:28.420 Do you want to know about what crimes are going on? Are you building a real estate website and you want data on environment, or you want data on unemployment, or other information about where properties are situated? 00:18:28.420 --> 00:18:35.351 You can now get that. So I think there's going to be a lot more use from business. There'll be a lot more use from everyone. 00:18:35.351 --> 00:18:38.446 But I think particularly business is going to wake up to the opportunities here. 00:18:38.446 --> 00:18:48.666 I think it's also going to lead to more data. One is, government is going to be more data. I think also businesses are going to realise, and communities, that they want to share back some of that data, some of the data they have. 00:18:48.666 --> 00:19:00.990 It's not going to be their kind of crown jewels, and it's not going to, it's often going to start out with data that's not core to their business. Right? It's kind of like Nike, they realised that by opening and sharing data, they can scale in a way they can't on their own. 00:19:00.990 --> 00:19:14.017 And does it mean that richer data, going back -- how could I leave out Hegel and Marx in a talk like this -- "Quantity changes quality" as Hegel told us. And more data, going back to that woven ball, more data actually means better data. 00:19:14.017 --> 00:19:24.040 It means richer data, it's a qualitative difference in what we can do. Geodata on it's own isn't that useful. Transport data on it's own isn't useful. Geodata plus transport data is useful! 00:19:24.040 --> 00:19:31.912 And we're going to be seeing data refining. Data is the new oil, right? So, we're going to refine it. And that's going to be a big business. Higher quality data. 00:19:31.912 --> 00:19:42.440 And I want to leave you with a couple of thoughts. So, one is: Some people say, "well, okay, but, you know, selling data is big business". And it is, but going forward in some of these things like software, data is going to be a platform. 00:19:42.440 --> 00:19:47.548 It's not a commodity. Businesses built purely on selling data, I just don't think I'm going to make it. 00:19:47.548 --> 00:19:52.067 You need to be building on your data, not attempting to purely sell it. 00:19:52.067 --> 00:20:02.223 And the other answer is to be modest. So I said: where are we going? I don't know if people know, and this takes us back to an earlier age, an age of electricity and steam, of Faraday. 00:20:02.223 --> 00:20:16.694 So he's demonstrating electricity at the Royal Society, and Gladstone, the future Prime Minister of England, sees him do this stuff, you know, the frog legs move, and Gladstone's like: "well, I mean, this is party trick, Faraday, it's great, but, what's really, you know, what's electricity going to amount to?" 00:20:16.694 --> 00:20:20.832 And Faraday says to him: "Well, what's the use of a baby?" 00:20:20.832 --> 00:20:23.739 You know, a baby when it's young is not very useful. 00:20:23.739 --> 00:20:25.718 [scattered laughter] 00:20:25.718 --> 00:20:36.165 But it grows up into something! And that is where we are going today. We are the beginning of the Open Data journey. And partly is, we don't know what it's going to grow up into. 00:20:36.165 --> 00:20:37.774 Thank you very much! 00:20:37.774 --> 00:20:40.558 [Applause] 00:20:40.558 --> 00:20:57.332 QUESTIONER: Um, citizens and um, I guess patients in hospitals, assume that the institutions have all this data and it's very well organised, and it's a question of will. Have you encountered cases in which they simply don't have it, or they have it, and it's just such a mess that they're too embarrassed to give it out? 00:20:57.332 --> 00:21:17.527 RUFUS: Absolutely. I mean, one story that kind of intrigues me, is we've been building this "Where Does My Money Go?" open spending project. And one of the things the government mandated was giving out, rather than just high-level financial information, giving out information at a detailed level. You know, so they now publish, for example, spending data from each government department monthly, every transaction over 25,000 pounds. 00:21:17.527 --> 00:21:31.130 Every purchase they make, every mobile phone provider they contract with, we get that data. And one of the intriguing things, of their mandating this, was it turned out, before, they had no way, before they did this, of actually seeing, on any regular basis, what their department spent money on. 00:21:31.130 --> 00:21:38.839 Because in fact, the only thing they reported up onto, in central government to Treasury, was kind of like, how much did you spend against Project X that you were allocated budget for? 00:21:38.839 --> 00:21:46.454 You know, departments, were actually really intrigued, they say "Oh, well that other department's going with Vodafone, and we're with Orange, and look how much they're paying per month!" 00:21:46.454 --> 00:21:59.519 So I think in essence it is really driving changes in government, and yeah, there are people, I think you'd been worried about giving out data quality. I was just talking to the Department of Education last week and they said -- you know, one of the things -- they had financial information from schools, and which they were slowly being mandated to publish. 00:21:59.519 --> 00:22:05.974 And schools are suddenly all ringing up, saying: "Well we never really bothered to really update that information to be accurate! Uh, we really want to do it right now". 00:22:05.974 --> 00:22:08.659 So I think that definitely does happen, yep. 00:22:08.659 --> 00:22:12.617 QUESTIONER: Are you seeing now new roles in government, to help facilitate this? 00:22:12.617 --> 00:22:27.580 RUFUS: Yeah. I mean, to take another example, I, sorry. Both in government, so the UK government has a transparency kind of 'czar' if you like. Also I learnt, is Nike hired an Open Data evangelist. One of the things they, while they were implementing this programme, they actually hired explicitly, an Open Data evangelist. 00:22:27.580 --> 00:22:32.406 So yeah, I think we are, we're definitely seeing this in government. Both in the tech level, but also at the policy level. 00:22:32.406 --> 00:22:43.837 And I think it's not just government, it will also be companies doing this, and so on, who will be saying: "We need an Open Data expert. We need to be aware of what's going on here and be able to plan it as part of our strategy." 00:22:43.837 --> 00:22:55.085 QUESTIONER: A final question. You mentioned that, kind of outsourcing, almost, some of this data refining, outside government or the big institutions, has helped them. Can you tell us any stories of kind of gratitude being expressed by the government? 00:22:55.085 --> 00:23:10.447 RUFUS: Well, I mean, to kind of, yeah. I mean there was an interesting example actually where we had some complaint because the open spending data I told you about, where we're aggregating the government spending and financial data -- you know, the site had a few performance issues, occasionally, as we loaded more data in. 00:23:10.447 --> 00:23:20.864 I remember kind of getting this call kind of going : "Well, you know, we're a little bit upset". You know, data.gov.uk, and it turned out the reason was, the Treasury kept looking at this data, and they were annoyed when the site was going down. 00:23:20.864 --> 00:23:25.878 So that was really intriguing to me, that we were kind of one of the best, at least, up-to-date aggregators out there. 00:23:25.878 --> 00:23:33.184 Um, I think you are already seeing people doing stuff with the data and kind of doing stuff, sometimes for free. And you don't have to have the shiny front-end. 00:23:33.184 --> 00:23:46.246 I mean, one of the things, we went about, we went on about, I know Tim Berners-Lee went on about -- raw data now, you know, you can build fewer shiny front-ends, and just release raw data. And you know, someone else will help you build the app, the front-end, the interface. 00:23:46.246 --> 00:23:54.122 And help you innovate about it. What is the best way to provide healthcare data to citizens, or education data to citizens, so they make better and more informed choices? 00:23:54.122 --> 00:23:56.767 I don't know, and the government probably doesn't know. 00:23:56.767 --> 00:24:02.799 But somewhere out there, someone is going to innovate and really provide the best way for us to deliver that kind of information to citizens. 00:24:02.799 --> 00:24:04.250 QUESTIONER: Thank you very much. 00:24:04.250 --> 00:24:05.496 RUFUS: Thank you. 00:24:05.496 --> 00:24:06.901 [Applause] 00:24:06.901 --> 00:24:09.423 lift _ Video Production ACTUA 00:24:09.423 --> 00:24:11.751 Copyright (c) 2012 Lift conference