WEBVTT 99:59:59.999 --> 99:59:59.999 The world we live in is awashed with data 99:59:59.999 --> 99:59:59.999 that comes pouring in from everywhere around us. 99:59:59.999 --> 99:59:59.999 On it own, this data is just noise and confusion. 99:59:59.999 --> 99:59:59.999 To make sense of data, to find the meaning in it, 99:59:59.999 --> 99:59:59.999 we need a powerful branch of science: statistics. 99:59:59.999 --> 99:59:59.999 Believe me, there's nothing boring about statistics 99:59:59.999 --> 99:59:59.999 especially not today, when we can make the data sing. 99:59:59.999 --> 99:59:59.999 With statistics we can really make sense of the world. 99:59:59.999 --> 99:59:59.999 Are statistics, the data diluge as it's been called, 99:59:59.999 --> 99:59:59.999 leading us to a greater understanding 99:59:59.999 --> 99:59:59.999 of the life on Earth and the world beyond? 99:59:59.999 --> 99:59:59.999 Thanks to incredible power of today's computers 99:59:59.999 --> 99:59:59.999 it may fundamentally transform the process of scientific discovery. 99:59:59.999 --> 99:59:59.999 I kid you not, statistics is now the sexiest subject around. 99:59:59.999 --> 99:59:59.999 Did you know that there's is one million boats in Sweden? 99:59:59.999 --> 99:59:59.999 That's one boat per nine people. 99:59:59.999 --> 99:59:59.999 It's the highest number of boats per person in Europe. 99:59:59.999 --> 99:59:59.999 Being statistician, you don't like telling your profession at dinner parties, 99:59:59.999 --> 99:59:59.999 but really, statisticians shouldn't be shy 99:59:59.999 --> 99:59:59.999 because they always want to understand what's going on. 99:59:59.999 --> 99:59:59.999 Stastistics gives us a persperctive of the world we live in 99:59:59.999 --> 99:59:59.999 that we can't get in any other way. 99:59:59.999 --> 99:59:59.999 Statistics tells us whether the things we think and believe are actually true. 99:59:59.999 --> 99:59:59.999 Statistics are far more useful than we usually like to admit. 99:59:59.999 --> 99:59:59.999 In the last recession, there was this famous call into Talk Radio Station. 99:59:59.999 --> 99:59:59.999 The man complained: "in times like this, when unemployment rates are up to 13%, 99:59:59.999 --> 99:59:59.999 and income has fallen by 5%, and suicide rates are climbing, 99:59:59.999 --> 99:59:59.999 I get so angry that the government is wasting money on things like correctional statistics." 99:59:59.999 --> 99:59:59.999 I'm not oficially a statistician, strictly speaking my field is global health. 99:59:59.999 --> 99:59:59.999 But I got really obsessed with stats, when I realised how many people in Sweden 99:59:59.999 --> 99:59:59.999 don't know anything about the rest of the world. 99:59:59.999 --> 99:59:59.999 I started in our Medical University in Karolinksa Institute, 99:59:59.999 --> 99:59:59.999 an ungraduate course called Global Health. 99:59:59.999 --> 99:59:59.999 These students coming to us have actually the highest grades you can get in theSwedish college system. 99:59:59.999 --> 99:59:59.999 So I thought maybe they know everything I'm going to teach them. 99:59:59.999 --> 99:59:59.999 So I did a pre-test when they came. 99:59:59.999 --> 99:59:59.999 One of the questions, from which I learnt a lot, was: 99:59:59.999 --> 99:59:59.999 Which country has the highest child mortality of these five pairs? 99:59:59.999 --> 99:59:59.999 I won't put you at test here, but it's Turkey which is higher there, 99:59:59.999 --> 99:59:59.999 Poland, Russia, Pakistan and South Africa. 99:59:59.999 --> 99:59:59.999 And these were the results of the Swedish students. 99:59:59.999 --> 99:59:59.999 1.8 answers right out of 5 possible, 99:59:59.999 --> 99:59:59.999 that means that there was a place for a professor in International Health 99:59:59.999 --> 99:59:59.999 and for my course. 99:59:59.999 --> 99:59:59.999 But one late night when I was compiling my report 99:59:59.999 --> 99:59:59.999 I really realise my discovery. 99:59:59.999 --> 99:59:59.999 I have shown that Swedish top students know 99:59:59.999 --> 99:59:59.999 statistically significantly less about the world than the chimpanzees. 99:59:59.999 --> 99:59:59.999 Beacuse the chimpanzee would score half right. 99:59:59.999 --> 99:59:59.999 If I gave them two bananas with Sri Lanka and Turkey 99:59:59.999 --> 99:59:59.999 they would be right half of the cases. 99:59:59.999 --> 99:59:59.999 But the students are not there. 99:59:59.999 --> 99:59:59.999 I did also an unethical study of the professors of the Karolinska Institute 99:59:59.999 --> 99:59:59.999 that hands out the Nobel Prize in Medicine, and they aren't on par with the chimpanzee. 99:59:59.999 --> 99:59:59.999 Today, there's more information accesible than ever before, 99:59:59.999 --> 99:59:59.999 and I work with my team at the Gapminder Foundation 99:59:59.999 --> 99:59:59.999 using new tools that help everyone make sense of the changing world. 99:59:59.999 --> 99:59:59.999 We draw on the masses of data that are now free available 99:59:59.999 --> 99:59:59.999 from international institutions like the UN and the World Bank. 99:59:59.999 --> 99:59:59.999 It's become my mission to share my insights from this data 99:59:59.999 --> 99:59:59.999 with anyone who listen, and to reveal how statistics is nothing to be frightened of. 99:59:59.999 --> 99:59:59.999 I'm going to provide you a view of the global health situation across mankind, 99:59:59.999 --> 99:59:59.999 and I'm going to do that in a hopefully enjoyable way. So relax. 99:59:59.999 --> 99:59:59.999 We did this software which displays it like this, 99:59:59.999 --> 99:59:59.999 every bubble here is a country, this is China, this is India. 99:59:59.999 --> 99:59:59.999 The size of the bubble is the population. 99:59:59.999 --> 99:59:59.999 And I'm going to stage a race here 99:59:59.999 --> 99:59:59.999 between this sort of yellow Ford here, and the red Toyota down there, 99:59:59.999 --> 99:59:59.999 and the brownish Volvo. 99:59:59.999 --> 99:59:59.999 The Toyota has a very bad start down here, 99:59:59.999 --> 99:59:59.999 and United States' Ford is going off road there, 99:59:59.999 --> 99:59:59.999 and the Volvo is doing quite fine, this is the war, 99:59:59.999 --> 99:59:59.999 they Toyota got off crack, now Toyota is coming on the healthier side of Sweden. 99:59:59.999 --> 99:59:59.999 That's the point when I sold the Volvo and bought the Toyota. 99:59:59.999 --> 99:59:59.999 This is the Great Leap Forward when China fell down, 99:59:59.999 --> 99:59:59.999 it was central planning by Mao Tse Tung, 99:59:59.999 --> 99:59:59.999 China recovered and said "never more stupid central planning", but they went up here. 99:59:59.999 --> 99:59:59.999 No, there was one more inequity, look there! United States! 99:59:59.999 --> 99:59:59.999 Oh, they broke my frame! 99:59:59.999 --> 99:59:59.999 Washington D.C. is so rich over there, but it's not as healthy as Kerala, India. 99:59:59.999 --> 99:59:59.999 It's quite interesting, isn't it? 99:59:59.999 --> 99:59:59.999 Welcome to the USA, world leaders in big cars 99:59:59.999 --> 99:59:59.999 and free data. 99:59:59.999 --> 99:59:59.999 There are many here who share my vision 99:59:59.999 --> 99:59:59.999 of making public data accesible and useful for everyone. 99:59:59.999 --> 99:59:59.999 The city of San Francisco is in the lead, opening up it's data on everything. 99:59:59.999 --> 99:59:59.999 Even the Police Dept. is releasing all it's crime reports. 99:59:59.999 --> 99:59:59.999 This official crime data has been turned into a wonderful inteactive map 99:59:59.999 --> 99:59:59.999 by two of the cities computer whizzes. 99:59:59.999 --> 99:59:59.999 It's community statistics in action. 99:59:59.999 --> 99:59:59.999 Crimespotting is a map of crime reports 99:59:59.999 --> 99:59:59.999 from the San Francisco Police Dept. 99:59:59.999 --> 99:59:59.999 showing dots on maps for citizens to be able to see patterns of crime 99:59:59.999 --> 99:59:59.999 in their neighbourhoods in San Francisco. 99:59:59.999 --> 99:59:59.999 The map is not just about individual crimes 99:59:59.999 --> 99:59:59.999 but about broader patterns that show you where crime is clustered around the city, 99:59:59.999 --> 99:59:59.999 which have high crime, which areas have relatively low crime. 99:59:59.999 --> 99:59:59.999 We're here at top of Jones Street, on uphill, quite a nice neighbourhood 99:59:59.999 --> 99:59:59.999 what the crime maps show us is the relationship between typography and crime. 99:59:59.999 --> 99:59:59.999 The higher up the hill, the less crime there is. 99:59:59.999 --> 99:59:59.999 We crossed over the border into the flats. 99:59:59.999 --> 99:59:59.999 Essentially, as soon as you get into the kind of lower line areas of Jones street, 99:59:59.999 --> 99:59:59.999 the crime just skyrockets. 99:59:59.999 --> 99:59:59.999 So we're in the uptown Tenderloin District, 99:59:59.999 --> 99:59:59.999 it's one of the oldest and most dangerous neighbourhoods in San Francisco. 99:59:59.999 --> 99:59:59.999 This is where you go to buy drugs, right around here. 99:59:59.999 --> 99:59:59.999 You see lots of aggreviated assault, lots of thefts. 99:59:59.999 --> 99:59:59.999 Basically, the huge part of the crime of the city happens right in these four or six block areas. 99:59:59.999 --> 99:59:59.999 If you've been hearing police sirens in your neighbourhood, 99:59:59.999 --> 99:59:59.999 you can use the map to find out why. 99:59:59.999 --> 99:59:59.999 If you are out at night in an unfamiliar part of town 99:59:59.999 --> 99:59:59.999 you can check the map for streets to avoid. 99:59:59.999 --> 99:59:59.999 If a neighbour gets burglared, you can see, 99:59:59.999 --> 99:59:59.999 is it the one off or has there been a spike in local crime? 99:59:59.999 --> 99:59:59.999 If you commute through a neighbourhood and you're worried about its safety 99:59:59.999 --> 99:59:59.999 the fact that we have the ability to turn off all the night time and middle-of-the-day crimes 99:59:59.999 --> 99:59:59.999 and show you just the things that are happening during your commute, 99:59:59.999 --> 99:59:59.999 is a statistical operation but I think to the people that are interacting with the thing 99:59:59.999 --> 99:59:59.999 it feels very much more like they just are sort of browsing a website 99:59:59.999 --> 99:59:59.999 or shopping on Amazon. They're looking at data, 99:59:59.999 --> 99:59:59.999 and they don't realise that they're doing statistics. 99:59:59.999 --> 99:59:59.999 What's most exciting for me is that public statistics 99:59:59.999 --> 99:59:59.999 is making citizens more powerful and the authorities more accountable. 99:59:59.999 --> 99:59:59.999 We have community meetings that the police attend 99:59:59.999 --> 99:59:59.999 and what citizens are now doing, they're bringing printouts of the maps 99:59:59.999 --> 99:59:59.999 to show where crimes are taking place, 99:59:59.999 --> 99:59:59.999 and they're demanding services from the police department, 99:59:59.999 --> 99:59:59.999 which is now having to change how they please, 99:59:59.999 --> 99:59:59.999 how they provide policing services, 99:59:59.999 --> 99:59:59.999 because the data is showing what is working and what is not. 99:59:59.999 --> 99:59:59.999 People in San Francisco are also using public data 99:59:59.999 --> 99:59:59.999 to map social inequalities, and see how to improve society 99:59:59.999 --> 99:59:59.999 and the possibilities are endless. 99:59:59.999 --> 99:59:59.999 Our dream would be that the government announced that 99:59:59.999 --> 99:59:59.999 this data project would really focus on live information 99:59:59.999 --> 99:59:59.999 on stuff that was being reported and pushed out into the world as it was happening. 99:59:59.999 --> 99:59:59.999 Trash pickup, traffic accidents, buses, 99:59:59.999 --> 99:59:59.999 and through the kind of the stats gathering power on the internet 99:59:59.999 --> 99:59:59.999 it's posible to really see the workings of the city 99:59:59.999 --> 99:59:59.999 displayed as a unified interface. 99:59:59.999 --> 99:59:59.999 That's where we are heading, 99:59:59.999 --> 99:59:59.999 towards a world of free data with all the statistical insights that come from it 99:59:59.999 --> 99:59:59.999 accesible to everyone, empowering us as citizens 99:59:59.999 --> 99:59:59.999 and letting hold our rulers to account. 99:59:59.999 --> 99:59:59.999 It's a long way from where statistics began. 99:59:59.999 --> 99:59:59.999 Statistics are essential to monitor our government in our societies. 99:59:59.999 --> 99:59:59.999 But, it was our rulers out there who started the collection of statistics 99:59:59.999 --> 99:59:59.999 in first place in order to monitor us. 99:59:59.999 --> 99:59:59.999 In fact the word statistics comes from state. 99:59:59.999 --> 99:59:59.999 Modern statistics began two centuries ago. 99:59:59.999 --> 99:59:59.999 Once it got going it spread and never stopped. 99:59:59.999 --> 99:59:59.999 And guess who was first. 99:59:59.999 --> 99:59:59.999 The Chinese have Confucious, the Italians have Da Vinci, 99:59:59.999 --> 99:59:59.999 and the British have Shakespeare, and we have the Tabellverket 99:59:59.999 --> 99:59:59.999 the first ever systematic collection of statistics. 99:59:59.999 --> 99:59:59.999 Since the year 1749 we have collected data on every birth, marriage and death 99:59:59.999 --> 99:59:59.999 and we are proud of it. 99:59:59.999 --> 99:59:59.999 The Tabellverket recorded information from every parish in Sweden. 99:59:59.999 --> 99:59:59.999 It was a huge quantity of data and it was the first time any goverment 99:59:59.999 --> 99:59:59.999 could get any accurate picture of its people. 99:59:59.999 --> 99:59:59.999 Sweden had been the greatest military power in Northern Europe 99:59:59.999 --> 99:59:59.999 but by 1749 our star was really fading and other countries were growing stronger. 99:59:59.999 --> 99:59:59.999 At least though, we were a large power, thought to have 20 million people 99:59:59.999 --> 99:59:59.999 enough to rival Britain and France. 99:59:59.999 --> 99:59:59.999 But we were in for a nasty surprise. 99:59:59.999 --> 99:59:59.999 The first analysis of Tabellverket revealed that Sweden only had 2 million inhabitants. 99:59:59.999 --> 99:59:59.999 Sweden was not only a power in decline, it also had a very small popoulation. 99:59:59.999 --> 99:59:59.999 The government was horrified by this finding. 99:59:59.999 --> 99:59:59.999 What if the enemy found out? 99:59:59.999 --> 99:59:59.999 But the Tabellverket also showed that many women die in childbirth. 99:59:59.999 --> 99:59:59.999 And many children died young, and government took action to improve the health of the people. 99:59:59.999 --> 99:59:59.999 That was the beginning of modern Sweden. 99:59:59.999 --> 99:59:59.999 It took more than 50 years before the Austrians, Belgiums, Danes, Dutch, 99:59:59.999 --> 99:59:59.999 Germans, Italians and finally the British caught up with Sweden 99:59:59.999 --> 99:59:59.999 in collecting and using statistics. 99:59:59.999 --> 99:59:59.999 It was called political arithmethic, and it was a lovely phrase as use for statistics. 99:59:59.999 --> 99:59:59.999 Governments could have much more control and understanding of the society 99:59:59.999 --> 99:59:59.999 how it's working, how it's developing, 99:59:59.999 --> 99:59:59.999 and essentially, so they could control it better. 99:59:59.999 --> 99:59:59.999 It wasn't just governments who woke up to the power of statistics. 99:59:59.999 --> 99:59:59.999 Right across Europe, 19th century society went mad for facts. 99:59:59.999 --> 99:59:59.999 And despite its late start, Britain with its Royal Statistical Society in London 99:59:59.999 --> 99:59:59.999 was soon a statisticians' nirvana. 99:59:59.999 --> 99:59:59.999 I love looking at old copies of the Royal Statistical Society, 99:59:59.999 --> 99:59:59.999 because is full of this stuff. 99:59:59.999 --> 99:59:59.999 There's a wonderful paper from the 1840s 99:59:59.999 --> 99:59:59.999 which shows a map of England and the rates of bastardy of each county 99:59:59.999 --> 99:59:59.999 99:59:59.999 --> 99:59:59.999 so you can identify very quickly the areas with high areas of bastardy. 99:59:59.999 --> 99:59:59.999 Being in East Anglia makes me slightly laugh 99:59:59.999 --> 99:59:59.999 that Norfolk was on top of the bastardy league in the 1840s. 99:59:59.999 --> 99:59:59.999 One of the founders of the Royal Statistical Society 99:59:59.999 --> 99:59:59.999 was the great victorian mathematician and inventor Charles Babbage. 99:59:59.999 --> 99:59:59.999 In 1842 he read the latest poem by a equally great victorian 99:59:59.999 --> 99:59:59.999 Alfred Tennyson. 99:59:59.999 --> 99:59:59.999 "Vision of Sin" contained the lines: 99:59:59.999 --> 99:59:59.999 "Fill the cup and fill the can, Have a rouse before the morn. 99:59:59.999 --> 99:59:59.999 Every moment dies a man, Every moment one is born." 99:59:59.999 --> 99:59:59.999 So keen statistician was Babbage that he could not contain himself. 99:59:59.999 --> 99:59:59.999 He dashed a letter to Tennyson explaining that because of population growth 99:59:59.999 --> 99:59:59.999 the line should read: 99:59:59.999 --> 99:59:59.999 "Every moment dies a man, And 11/16 is born." 99:59:59.999 --> 99:59:59.999 "I may add that the exact figure is 1.167 99:59:59.999 --> 99:59:59.999 but something must be conceded to the laws of metre." 99:59:59.999 --> 99:59:59.999 In the 19th century scholars all over Europe 99:59:59.999 --> 99:59:59.999 did an amazing work in measuring the societies. 99:59:59.999 --> 99:59:59.999 They hovered up data in almost everything 99:59:59.999 --> 99:59:59.999 but numbers alone don't tell you anything 99:59:59.999 --> 99:59:59.999 you have to analyse them, and that's what makes statistics. 99:59:59.999 --> 99:59:59.999 When the first statisticians began to get to grips with analysing their data 99:59:59.999 --> 99:59:59.999 they seized upon the average, and they took the average of everything. 99:59:59.999 --> 99:59:59.999 What's so great about an average 99:59:59.999 --> 99:59:59.999 is that you can take a whole mass of data and reduce it to a single number. 99:59:59.999 --> 99:59:59.999 Though each of us is unique, our collective lives produce averages 99:59:59.999 --> 99:59:59.999 that characterise whole populations. 99:59:59.999 --> 99:59:59.999 I look to my local newspaper one week 99:59:59.999 --> 99:59:59.999 and saw that a pensioner had accidently put a foot on the accelerator 99:59:59.999 --> 99:59:59.999 and crashed her friend against the wall. 99:59:59.999 --> 99:59:59.999 Devastating, hideous, horrible thing to happen. 99:59:59.999 --> 99:59:59.999 And there was a second one about a young man who didn't have a driving licence 99:59:59.999 --> 99:59:59.999 who was driving a car under the influence of drugs and alcohol NOTE Paragraph 99:59:59.999 --> 99:59:59.999 and crashed into a pedestrian and killed him. 99:59:59.999 --> 99:59:59.999 What is remarkable, absolutely remarkable, 99:59:59.999 --> 99:59:59.999 if you look at the number of people who die each year 99:59:59.999 --> 99:59:59.999 in traffic accidents, it's nearly a constant. 99:59:59.999 --> 99:59:59.999 What? 99:59:59.999 --> 99:59:59.999 All these individual events, somehow when you sum them all up 99:59:59.999 --> 99:59:59.999 it's the same number every year, 99:59:59.999 --> 99:59:59.999 and every year two and a half times as many men die 99:59:59.999 --> 99:59:59.999 in traffic accidents as women, and it's a constant. 99:59:59.999 --> 99:59:59.999 An every year the rate in Belgium is double 99:59:59.999 --> 99:59:59.999 the rate in England, there are these remarkable regularities 99:59:59.999 --> 99:59:59.999 so that these individual particular events sum up into a social phenomenon. 99:59:59.999 --> 99:59:59.999 (Lecture) Let's see what Sweden has done 99:59:59.999 --> 99:59:59.999 we used to boast of fast social progress. 99:59:59.999 --> 99:59:59.999 (Narration) In my lectures, to tell stories about the changing world 99:59:59.999 --> 99:59:59.999 I use averages for entire countries, whether the average for income, 99:59:59.999 --> 99:59:59.999 child mortality, family size or carbon output. 99:59:59.999 --> 99:59:59.999 (Lecture) OK, I give you Singapore, the year I was born. 99:59:59.999 --> 99:59:59.999 Singapore had twice the child mortality of Sweden. 99:59:59.999 --> 99:59:59.999 The most tropical country in the world. A marshland on the Equator. 99:59:59.999 --> 99:59:59.999 And here we go. It took a little time for them to get independence 99:59:59.999 --> 99:59:59.999 but they started to grow their economy, and they made the social investments, 99:59:59.999 --> 99:59:59.999 they got away malaria, they got a magnificient health system 99:59:59.999 --> 99:59:59.999 that beats both UkKs and Sweden's. 99:59:59.999 --> 99:59:59.999 We thought it would never happened but they would win over Sweden! 99:59:59.999 --> 99:59:59.999 But useful as averages are they don't tell you the whole story. 99:59:59.999 --> 99:59:59.999 On average, Swedish people have slightly less than two legs. 99:59:59.999 --> 99:59:59.999 That is because a few people have one leg or no legs, and no one has three legs 99:59:59.999 --> 99:59:59.999 so almost everybody in Sweden has more than the average number of legs. 99:59:59.999 --> 99:59:59.999 The variation in data is just as important as the average. 99:59:59.999 --> 99:59:59.999 But how do you get the handle on variation? 99:59:59.999 --> 99:59:59.999 For this you transform numbers into shapes. 99:59:59.999 --> 99:59:59.999 Let's llok again at the number of adult women in Sweden for different heights. 99:59:59.999 --> 99:59:59.999 Plotting the data as a shape shows us how much their heights vary from the average 99:59:59.999 --> 99:59:59.999 and how wide that variation is. 99:59:59.999 --> 99:59:59.999 The shape a set of data makes is called its distribution. 99:59:59.999 --> 99:59:59.999 (Lecture) This is the income distribution of China 1970 99:59:59.999 --> 99:59:59.999 This is the income distribution of the United States 1970. 99:59:59.999 --> 99:59:59.999 Almost no overlap. And what has happened? 99:59:59.999 --> 99:59:59.999 China is growing. It's not so equal any longer. 99:59:59.999 --> 99:59:59.999 And it's appearing here, overlooking the United States 99:59:59.999 --> 99:59:59.999 almost like a ghost, isn't it? It's scary! 99:59:59.999 --> 99:59:59.999 That statistician who first explored distribution 99:59:59.999 --> 99:59:59.999 discovered one shape that turned up again and again 99:59:59.999 --> 99:59:59.999 the victorian scholar Francis Goldtone was so fascinated 99:59:59.999 --> 99:59:59.999 he built a machine that could reproduce it 99:59:59.999 --> 99:59:59.999 and he found it fitted so many different sets of measurements 99:59:59.999 --> 99:59:59.999 that he named it the Normal Distribution. 99:59:59.999 --> 99:59:59.999 Whether it was people's arm spans, land capacity or even their exam results 99:59:59.999 --> 99:59:59.999 the Normal Distribution shape recurred time and time again. 99:59:59.999 --> 99:59:59.999 And the statisticians soon found many other regular shapes 99:59:59.999 --> 99:59:59.999 each produced by a certain kind of natural or social processes. 99:59:59.999 --> 99:59:59.999 And every statistician has their favourite. 99:59:59.999 --> 99:59:59.999 The Poisson distribution, I think it's my favourite, it's absolute crack. 99:59:59.999 --> 99:59:59.999 The Poisson shape, describes how likely it is that out-of-the-ordinary things will happen. 99:59:59.999 --> 99:59:59.999 Imagine a London bus stop that we know that on average will get three buses an hour. 99:59:59.999 --> 99:59:59.999 We won't always get three buses of course. 99:59:59.999 --> 99:59:59.999 Amazingly the Poisson shape will show us the probability that in any given hour 99:59:59.999 --> 99:59:59.999 will get 4, 5 or 6 buses or no buses at all. 99:59:59.999 --> 99:59:59.999 The exact shape changes with the average 99:59:59.999 --> 99:59:59.999 but whether it is how many people will win the lottery jackpot each week 99:59:59.999 --> 99:59:59.999 or how many people will phone a call centre each minute 99:59:59.999 --> 99:59:59.999 the Poisson shape will give the probabilities. 99:59:59.999 --> 99:59:59.999 The wonderful example where this does apply is in the late 19th century 99:59:59.999 --> 99:59:59.999 was to count each year the number of Prussian officers 99:59:59.999 --> 99:59:59.999 cavalry officers that had be kicked to death by their horses 99:59:59.999 --> 99:59:59.999 Some year there were none, some years one, some years two,... up to seven. 99:59:59.999 --> 99:59:59.999 One particularly bad year. 99:59:59.999 --> 99:59:59.999 But with this distribution, how many years they go, one, two three, four, 99:59:59.999 --> 99:59:59.999 Prussian cavalry officers kicked to death by their horses 99:59:59.999 --> 99:59:59.999 beautifully obbey the Poisson distribution. 99:59:59.999 --> 99:59:59.999 So statisticians use shapes so we wield the patterns in the data 99:59:59.999 --> 99:59:59.999 but we also use images of all kinds to communicate statistics to a wider public 99:59:59.999 --> 99:59:59.999 because if the story in the numbers is told by a beautiful and clever image 99:59:59.999 --> 99:59:59.999 then everyone understands. 99:59:59.999 --> 99:59:59.999 Of the pioneers of statiscal graphics, my favourite is Florence Nightingale. 99:59:59.999 --> 99:59:59.999 There are not many people who realise that actually she was known as a passionate statistician 99:59:59.999 --> 99:59:59.999 and not just the Lady of the Lamp. 99:59:59.999 --> 99:59:59.999 She said that to understand God's thoughts we must study statistics 99:59:59.999 --> 99:59:59.999 for these are the measure of His purpose. 99:59:59.999 --> 99:59:59.999 Statistics must reserve a religious studio moral imperative. 99:59:59.999 --> 99:59:59.999 When Florence was nine years old, she started collecting data. 99:59:59.999 --> 99:59:59.999 Her data was different fruits and vegetables she found. 99:59:59.999 --> 99:59:59.999 Put them into different tables, trying to organise them in some standard form, 99:59:59.999 --> 99:59:59.999 so we have one of the Nightgale's first statistical tables at the age of nine. 99:59:59.999 --> 99:59:59.999 In the mid-1850s, Florence Nightingale went to Crimea 99:59:59.999 --> 99:59:59.999 to care for British casualties at war. 99:59:59.999 --> 99:59:59.999 She was horrified by what she discovered. 99:59:59.999 --> 99:59:59.999 For all the soldiers being blown to bits on the battlefield 99:59:59.999 --> 99:59:59.999 there were many many more soldiers dying from diseases 99:59:59.999 --> 99:59:59.999 caught in the army's filthy hospitals. 99:59:59.999 --> 99:59:59.999 So Florence Nightingale bagan counting the dead. 99:59:59.999 --> 99:59:59.999 For two years she recorded mortality data in meticulous detail. 99:59:59.999 --> 99:59:59.999 When the war was over, she persuaded the government 99:59:59.999 --> 99:59:59.999 to set up a Royal Comission of Enquiry. 99:59:59.999 --> 99:59:59.999 And gathered her data in a devastating report. 99:59:59.999 --> 99:59:59.999 What has amended her place in the statistically history books is the graphics she used. 99:59:59.999 --> 99:59:59.999 And one in particular, the Polar Area Graph. 99:59:59.999 --> 99:59:59.999 For each month of the war, a huge blue wedge represented the soldiers 99:59:59.999 --> 99:59:59.999 who had died of preventable diseases. 99:59:59.999 --> 99:59:59.999 The much smaller red wedges were deaths from wounds, 99:59:59.999 --> 99:59:59.999 and the black wedges deaths from accidents and other causes. 99:59:59.999 --> 99:59:59.999 Nightingale graphics were so clear, they were impossible to ignore. 99:59:59.999 --> 99:59:59.999 The usual thing around Florence Nightingale's time 99:59:59.999 --> 99:59:59.999 was just to produce tables and tables of figures. Absolutely tedious stuff. 99:59:59.999 --> 99:59:59.999 Unless you are a dedicated statistician, it's quite difficult to spot the patterns naturally. 99:59:59.999 --> 99:59:59.999 But visualisations tell a story. They tell a story immediately. 99:59:59.999 --> 99:59:59.999 The use of colour, the use of shape, can really tell a powerful story. 99:59:59.999 --> 99:59:59.999 And these days, we can make things move as well. 99:59:59.999 --> 99:59:59.999 Florence Nightingale would've loved to play with it, 99:59:59.999 --> 99:59:59.999 she would've produced wonderful animations, I'm absolutely certain about it. 99:59:59.999 --> 99:59:59.999 Today, a hundred and fifty years on, 99:59:59.999 --> 99:59:59.999 Nightingale's graphics are rightly regarded as a classic. 99:59:59.999 --> 99:59:59.999 They led to a revolution in nursing and health care, in hygiene in hospitals worldwide. 99:59:59.999 --> 99:59:59.999 We've saved innumerable lives. 99:59:59.999 --> 99:59:59.999 Statistical graphics has become an art of its very own. 99:59:59.999 --> 99:59:59.999 Led by designers who are passionate about visualising data. 99:59:59.999 --> 99:59:59.999 This is the Billion Pound O Gram. 99:59:59.999 --> 99:59:59.999 This image arouse out of the frustration with the reporting 99:59:59.999 --> 99:59:59.999 of billion-pounds amounts in the media. 99:59:59.999 --> 99:59:59.999 500 trillion pounds for this war, 50 million pounds for this hospital, 99:59:59.999 --> 99:59:59.999 this does not make sense, these figures are too enormous to get your mind around. 99:59:59.999 --> 99:59:59.999 So I squailed to this data from various news sources and created this diagram 99:59:59.999 --> 99:59:59.999 so the squares here are scaled according the the billion-pound amounts. 99:59:59.999 --> 99:59:59.999 When you see numbers visualised like this, 99:59:59.999 --> 99:59:59.999 you start to have a different kind of relationship with them. 99:59:59.999 --> 99:59:59.999 You can see patterns, see the scale of them. 99:59:59.999 --> 99:59:59.999 Here, this little square, 37 billion, this was the predicted cost of the Iraq war in 2003. 99:59:59.999 --> 99:59:59.999 As you can see it has grown exponentially over the last few years 99:59:59.999 --> 99:59:59.999 to the total cost of about 2,500 billion. 99:59:59.999 --> 99:59:59.999 It's funny because when you visualise statistics like this, you undestand them. 99:59:59.999 --> 99:59:59.999 And when you understand them, you can put things into perspective. 99:59:59.999 --> 99:59:59.999 Visualisation is right at the heart of my own work too. 99:59:59.999 --> 99:59:59.999 I teach Global Health. 99:59:59.999 --> 99:59:59.999 I know that having the data is not enough, 99:59:59.999 --> 99:59:59.999 I have to show it in ways people both enjoy and undestand. 99:59:59.999 --> 99:59:59.999 Now I'm going to try something I've never done before. 99:59:59.999 --> 99:59:59.999 Animating the data in real space. 99:59:59.999 --> 99:59:59.999 With a bit of technical assistance from the crew. 99:59:59.999 --> 99:59:59.999 So here we go! 99:59:59.999 --> 99:59:59.999 First an axis for health, life expectancy from 25 years to 75 years. 99:59:59.999 --> 99:59:59.999 Down here an axis for wealth, income per person, $400, $4,000 and $40,000. 99:59:59.999 --> 99:59:59.999 So down here is poor and sick. And up here is rich and healthy. 99:59:59.999 --> 99:59:59.999 Now I'm going to show you the world 200 years ago, in 1810. 99:59:59.999 --> 99:59:59.999 Here come all the countries: Europe brown, Asia red, 99:59:59.999 --> 99:59:59.999 Middle East green, Africa South-of-Sahara blue, and America is yellow. 99:59:59.999 --> 99:59:59.999 And the size of the country bubble shows the size of the population. 99:59:59.999 --> 99:59:59.999 And in 1810 it was pretty crowded down there, isn't it? 99:59:59.999 --> 99:59:59.999 All countries were sick and poor, life expectancy would be below 40 in all countries. 99:59:59.999 --> 99:59:59.999 Only the UK and the Netherlands were slightly better off, but not much. 99:59:59.999 --> 99:59:59.999 And now, I'll start the world! 99:59:59.999 --> 99:59:59.999 The Industrial Revolution makes countries in Europe and elsewhere move away from the rest. 99:59:59.999 --> 99:59:59.999 But the colonised countries in Asia and Africa are stuck down there. 99:59:59.999 --> 99:59:59.999 Eventually the Western countries get healthier and healthier. 99:59:59.999 --> 99:59:59.999 Now we slow down to see the impact of the First World War and the Spanish Flu Epidemy. 99:59:59.999 --> 99:59:59.999 What a catastrophe! 99:59:59.999 --> 99:59:59.999 Now I'll speed up through the 1920s and 1930s 99:59:59.999 --> 99:59:59.999 and spite of the Great Depression, Western countries fueled on towards greater wealth and health. 99:59:59.999 --> 99:59:59.999 Japan and some others try to follow but most countries stay down here. 99:59:59.999 --> 99:59:59.999 After the tragedies of the Second World War 99:59:59.999 --> 99:59:59.999 we stop a bit to look at the world in 1948. 99:59:59.999 --> 99:59:59.999 1948 was a great year, the war was over, Sweden topped the medal table at the Winter Olympics, 99:59:59.999 --> 99:59:59.999 and I was born, but the differences between the countries of the world was wider than ever. 99:59:59.999 --> 99:59:59.999 United States was in the front, Japan was catching up, Brasil was way behind, 99:59:59.999 --> 99:59:59.999 Iran was getting a little richer from oil, but still had short lives. 99:59:59.999 --> 99:59:59.999 The Asian giants, China, India, Pakistan, Bangladesh and Indonesia, 99:59:59.999 --> 99:59:59.999 they were still poor and sit down here. 99:59:59.999 --> 99:59:59.999 But look what is about to happen. In my lifetime, former colonies gained independence 99:59:59.999 --> 99:59:59.999 and finally they started to get healthier, and healthier, and healthier. 99:59:59.999 --> 99:59:59.999 And in the 1970s, countries in Asia and Latin America 99:59:59.999 --> 99:59:59.999 started to catch up with the Western countries. 99:59:59.999 --> 99:59:59.999 They became the emerging economies. 99:59:59.999 --> 99:59:59.999 Some in Africa follow, some in Africa are stuck in civil wars, and others are hit by HIV. 99:59:59.999 --> 99:59:59.999 And now we can see the world today, in the most up-to-date statistics. 99:59:59.999 --> 99:59:59.999 Most people today live in the middle, 99:59:59.999 --> 99:59:59.999 but here are huge differences at the same time 99:59:59.999 --> 99:59:59.999 between the best of countries and the worst of countries 99:59:59.999 --> 99:59:59.999 and there are also huge inequalities within countries. 99:59:59.999 --> 99:59:59.999 These bubbles show country averages, but I can split them. 99:59:59.999 --> 99:59:59.999 Take China, I can split it into provinces. 99:59:59.999 --> 99:59:59.999 There goes Shanghai, it has the same health and wealth as Italy today. 99:59:59.999 --> 99:59:59.999 And then there's the poor inland province of Guizhou. It's like Pakistan. 99:59:59.999 --> 99:59:59.999 And if I split it further, the rural parts are like Ghana in Africa. 99:59:59.999 --> 99:59:59.999 And yet, despite the enormous disparities today, we have seen 200 years of remarkable progress. 99:59:59.999 --> 99:59:59.999 That huge historical gap between the West and the rest is now closing. 99:59:59.999 --> 99:59:59.999 We have become an entirely new converging world. 99:59:59.999 --> 99:59:59.999 And I see a clear trend into the future, with aid, trade, green technology and peace. 99:59:59.999 --> 99:59:59.999 It's fully possible that everyone can make it to the healthy-wealthy corner. 99:59:59.999 --> 99:59:59.999 What you've just seen in the last few minutes is a story of 200 countries 99:59:59.999 --> 99:59:59.999 shown over 200 years and beyond. It involved plotting 120,000 numbers. 99:59:59.999 --> 99:59:59.999 Pretty neat, eh? 99:59:59.999 --> 99:59:59.999 With statistics we can start to see things as they really are. 99:59:59.999 --> 99:59:59.999 From tables of data, to averages, distributions and visualisations, 99:59:59.999 --> 99:59:59.999 statistics gives us a clear description of the world. 99:59:59.999 --> 99:59:59.999 But with statistics we can not only discover what is happening 99:59:59.999 --> 99:59:59.999 but also explore why, by using the powerful analytical method of correlation. 99:59:59.999 --> 99:59:59.999 Just looking at one thing at a time doesn't tell you very much. 99:59:59.999 --> 99:59:59.999 You have to look at the relationships between things. 99:59:59.999 --> 99:59:59.999 How they change. How they vary together. That's what correlation is about. 99:59:59.999 --> 99:59:59.999 That's how we start to understand the processes that are really going on 99:59:59.999 --> 99:59:59.999 in the world and in socierty. 99:59:59.999 --> 99:59:59.999 Most of us would recognise today that crime correlates to poverty, 99:59:59.999 --> 99:59:59.999 that infection correlates to poor sanitasion, 99:59:59.999 --> 99:59:59.999 and that knowledge of statistics correlates to being great at dancing. 99:59:59.999 --> 99:59:59.999 Correlations can be very tricky. 99:59:59.999 --> 99:59:59.999 I've got a joke about silly correlations. 99:59:59.999 --> 99:59:59.999 This was this American who was afraid of heart attack. 99:59:59.999 --> 99:59:59.999 He found out that the Japanese ate very little fat, and almost didn't drink wine, 99:59:59.999 --> 99:59:59.999 and have much less heart attacks than the American. 99:59:59.999 --> 99:59:59.999 But on the other hand, he found out that the French eat as much fat as the Americans 99:59:59.999 --> 99:59:59.999 and they drink much more wine, but they also have less heart attacks. 99:59:59.999 --> 99:59:59.999 so he concluded that what kills you is speaking English. 99:59:59.999 --> 99:59:59.999 The best example of a really ground-breaking correlation 99:59:59.999 --> 99:59:59.999 was the link that was established in the 1950s between smoking and lung cancer. 99:59:59.999 --> 99:59:59.999 Not long after the Second World War, a British doctor, Richard Doll, 99:59:59.999 --> 99:59:59.999 investigated lung cancer patients in twenty London hospitals, 99:59:59.999 --> 99:59:59.999 and he became certain that the only thing they had in common was smoking 99:59:59.999 --> 99:59:59.999 so certain that he stopped smoking himself. 99:59:59.999 --> 99:59:59.999 But other people weren't so sure. 99:59:59.999 --> 99:59:59.999 Lots of the discussion of early data linking smoking and lung cancer 99:59:59.999 --> 99:59:59.999 it can't be smoking, surely, that thing we've done all our lives, that can't be bad for you. 99:59:59.999 --> 99:59:59.999 Maybe it's genes, maybe people who are genetically predisposed to get lung cancer 99:59:59.999 --> 99:59:59.999 are also genetically predisposed to smoke. 99:59:59.999 --> 99:59:59.999 Maybe it's not the smoking, maybe it's air pollution, 99:59:59.999 --> 99:59:59.999 that smokers and somehow more exposed to air pollution than non-smokers. 99:59:59.999 --> 99:59:59.999 Maybe it's not smoking, maybe it's poverty. 99:59:59.999 --> 99:59:59.999 So now we have three possible explanations apart from chance. 99:59:59.999 --> 99:59:59.999 To verify his correlation did imply cause and effect 99:59:59.999 --> 99:59:59.999 Richard Doll created the biggest statistical study of smoking yet 99:59:59.999 --> 99:59:59.999 He began tracking the lives of 40,000 British doctors 99:59:59.999 --> 99:59:59.999 some of whom smoked, some of whom didn't. 99:59:59.999 --> 99:59:59.999 And gathered enough data to correlate the amount of doctors who smoked 99:59:59.999 --> 99:59:59.999 with their likelihood of getting cancer. 99:59:59.999 --> 99:59:59.999 Eventually, he did not only show a correlation between smoking and lung cancer 99:59:59.999 --> 99:59:59.999 but also a correlation between stopping smoking and reducing the risk. 99:59:59.999 --> 99:59:59.999 This was science at its best. 99:59:59.999 --> 99:59:59.999 What correlations do not replace is human thought. 99:59:59.999 --> 99:59:59.999 We could think about what it means. 99:59:59.999 --> 99:59:59.999 What a good scientist does if he comes up with a correlation 99:59:59.999 --> 99:59:59.999 is try as hard as he or she possibly can to disprove it 99:59:59.999 --> 99:59:59.999 to break it down, to get rid of it, to try to refute it, 99:59:59.999 --> 99:59:59.999 and if it withstands all those efforts at demolishing it, and it still standing out, 99:59:59.999 --> 99:59:59.999 then we might really have something here. 99:59:59.999 --> 99:59:59.999 However brilliants the scientists, data is still the oxygen of science. 99:59:59.999 --> 99:59:59.999 The good news is that the more we have, the more correlations we'll find, 99:59:59.999 --> 99:59:59.999 the more theories we'll test, and the more discoveries we are likely to make. 99:59:59.999 --> 99:59:59.999 And history shows how our total sum of information grows in huge leaps 99:59:59.999 --> 99:59:59.999 as we develop new technologies. 99:59:59.999 --> 99:59:59.999 The invention of the printing press kicked off the first data and information explosion 99:59:59.999 --> 99:59:59.999 If you piled up all the books that have been printed by the year 1700 99:59:59.999 --> 99:59:59.999 they would make sixty stacks, each as high as Mount Everest. 99:59:59.999 --> 99:59:59.999 Then, starting in the 19th century, there came a second information revolution. 99:59:59.999 --> 99:59:59.999 With the telegraph, gramophone, camera, and later radio and TV. 99:59:59.999 --> 99:59:59.999 The total amount of information exploded. 99:59:59.999 --> 99:59:59.999 And by the 1950s the information available to us all had multiplied six thousend times. 99:59:59.999 --> 99:59:59.999 Then, thanks to the computer, and later the Internet, we went digital, 99:59:59.999 --> 99:59:59.999 and the amount of data we have now, is unimaginably vast. 99:59:59.999 --> 99:59:59.999 A single letter printed in a book is the equivalent to a byte of data. 99:59:59.999 --> 99:59:59.999 A single page equals a kilobyte or two. 99:59:59.999 --> 99:59:59.999 Five megabytes is enough for the complete works of Shakespeare. 99:59:59.999 --> 99:59:59.999 10 gigabytes, that's a DVD movie. 99:59:59.999 --> 99:59:59.999 2 terabytes is the tens of millions of photos added to Facebook everyday. 99:59:59.999 --> 99:59:59.999 10 petabytes is the data recorded every second by the world's largest particle accelerator, 99:59:59.999 --> 99:59:59.999 so much only a tiny fraction is kept. 99:59:59.999 --> 99:59:59.999 6 exabytes is what you'd have if you sequenced the genomes of every single person on Earth. 99:59:59.999 --> 99:59:59.999 But really, that's nothing. In 2009, the Internet added up to 600 exabytes, 99:59:59.999 --> 99:59:59.999 and in 2010, in just one year, that will double to more than one zettabyte. 99:59:59.999 --> 99:59:59.999 But in the real world, if we turned all this data into print 99:59:59.999 --> 99:59:59.999 it would make ninety stacks of books, each reaching from here all the way to the Sun. 99:59:59.999 --> 99:59:59.999 The data deluge is staggering. But with today's computers and statistics, 99:59:59.999 --> 99:59:59.999 I'm confident we can handle it. 99:59:59.999 --> 99:59:59.999 When it comes to all the data on the Internet, 99:59:59.999 --> 99:59:59.999 the powerhouse of statistical analysis is the Sillicon Valley giant Google. 99:59:59.999 --> 99:59:59.999 The average person over their lifetime 99:59:59.999 --> 99:59:59.999 is exposed to about a hundred million words of conversation. 99:59:59.999 --> 99:59:59.999 So if you multiply that by the six billion people on the planet 99:59:59.999 --> 99:59:59.999 that amount of words is equal to the amount of words 99:59:59.999 --> 99:59:59.999 that Google has available at any one instant of time. 99:59:59.999 --> 99:59:59.999 Google's computers hoover up and file away 99:59:59.999 --> 99:59:59.999 every document, web page and image they can find. 99:59:59.999 --> 99:59:59.999 Then they hunt for patterns and correlations in all this data 99:59:59.999 --> 99:59:59.999 doing statistics on a massive scale. 99:59:59.999 --> 99:59:59.999 And for me, Google has one project that is particularly exciting: 99:59:59.999 --> 99:59:59.999 statistical language translation. 99:59:59.999 --> 99:59:59.999 If you do want to provide access to all the web's information 99:59:59.999 --> 99:59:59.999 no matter what language is spoken. 99:59:59.999 --> 99:59:59.999 There's so much information on the Internet, you can not hope to tranlate it all by hand 99:59:59.999 --> 99:59:59.999 into every possible language, we figured we have to be able to do machine translation. 99:59:59.999 --> 99:59:59.999 In the past, programmers tried to teach their computers to see each language as a set of grammatical rules. 99:59:59.999 --> 99:59:59.999 Much like languages are taught at school. 99:59:59.999 --> 99:59:59.999 But this didn't work, because no set of rules could capture language in all its subtlety and ambiguity, 99:59:59.999 --> 99:59:59.999 Having eaten out lunch, the coach departed. 99:59:59.999 --> 99:59:59.999 That's obviously incorrect. Written like that, it would imply that the coach has eaten the lunch. 99:59:59.999 --> 99:59:59.999 It would be far better to say: Having eaten our lunch, we departed in the coach. 99:59:59.999 --> 99:59:59.999 Those rules are helpful, they are useful most of the time, 99:59:59.999 --> 99:59:59.999 but they don't turn out to be true all the time. 99:59:59.999 --> 99:59:59.999 And the insight of using statistical machine translation 99:59:59.999 --> 99:59:59.999 is saying: if we have all these exceptions anyways, maybe you can get by without having any rules, 99:59:59.999 --> 99:59:59.999 maybe we can treat everything as an exception, and that's essentially what we've done. 99:59:59.999 --> 99:59:59.999 What the computer is doing when it's learning how to translate 99:59:59.999 --> 99:59:59.999 is to learn correlations between words and between phrases 99:59:59.999 --> 99:59:59.999 so we feed the system very large amounts of data 99:59:59.999 --> 99:59:59.999 and the the system sees if a certain word or phrase correlates very often to the other language. 99:59:59.999 --> 99:59:59.999 Google's website currently offers translation between any of 57 different languages. 99:59:59.999 --> 99:59:59.999 It does this purely statistically,having correlated the huge collection of multilingual texts. 99:59:59.999 --> 99:59:59.999 The people who built he system don't need to know Chinese 99:59:59.999 --> 99:59:59.999 in order to build the Chinese system. They dont need to know Arabic. 99:59:59.999 --> 99:59:59.999 The expertise that is needed is basically knowledge of statistics, of computer science, 99:59:59.999 --> 99:59:59.999 of infrastructure, 99:59:59.999 --> 99:59:59.999 to build these very large computer systems we are building for doing that. 99:59:59.999 --> 99:59:59.999 I hooked up with Google from my office in Stockholm, to try the translator by myself. 99:59:59.999 --> 99:59:59.999 I will type some Swedish sentences. 99:59:59.999 --> 99:59:59.999 (Types in Swedish) 99:59:59.999 --> 99:59:59.999 (Reads on the screen) Sweden's finance minister has a ponytail and a gold ring in your ear. 99:59:59.999 --> 99:59:59.999 It's almost exactly correct, it's amazing. 99:59:59.999 --> 99:59:59.999 He comes from the conservative party, that's the kind of Sweden we have today. 99:59:59.999 --> 99:59:59.999 I will type one more sentence. 99:59:59.999 --> 99:59:59.999 In his same-sex parnertships has Stockholm's new bishop and his partners a three-year son. 99:59:59.999 --> 99:59:59.999 It's almost perfect, there's one important thing, it's "her". 99:59:59.999 --> 99:59:59.999 It's a lesbian partnership. 99:59:59.999 --> 99:59:59.999 OK, those kinds of words like "her" are one of the challenges in translation, 99:59:59.999 --> 99:59:59.999 to get those right. NOTE Paragraph 99:59:59.999 --> 99:59:59.999 When it comes to bishops, one can excuse it. 99:59:59.999 --> 99:59:59.999 Right, I think that more often than not it would be probably a "his". 99:59:59.999 --> 99:59:59.999 I will write one more sentence. (Reads aloud in Swedish) 99:59:59.999 --> 99:59:59.999 When Sweden is taking part in Olympic gold, is not to win but to beat Norway. 99:59:59.999 --> 99:59:59.999 But they are very good in Winter Olympics, so we can't make it, but we are trying. 99:59:59.999 --> 99:59:59.999 Very good, very good. 99:59:59.999 --> 99:59:59.999 This is absolutely amazing, 99:59:59.999 --> 99:59:59.999 and I'm impressed that it picked up words like "same-sex partnerships" 99:59:59.999 --> 99:59:59.999 which are very due to the language. 99:59:59.999 --> 99:59:59.999 The translator is good, but if it succeeds, what will be next, that'll be remarkable. 99:59:59.999 --> 99:59:59.999 One of the exciting possibilities is combining the machine translation technology 99:59:59.999 --> 99:59:59.999 with the speech recognition technology. 99:59:59.999 --> 99:59:59.999 Both of these are statistically neutre. 99:59:59.999 --> 99:59:59.999 The machine translation relies on the statistics of mapping from one language to another, 99:59:59.999 --> 99:59:59.999 and similarly speech recognition relies on the statistics of mapping from a sound form to the words. 99:59:59.999 --> 99:59:59.999 When we put them together, now we have the capability 99:59:59.999 --> 99:59:59.999 of having instant conversations between two people who don't speak a common language. 99:59:59.999 --> 99:59:59.999 I can talk to you in my language, you hear me in your language, 99:59:59.999 --> 99:59:59.999 and you can answer back in real time, we can make that translation, 99:59:59.999 --> 99:59:59.999 we can bring people together and allow them to speak. 99:59:59.999 --> 99:59:59.999 The Internet is just one of many technologies created to gather massives amount of data. 99:59:59.999 --> 99:59:59.999 Scientists studying our Earth and our environment 99:59:59.999 --> 99:59:59.999 now use an incredible range of instruments to measure the processes of our planet. 99:59:59.999 --> 99:59:59.999 All around us our sensors are continously measuring temperature, water flow and ocean currents. 99:59:59.999 --> 99:59:59.999 High in orbit our satellite is busy imaging cloud formations, forest growth and snow cover. 99:59:59.999 --> 99:59:59.999 Scientists speak of instrumenting the Earth. 99:59:59.999 --> 99:59:59.999 And pointing up to the skies above, 99:59:59.999 --> 99:59:59.999 our powerful new telescopes are mapping the Universe. 99:59:59.999 --> 99:59:59.999 What's happening in astronomy, is tipically how profoundly this torrent of data 99:59:59.999 --> 99:59:59.999 is transforming science. 99:59:59.999 --> 99:59:59.999 Astronomers are now addressing many enduring misteries of the cosmos 99:59:59.999 --> 99:59:59.999 by applying statistical methods to all this new data. 99:59:59.999 --> 99:59:59.999 The galaxy is a very big place and it has billions of starts in it 99:59:59.999 --> 99:59:59.999 so to put toghether a coherent picture of the whole galaxy requires 99:59:59.999 --> 99:59:59.999 having enourmous amounts of data, and before you can do a large sky survey 99:59:59.999 --> 99:59:59.999 with sensitive digital detectors, that you can map many stars at once, 99:59:59.999 --> 99:59:59.999 it's very difficult to gather enough data of enough of the galaxy. 99:59:59.999 --> 99:59:59.999 In the past, large surveys of the night sky had to be done 99:59:59.999 --> 99:59:59.999 by exposing thousands of large photographic plates, 99:59:59.999 --> 99:59:59.999 but these surveys could take 25 years or more to complete. 99:59:59.999 --> 99:59:59.999 Then, in the 1990s, came digital astronomy, 99:59:59.999 --> 99:59:59.999 and a huge increase in both the amount and the accesibility of data. 99:59:59.999 --> 99:59:59.999 The Sloan Sky Survey is the world's biggest yet 99:59:59.999 --> 99:59:59.999 using a massive digital sensor mounted 99:59:59.999 --> 99:59:59.999 on the back of a custom built telescope in New Mexico. 99:59:59.999 --> 99:59:59.999 It's scanned the sky night after night for eight years 99:59:59.999 --> 99:59:59.999 building up a composite picture in unprecedented resolution. 99:59:59.999 --> 99:59:59.999 The Sloan's is some of the best deepest survey data 99:59:59.999 --> 99:59:59.999 we have in astronomy, 99:59:59.999 --> 99:59:59.999 both in our galaxy and galaxies away from ours. 99:59:59.999 --> 99:59:59.999 All the Sloan data is on the Internet 99:59:59.999 --> 99:59:59.999 and with it astronomers have identified 99:59:59.999 --> 99:59:59.999 millions of hidden unknown stars and galaxies. 99:59:59.999 --> 99:59:59.999 They also comb the database for statistical patterns 99:59:59.999 --> 99:59:59.999 which will prove, disprove or suggest new theories. 99:59:59.999 --> 99:59:59.999 So we have this idea that galaxies grow 99:59:59.999 --> 99:59:59.999 they become large galaxies 99:59:59.999 --> 99:59:59.999 like the one we live in, the Milky Way. 99:59:59.999 --> 99:59:59.999 Not all at once, not smoothly 99:59:59.999 --> 99:59:59.999 but by continously incorporating 99:59:59.999 --> 99:59:59.999 cannibalising smaller galaxies 99:59:59.999 --> 99:59:59.999 they dissolve them and become part of the bigger galaxy 99:59:59.999 --> 99:59:59.999 It's a startling idea 99:59:59.999 --> 99:59:59.999 and in the Sloan data there's the evidence to support it. 99:59:59.999 --> 99:59:59.999 Groups of starts that came from cannibalised galaxies 99:59:59.999 --> 99:59:59.999 stand out in the Sloan data statistically different from other stars. 99:59:59.999 --> 99:59:59.999 because they move at a different velocity. 99:59:59.999 --> 99:59:59.999 Each big spike of one of these distribution graphs 99:59:59.999 --> 99:59:59.999 means professor Rockossi has found a group of stars 99:59:59.999 --> 99:59:59.999 all travelling in a different way to the rest. 99:59:59.999 --> 99:59:59.999 They are the telltale patterns she's looking for. 99:59:59.999 --> 99:59:59.999 The evidence is accumulating that in fact 99:59:59.999 --> 99:59:59.999 this really is how galaxies grow 99:59:59.999 --> 99:59:59.999 or an important way of how galaxies grow 99:59:59.999 --> 99:59:59.999 this is important to understand how galaxies form 99:59:59.999 --> 99:59:59.999 not only ours but every galaxy. 99:59:59.999 --> 99:59:59.999 The more data there is the more discoveries can be made 99:59:59.999 --> 99:59:59.999 and the technology is getting better all the time. 99:59:59.999 --> 99:59:59.999 The next big survey telescope starts its work in 2015. 99:59:59.999 --> 99:59:59.999 It will leave Sloan in the dust. 99:59:59.999 --> 99:59:59.999 Sloan has taken 8 eight years to cover one quarter of the nightsky. 99:59:59.999 --> 99:59:59.999 The new telescope will scan the entire sky in even greater resolution 99:59:59.999 --> 99:59:59.999 every three days. 99:59:59.999 --> 99:59:59.999 The vast amounts of data we have today 99:59:59.999 --> 99:59:59.999 allows researchers in all sorts of fields 99:59:59.999 --> 99:59:59.999 to test their theories in a previously unimaginable scale 99:59:59.999 --> 99:59:59.999 but it may even change the fundamental way science is done. 99:59:59.999 --> 99:59:59.999 With the power of todays' computers applied to all this data 99:59:59.999 --> 99:59:59.999 the machines might be able to guide the researchers. 99:59:59.999 --> 99:59:59.999 There is a profoundly important, one of the most significant points in science 99:59:59.999 --> 99:59:59.999 certainly one of the most exciting 99:59:59.999 --> 99:59:59.999 the potential to transform not only how scientists do science 99:59:59.999 --> 99:59:59.999 but what science is possibly. 99:59:59.999 --> 99:59:59.999 What will power that transformation of how science is done 99:59:59.999 --> 99:59:59.999 is going to be computation. 99:59:59.999 --> 99:59:59.999 Many of the dynamics of the natual world 99:59:59.999 --> 99:59:59.999 like the interplay between the rainforest and the atmosphere 99:59:59.999 --> 99:59:59.999 are so complex, that we don't yet really understand. 99:59:59.999 --> 99:59:59.999 But now computers are generating tens of thousands of simulations 99:59:59.999 --> 99:59:59.999 of how these biological systems might work. 99:59:59.999 --> 99:59:59.999 Is like creating thousands of hypothetical parellel worlds. 99:59:59.999 --> 99:59:59.999 Each of these simulations is analysed with statistics 99:59:59.999 --> 99:59:59.999 to see if any are a good match of what is observed in each. 99:59:59.999 --> 99:59:59.999 The computers can now automatically generate, 99:59:59.999 --> 99:59:59.999 test and discard hypothesis with scarcely human insight. 99:59:59.999 --> 99:59:59.999 This new application statistics will become 99:59:59.999 --> 99:59:59.999 absolutely vital for the future of science. 99:59:59.999 --> 99:59:59.999 It's creating a new paradigm in the way we do science 99:59:59.999 --> 99:59:59.999 which is characterised as data-centric or data-driven 99:59:59.999 --> 99:59:59.999 rather than hypothesis- or experiment-driven. 99:59:59.999 --> 99:59:59.999 It's an exciting time in terms of science, computation and statistics. 99:59:59.999 --> 99:59:59.999 If all this sounds a bit abstract to you 99:59:59.999 --> 99:59:59.999 how about one final frontier? 99:59:59.999 --> 99:59:59.999 Could statistics make sense of your feelings? 99:59:59.999 --> 99:59:59.999 In California, (where else!), one computer scientist 99:59:59.999 --> 99:59:59.999 is harvesting the Internet to try to define the patterns 99:59:59.999 --> 99:59:59.999 of our innermost thoughts and emotions. 99:59:59.999 --> 99:59:59.999 This is the Madness Movement 99:59:59.999 --> 99:59:59.999 it represents a skyscrapper's view of the world. 99:59:59.999 --> 99:59:59.999 Each brightly coloured dot is an individual feeling 99:59:59.999 --> 99:59:59.999 expressed by someone out there in a blog or a tweet 99:59:59.999 --> 99:59:59.999 and when you click on the dot 99:59:59.999 --> 99:59:59.999 it explodes to reveal the underlying feeling of that person. 99:59:59.999 --> 99:59:59.999 This is what people say they're feeling today: 99:59:59.999 --> 99:59:59.999 better 99:59:59.999 --> 99:59:59.999 safe 99:59:59.999 --> 99:59:59.999 crappy 99:59:59.999 --> 99:59:59.999 well 99:59:59.999 --> 99:59:59.999 pretty 99:59:59.999 --> 99:59:59.999 special 99:59:59.999 --> 99:59:59.999 sorry 99:59:59.999 --> 99:59:59.999 alone 99:59:59.999 --> 99:59:59.999 Every minute WeFeelFine crosses the world's blogs 99:59:59.999 --> 99:59:59.999 takes all the sentences that start with the words "I feel" or "I'm feeling" 99:59:59.999 --> 99:59:59.999 and push them into a database. 99:59:59.999 --> 99:59:59.999 We collect all the feelings and we count the most common 99:59:59.999 --> 99:59:59.999 better 99:59:59.999 --> 99:59:59.999 bad 99:59:59.999 --> 99:59:59.999 good 99:59:59.999 --> 99:59:59.999 right 99:59:59.999 --> 99:59:59.999 guilty 99:59:59.999 --> 99:59:59.999 sick 99:59:59.999 --> 99:59:59.999 the same 99:59:59.999 --> 99:59:59.999 like shit 99:59:59.999 --> 99:59:59.999 sorry 99:59:59.999 --> 99:59:59.999 well 99:59:59.999 --> 99:59:59.999 We can take a look at any one feeling and analyse it. 99:59:59.999 --> 99:59:59.999 Right now a lot of people are feeling happy. 99:59:59.999 --> 99:59:59.999 We can take a look at these people, and break them down by age, gender or location. 99:59:59.999 --> 99:59:59.999 Since bloggers have public profiles, we have that information 99:59:59.999 --> 99:59:59.999 and we can ask questions like, "Are women happier than men?" 99:59:59.999 --> 99:59:59.999 or "Is England happier than the United States?" 99:59:59.999 --> 99:59:59.999 We find that as people get older, they get happier. 99:59:59.999 --> 99:59:59.999 For younger people, happiness associates with excitement 99:59:59.999 --> 99:59:59.999 whereas older people associate happiness more with peacefulness. 99:59:59.999 --> 99:59:59.999 We also find than women feel loved more often than men, 99:59:59.999 --> 99:59:59.999 but also more guilty. 99:59:59.999 --> 99:59:59.999 While men feel good more often than women, but also more alone. 99:59:59.999 --> 99:59:59.999 As people live more and more of their lives online 99:59:59.999 --> 99:59:59.999 they leave behind digital traces 99:59:59.999 --> 99:59:59.999 with which we can statistically analyse 99:59:59.999 --> 99:59:59.999 what it means to be human. 99:59:59.999 --> 99:59:59.999 Where does all this leave us? 99:59:59.999 --> 99:59:59.999 We generate unimaginable quantities of data 99:59:59.999 --> 99:59:59.999 About everything you can think of 99:59:59.999 --> 99:59:59.999 and we analyse it to reveal the patterns. 99:59:59.999 --> 99:59:59.999 Now not only experts but all of us can understand 99:59:59.999 --> 99:59:59.999 the stories in the numbers. 99:59:59.999 --> 99:59:59.999 Instead of being led astray by prejudice 99:59:59.999 --> 99:59:59.999 with statistics at our fingertips, our eyes can be open 99:59:59.999 --> 99:59:59.999 for a facts-based view of the world. 99:59:59.999 --> 99:59:59.999 More than ever before we can become authors of our own destiny. 99:59:59.999 --> 99:59:59.999 And that's pretty exciting isn't it? 99:59:59.999 --> 99:59:59.999 (Music)