WEBVTT 00:00:07.808 --> 00:00:10.839 A toothpaste brand claims their product will destroy more plaque 00:00:10.839 --> 00:00:12.910 than any product ever made. 00:00:12.910 --> 00:00:16.411 A politician tells you their plan will create the most jobs. 00:00:16.411 --> 00:00:18.951 We're so used to hearing these kinds of exaggerations 00:00:18.951 --> 00:00:20.850 in advertising and politics 00:00:20.850 --> 00:00:23.131 that we might not even bat an eye. 00:00:23.131 --> 00:00:26.111 But what about when the claim is accompanied by a graph? 00:00:26.111 --> 00:00:28.471 Afterall, a graph isn't an opinion. 00:00:28.471 --> 00:00:32.611 It represents cold, hard numbers, and who can argue with those? 00:00:32.611 --> 00:00:36.403 Yet, as it turns out, there are plenty of ways graphs can mislead 00:00:36.403 --> 00:00:38.192 and outright manipulate. 00:00:38.192 --> 00:00:40.745 Here are some things to look out for. 00:00:40.745 --> 00:00:45.760 In this 1992 ad, Chevy claimed to make the most reliable trucks in America 00:00:45.760 --> 00:00:47.510 using this graph. 00:00:47.510 --> 00:00:51.963 Not only does it show that 98% of all Chevy trucks sold in the last ten years 00:00:51.963 --> 00:00:53.592 are still on the road, 00:00:53.592 --> 00:00:57.338 but it looks like they're twice as dependable as Toyota trucks. 00:00:57.338 --> 00:01:00.634 That is, until you take a closer look at the numbers on the left 00:01:00.634 --> 00:01:05.472 and see that the figure for Toyota is about 96.5%. 00:01:05.472 --> 00:01:09.313 The scale only goes between 95 and 100%. 00:01:09.313 --> 00:01:12.963 If it went from 0 to 100, it would look like this. 00:01:12.963 --> 00:01:16.243 This is one of the most common ways graphs misrepresent data, 00:01:16.243 --> 00:01:18.333 by distorting the scale. 00:01:18.333 --> 00:01:20.804 Zooming in on a small portion of the y-axis 00:01:20.804 --> 00:01:25.703 exaggerates a barely detectable difference between the things being compared. 00:01:25.703 --> 00:01:27.974 And it's especially misleading with bar graphs 00:01:27.974 --> 00:01:31.023 since we assume the difference in the size of the bars 00:01:31.023 --> 00:01:33.233 is proportional to the values. 00:01:33.233 --> 00:01:36.125 But the scale can also be distorted along the x-axis, 00:01:36.125 --> 00:01:40.414 usually in line graphs showing something changing over time. 00:01:40.414 --> 00:01:44.747 This chart showing the rise in American unemployment from 2008 to 2010 00:01:44.747 --> 00:01:47.996 manipulates the x-axis in two ways. 00:01:47.996 --> 00:01:50.395 First of all, the scale is inconsistent, 00:01:50.395 --> 00:01:53.416 compressing the 15-month span after March 2009 00:01:53.416 --> 00:01:56.755 to look shorter than the preceding six months. 00:01:56.755 --> 00:02:00.106 Using more consistent data points gives a different picture 00:02:00.106 --> 00:02:03.705 with job losses tapering off by the end of 2009. 00:02:03.705 --> 00:02:06.675 And if you wonder why they were increasing in the first place, 00:02:06.675 --> 00:02:10.615 the timeline starts immediately after the U.S.'s biggest financial collapse 00:02:10.615 --> 00:02:12.626 since the Great Depression. 00:02:12.626 --> 00:02:15.219 These techniques are known as cherry picking. 00:02:15.219 --> 00:02:18.869 A time range can be carefully chosen to exclude the impact of a major event 00:02:18.869 --> 00:02:20.648 right outside it. 00:02:20.648 --> 00:02:24.762 And picking specific data points can hide important changes in between. 00:02:24.762 --> 00:02:27.356 Even when there's nothing wrong with the graph itself, 00:02:27.356 --> 00:02:30.937 leaving out relevant data can give a misleading impression. 00:02:30.937 --> 00:02:33.997 This chart of how many people watch the Super Bowl each year 00:02:33.997 --> 00:02:37.626 makes it look like the event's popularity is exploding. 00:02:37.626 --> 00:02:40.198 But it's not accounting for population growth. 00:02:40.198 --> 00:02:41.967 The ratings have actually held steady 00:02:41.967 --> 00:02:45.109 because while the number of football fans has increased, 00:02:45.109 --> 00:02:47.959 their share of overall viewership has not. 00:02:47.959 --> 00:02:49.888 Finally, a graph can't tell you much 00:02:49.888 --> 00:02:53.318 if you don't know the full significance of what's being presented. 00:02:53.318 --> 00:02:56.457 Both of the following graphs use the same ocean temperature data 00:02:56.457 --> 00:02:59.719 from the National Centers for Environmental Information. 00:02:59.719 --> 00:03:02.490 So why do they seem to give opposite impressions? 00:03:02.490 --> 00:03:05.279 The first graph plots the average annual ocean temperature 00:03:05.279 --> 00:03:07.987 from 1880 to 2016, 00:03:07.987 --> 00:03:10.149 making the change look insignificant. 00:03:10.149 --> 00:03:12.878 But in fact, a rise of even half a degree Celsius 00:03:12.878 --> 00:03:15.799 can cause massive ecological disruption. 00:03:15.799 --> 00:03:17.219 This is why the second graph, 00:03:17.219 --> 00:03:19.858 which show the average temperature variation each year, 00:03:19.858 --> 00:03:22.390 is far more significant. 00:03:22.390 --> 00:03:27.379 When they're used well, graphs can help us intuitively grasp complex data. 00:03:27.379 --> 00:03:31.180 But as visual software has enabled more usage of graphs throughout all media, 00:03:31.180 --> 00:03:35.900 it's also made them easier to use in a careless or dishonest way. 00:03:35.900 --> 00:03:39.560 So the next time you see a graph, don't be swayed by the lines and curves. 00:03:39.560 --> 00:03:40.882 Look at the labels, 00:03:40.882 --> 00:03:42.130 the numbers, 00:03:42.130 --> 00:03:43.048 the scale, 00:03:43.048 --> 00:03:44.360 and the context, 00:03:44.360 --> 00:03:46.780 and ask what story the picture is trying to tell.