1 00:00:26,335 --> 00:00:27,831 I'm here to tell you 2 00:00:28,021 --> 00:00:31,441 why I don't tell the truth about castles. 3 00:00:33,211 --> 00:00:34,913 You might think it's my job. 4 00:00:34,913 --> 00:00:36,664 After all, we expect professionals 5 00:00:36,664 --> 00:00:39,890 to speak with authority and give us clear-cut solutions, 6 00:00:40,110 --> 00:00:42,110 and that makes us very, very nervous 7 00:00:42,110 --> 00:00:45,476 because there's so much we simply don't know about history. 8 00:00:45,926 --> 00:00:46,932 And as a result, 9 00:00:46,932 --> 00:00:51,055 a lot of things have become established in our collective memory as the truth 10 00:00:51,055 --> 00:00:53,342 simply because someone said it once, 11 00:00:53,342 --> 00:00:55,101 it sounded convincing, 12 00:00:55,101 --> 00:00:56,866 and nobody since has stood up to say, 13 00:00:56,866 --> 00:00:59,627 "Well, we don't know exactly what it was like, 14 00:00:59,627 --> 00:01:01,404 but it wasn't like that." 15 00:01:03,124 --> 00:01:05,169 Take Greek temples. 16 00:01:05,169 --> 00:01:08,664 Everyone knows they're made of beautiful shining white marble. 17 00:01:08,664 --> 00:01:12,383 We've seen them that way for centuries, from postcards to museums, 18 00:01:12,863 --> 00:01:16,060 and that establishes certain seeing habits in our heads, 19 00:01:16,650 --> 00:01:21,724 where we've seen them this way so much that anything different just looks wrong. 20 00:01:22,094 --> 00:01:24,982 And yet today we know for a fact 21 00:01:24,982 --> 00:01:27,342 that they were painted in bright garish colors; 22 00:01:27,342 --> 00:01:30,049 we're just a little unclear on some of the details. 23 00:01:32,059 --> 00:01:35,856 I've colored this one in myself in about five minutes of research, 24 00:01:35,856 --> 00:01:38,771 so it's likely to be wrong in all the relevant places, 25 00:01:38,791 --> 00:01:41,772 and it's still more correct than the white one. 26 00:01:42,722 --> 00:01:45,821 So why do we continue to show them in white? 27 00:01:45,821 --> 00:01:47,497 Well, there's two reasons for that: 28 00:01:47,497 --> 00:01:50,907 One is that we as humans like certainty. 29 00:01:51,187 --> 00:01:55,070 And so we would prefer to be absolutely certain 30 00:01:55,070 --> 00:01:58,069 even if it's the absolute certainty that we are absolutely wrong 31 00:01:58,069 --> 00:01:59,069 (Laughter) 32 00:01:59,069 --> 00:02:01,118 than to say, "Well, 33 00:02:01,118 --> 00:02:04,033 maybe it could have been approximately, 34 00:02:04,033 --> 00:02:05,974 I think, something like …" 35 00:02:05,974 --> 00:02:07,170 And then there's the fact 36 00:02:07,170 --> 00:02:10,466 that when we're trying to establish a new truth in people's heads, 37 00:02:10,466 --> 00:02:13,228 we want it to be the correct truth this time. 38 00:02:13,988 --> 00:02:16,607 But even if we're not entirely clear on all the details, 39 00:02:16,607 --> 00:02:18,754 that doesn't mean we can't make a statement. 40 00:02:19,384 --> 00:02:21,416 If you ask me right now what time it is, 41 00:02:21,416 --> 00:02:22,878 I can't tell you, 42 00:02:23,248 --> 00:02:27,525 but I don't have to shrug my shoulders and just say, "I have no idea." 43 00:02:28,045 --> 00:02:31,078 I know this event is on from 12:00 till 6:00, 44 00:02:31,078 --> 00:02:33,394 so that eliminates half the clock right there. 45 00:02:34,394 --> 00:02:37,368 We've had our first coffee break, we've not had the second, 46 00:02:37,368 --> 00:02:39,260 so it's between 2:00 and 5:00. 47 00:02:40,070 --> 00:02:44,710 I know there were people ahead of me, and I'm not being told I'm out of time, 48 00:02:45,190 --> 00:02:47,762 so it must be around 4:30. 49 00:02:49,362 --> 00:02:51,039 Is that correct? 50 00:02:51,039 --> 00:02:51,988 I don't know. 51 00:02:51,988 --> 00:02:55,420 It might not be the truth, but I don't have to tell you the truth. 52 00:02:55,420 --> 00:02:58,317 I just have to know how correct I'm likely to be 53 00:02:58,587 --> 00:03:01,941 because how correct I am can be very, very important. 54 00:03:02,210 --> 00:03:04,587 Me telling you it's about 4:30 is pretty useless 55 00:03:04,587 --> 00:03:07,249 if you want to know whether you can still catch your bus; 56 00:03:07,249 --> 00:03:10,018 and in that case, we might have to ask more people, 57 00:03:10,018 --> 00:03:12,541 we might have to fill it in with more clues and so on; 58 00:03:12,541 --> 00:03:13,767 and that's science. 59 00:03:13,767 --> 00:03:15,217 We ask a question, 60 00:03:15,217 --> 00:03:18,744 and then we fill in the unknown and get more and more precise. 61 00:03:19,134 --> 00:03:21,940 So the scientific method is pretty well-established: 62 00:03:22,090 --> 00:03:24,563 You ask a question about the world around you, 63 00:03:24,563 --> 00:03:27,147 you research what you already know about it, 64 00:03:27,147 --> 00:03:31,228 you design an experiment to test what you don't know about it, 65 00:03:31,438 --> 00:03:34,571 you gather the data, you analyze them, and you reach a conclusion; 66 00:03:34,761 --> 00:03:37,241 and that conclusion could be, "I need more data." 67 00:03:37,411 --> 00:03:39,694 Then you go back, design another experiment, 68 00:03:39,694 --> 00:03:41,459 run it again, gather more data, 69 00:03:41,459 --> 00:03:43,810 and you get more data and more data and more data, 70 00:03:43,810 --> 00:03:45,482 and suddenly you're buried in data, 71 00:03:45,482 --> 00:03:47,224 and you're dealing with big data, 72 00:03:47,224 --> 00:03:49,423 where scientists now have this problem 73 00:03:49,423 --> 00:03:51,541 that there's so many data 74 00:03:51,541 --> 00:03:54,056 they can never read them all in one lifetime. 75 00:03:54,056 --> 00:03:56,230 They have to find new ways to deal with that. 76 00:03:56,890 --> 00:03:58,241 And then there's me. 77 00:03:58,971 --> 00:04:00,290 This is me. 78 00:04:00,290 --> 00:04:03,353 You can tell I'm not the kind of scientist with a lab coat, 79 00:04:03,353 --> 00:04:06,337 and my data problem is slightly different. 80 00:04:06,557 --> 00:04:10,120 Basically I'm dealing with one student's lab report 81 00:04:10,120 --> 00:04:11,940 that they dropped on the floor, 82 00:04:11,940 --> 00:04:14,163 lost half the pages and then shuffled the rest, 83 00:04:14,163 --> 00:04:16,854 and there's probably a coffee stain on the relevant bit. 84 00:04:16,934 --> 00:04:19,900 So what I've got is I've got half a broken castle, 85 00:04:20,483 --> 00:04:22,143 slightly burned, 86 00:04:23,066 --> 00:04:26,968 I've got a legal contract from 1388 that was written by a guy 87 00:04:26,968 --> 00:04:30,872 who managed to spell the name "Arnold" four different ways in three pages, 88 00:04:31,672 --> 00:04:33,387 I've got some rocks from the village 89 00:04:33,387 --> 00:04:36,433 that may or may not have belonged to this castle, 90 00:04:37,413 --> 00:04:39,278 I have got a map that was done by a guy 91 00:04:39,278 --> 00:04:42,775 for whom this was a 10-minute squiggle in an eighth-year campaign, 92 00:04:43,345 --> 00:04:45,689 a painting that was drawn 93 00:04:45,689 --> 00:04:49,026 about 300 years after the castle burnt down, 94 00:04:49,356 --> 00:04:51,822 and a book that was probably propaganda. 95 00:04:52,252 --> 00:04:54,402 And of course I could go look in the archives, 96 00:04:54,402 --> 00:04:57,261 I can get another archaeological excavation going, and so on, 97 00:04:57,261 --> 00:05:00,213 but at some point, there's simply no way to gather more data. 98 00:05:00,213 --> 00:05:02,507 And then you expect me to take that 99 00:05:02,507 --> 00:05:05,864 and mash it all up into the truth about castles? 100 00:05:09,304 --> 00:05:11,607 You want a reconstruction that's so realistic 101 00:05:11,607 --> 00:05:13,232 it feels like you're really there, 102 00:05:13,232 --> 00:05:16,826 like every little pebble in the courtyard is just right. 103 00:05:17,686 --> 00:05:19,969 There's a reason that a lot of sites and museums 104 00:05:19,969 --> 00:05:22,214 don't use the word "reconstruction"; 105 00:05:22,214 --> 00:05:23,764 instead, you find a picture, 106 00:05:23,764 --> 00:05:27,534 and next to it, it has the disclaimer "Artist's impression." 107 00:05:28,324 --> 00:05:30,657 And that doesn't mean they didn't do any research; 108 00:05:30,657 --> 00:05:33,276 it just means they didn't document what they researched. 109 00:05:33,276 --> 00:05:35,895 We don't know who they talked to, which books they read, 110 00:05:35,895 --> 00:05:37,648 which conclusions they drew, 111 00:05:37,648 --> 00:05:40,330 and which other theories they discarded. 112 00:05:40,330 --> 00:05:42,046 Now, imagine for a moment 113 00:05:42,046 --> 00:05:44,990 that we would treat a text the same way. 114 00:05:44,990 --> 00:05:46,092 You go into the museum. 115 00:05:46,092 --> 00:05:49,602 There's a plaque, and it says, "Author's impression." 116 00:05:49,602 --> 00:05:52,528 The author thinks there might have been a castle here. 117 00:05:52,738 --> 00:05:55,161 You wouldn't take that very seriously. 118 00:05:55,161 --> 00:05:59,127 So why do we treat text so differently from models? 119 00:05:59,127 --> 00:06:03,742 It's because we've come to a consensus on what makes a scientific text, 120 00:06:03,742 --> 00:06:05,636 and it's quite simply this. 121 00:06:05,956 --> 00:06:08,110 When you're writing a scientific document, 122 00:06:08,110 --> 00:06:09,937 you put in footnotes, 123 00:06:09,937 --> 00:06:12,482 you cite works by previous scholars, 124 00:06:12,507 --> 00:06:14,405 you show your argumentation - 125 00:06:15,675 --> 00:06:18,329 you simply give your document provenance - 126 00:06:19,139 --> 00:06:21,783 because showing you a picture of the truth 127 00:06:22,453 --> 00:06:25,773 isn't going to help you without me explaining why it's true. 128 00:06:26,423 --> 00:06:30,088 The truth is, all of these are correct at the same time. 129 00:06:30,668 --> 00:06:33,295 That's the truth, but it's not a very helpful truth, 130 00:06:34,585 --> 00:06:38,560 because without context, data are not information. 131 00:06:38,560 --> 00:06:40,669 So I'll give you a little context. 132 00:06:43,707 --> 00:06:45,305 So for a little context, 133 00:06:45,305 --> 00:06:48,497 this first clock shows the time in Luxembourg, 134 00:06:48,497 --> 00:06:50,766 and the second one has the time in Tokyo, 135 00:06:51,116 --> 00:06:53,215 the third one is one of those annoying clocks 136 00:06:53,215 --> 00:06:55,501 everyone had in their kitchens about 10 years ago 137 00:06:55,501 --> 00:06:57,164 that actually run counterclockwise, 138 00:06:57,164 --> 00:06:59,644 and the fourth one is not a clock, it's a barometer - 139 00:06:59,644 --> 00:07:01,777 you just wouldn't know that by looking at it. 140 00:07:03,577 --> 00:07:04,776 So in historic research, 141 00:07:04,776 --> 00:07:07,424 when we deal with images, we know what to do: 142 00:07:07,424 --> 00:07:10,977 We give those provenance through metadata and paradata. 143 00:07:11,457 --> 00:07:13,487 Metadata you've probably heard. 144 00:07:13,507 --> 00:07:16,044 Metadata are data about the data. 145 00:07:16,044 --> 00:07:18,523 You can see those when you're browsing your computer, 146 00:07:18,523 --> 00:07:20,859 and you can see who made a file, when it was made, 147 00:07:20,859 --> 00:07:22,704 when it was last opened, and so on. 148 00:07:23,104 --> 00:07:25,070 Paradata are slightly more complex. 149 00:07:25,070 --> 00:07:28,137 Paradata are data that give context for the data, 150 00:07:28,327 --> 00:07:30,943 so like how they were gathered, how they were processed, 151 00:07:30,943 --> 00:07:33,306 which decisions were made about them, and so on. 152 00:07:34,406 --> 00:07:35,953 The metadata for this image 153 00:07:35,953 --> 00:07:40,300 would be that it was taken by me on the first of June, 2017 154 00:07:40,300 --> 00:07:42,568 on a Sony compact camera. 155 00:07:42,568 --> 00:07:47,362 The paradata are that it was picture 111 in a series of 128 156 00:07:47,362 --> 00:07:50,564 and I took it on my first research trip to this castle. 157 00:07:51,314 --> 00:07:53,232 And I love to show this picture 158 00:07:53,232 --> 00:07:57,826 because this picture has everything in it that is wrong with models. 159 00:07:58,921 --> 00:08:01,786 You walk up the stairs in this castle, you come to the attic, 160 00:08:01,786 --> 00:08:04,502 and there's a big glass box with this model sitting in it. 161 00:08:04,502 --> 00:08:05,645 And what I love about it 162 00:08:05,645 --> 00:08:09,262 is that there are no data attached to it whatsoever. 163 00:08:09,262 --> 00:08:10,918 You don't have a scale bar. 164 00:08:11,958 --> 00:08:15,025 You don't have a date it was made or who made it. 165 00:08:15,055 --> 00:08:17,355 You don't have a date it's supposed to represent. 166 00:08:17,355 --> 00:08:18,647 There's nothing even to say 167 00:08:18,647 --> 00:08:21,603 that it's supposed to be this castle that you're standing in. 168 00:08:22,483 --> 00:08:24,959 And if you're talking about decision-making processes 169 00:08:24,959 --> 00:08:26,018 in the reconstruction, 170 00:08:26,018 --> 00:08:28,565 if you take a closer look at that center tower there, 171 00:08:28,565 --> 00:08:30,326 it becomes very, very obvious 172 00:08:30,326 --> 00:08:33,896 the size of that tower was not based on an archeological excavation 173 00:08:33,896 --> 00:08:36,386 or because there was a foundation there or something. 174 00:08:36,386 --> 00:08:38,923 No, that's the size of the toilet paper roll they had. 175 00:08:38,923 --> 00:08:41,138 (Laughter) 176 00:08:41,138 --> 00:08:43,393 And so this model makes me happy 177 00:08:43,393 --> 00:08:45,851 because it's everything I'm trying to avoid. 178 00:08:48,151 --> 00:08:51,113 And I'm not the only person trying to avoid this kind of thing. 179 00:08:51,113 --> 00:08:54,283 A lot of intelligent people are working and avoiding this. 180 00:08:54,593 --> 00:08:57,540 There are some hugely complex systems these days 181 00:08:57,540 --> 00:08:59,891 that go into great detail on data, 182 00:08:59,891 --> 00:09:02,775 metadata, paradata, how they all relate, and so forth; 183 00:09:02,775 --> 00:09:05,954 and my favorite one takes about six months to learn. 184 00:09:06,514 --> 00:09:08,724 Now that's bad enough for me as a researcher, 185 00:09:08,724 --> 00:09:10,790 but imagine that you, as a museum visitor, 186 00:09:10,790 --> 00:09:12,839 have to go on a six-month training course 187 00:09:12,839 --> 00:09:14,852 to understand what you're seeing. 188 00:09:15,522 --> 00:09:19,772 So, instead, I have a system that's just good enough for me. 189 00:09:19,772 --> 00:09:21,686 I simply take my model, 190 00:09:21,686 --> 00:09:26,792 and I tell you which parts are true and which ones are not. 191 00:09:26,792 --> 00:09:28,893 So probably true is the easiest. 192 00:09:28,893 --> 00:09:31,826 That's the category of things that I think are true 193 00:09:31,826 --> 00:09:33,858 because they're still there, 194 00:09:33,858 --> 00:09:36,567 so that could be things like the castle ruins. 195 00:09:37,757 --> 00:09:39,696 Next, pretty close to true, 196 00:09:40,981 --> 00:09:42,696 we have a lot of evidence for those. 197 00:09:42,696 --> 00:09:43,699 So for example, 198 00:09:43,699 --> 00:09:46,621 I was saying foundations, towers on foundations - 199 00:09:46,621 --> 00:09:49,844 we fill in the gaps what we have good evidence. 200 00:09:50,254 --> 00:09:53,962 Third stage, extrapolation, could be true - maybe not. 201 00:09:53,962 --> 00:09:56,583 That's where I'm working on secondary and tertiary data, 202 00:09:56,583 --> 00:09:58,253 like the maps and images. 203 00:09:58,253 --> 00:10:00,360 And then there's my favorite category - 204 00:10:01,015 --> 00:10:03,495 the stuff that's not really true. 205 00:10:04,900 --> 00:10:07,399 Now, these things I need to put in my model 206 00:10:07,709 --> 00:10:10,328 because the model would be missing something without it. 207 00:10:10,328 --> 00:10:13,094 If I didn't put these in, I would be telling you a lie, 208 00:10:14,014 --> 00:10:16,142 but I have no idea what to really put in. 209 00:10:16,482 --> 00:10:17,816 It's an interesting problem. 210 00:10:17,816 --> 00:10:21,882 So that's things like I know the great hall had paintings on the walls, 211 00:10:21,882 --> 00:10:24,259 I will never know what exactly was painted on them, 212 00:10:24,259 --> 00:10:25,829 so I have to make something up, 213 00:10:25,829 --> 00:10:28,628 but if I left them as a blank stone the way they are now, 214 00:10:28,628 --> 00:10:30,668 that would be making a statement. 215 00:10:31,398 --> 00:10:34,773 And then, of course, I need to attach my metadata and my paradata, 216 00:10:34,773 --> 00:10:37,738 and tell you why it's in that category. 217 00:10:38,058 --> 00:10:41,174 And finally, I need to make very, very sure 218 00:10:41,174 --> 00:10:43,531 that you don't only know why it's in that category 219 00:10:43,531 --> 00:10:45,892 but which part exactly I'm talking about. 220 00:10:46,492 --> 00:10:49,142 If you remember that clock from earlier, 221 00:10:49,472 --> 00:10:52,769 well, I can tell you for a fact that it's Friday afternoon. 222 00:10:53,089 --> 00:10:55,788 I can also tell you with absolute certainty 223 00:10:55,788 --> 00:10:59,842 that sometime in the last two millennia, we had a castle on this hill. 224 00:11:00,542 --> 00:11:02,537 What I cannot tell you 225 00:11:02,537 --> 00:11:06,119 is whether in that window, in 1548, 226 00:11:06,119 --> 00:11:09,719 we had an archway and that archway had a stone 227 00:11:09,719 --> 00:11:12,864 and that stone was exactly 312 millimeters wide. 228 00:11:13,234 --> 00:11:15,101 It could have been 317, 229 00:11:15,101 --> 00:11:17,800 but my drawing is going to say one way or the other. 230 00:11:19,600 --> 00:11:23,669 And that is the really, really interesting point for future researchers 231 00:11:23,949 --> 00:11:26,531 because if I've told you I have no idea what was here, 232 00:11:26,531 --> 00:11:29,530 they can use that point to research, and then they can say, 233 00:11:29,530 --> 00:11:32,540 "Look, we found more data, and actually you're completely wrong. 234 00:11:32,540 --> 00:11:35,025 It was 483 millimeters." 235 00:11:35,025 --> 00:11:36,569 And I can say, "Hooray!" 236 00:11:36,569 --> 00:11:39,615 because that advances our state of collective knowledge. 237 00:11:40,025 --> 00:11:42,127 So if I'm doing science properly, 238 00:11:42,127 --> 00:11:44,751 I want people to be able to prove me wrong. 239 00:11:46,390 --> 00:11:49,887 So that's why I'm not going to tell you the truth about castles, 240 00:11:51,454 --> 00:11:54,344 and why I make it very, very clear to you 241 00:11:54,804 --> 00:11:56,336 when I'm just making it up. 242 00:11:56,336 --> 00:11:57,707 (Laughter) 243 00:11:58,487 --> 00:11:59,740 Thank you. 244 00:11:59,740 --> 00:12:02,454 (Applause)