31C3 Title, no sound Alright, welcome! So, welcome again from me. It's great to be here! So many people, even to this late hour. I've been told, this is the prime time. That is awesome, at 11 p.m. I'm David, I'm a Computer Scientist from Bonn. And we just can start with the things that happened so far at the congress. If you happened to be here at the congress or watched sessions on stream - welcome again to the colleagues on the internet - then there will be always devices that one does not like so much to use. [Laughter] Who participated in the sessions of Tobias Engel and Karsten Nohl, does indeed use his mobile phone less confident. And who was with starbug afterwards, will not like to use iris scanners or finger print scanners anymore and may use gloves more frequently now. So here a little disclaimer: If someone has an intimate relation to his photocopier and tends to keep it like that, should refrain from participating this session. We will do three things during this session. First of all we will get to know one of most prevalent and dangerous bugs of the last years. Secondly, we will comprehend the bug. That is in a manner nerds and muggels will understand. And last but not least, for the activists among us - may be some present here - we will deduct some rules that may apply to a single person that will handle a powerful opponent, just like a global player. But in your case it can be something completely different. That's why I will describe precisely how this dispute evolved over time and what kind of mistakes I made. The talk's kind of structured like a novel. First, there's a prologue, for the conspiracy theorists among you. The year is 2008. In summer 2008 the US were having the primaries for presidential election. Barack Obama was in the running against Hillary Clinton. In the US, like here, there's lots of intrigue in politics. So there were a few anonymous emails, that should benefit Mrs. Clinton. Those mails claimed, among other things, that Obama had been born in Kenia als a Kenian citizen. That would make him fomally unfit to be president. To become president of the US, you have to be 'natural born citizen' of the US. What exactly a 'natural born citizen' is the Americans themselves even don't really fully know. But there's a whole Wiki article about the controversy, where you can read all about it. Two things generally acknowleged: First, one's to be American. Second, one's to also be that at time of birth. So when I come to the US, newly naturalized, that doesn't work. That Obama's second name is Hussein was somewhat suboptimal too in that context. (laughs). Obama obviously had an interest in ending that 'argument' as quickly as possible. So he made his birth certificate publicly available. I say 'short birth certificate' because, when he was born, a short and a long one were made. The short one ist shown here on the left, you see it behind me. And I in front of me. But good conspiracy theorists aren't distracted by facts. (laughter and applause) Immediatly, there are accusations the birth certificate's faked. Supposedly, there was a stamp missing, and ... and ... and. Whatever you can come up with. You all can come up with it. On the right, you see a few car stickers by Obama's enemies. The lowermost explicitly calls for the birth certificate. The theory that Obama shouldn't be allowed to be president, is rather wide-spread in the US. Obama won the primaries, and the following election, but the dispute simmered on. There was a whole scene of birthers that wanted to prove Obama's actually not American. After the whole thing hadn't calmed down fo two and a half years - Obama already being president for some time - in 2011 he had all of it. He published the scan of the long version of the birth certificate, on the right in the picture. You can already see there's much more information in it, and you could think: They'll leave him alone now. But far from it. Shortly after the release there were accusations the birth certificate was a clumly forgery. Let's take a closer look. The left picture is a strong enhancemt of the red box in the right picture. The numbers six and four are visible. These numbers have sharp, pixel-perfect edges. Yes, it's even visible on the projector. And the numers are uniformly colored. On their right side the number one is blurred and colored unevenly. The one is as you would expect a scan in reality. Why is there such a difference between two numbers in one and the same row of numbers? A few more examples. Again one can see numbers with sharp edges or these ticking boxes in contrast to normal, slightly blurred numbers and boxes. I drew some red boxes the ticking boxes and the 'and'. There one can see a kind of shift. And it does really look as though somebody drew this using Paint. Meaning the ancient one, I am sure you remeber from your childhood. MS Paint on Windows 3.11. I used to sit at my father's workplace at work and stole his working hours. Or this one, particulary beautiful. This section of the frame is from the stamp at the bottom. There's a typo, in the stamp. Yeah sure, makes sense. We have heard that one before, typo in the stamp. I mean of course one would think it's a fraud the way it looks. And at the same time think that the intern at the White House is too stupid to use Photoshop. Laughter Concerning PR this was a massive failure of course. According to a Gallup poll in 2011, 5% of Americans believed, Obama was definitely not born in the US. And a further 8% thought, that he was 'probably not' born in the US. Well that didn't work out. The White House had to back up pretty badly. To this day they get requests because of this. This was the prologue. We will now move on to the main trial and jump in time to 2013. On the 24th of June 2013 a company, I was friends with, called me The had two big Xerocs Workcentres. Xerocs Workcentres are those giant buisness copiers, that stand everywhere nowadays. They are connected via WIFI, can scan, print, copy, mail and cost as much as a small car. These printers aren't the ones your grandma uses, but have a few hundred users per device, maybe more. In this picture you can see a construction plan. The black areas aren't original, I just cencored those afterwards, since I would not have been allowed to use it. I marked three spots in yellow on the plans. These spots are standardized blocks containing the squarefootage of the room. These spots will become more important soon. The company told me: "Hey David, when we scan a construction plan the numbers change. Could you take a look at it?" Laughter On the left side, that's me. Laughter Applause At this point I have to add, that the relationship with them is really good. I worked my way through my computer sience degree. Of course my parents also contributed, I won't deny that. But I did IT-Service for the company and they were really nice all the time and of course I thought they were screwing with me. For sure. Copier changes numbers?? Of course, makes sense. We've heard that before. They said: "Yes, come over and take a look at it. We need the device, it has to work." So I drove over there and took a look. Still being a bit on the watch for the joke. They have a Xerox Workcentre 7535. Here are the three marked spots in the original, before scanning. I am not sure how good you can read it, so I will read it out loud. On the top it says 14.13 sqm (square meter) in the middle it's 21.11 sqm, and at the bottom 17.42 sqm. So I put the plans in the Workcentre and scanned it. And here are the same spots after the scan. Laughter and Applause Interesting. Suddenly all rooms are 14.13 sqm big. I thought this can't be right. Completely impossible. This isn't happening. I was still thinking they are screwing with me. (laughs) While scanning the - to clear that out from the beginning, since I got that question a dozen times in the internet- While scanning the text detection was turned of. The number substitution takes place in the raw pixel data. The company also had a second Workcentre, the 7556. Thats bigger and faster. Aside from these two kinds of Workcentres, that I mention here in the beginning, there are a lot more. It is a gigantic family of devices. In contrast to the smaller device which spat out the same numbers every time,... (laughs) the larger one gave out different ones every time. (Laughter) It is bigger and has more CPU power. (Laughter) Look at those rows and how the values change. At "Stelle 2", that is the middle row, first and last it's 14.13 sqm. And in the middle 21.11, once. That would have been the correct value btw. There is a chance to get it right. (Laughter) In the other rows it looks similar. In case one of you needs one of those NSA random generators.... (laughs) Applause Keep in mind, that actually this is no... I am laughing as well, but it is no laughing matter. Note that the numbers are set into the layout perfectly. The error was only noticed, because an obviously bigger room had a smaller square footage than a smaller one next to it. There's a broom cupboard with 100 sqm and next to it a ball room with 4 sqm. (Laughter) It hardly gets any meaner. The layout looks perfect. I do realise that the writing is really small. Don't you thinks this is some mean corner case and I was working on for three month, just to finally stick it up to Xerox. We will look at other examples. This is the original case in which the bug was originally noticed, and I didn't want to keep it from you. Here's the next one. This is an expense register. (Laughter) Two sixes became eights. It's funny, I released the picture it on my website, and I said: " Here a six became an eight." Then I get an e-mail: "No, on the top there's another." (loud laughing and applause) Again perfectly set. Why was it noticed this time? Because the numbers are supposed to be sorted by size. What I want to say is it is impossible to notice. If I give you some columns of numbers that don't make any noticable sense. Then you could obviously not see, that there's wrong numbers. It's always around there being semantic criteria, to make it noticable. To make it obviously implausible. Otherwise you have no chance to notice. Slowly I became a little worried. The neck length increases. To not let this be some random events, I started working to reproduce the error on purpose. IT guy style invested a night and generated number columns in different sizes and fonts. I scanned those and experimented for a few hours. And, indeed, the error accurs again. These are my random numbers. We will be able to work with those some more. The eights marked in yellow should be sixes and do not belong there. Let's stay ourselfes shortly. I promised you in the introduction, that I would lay out the entire interaction with Xerox, that would follow, over time and tell you, how I felt at the corresponding times and emphasize the things that according to my experience are extremely important when confronting a giant opponent. And I will keep that promise. I will tell you why at all times. But now I will say one thing up front. This thing I will discuss in different ways through the entire presentation. What never helps in my point of view is unfriendly twittering and hating. (self-concious applause) It's really nice that you are applauding, I wasn't sure that would happen. (laughter) I have nothing against twitter as such. Nothing at all. But if you want to achieve something, you make yourself vulnerable with such behaviour. And above all you won't be taken seriously. You can always be accused of not wanting a proper discussion. That won't fit in 140 letters, no matter what any of you say. (applause) Secondly you can always be accused of seeking attention for yourself. Because almost everything is public on twitter. At the most twitter is useful for establishing first contact, when you ask for an e-mail adress or a phone number. If I don't recommend twitter, what do I recommend? Much more serious and straight foreward is erverything, that is not public. That way one shows willingness to work rationaly and not urge to scream around. That's mail or phone calls. So we called the Xerox support. Several times ... Often ... We phoned uo all the levels up to the top level in Dublin - nobody knew anything. We also sought personal contact. Staff from the local Xerox retailer came over. That's not Xerox themselves, but a retail and support company. Thay were shocked - of course, right? And then they tried to reproduce it themselves. Zack! They reproduced it... (laughter and applause) That was .. we are laughing now. They were standing there heads hanging low. You are standing there selling these things and suddenly you question your existence. That's not cool at all. At Xerox - not the support company, but the entire, big Xerox, 140.000 employees, there was surprise, but no efforts were made to help us or the retail company. Meaning they were cautious of the problem. (laughs) (laughter) So there were no signs at all of greater interest and no advice, as for solving the problem. Then one guy came from Xerox Central, who updated the software, we had an acient one installed. He installed the new software, problem was still there. I thought: "Great, now we know the problem existed in the fimware three years ago until today." Hmmm. When for more than a week nothing happened on Xerox's side that promised hope, I thought: "Now you have been accommodating enough!" So I wrote a blog article in German and English about what I just told you about. In this article I offered test documents to download. The readers can print, scan and check whether they are affected or not. With that the spread of the story started. I have to add, my blog is not really huge , really not. It has around 500-1000 readers per day. That's not a huge amount, but also not nothing and the most readers are computer scientists of some form, I know that from the e-mails I get. On the bottom of my slides from now on you can see a line. This line will continuously move further to the right. Thats a plot of the klicks. It's not meant to show off with clicks, but in context it's great to see, at what time one gets attention in what way and also to see how fast it fades. We will show that immediately. This small bump - yes, it's visible. The line moved to the right and there's a peak of 3000 hits/hour. Those numbers are from Google Analytics, I have been told, one has to multiply them by two, but for order of magnitude it's enough. On the 2nd and 3rd of August the story hit on several tech-blogs. At this point I declare the long-known fefe as tech-blog. (laughter) I know, I know, there's the first protest. But I will agree on the fact, that fefe is read by a lot of IT-poeple. Alright, I am not hearing any more protest. The peak you see here is because of blog.fefe.de . The message spreads, and I get more and more mails from readers that are affected. The most concerning is that I get e-mails with confirmations for a lot of Xerox-Workcentres that I don't even know. (laughter) I told you before these things are one giant family of products. Very slowly I realise, that this could turn into something bigger eventually. Lesson learned: It was good to release the test-documents online with the article. Would the users not have been able to check for themselves using the test-documents, the story would never have had an impact like it would soon have. On the 4th of August the story arrived in tech-portals around the world. In the slide is Hacker News by Y-Combinator, that's one of the biggest of this kind, you probably know it. From now on I get hundreds of technically versed e-mails a day. I say "technically versed", because there were also others that were less technical. Over the entire time I spend days to channel and sort the news I get. This enabled me to continue the reporting in a professionaly and to get to the roots of the bug with professional help. The whole thing becomes an avalanche and I am not allowed to sleep any more. Cause the US press is on the phone constantly. You must not think that US- journalists ever realise, that there's a thing called time zones .... (laughter and applause) Here's another anecdote. One would think the US media journalists are competitors. Meaning if one had a special information he would not pass it on to the others, right? As soon as the colleague from ABC had my phone number ALL of them had it. I tell you, it's incredible! (laughs) Lesson learned: Write these things in multiple languages! Important are English for the international space. Also the language of the home market of the company, you are confronting. In my case thats the USA, so English, again two birds with one stone. By the way: in the US Xerox is so strong that "to copy" is called "to xerox" there. They really say that in everyday conversation. The same way we say: "Hand me a Tempo! (cotton tissue)", just to give you an impression of how much repute the company and the brand has there. And when in the world of technology something like this goes around what's next? Mass media (some laughing) And there you get the whole package. We'll just click through here to illustrate it. This list is in no way complete, there were thousands of articles suddenly, all over the world. And if I show an article, then - just as a disclaimer - it doesn't make a statement about the date of publishing statement about the date of publishing, I just make it in a way that's good for the show (some laughter) Browsing, here is Heise, of course that joys me as a computer scientist, they covered the whole story in five articles or so. ZDF Hyperland, yes? I'm demonstrating the german press a bit here. The german press was very reserved. The most articles were in fact from abroad. Therefore the comment about the "home market". But here a small anecdote about the german press. A journalist told me that he wanted to bring the story to the "Tagesschau". They told him "Yeah, hmm, it's alright. But for this we want it to happen during real copying, and not just during scanning!" (laughter and applause) If anyone from the "Tagesschau" is watching, this applause is for you! (laughter) So I think: You geniuses! Pro Tip: If you print a scan, then you have a copy! (laughter) With the difference, that such a saved scan can cause harm even years later. But please! So I thought, no "Tagesschau" story, it's going around the world already anyways, not my problem if they are the only ones not covering it. Lesson learned: Stay professional and sovereign. Don't just bloat things out of thirst for attention. Every one of you can probably name some affaire, that went rather well for whoever made it public, and then in the decisive moment he tasted blood and made something up. That's bad of course. Oh well. The Economists, that's really vintage, I liked this title: "Lies, damned lies and scans" That comes from Tom Sawyer: "Lies, damned lies and statistics" Now PR wise, we're at a point where it's expensive. The Economists has influence. ABC News - even more expensive. There are the colleagues with their phones. BBC, CNBC. Suddenly, it was everywhere. My powerpoint is lagging, here it is again. Business Week, that is a popular economy magazine. I'll recall here, until now, no reaction from Xerox. Yes, three days in business, worldwide. No reaction! And when you take that long, the tone gets really rough. I quote: "On the scale of things, that are too terrible to imagine, document altering scanners are somewhere up there with meat eating bacteria." (laughter) They are actually writing this in the Business Week! (laughs) So I was called my a friend of mine, listen you have to read this. Great! Imagine, there's Peter Coy, he's editor there, that we will see again a few more times over the course of this talk. So, my blog article is now at about 100.000 visitors per day. And still, no feedback from Xerox. In the meantime I was able to explain, with the help of many reader-mails, what's happening at all. And that's what I am telling you now, so we make a small excourse about image compression. Here we have a test image, that I made. It's a sundew, with a fly on it, that's a plant. The fly as well as the text belong to this test image. For us to have a nice variety of pictures. Data transfer costs time, money and storage. Image consist, compared to text, of a great amount of data. And to send and save pictures completely uncompressed would be really expensive. And images are sent everywhere, yes? The use is there for every one of us. I tell you, it goes to the highest possible scenarios. Just recently there was a giant coverage, and even an investigation by the government, just because a former member of the parliament transferred pictures. (laughter) (laughs) So now, this member of the parliament can't wait for his pictures forever, so we have to compress the image data. (laughs again) Listen here! (laughs stupidly) (applause) Now we have two parts of my test image. One image part and one text part. And I enhanced it so much you can see individual pixels. This is so we can see what go wrong with different compression methods. There is lossless compression. Here the image data stays as is, it is just somehow stored more efficient. Or we accept losses, so, changes in the image data, to "squish" the data and make it even smaller. Here are the popular GIF-images. Can I have a small hand sign, who thinks that GIF has lossy compression? Wow, that's a lot! Almost everyone. GIF is a lossless compression method. The downside is, it only supports 256 colours. The here shown lower quality stems not from the image being saved as a GIF, but from the colour reduction. To be able to see it better, I reduced the colour amount to 16. Here you see it nicely, uiuiui. So. The finished image is saved pixel for pixel, and then LZW compressed. LZW is an old compression algorithm, similar to ZIP. GIF is very suited for graphics with few colours. And because pixels are still saved completely one by one, sharp edges are well represented. You can see, the text looks pretty good. It's less good in photographs, as you can see. Most widespread are JPEG images. And JPEG is lossy. The original image doesn't get saved pixel for pixel anymore, but instead gets split into 8x8 pixel blocks. And every block then gets approximated with cosinus-waves. How exactly this works mathematically, we can spare ourselves from here. But it is good to know, that this kind of compression, it's good for pictures, but bad for sharp edges, as you can see in the letters, yes, you can see artifacts, you can see some stains around it. But usually this would be full of artifacts, the image. I can hold up my notebook or so. Long story short. Depending on the type of image, certain compression methods are good, and others aren't. That's why there is the JBig2-fomat. This is one of the special words, that I wrote down in three variants for the translators. Here you can dissect one image in multiple sub images. The red circled here as an example. These are sub images. These sub images we call "patches", english for "Flicken". As we see, there are parts of the image, that don't belong to any patch. That's pretty cool, because the data for these won't need to be saved at all. You just say, background white. The joke here is, these seperate patches, you can compress these with multiple compression methods. The text patches, for example with GIF, I'll show it just very roughly here. You probably can't use GIF in JBig2. But the principle stays. And the photo patch for example with JPEG. Every patch its suited compression method. That's a real advancement. I probably won't have to explain anyone here, that with this you will know, which patch contains what, get a good quality, and probably a smaller file size. So, if you dissect the image into patches anyway, you might as well use a completely new high tech compression method. You can dissect the original image much finer, and have every individual letter as its own patch. That's a lot of patches. A whole lot of patches. And you can do this with text pages and books. And its used, I didn't just make that up now. So next we see, which patches are similar to each other. This step is called "pattern matching". I have marked four patches with arrows here. These patches are very similar. No wonder, you will say. All of them are small "e"s. They are only different by a few pixels. Through this pattern matching, you get a group of similar symbols. For this group, you only really save one of those symbols, and that is used over and over in the compressed image. Instead of his brothers. From these four marked "e"s, only one would be really saved, and then replaced all the other ones. This way you can really save a lot of data, with minimal quality loss. Here is the final product. Looks still good, doesn't it? No artifacts visible. Takes a lot less data than without pattern matching. Did you see that? The pattern matching thinks the I is similar to the small L, so you can replace that with it. This happens, when pattern matching works inaccurate. Did you see this too? These are incredibly dangerous mistakes. Usual compression errors are not so bad. Then one letter is unreadable. You see it, and you know that something went wrong, "scan again please". But here you have actual wrong data, that looks flawless. And they get layoutet in perfectly because of the similarities. You have to actually read this, to notice the mistake. And even then, you can only see the mistake, when the document becomes obviously implausible, like in the blueprint. I don't know about you guys. But I don't read through all of my scans, that I take, just to see if it has any mistakes. But my friends, a politician that would have to gloss over this, he would say: "Scan a medicine dosing with a Xerox-device in a retirement home, and there is a high chance that in no time you'll relieve the pension funds." (laughter) (applause) Now it is clear, that this also related to security. Until now, you could have asked, why does David hold a speech about copying machines on the congress? But this is actually about a severe failure of a company, that is a serious security issue. Is anyone here from Berlin? Maybe a hand sign? What did the blueprints for the airport get scanned with? (laughter and applause) But you know what? Airports, medicine, rockets, airplanes... As big as this is, that's all trivial. It gets interesting at the question, where those scans got used in court as evidence, that can be reexamined now. Or the other way around, if one of you sues me with a Xerox-scan, from now on I'll just tell you: "Ah, you know what, it's faulty!" (laughs) Now you can look for the original first, to prove me otherwise. I can't prove anymore, that that part of the scan also comes from the part of the paper that you expect it to be from. The legal value is zero! There's hundreds of thousands of industrial copiers worldwide. Those are business devices, every machine has many users, even more documents that were made by it, that were distributed whereever. And so you can have an idea, a large company called me, their letter processing works so, that incoming leters just get scanned immideately by machines, and from there on they only exist electronically. Have fun, if those contain errors. So, we come back to the implications later again. But for now, back to the story. It's the 5th of August. We are three days after the first impact, and on the third day god created, finally yes, a life sign by Xerox. Now, they are watching after all man! (laughs) (applause) Thank you (laughs) The PR of Xerox Germany calls me. The talk is very unproductive. They can't do anything without the americans. At first, they though it was a joke. I say, it's not. And then we said, we will stay in contact. (laughs) (laughter and applause) And so, the day after, 6th of August, for the first time it really had a punch. In the morning, I get a screenshot by a reader, from one of the details from the admin panel of his Xerox-copiers. There they talk about letter replacement. Aha! For the record, now. We can all learn this here: There are three PDF compression levels. These are called "Normal", "Higher", and "High". Very marketing appropriate. So, "Normal" is the mode, that compresses the most. The reader says: on "Normal", the error occurs, in the higher levels it doesn't. My tests seem to comfirm this. I say it extra vague here, more on it later. (pauses to drink) I promised you to show you the moods over this situation, in case something like it ever happens to you. And really: In the first moment my heart dropped into my gut. I was scared shitless, to be the idiot that didn't read the manual, yes? (laughter) Because there is still no official Xerox-statement, and I got a tip from the press, that Xerox says exactly this in their statement. Lesson Learned: What's the difference between inside and outside view? Exactly this. No? Surely you think: "Hello? Why is David so agigated, it's clear that this type of document error should never have happened, not even unknowingly." But from the inside... It looks different. Despite being scared, it's important: Stay calm, act rational. Because of anxious moments like this, it's important that previously you never screech and de-escalate. Never rabble beforehand. If you were always sovereign, you can appear confident, and in doubt, calmy and publically ask: "Well, boys? Why did the support not tell me this two weeks ago, eh?" Lesson Learned: Appear professional from the start, never hate. I'll repeat that again. So, now, defense to the front. I presented the screenshot as a possible workaround and advised: Turn compression on "Higher". Additionally I wrote, that I was wondering a bit, why the support couldn't say this to me over the course of a whole week. I also criticized, that the setting is called "Normal". (laughs) And the possible consequences I showed to you, of course those stay, because on the scan you can't see, that it might contain errors. The goal was, to give the thing a spin, before Xerox fights back. It follows a telephone conference with Rick Dastin. (murmur) I see, he is known in the audience, the vice president worldwide of Xerox. And Franics Tse, one of their chief engineers, that was handling the image compression. Guys, the boss does support himself! (laughter and applause) Rick Dastin was in fact the first person that work at Xerox, that I got officially told by, that the letter replacement was in fact already known by Xerox. So, if you'd like to know, what the support can't tell you after a week, then you say: "I want to talk to Rick Dastin!" (laughter) And here, it was revealed that the theory, that the pattern matching was at fault, was true. Dastin also confirmed, that the pattern matching is only used in "Normal" mode. So after a bit of discussion, it was also clear, that the support fucked up, and the name "Normal" might be badly chosen. I then suggested "Experimental". (laughter and applause) Maybe here: I'm really in a good mood, and this is a lot of fun, and we are all laughing, but in that moment I was just more nervous. Not that you think it would be different for you. There I'll be completely honest. And then comes a clear "RTFM" from Xerox. First: "Normal" mode, David, is not even a factory setting! Dear customers, you're all stupid. Who would set it to such thing! Second: That letters can get swapped, that is explained in the manual, on two seperate occasions. Dear customers: double stupid! For the factory setting: Of course that's only a half truth. For the customer, factory setting is, what the device gets delivered with. Xerox doesn't supply to big customers. Those sales go over third parties. If you order a Xerox-copier, you do it over another company, that isn't Xerox, and they will advise you and there you can configure whatever before they ship it. And for the manual: The notice is in some manuals indeed. But then I looked closer: On page 107 and 328 in the text, yes? Now we are all old enough to know, how many people will read a 300 page manual, before handling a printer. (laughter) I also thought, that copiers generally shouldn't be designed in a way, so those errors can occur at all. That can't be, no one expects that. (applause) The answer was: "Yes, it can be!" (laughter) "The market wants it this way, errors would just..." (laughter) That was indeed a statement, that was said exactly like this. I quote here, but of course that only related to small file sizes. And errors would also be very rare. But I would be right, you can't prove, that a document is free of errors. So, all in all the talk had a nice atmosphere. They really didn't try to squash me legally or so. They listened very nicely, the talk was super long too, 45 minutes or so. And then I let myself get caught by them, like an amateur. You have to consider, I had never done anything on a scale like this. And with a company like Xerox, they have professionals. I was already wondering, why we were talking so peacefully for such a long time. Dastin is the vice president of a worldwid operating company after all. And he probably has other stuff to do. And now it turns out, during the phone talk, Xerox published a statement. Not bad at all. During that time I couldn't react after all. And it had the beautiful title "Always listening to our customers"... right at the moment! (laughs) And they write in their statement, for error free files, please use a compression setting of at least "Higher", and the error would be written about in the manual. RTFM. Lesson learned: Have someone watch the side of the enemy. So I wrote my own article, about the contents of the phone call, the one that I just told you about. Well, and then I also wrote, that I don't think they're off the hook yet. And now? This could've been over here. When a single blogger goes up against a giant company, it usually ends one of three ways, when the company shoots back: Either the blogger gives in after, or the public sides with the company, or the public loses interest, when the company shot back. Every one of you can now think of three stories, where it was like this. But none of this happened. You see the giant increase at the bottom. The story was on the cover of Slashdot. And the press, luckily, also had their attention on me. Here for example, Heise writes, that I offered the workaround even before Xerox. (laughs) (laughter and applause) I'll exceed my time limit a bit. Or also, bone dry, "Spiegel". They wrote: "So so, Xerox knew about the problem for years?" (laughs dumbly) That's really... If you sit in PR of a company, and this happens to you, I guarantee you don't need to take vacation for the rest of the year. But it gets real funny, when the story arrives at internet humour. I won't withhold this from you. I don't know who of you has lived in the US before. In german, we have the vulgar saying: "Now the shit is steaming". And the americans say "Shit hits the fan". The day after this story is on the front page of Reddit. The circled comments brings the most eloquent version of "Shit hits the fan", that I have ever seen. (laughter) Yes, but what he says, is true. I already said it earlier. When a company is depending on document digitalization, and you think about it, who isn't these days, then we have a problem. They can shut down the company, if they are unlucky. For example, I was called by the management of a state archive. They created their archive with Xerox devices, and what did they do then? They thew away the originals. Ye? (spiteful laughter) Now they stand there, with an empty gaze in front of their scanner fleet, and then they can check all their documents for plausability. But even otherwise the internet humour is amazing. (laughter) (applause) Even the involved provide the humour themselves. If you, as the Xerox vice president, get the same interviews all day, maybe mistakes happen. This one's pretty good. You don't need to read, I'll read it out real quick. Of all things, in front of BBC Dastin tried to explain. He said: "You know, all this is half so bad, this "Normal" compression mode, it can produce errors, but almost no one uses that, only the military or some oil drilling platform." (laughter and applause) Yeah, what could go wrong? (laughs childlike) So, now we have... (laughter) (laughs) Now we all noticed, that errors on oil drilling platforms in the USA were a bit neglected lately. Now we all laughed. And I did say - I want to keep my word - laughing is ok, but malice is inappropriate, even malice is hating. And, try to imagine you in Dastin's shoes. If you were interviewed about the same thing for 14 hours, you'd make a mistake too. And of course, that mistake will be talked about. Dastin said to me afterwards, they misquoted him, and I don't have any reason not to believe him. Just to protect him a bit here: He probably didn't have a good day. So, let's continue. This tech-portal is glad that catpics don't seem to be affected. (laughter) Notice the way it's written, as if they make sure, yes, as if they don't know really, maybe catpics are affected after all. (murmur) And here's a new press statement by Xerox. The public pressure was so big, that Xerox said: "Ah well, you know what, maybe we should rather do a patch where we remove pattern matching". Legally recognizing the mistake however, they never did. Even until now. Since it was in the manual. That's how it is by the way. If it's in the manual, it's ok. For microwave, it's written, you can't dry your cat in this. Here is another newspaper article. And when you waited so long, even a patch won't save you from mockery. Now the newspapers start including misprints in titles on purpose. (laughter) Let's go back to Xerox's statement, because they write a clear, important declaration. You will not see letter replacement, if you set your compression to at least "Higher", at minimum 200 dpi. Xerox published documents, in which it is clearly stated, that pattern matching is only used in "Normal" compression mode, and not in the two higher ones. But now here this whole time I've been thinking, I'm sure I also saw it in the higher modes. Different readers told me as well. But I just can't reproduce it on my two local devices. But one thing is for sure: If letters get replaced in higher modes as well, then absolutely everyone would be affected. And Xerox would have miscommunicated. Then we would have a much bigger problem worldwide. So I don't just publish my worry as a rumour. Decency also dictates that. So, but now one of my friends in a company in Bonn, my former living place, looked at his Xerox Workcentre 7545. I'll look up the numbers later! (laughs dumbly) And because it was my former place of residence, we went there and took my test numbers, and scanned them in the mode "Higher", that's the factory setting, and we even chose 300 dpi as a resolution, for text, you'll agree with me, that's quite generous. Zack - The yellow numbers are wrong. (laughter) That's not all by the way. I just marked a few here that I saw. I won't go through 500.000 numbers and mark all wrong ones. But you see, how common the errors are. I repeat: In compression mode "Higher" with 300 dpi. Now we take the blue rectangle and enhance it. Here are groups of numbers marked in red - oh, you only see it in light pink now, but you see it - that are identical to the pixel. Such thing is very unlikely. If you scan the same number multiple times, it will almost always look slightly diferent. So, pixel identical numbers in a high quantity means, that numbers get reused, that's a clear sign of pattern matching. So different from Xerox's statement, we also have pattern matching that's used here. One reader once even told me of an interactive visualization, that makes same numbers visible. Yes, let's see if it... - Yes! - there it is. And now I can hover over it here with my mouse pointer, and we'll make everything red, where a number was reused. I won't make it too long, I'm already a bit in overtime. It's because you always applaud so nicely. Which I enjoy. (laughs) (applause) But here you can see, how many numbers can really be wrong. From here on it's clear: Hundreds of thousands of devices, on factory settings are affected, and the fun is really over. With this you can really hit a company hard. And I didn't want to publish this without searching a talk first. And I wanted to make sure, that I didn't make a mistake. I didn't want to be able to be sued for millions in stock price here. So I recorded the whole process of the wrong number generation on video, and put it on youtube as an unlisted video. I sent the link to Francis Tse, one of the chief engineers that I mentioned earlier. And of course they were thunderstruck. From here on the thing is really all encompassing. Francis confirmed over phone, that I did all right indeed. And Xerox was cooperative, but they also wanted me to wait until they reproduced the error. But I also remembered, that during our last telephone call, I felt a bit fucked with. So I said, my people, it won't be like last time now. "I have the blog article done, and the video is already uploaded." (laughter) (laughs) And when you... (applause) "Don't take offense, but I request to be included from now on, because I also treat you fair." So we agreed on the thing, and now you see what it brings to not hate in advance. If you shat on them beforehand on Twitter, it's clear they say "Come, screw you!" After that, there was about six hours back and forth calls. We had calls over and over. They tried to reproduce the error with my help. For me it was evening, I spent the night on the phone in the office and didn' eat anything but the cookies that layed around. At some point Francis calls again, and says completely dumbfounded "Yep, we reproduced it." Errors on factory settings, then there was silence on both sides. We were just all shocked. And you know what was found parallel? The Code for the compression scan is eight years old. That's how long the bug was out in the wild. Eight years. Yes, they were a bit dumbfounded. And I said: "Here's my blog article, please read it and confirm, what legal safety I have for publishing this." (laughter and applause) (gasps of laughter) No, so... this error is extremely dangerous. I didn't want to wait any longer. Here's the article, and that's what they did. And I was allowed to publish the article before them, even. That's pretty unique. And you will agree with me, don't hate: If that's what you reach with this, then that's good. A conversation between adults. Lesson learned: Negotiate in the right moment. This is the next Xerox press statement. I'll increase my speed a bit. Xerox, of course, commented right after this as well. They retract their earlier communication, thank me, and say, that now first of all they'll see, how big the thing really is. And from there on they were always nice in the statements, and overall the climate was very constructive. This is the next Slashdot article. It's getting surreal, just look at the titles! After the back and forth, it doesn't matter for be with Slashdot what Xerox says, but what they confirm to me. (laughter) And here again is our snappy Peter Coy from Business Week. But now... One more, I do have on more. I mean, a compression mode! (laughter) Doesn't really matter now. But on August 11th the proof for the error also occuring on "Highest" mode succeeds. Even a quality conscious user in the last eight years, that wanted to produce beauttiful PDFs, couldn't avoid it. And to be honest, after my informations the error doesn't occurr on TIFFs. I don't want to make it look worse than it is. No one takes TIFFs, of course, they're gigantic. On August 12th Xerox admits publically, that it's a matter of an eight year old system error. And announces the patch again. But of course they are deep in the whole thing, legally. And when it's midday in the USA, it's night time here. And so in the middle of the night, when visitors of this speech are usually awake, Dastin and Tse called me on my phone, and wanted to tell me first, which I have to say, I found incredibly nice of them, that they found the bug, and they'll roll out new software. And there you can see that the relation really got better. This is the patch download page by Xerox. Here you can see how many devices are affected. Note the "X"e, that's whole device families! (laughter) So, the press is reporting again. The computer magazine CT writes an article, and calls the whole thing "Scannergate". And here is one last kick from our beloved Peter Coy. He sounds so sarcastic, but unfortunately he's completely right. Eight years of production of scanned, archived documents could contain these errors, and cause harm until forever. Hundreds of thousands of deviced and companies worldwide. We live in a society, where now, as we are speaking, the transition from a world of paper into a mix of paper and digital is happening. And the translator between the two worlds, that's deviced by Xerox workcentres. It'll be with us for a long time. Now the most important thing: I already said, that Xerox has a decentralized supply over third parties. Personally, I have no reason to believe that the patch reached a lot of devices. So: Spread the word! At the end of this talk there will be URLs, where you can get more info and see more. It's almost the end... Besides all the "Lessons learned", there's one "Lesson" that I haven't mentioned yet. I always got disbelieving looks, that I didn't take any money for the thing. One manager even said, I'm "pretty dumb". About that, two things. First, it's generally hard to make money with something like this. Even if you want With no proof you won't be taken serious. And with the proof, you'll mostly just find the bugfix directly, and then you won't get any money either. And second: Companies don't know friends. If I had taken money, it would've somehow been made public and could've been used against me. And it would've brought be in a position hard to negotiate. But I wanted this error to be fixed. And last but not least, the community helped me, and they didn't get money either. I'd do it like this again, but... (cheering) ...at the end of the day, everyone has to decide that for themselves. If you would do it differently, then that's ok. I just want to say in advance, you bring yourself in a weaker negotiation position. That's all the "Lessons learned" again. I won't reiterate them again now. They're here so you can download the presentation, and still have them. And now we close the circle to the start, and with that we are done. At the start, there's the prologue with Obama's birth certificate. Here it is, the "long form birth certificate". Shortly after the Xerox-saga, journalists from the "Reality Check" USA wrote me, if the Xerox bug could've been the reason for the "forgery". And they did a whole lot of detective work. For example, the Obamas published their tax documents, shortly before the birth certificate. It was scanned by a Xerox Workcentre 7655. Tja, and further technical attributes spoke for a Xerox scanner. And the "Reality Check" guys asked me, if I could ask Xerox about it, since I had such good contacts. And Xerox... (laughter) And Xerox asked for understanding, that they really didn't want to deal with this now... (laughs) ...and I left it alone. And now I'll prepare for my congress speech, for this talk today, yeah, I look in the PDFs again, and there's the exact copied, yeah the exact letters in there, that were a sign by Xerox for pattern matching back then. And I look on the internet pages, and there it also says something about letter doubling. Here's two exact same boxes. Notice the indents on it. Now, make your own image here But I think it could be, that this conspiracy is hereby over and done. And with this, it only remains for me to say thanks, for spending a whole hour with me! (applause) If everyone keeps clapping, it'll take even longer! So... (laughs) Up there you'll find another link for the Xerox saga. Pass it on! And down here a link to my page. There I'll publish the presentation online. Maybe tomorrow. I won't go into the WIFI here! (laughs) (laughter) And take care of evil copiers! Herald: Okay, thanks first of all, for this amazing talk! I think it was very interesting for everyone. Everyone on the way out, please hurry and close the doors after. And be quiet. For the questions, I'd like to start with the ones from the internet. From our Signal Angel. Signal Angel: Thanks! And a great applause from the internet, you couldn't hear it now. But there was a lot of positive feedback. And also the plea to publish the presentation, especially the symbol images were well recived. Daniel: It will happen, on my page, latest tomorrow. Definitely. Signal Angel: Very good, thanks. Two questions from me. The first question is, does Xerox have a technical difference between Scanning, Printing and Copying? Or is it always the same thing? Daniel: So, scanning, there paper comes in and for printing it comes out, ne? (laughter) No, so, for printing, you just recieve the printing data. I don't know about anything being compressed afterwards again. Scanning - here there are different modes. The PDF modes, there are three, that I mentioned earlier. And copying - In my view it's not like this, that it always happens during printing, because there you don't compress. You see how I mean it, yes? I'm sure I would have recieved some reports if it was like that. And that's why I don't think the process of copying itself is affected. But that wouldn't be so bad anyways, because there are no documents that get archieved here. Signal Angel: Okay, and the second question: Are there any definitive harms that happened because of this bug? Did you ever recieve and feedback regarding this? Daniel: I have feedback, the ones that I named earlier. And of course a few more. I'm of course not going to say any names. But... So, I can only say this much: You have to imagine yourself in the place of the company that's affected here. Your files might be good for the trash. Will you make this public? No, you will request compensation from Xerox in silence, and not write any of this on your website, because then it will fall back to you, that our data is faulty. No one will ask you, if that was a Xerox copier now. So I don't expect there to be a grand reveal now, if it can be avoided. If some random bridge on a highway collapses now that would of course be a different matter. Signal Angel: Okay, thanks again! Daniel: Gern! Herald: Good, then I'd suggest we continue at microphone 2, at the first person. Question: Just a short question. This is probably a technique that gets used by many. Did you ever try this with devices from other companies? Daniel: I had a great quantity of reports from other companies. But if you take on a thing of this scale, you'll become a victim of spin doctoring. And all of it turned out to be false, Here, again: Stay sovereign, don't just pump out rumours. Here none of it was true, and in concrete cases it wasn't the compression method itself, but the fact that there was indeed another bug. Herald: Good, then 3 please! Question: Hello? Thanks for the talk, it was pretty cool. I just wonder about the thing, the bug being there somehow for eight years. Did you look on search engines, did others... I mean, I can't imagine that for eight years no one saw it, because as you say, on a blueprint, there you can see pretty quickly, so... or maybe other people messaged you, because they had seen it before, or maybe they said, hey I noticed this before, Xerox said, yes, higher compression, then they were lucky and it worked. Daniel: So, it was, first of all hard to discover. Second of all, it was known for the mode "Normal". It was on purpose, they even knew about it And that's why it was hard to recognize the real bug, because Xerox... The support that knew - mine didn't know - always blamed it on the "Normal" setting. And then it's plausible, then I tell you: "Yes, you used the "Normal" setting, take another one, then the error will occur less, you'll probably be lucky there" So I think, that indeed, that the bug was discovered for the first time... Question: So, no one contacted you, with "Hey, I've seen this before" or so? Daniel: No, no one. In the whole storm, no. Herald: Okay, next up again from the 2 please. Question: Moin, thanks for the presentation from me as well. Was very cool. Daniel: Sure. Question: Short question, you said, you didn't do it for money... Daniel: Correct. Question: ...and somehow... I find it very noble, very cool. But did they ever offer you something from their side? Daniel: No, they didn't. No one there... Question: Not even a job or anything? Daniel: Well, there I can in fact hold Xerox a bit. They didn't offer me anything. I couldn't have accepted it anyways by that logic. That's why it was totally fine. In that long night, where we had the phone call, they were ready to have me fly in. But I honestly don't know anything about copiers either. Not my main job. I can show them the bug, but I can't repair it. So... Question: Ok, but if they would have flown you in, why not work with them together and try to solve the thing? Daniel: Jo, I could've done that. But I couldn't have contributed anything. Because, they have to find the bug in their code themselves. It was clear that something happened. I can't help with that. I'd just sit around. So I also said it just like that. Question: That makes sense. Daniel: Yes, and flying 2x intercontinential for that... I don't know. Question: Yes, but if they paid I would've done it. Daniel: I admit, I also overthought it again. But I had also stuff to do job wise, and it wouldn't have worked out. Herald: Good, next up 3 again. Question: Well, I have a copier at home, and I have a very intimate relationship with it. Are there any reports, that some tried it with their home copiers, and then went "Oh Sh...?" Daniel: I don't know of any reports like that. It only affected the things that I just showed. Workcentre, ColourCube. All big things. basically. Question: Okay. Daniel: This JBig2 in Hardware, that's also I think very expensive to implement. Question: Okay, thanks! Daniel: Jo! Herald: And 3 again please! Question: Maybe a cool crows research task Is maybe to look through those manuals, to collect. Who had access, which year does it show up in the documentation at all, is it really that old, so eight years, or maybe only four years? They only noticed four years ago, and thought, hm, it's cheaper, we print new handbooks, and leave the software as it is. Because it's more expensive, to roll out new firmware. Daniel: There's a theory, that here a bug was declared a feature. I can confirm that. But I don't have proof for it. I want to say that very clearly. But seriously, who would design a scanner, that swaps around numbers? Only if it was just for the military (laugsh) Herald: Okay, I think one last question. Then 2 again. Question: Not really a question, but more of a suggestion for the presentation, in case you present it again. It's really great. You have this scale, with accesses to your website at the bottom. I wondered, during the talk, if maybe you could also do that with the stock price of Xerox? (Daniel laughs) Daniel: It wasn't that bad. I mean, that PR section of them handled it pretty well despite the world wide attention they had. I mean, that's really an error, where you could think, this is a danger for the whole company. It's their bread and butter business. But it didn't turn out that way. We will see, I could've put such a live stock price curve in the presentation. I don't know, what's happening on the internet right now. But good suggestion, thanks! Herald: Okay, we also have questions from the internet. Therefore I'd also like to... Signal Angel: I just have one more question from the internet. Are there are statistics or numbers, about how high the likeliness of such an error is? Daniel: Well, you saw the page I told you about. That was the case with font size 7 or 8. I don't know anymore, where I got it really niceƶy reproduced. But when... Signal Angel: But... Numbers, thatr's not a normal page now is it? Daniel: It was all numbers, but of course it's also possible with similar letters. But it can happen too. I don't have any statistics. For the numbers the 6 and 8 are affected the most. But real error percentages, I don't have. But you can see, what's possible. So I have... I didn't try for hours on end, until I found the page with many yellow points. I scanned ONE page, and then it was like that. Yeah? So it's not like you have to look for it forever. Question: Yes, thanks! Herald: Alright, I think we are done then. Then please another big applause for the lecturer! (applause) Daniel: Thanks! (longer applause) 31C3 Credits with no audio Subtitles created on amara.org in the year 2017 - 2022 by multiple collaborators