WEBVTT 00:00:00.000 --> 00:00:12.710 rC3 Opening Music 00:00:12.710 --> 00:00:19.340 Herald: So about our next speaker. He's a security researcher focused on embedded 00:00:19.340 --> 00:00:26.500 systems, secure communications and mobile security. He was nominated by 00:00:26.500 --> 00:00:39.940 Forbes for the 30 under 30 in technology and also has won a OWASP Appsec CTF. 00:00:39.940 --> 00:00:46.790 He has also found and disclosed responsibly multiple vulnerabilities. And especially 00:00:46.790 --> 00:00:52.280 for you Nintendo aficionados I want you to watch out for the next intro, which is 00:00:52.280 --> 00:00:56.270 really amazing and you will all love. Thank you very much. 00:00:56.305 --> 00:01:00.825 shows nintendo cartridge 00:01:00.825 --> 00:01:06.065 plugs cartridge 00:01:09.532 --> 00:01:10.532 nintendo start sound plays 00:01:10.532 --> 00:01:14.850 Thomas: Oh, damn it. retrieves cartridge 00:01:14.850 --> 00:01:19.980 blows into cartridge plugs cartridge again 00:01:22.146 --> 00:01:23.446 nintendo start sound plays 00:01:26.243 --> 00:01:29.423 music plays 00:02:52.810 --> 00:02:56.099 Thomas Roth: Uff, what a trip. Welcome to my talk on 00:02:56.099 --> 00:03:00.810 hacking the new Nintendo Game & Watch Super Mario Brothers. My name is Thomas 00:03:00.810 --> 00:03:05.290 Roth and I'm a security researcher and trainer from Germany. And you can find me 00:03:05.290 --> 00:03:10.719 on Twitter at @ghidraninja and also on YouTube at stacksmashing. Now, this year 00:03:10.719 --> 00:03:16.439 marks the 35th anniversary of our favorite plumber, Super Mario And to celebrate 00:03:16.439 --> 00:03:20.699 that, Nintendo launched a new game console called the Nintendo Game & Watch Super 00:03:20.699 --> 00:03:26.669 Mario Brothers. The console is lightweight and looks pretty nice, and it comes 00:03:26.669 --> 00:03:31.859 preinstalled with three games and also this nice animated clock. The three games 00:03:31.859 --> 00:03:36.920 are Super Mario Brothers, the original NES game, Super Mario Brothers 2 The Lost 00:03:36.920 --> 00:03:44.830 Levels and also a reinterpretation of an old Game & Watch game called Ball. Now, as 00:03:44.830 --> 00:03:49.939 you probably know, this is not the first retro console that Nintendo released. In 00:03:49.939 --> 00:03:57.400 2016, they released the NES Classic and 2017 they released the SNES Classic. Now, 00:03:57.400 --> 00:04:01.729 these devices were super popular in the homebrew community, because they make it 00:04:01.729 --> 00:04:05.779 really easy to add additional ROMs to it. They make it really easy to modify the 00:04:05.779 --> 00:04:10.540 firmware and so on. And you can basically just plug them into your computer, install 00:04:10.540 --> 00:04:14.959 a simple software and you can do whatever you want with them. The reason for that is 00:04:14.959 --> 00:04:21.140 that they run Linux and have a pretty powerful ARM processor on the inside. And 00:04:21.140 --> 00:04:27.360 so it's really a nice device to play with and so on. And so when Nintendo announced 00:04:27.360 --> 00:04:31.650 this new console, a lot of people were hoping for a similar experience of having 00:04:31.650 --> 00:04:38.810 a nice mobile home brew device. Now, if you were to make a Venn diagram of some of 00:04:38.810 --> 00:04:43.099 my biggest interests, you would have reverse engineering, hardware hacking and 00:04:43.099 --> 00:04:48.539 retro computing. And this new Game & Watch fits right in the middle of that. And so 00:04:48.539 --> 00:04:52.920 when it was announced on the 3rd of September, I knew that I needed to have 00:04:52.920 --> 00:04:58.930 one of those. And given how hard the NES and SNES classic were to buy for a while, 00:04:58.930 --> 00:05:03.389 I preordered it on like four or five different sites, a couple of which got 00:05:03.389 --> 00:05:09.470 canceled. But I was pretty excited, because I had three preorders and was supposed to 00:05:09.470 --> 00:05:15.380 ship on the 13th of November. And so I was really looking forward to this. And I was 00:05:15.380 --> 00:05:19.909 having breakfast on the 12th of November, when suddenly the doorbell rang and DHL 00:05:19.909 --> 00:05:25.730 delivered me the new Game & Watch one day before the official release. Now, at that 00:05:25.730 --> 00:05:30.300 point in time, there was no technical information available about the device 00:05:30.300 --> 00:05:35.449 whatsoever. Like, if you searched for Game & Watch on Twitter, you would only find 00:05:35.449 --> 00:05:40.680 denouncements or maybe a picture of the box of someone who also received it early. 00:05:40.680 --> 00:05:44.900 But there were no teardowns, no pictures of the insides and most importantly, 00:05:44.900 --> 00:05:50.319 nobody had hacked it yet. And this gave me, as a hardware hacker, the kind of 00:05:50.319 --> 00:05:55.669 unique opportunity to potentially be the first one to hack a new Nintendo console. 00:05:55.669 --> 00:06:00.199 And so I just literally dropped everything else I was doing and started investigating 00:06:00.199 --> 00:06:05.949 the device. Now, I should say that normally I stay pretty far away from any 00:06:05.949 --> 00:06:11.460 new console hacking. Mainly, because of the piracy issues. I don't want to enable 00:06:11.460 --> 00:06:18.930 piracy. I don't want to deal with piracy. And I don't want to build tools that 00:06:18.930 --> 00:06:23.930 enable other people to pirate stuff, basically. But given that on this device, 00:06:23.930 --> 00:06:28.900 you cannot buy any more games and that all the games, that are on there, were basically 00:06:28.900 --> 00:06:33.840 already released over 30 years ago. I was not really worried about piracy and felt 00:06:33.840 --> 00:06:39.449 pretty comfortable in sharing all the results of the investigation and also 00:06:39.449 --> 00:06:44.490 the... basically the issues we found that allowed us to customize the device and so 00:06:44.490 --> 00:06:49.389 on. And in this talk, I want to walk you through, how we managed to hack the device 00:06:49.389 --> 00:06:54.931 and how you can do it at home using relatively cheap hardware. And, yeah, hope 00:06:54.931 --> 00:07:03.040 you enjoy it. Now, let's start by looking at the device itself. The device is 00:07:03.040 --> 00:07:08.469 pretty lightweight and comes with a nicely sized case. And so it really... for me, it 00:07:08.469 --> 00:07:14.889 sits really well in my hand. And it has a nice 320 by 240 LCD display, a d-pad, A 00:07:14.889 --> 00:07:19.529 and B buttons and also three buttons to switch between the different game modes. 00:07:19.529 --> 00:07:23.940 On the right side we also have the power button and the USB-C port. Now, before you 00:07:23.940 --> 00:07:28.640 get excited about the USB port, I can already tell you that unfortunately, 00:07:28.640 --> 00:07:33.030 Nintendo decided to not connect the data lines off the USB port. And so you can 00:07:33.030 --> 00:07:38.550 really only use it for charging. Also, because we are talking about Nintendo 00:07:38.550 --> 00:07:43.979 here, they use their proprietary tri-point screws on the device. And so to open it 00:07:43.979 --> 00:07:48.730 up, you need one of those special tri- point bits. Luckily, nowadays, most bit 00:07:48.730 --> 00:07:54.120 sets should have them, but it still would suck, if you order your unit and then you 00:07:54.120 --> 00:07:59.639 can't open it up, because you're missing a screwdriver. After opening it up, the 00:07:59.639 --> 00:08:03.779 first thing you probably notice is the battery. And if you've ever opened up a 00:08:03.779 --> 00:08:07.599 Nintendo switch joycon before, you might recognize the battery, because it's the 00:08:07.599 --> 00:08:12.729 exact same one that's used in the joycons. This is very cool, because if down the 00:08:12.729 --> 00:08:16.529 line, like, let's say in two or three years, your battery of your Game & Watch 00:08:16.529 --> 00:08:20.650 dies, you can just go and buy a joycon battery, which you can have really 00:08:20.650 --> 00:08:26.520 cheaply, almost anywhere. Next to the battery, on the right side, we have a 00:08:26.520 --> 00:08:32.349 small speaker which is not very good. And underneath we have the main PCB with the 00:08:32.349 --> 00:08:37.510 processor, all the storage and so on and so forth. Let's take a look at those. Now, 00:08:37.510 --> 00:08:44.779 the main processor of the device is an STM32H7B0. This is a Cortex M7 from 00:08:44.779 --> 00:08:53.201 STMicroelectronics with 1.3 MB of RAM and 128 kB of flash. It runs at 280 MHz and is 00:08:53.201 --> 00:08:59.460 a pretty beefy microcontroller. But it's much less powerful than the processor in 00:08:59.460 --> 00:09:03.860 the NES or SNES classic. Like this processor is really just a microcontroller 00:09:03.860 --> 00:09:09.260 and so it can't run Linux. It can't run, let's say, super complex software. Instead 00:09:09.260 --> 00:09:14.170 it'll be programed in some bare metal way. And so we will have a bare metal 00:09:14.170 --> 00:09:20.580 firmware on the device. To the right of it, you can also find a 1 MB SPI flash. 00:09:20.580 --> 00:09:26.180 And so overall, we have roughly 1.1 MB of storage on the device. Now, most 00:09:26.180 --> 00:09:31.279 microcontrollers or basically all microcontrollers have a debugging port. 00:09:31.279 --> 00:09:36.370 And if we take a look at the PCB, you can see that there are five unpopulated 00:09:36.370 --> 00:09:40.980 contacts here. And if you see a couple of contacts, that are not populated close to 00:09:40.980 --> 00:09:47.510 your CPU, it's very likely, that it's the debugging port. And luckily, the datasheet 00:09:47.510 --> 00:09:54.449 for the STM32 is openly available. And so we can check the pinouts in the datasheet 00:09:54.449 --> 00:09:59.050 and then use a multimeter to to see whether these pins are actually the 00:09:59.050 --> 00:10:04.500 debugging interface. And turns out they actually are. And so we can find the SWD 00:10:04.500 --> 00:10:11.630 debugging interface as well as Vcc and ground exposed on these pins. Now this 00:10:11.630 --> 00:10:16.779 means that we can use a debugger. So, for example, a J-link or ST-link or whatever 00:10:16.779 --> 00:10:21.980 to connect to the device. And because the the contacts are really easy to access, 00:10:21.980 --> 00:10:25.870 you don't even have to solder. You can just hook up a couple of test pins and 00:10:25.870 --> 00:10:32.600 they will allow you to easily hook-up your debugger. Now, the problem is, on most 00:10:32.600 --> 00:10:36.900 devices, the debugging interface will be locked during manufacturing, this is done 00:10:36.900 --> 00:10:42.550 to prevent people like us to basically do whatever with the device and to prevent us 00:10:42.550 --> 00:10:47.450 from being able to dump the firmware, potentially reflash it and so on. And so I 00:10:47.450 --> 00:10:52.190 was very curious to see, whether we can actually connect to the debugging port. 00:10:52.190 --> 00:10:56.090 And when starting up J-link and trying to connect, we can see it can actually 00:10:56.090 --> 00:11:01.230 successfully connect. But, when you take a closer look, there's also a message that 00:11:01.230 --> 00:11:09.269 the device is active read protected. This is because the chip, the STM32 chip, 00:11:09.269 --> 00:11:15.650 features something called RDP protection level or readout protection level. This is 00:11:15.650 --> 00:11:20.300 basically the security setting for the debugging interface and it has three 00:11:20.300 --> 00:11:26.769 levels. Level zero means no protection is active. Level one means that the flash 00:11:26.769 --> 00:11:31.839 memory is protected and so we can't dump the internal flash of the device. However, 00:11:31.839 --> 00:11:36.939 we can dump the RAM contents and we can also execute code from RAM. And then 00:11:36.939 --> 00:11:42.240 there's also level two, which means that all debugging features are disabled. Now, 00:11:42.240 --> 00:11:46.630 just because a chip is in level two, doesn't mean that you have to give up. 00:11:46.630 --> 00:11:51.589 For example, in our talk wallet.fail a couple of years ago, we showed how to use fault 00:11:51.589 --> 00:11:56.000 injection to bypass the level two protection and downgrade a chip to level 00:11:56.000 --> 00:12:00.820 one. However, on the Game & Watch, we are lucky and the interface is not fully 00:12:00.820 --> 00:12:07.139 disabled. Instead, it's in level one. And so we can still dump the RAM, which is a 00:12:07.139 --> 00:12:11.300 pretty good entry point, even though we can't dump the firmware yet. Now, having 00:12:11.300 --> 00:12:17.010 dumped the RAM of the device, I was pretty curious to see, what's inside of it. And 00:12:17.010 --> 00:12:21.660 one of my suspicions was, that potentially the emulator, that's hopefully running on 00:12:21.660 --> 00:12:29.000 the device, loads the original Super Mario Brothers ROM into RAM. And so, I was 00:12:29.000 --> 00:12:34.830 wondering whether maybe we can find the ROM that the device uses in the RAM-dump. 00:12:34.830 --> 00:12:39.750 And so I opened up the RAM-dump in a hex editor and I also opened up the original 00:12:39.750 --> 00:12:44.450 Super Mario Brothers ROM in a second window in a hex editor and tried to find 00:12:44.450 --> 00:12:49.411 different parts of the original ROM in the RAM-dump. And it turns out that, yes, the 00:12:49.411 --> 00:12:55.380 NES ROM is loaded into RAM and it's always at the same address. And so it's probably 00:12:55.380 --> 00:13:00.289 like during boot up, it gets copied into RAM or something along those lines. And so 00:13:00.289 --> 00:13:05.420 this is pretty cool to know, because it tells us a couple of things. First off, we 00:13:05.420 --> 00:13:09.790 know now that the debug port is enabled and working, but that it's unfortunately 00:13:09.790 --> 00:13:16.319 at RDP level one and so we can only dump the RAM. And we also know that the NES ROM 00:13:16.319 --> 00:13:21.259 is loaded into RAM. And this means that the device runs a real NES emulator. And 00:13:21.259 --> 00:13:25.680 so if we get lucky, we can, for example, just replace the ROM that is used by 00:13:25.680 --> 00:13:29.840 the device and play, for example, our own NES game. 00:13:30.600 --> 00:13:33.460 little pause 00:13:33.930 --> 00:13:37.010 Next, it was time to dump the flash chip 00:13:37.010 --> 00:13:41.160 of the device. For this, I'm using a device called Mini Pro and I'm using one 00:13:41.160 --> 00:13:46.959 of these really useful SOIC8 clips. And so these ones you can simply clip onto the 00:13:46.959 --> 00:13:52.240 flash chip and then dump it. Now, one warning though, the flash chip on the device, 00:13:52.240 --> 00:13:56.220 is running at 1.8 volts. And so you want to make sure that your programmer also 00:13:56.220 --> 00:14:01.839 supports 1.8 volt operation. If you accidentally try to read it out at 3.3 volts, 00:14:01.839 --> 00:14:06.770 you will break your flash. Trust me, because it happened to me on one of my 00:14:06.770 --> 00:14:12.940 units. Now, with this flash dump from the device, we can start to analyze it. And 00:14:12.940 --> 00:14:17.319 what I always like to do first, is take a look at the entropy or the randomness of 00:14:17.319 --> 00:14:23.350 the flash dump. And so using binwalk with the -E option, we get a nice entropy 00:14:23.350 --> 00:14:27.410 graph. And in this case, you can see we have a very high entropy over almost the 00:14:27.410 --> 00:14:32.899 whole flash contents. And this mostly indicates, that the flash contents are 00:14:32.899 --> 00:14:37.240 encrypted. It could also mean compression, but if it's compressed, you would often 00:14:37.240 --> 00:14:43.529 see more like dips in the entropy. And in this case, it's one very high entropy 00:14:43.529 --> 00:14:48.830 stream. We also noticed, that there are no repetitions whatsoever, which also tells 00:14:48.830 --> 00:14:53.350 us that it's probably not like a simple XOR based encryption or so and instead 00:14:53.350 --> 00:14:58.340 something like AES or something similar. But, just because the flash is encrypted 00:14:58.340 --> 00:15:02.199 doesn't mean we have to give up. On the contrary, I think now it starts to get 00:15:02.199 --> 00:15:06.829 interesting, because you actually have a challenge and it's not just plug and play, 00:15:06.829 --> 00:15:13.020 so to say. One of the biggest questions I had is, is the flash actually verified? 00:15:13.020 --> 00:15:18.160 Like does the device boot, even though the flash has been modified? Because, if it 00:15:18.160 --> 00:15:24.789 does, this would open up a lot of attack vectors, basically, as you will see. And 00:15:24.789 --> 00:15:30.720 so to verify this, I basically try to put zeros in random places in the flash 00:15:30.720 --> 00:15:35.760 image. And so, I put some at adress zero, some at 0x2000 and so on. And then I 00:15:35.760 --> 00:15:39.910 checked whether the device would still boot-up. And with the most flash 00:15:39.910 --> 00:15:44.370 modifications, it would still boot just fine. This tells us, that even though the 00:15:44.370 --> 00:15:48.599 flash contents are encrypted, they are not validated, they are not checksummed or 00:15:48.599 --> 00:15:54.610 anything. And so we can potentially trick the device into accepting a modified flash 00:15:54.610 --> 00:15:58.529 image. And this is really important to know, as you will see in a couple of 00:15:58.529 --> 00:16:05.310 minutes. My next suspicion was, that maybe the NES ROM we see in RAM, is actually 00:16:05.310 --> 00:16:12.839 loaded from the external flash. And so to find out whether that's the case, I again 00:16:12.839 --> 00:16:18.939 took the flash and I inserted zeros at multiple positions in the flash image. 00:16:18.939 --> 00:16:24.550 Flashed that over, booted-up the game, dumped the RAM and then compared the NES 00:16:24.550 --> 00:16:29.620 ROM that I'm now dumping from RAM with the one that I dumped initially and checked 00:16:29.620 --> 00:16:35.399 whether they are equal. Because my suspicion was that maybe I can overwrite a 00:16:35.399 --> 00:16:41.519 couple of bytes in the encrypted flash and then I will modify the NES room. And after 00:16:41.519 --> 00:16:46.760 doing this for, like, I don't know, half an hour, I got lucky and I modified 4 00:16:46.760 --> 00:16:51.399 bytes in the flash image and 4 bytes in the RAM...sorry...in the ROM that was loaded 00:16:51.399 --> 00:16:56.790 into RAM changed. And this tells us quite a bit. It means that the ROM is loaded 00:16:56.790 --> 00:17:04.450 from flash into RAM and that the flash contents are not validated. And what's 00:17:04.450 --> 00:17:10.280 also important is, that we change 4 bytes in the flash and now 4 bytes in 00:17:10.280 --> 00:17:15.510 the decrypted image changed. And this is very important to know, because if we take 00:17:15.510 --> 00:17:19.740 a look at what we would expect to happen when we change the flash contents, there 00:17:19.740 --> 00:17:23.880 are multiple outcomes. And so, for example, here we have the SPI-flash 00:17:23.880 --> 00:17:29.310 contents on the left and the RAM contents on the right. And so the RAM contents are 00:17:29.310 --> 00:17:35.410 basically the decrypted version of the SPI-flash contents. Now let's say we 00:17:35.410 --> 00:17:41.750 change 4 bytes in the encrypted flash image to zeros. How would we expect the 00:17:41.750 --> 00:17:47.580 RAM contents to change, for example, if we would see that now 16 bytes in the RAM are 00:17:47.580 --> 00:17:52.960 changing, this means that we are potentially looking at an encryption 00:17:52.960 --> 00:17:57.650 algorithm, such as AES in electronic codebook mode. Because, it's a block based 00:17:57.650 --> 00:18:03.180 encryption and so if we change four bytes in the input data, a block size, in this 00:18:03.180 --> 00:18:09.730 case 16 bytes, in the output data would change. The next possibility is, that we 00:18:09.730 --> 00:18:16.160 change 4 bytes in the SPI-flash and all data afterwards will be changed. And in 00:18:16.160 --> 00:18:21.830 this case, we would look at some kind of chaining cipher such as AES in the CBC 00:18:21.830 --> 00:18:27.600 mode. However, if we change 4 bytes in the SPI-flash and only 4 bytes in the 00:18:27.600 --> 00:18:33.510 RAM changed, we are looking at something such as AES in counter mode. And 00:18:33.510 --> 00:18:40.270 to understand this, let's take a better look at how AES in CTR works. AES-CTR 00:18:40.270 --> 00:18:45.930 works by having your cleartext and xoring it with an AES encryption stream, that is 00:18:45.930 --> 00:18:53.211 generated from a key, a Nonce and the counter algorithm. Now, the AES stream, 00:18:53.211 --> 00:18:57.370 that will be used to xor your your cleartext will always be the same, if key 00:18:57.370 --> 00:19:02.840 and Nonce is the same. This is why it's super important, that if you use AES-CTR, 00:19:02.840 --> 00:19:08.780 you always select a unique Nonce for each encryption. If you encrypt similar data 00:19:08.780 --> 00:19:15.060 with the same Nonce twice, large parts of the resulting ciphertext will be the same. 00:19:15.060 --> 00:19:19.960 And so the cleartext gets xored with the AES-CTR stream and then we get our 00:19:19.960 --> 00:19:26.570 ciphertext. Now, if we know the cleartext, as we do, because the cleartext is the ROM, 00:19:26.570 --> 00:19:32.270 that is loaded into RAM and we know the ciphertext, which we do, because it's the 00:19:32.270 --> 00:19:38.010 contents of the encrypted flash we just dump. We can basically reverse the 00:19:38.010 --> 00:19:44.580 operation and as a result, we get the AES- CTR stream, that was used to encrypt the 00:19:44.580 --> 00:19:52.050 flash. And now this means, that we can take, for example, a custom ROM, xor it 00:19:52.050 --> 00:19:57.830 with the AES-CTR stream we just calculated and then generate our own 00:19:57.830 --> 00:20:02.010 encrypted flash image, for example, with a modified ROM. And so I wrote a couple of 00:20:02.010 --> 00:20:08.340 Python scripts to try this. And after a while, I was running Hacked Super Mario 00:20:08.340 --> 00:20:14.290 Brothers instead of Super Mario Brothers. So, wohoo, we hacked the Nintendo Game & 00:20:14.290 --> 00:20:18.870 Watch one day before the official release. And we can install modified Super Mario 00:20:18.870 --> 00:20:23.990 Brothers ROMs. Now, you can find the scripts that I used for this on my Github. 00:20:23.990 --> 00:20:28.260 So it's in a repository called "Game & Watch Hacking". And I was super excited, 00:20:28.260 --> 00:20:33.570 because it meant, that I succeeded and that I basically hacked a Nintendo console one 00:20:33.570 --> 00:20:37.961 day before the official release. Unfortunately, I finished the level, but 00:20:37.961 --> 00:20:43.350 Toad wasn't as excited. He told me that unfortunately, our firmware is still in 00:20:43.350 --> 00:20:50.050 another castle. And so on the Monday after the launch of the device, I teamed up with 00:20:50.050 --> 00:20:54.790 Konrad Beckman, a hardware hacker from Sweden who I met at the previous Congress. 00:20:54.790 --> 00:20:59.850 And we started chatting and throwing ideas back and forth and so on. And eventually 00:20:59.850 --> 00:21:05.620 we noticed that the device has a special RAM area called ITCM-RAM, which is a 00:21:05.620 --> 00:21:10.570 tightly coupled instruction RAM that is normally used for very high performance 00:21:10.570 --> 00:21:15.121 routines such as interrupt handlers and so on. And so it's in a very fast RAM area. 00:21:15.121 --> 00:21:22.160 And we realized that we never actually looked at the contents of that ITCM-RAM. 00:21:22.160 --> 00:21:26.540 And so we dumped it from the device using the debugging port. And it turns out that 00:21:26.540 --> 00:21:33.020 this ITCM-RAM contains ARM code. And so, again, the question is, where does this 00:21:33.020 --> 00:21:37.570 ARM code come from, does it maybe just like the NES ROM come from the external 00:21:37.570 --> 00:21:45.741 flash? And so basically, I repeated the whole thing that we also did with the NES 00:21:45.741 --> 00:21:52.260 ROM and just put zeros at the very beginning of the encrypted flash. Rebooted 00:21:52.260 --> 00:21:57.720 the device and dumped the ITCM-RAM and I got super lucky on the first try already 00:21:57.720 --> 00:22:03.990 the ITCM contents changed. And because the ITCM contains code, not just data, so 00:22:03.990 --> 00:22:09.300 early we only had the NES-ROM, which is just data, but this time the RAM contains 00:22:09.300 --> 00:22:14.850 code. This means that with the same x or trick we used before, we could inject 00:22:14.850 --> 00:22:21.530 custom ITCM code into the external flash, which would then be loaded into RAM when 00:22:21.530 --> 00:22:27.620 the device boots. And because it's a persistent method, we can then reboot the 00:22:27.620 --> 00:22:32.520 device and let it run without the debugger connected. And so whatever code we load 00:22:32.520 --> 00:22:38.490 into this ITCM area will be able to actually read the flash. And so we could 00:22:38.490 --> 00:22:43.280 potentially write some code that gets somehow called by the firmware and then 00:22:43.280 --> 00:22:49.540 copies the internal flash into RAM from where we then can retrieve it using the 00:22:49.540 --> 00:22:57.560 debugger. Now, the problem is, let's say we have a custom payload somehow in this 00:22:57.560 --> 00:23:04.750 ITCM area. We don't know which address of this ITCM code gets executed. And so we 00:23:04.750 --> 00:23:09.410 don't know whether the firmware will jump to adress zero or adress 200 or whatever. 00:23:09.410 --> 00:23:14.270 But there's a really simple trick to still build a successful payload. And it's 00:23:14.270 --> 00:23:19.230 called a NOP slide. A NOP, or no operation, is an instruction that simply 00:23:19.230 --> 00:23:25.100 does nothing. And if we fill most of the ITCM-RAM with NOPs and put our payload at 00:23:25.100 --> 00:23:31.700 the very end, we build something that is basically a NOP-slide. And so when the 00:23:31.700 --> 00:23:37.260 CPU, indicated by Mario here, jumps to a random address in that whole NOP-slide, it 00:23:37.260 --> 00:23:43.500 will start executing NOPs and slide down into our payload and execute it. And so 00:23:43.500 --> 00:23:49.100 even if Mario jumps right in the middle of the NOP-slide, he will always slide down 00:23:49.100 --> 00:23:54.920 the slide and end up in our payload. And Konrad wrote this really, really simple 00:23:54.920 --> 00:23:58.330 payload, which is only like 10 instructions, which basically just copies 00:23:58.330 --> 00:24:03.980 the internal flash into RAM from where we can then retrieve it using the debugger. 00:24:03.980 --> 00:24:08.280 So wohoo, super simple exploit. We have a full firmware backup and a full flash 00:24:08.280 --> 00:24:13.590 backup and now we can really fiddle with everything on the device. And we've 00:24:13.590 --> 00:24:17.700 actually released tools to do this yourself. And so if you want to back up 00:24:17.700 --> 00:24:23.161 your Nintendo Game & Watch, you can just go onto my GitHub and download the game 00:24:23.161 --> 00:24:27.670 and watch backup repository, which contains a lot of information on how to 00:24:27.670 --> 00:24:33.270 back it up. It does check something and so on to ensure that you don't 00:24:33.270 --> 00:24:38.420 accidentally brick your device and you can easily back up the original firmware, 00:24:38.420 --> 00:24:43.610 install homebrew, and then always go back to the original software. We also have an 00:24:43.610 --> 00:24:50.630 awesome support community on Discord. And so if you ever need help, I think you will 00:24:50.630 --> 00:24:55.270 find success there. And so far we haven't had a single bricked Game & Watch and so 00:24:55.270 --> 00:25:02.200 looks to be pretty stable. And so I was pretty excited because the quest was 00:25:02.200 --> 00:25:11.170 over. Or is it? If you ever claim on the internet that you successfully hacked an 00:25:11.170 --> 00:25:18.180 embedded device, there will be exactly one response and one response only: but does 00:25:18.180 --> 00:25:23.610 it run Doom? Literally my Twitter DMs, my YouTube comments, and even my friends were 00:25:23.610 --> 00:25:28.720 spamming me with the challenge to get Doom running on the device. But to get Doom 00:25:28.720 --> 00:25:34.390 running, we first needed to bring up all the hardware. And so we basically needed 00:25:34.390 --> 00:25:40.070 to create a way to develop and load homebrew onto the device. Now, luckily for 00:25:40.070 --> 00:25:44.880 us, most of the components on the board are very well documented and so there are 00:25:44.880 --> 00:25:50.040 no NDA components. And so, for example, the processor has an open reference manual 00:25:50.040 --> 00:25:56.890 and open source library to use it. The flash is a well-known flash chip. And so 00:25:56.890 --> 00:26:00.440 on and so forth. And there are only a couple of very proprietary or custom 00:26:00.440 --> 00:26:06.280 components. And so, for example, the LCD on the device is proprietary and we had to 00:26:06.280 --> 00:26:12.690 basically sniff the SPI-bus that goes to the display to basically decode the 00:26:12.690 --> 00:26:19.160 initialization of the display and so on. And after a while, we had the full 00:26:19.160 --> 00:26:24.540 hardware running, we had LCD support, we had audio support, deep support, buttons 00:26:24.540 --> 00:26:29.210 and even flashing tools that allow you to simply use an SWD debugger to dump and 00:26:29.210 --> 00:26:33.820 rewrite the external flash. And you can find all of these things on our GitHub. 00:26:33.820 --> 00:26:38.520 Now, if you want to mod your own Game & Watch, all you need is a simple debugging 00:26:38.520 --> 00:26:46.840 adapter such as a cheap, three dollar ST- link, a J-link or a real ST-link device, 00:26:46.840 --> 00:26:51.140 and then you can get started. We've also published a base project for anyone who 00:26:51.140 --> 00:26:54.911 wants to get started with building their own games for the Game & Watch. And so 00:26:54.911 --> 00:26:58.670 it's really simple. It's just a frame buffer you can draw to, input is really 00:26:58.670 --> 00:27:04.470 simple and so on. And as said, we have a really helpful community. Now with all the 00:27:04.470 --> 00:27:10.000 hardware up and running, I could finally start porting Doom. I started by looking 00:27:10.000 --> 00:27:15.420 around for other ports of Doom to an STM32. And I found this project by floppes 00:27:15.420 --> 00:27:22.010 called stm32doom. Now the issue is, stm32doom is designed for a board with 00:27:22.010 --> 00:27:28.340 eight megabytes of RAM and also the data files for Doom were stored on external USB 00:27:28.340 --> 00:27:37.630 drive. On our platform, we only have 1.3 MB of RAM, 128 kB of flash and only 1 MB 00:27:37.630 --> 00:27:42.600 of external flash and we have to fit all the level information, all the code and 00:27:42.600 --> 00:27:50.880 so on in there. Now, the Doom level information is stored in so-called WAD - 00:27:50.880 --> 00:27:57.240 Where's All my Data files. And these data files contain the sprites, the textures, 00:27:57.240 --> 00:28:03.230 the levels and so on. Now the WAD for Doom 1 is roughly four megabytes in size and 00:28:03.230 --> 00:28:11.440 the WAD for Doom 2 is 40 MB in size. But we only have 1.1 MB of storage. Plus we 00:28:11.440 --> 00:28:16.390 have to fit all the code in there. So obviously we needed to find a very, very 00:28:16.390 --> 00:28:22.200 small Doom port. And as it turns out, there's a file called Mini-WAD, which is a 00:28:22.200 --> 00:28:27.680 minimal Doom, I wrote, which is basically all the bells and whistles are stripped 00:28:27.680 --> 00:28:34.240 from the WAD file and everything replaced by simple outlines and so on. And while 00:28:34.240 --> 00:28:38.130 it's not pretty, I was pretty confident that I could get it working as it's only 00:28:38.130 --> 00:28:46.320 250 kB of storage, down from 40 megabytes. Now, in addition to that, a lot of stuff 00:28:46.320 --> 00:28:51.300 on the Chocolate Doom port itself had to be changed. And so, for example, I had to 00:28:51.300 --> 00:28:56.150 rip out all the file handling and add a custom file handler. I had to add support 00:28:56.150 --> 00:29:01.230 for the Game & Watch LCD, button input support. And I also had to get rid of a 00:29:01.230 --> 00:29:05.350 lot of things to get it running somewhat smoothly. And so, for example, the 00:29:05.350 --> 00:29:10.630 infamous Wipe effect had to go and I also had to remove sound support. Now, the next 00:29:10.630 --> 00:29:16.270 issue was that once it was compiling, it simply would not fit into RAM and crash 00:29:16.270 --> 00:29:22.820 all the time. Now on the device, we have roughly 1.3 MB of RAM in different RAM 00:29:22.820 --> 00:29:27.510 areas. And for example just the frame buffer, that we obviously need, takes up 00:29:27.510 --> 00:29:36.350 154 kB off that. Then we have 160 kB of initialized data, 320 kB of uninitialized 00:29:36.350 --> 00:29:42.000 data and a ton of dynamic allocations that are done by Chocolate Doom. And these 00:29:42.000 --> 00:29:46.610 dynamic allocations were a huge issue because the Chocolate Doom source code 00:29:46.610 --> 00:29:52.480 does a lot of small allocations, which are only used for temporary data. And so they 00:29:52.480 --> 00:29:58.600 get freed again and so on, and so your dynamic memory gets very, very fragmented 00:29:58.600 --> 00:30:02.710 very quickly, and so eventually there's just not enough space to, for example, 00:30:02.710 --> 00:30:09.791 initialize the level. And so to fix this, I took the Chocolate Doom code and I 00:30:09.791 --> 00:30:15.110 changed a lot of the dynamic allocations to static allocations, which also had the 00:30:15.110 --> 00:30:22.030 big advantage of making the error messages by the compiler much more meaningful. 00:30:22.030 --> 00:30:27.340 Because it would actually tell you: Hey, this and this data does not fit into RAM. 00:30:27.340 --> 00:30:31.990 And eventually, after a lot of trial and error and copying as many of the original 00:30:31.990 --> 00:30:39.400 assets as possible into the minimal IWAD, I got it. I had Doom running on the 00:30:39.400 --> 00:30:45.030 Nintendo Game & Watch Super Mario Brothers and I hopefully calmed the internet gods 00:30:45.030 --> 00:30:49.750 that forced me to do it. Now, unfortunately, the USB port is physically 00:30:49.750 --> 00:30:55.690 not connected to the processor and so it will not be possible to hack the device 00:30:55.690 --> 00:31:00.390 simply by plugging it into your computer. However, it's relatively simple to do this 00:31:00.390 --> 00:31:06.790 using one of these USB-Debuggers. Now, the most requested type of homebrew software 00:31:06.790 --> 00:31:12.870 was obviously emulators. And I'm proud to say that by now we actually have kind of a 00:31:12.870 --> 00:31:19.210 large collection of emulators running on the Nintendo Game & Watch. And it all 00:31:19.210 --> 00:31:23.370 started with Conrad Beckman discovering the Retro Go Project, which is an emulator 00:31:23.370 --> 00:31:29.970 collection for a device called the Odroid Go and the Odroid Go is a small handheld 00:31:29.970 --> 00:31:35.880 with similar input and size constraints as the Nintendo Game & Watch. And so it's 00:31:35.880 --> 00:31:40.630 kind of cool to port this over because it basically already did all of the hard 00:31:40.630 --> 00:31:47.670 work, so to say. And Retro Go comes with emulators for the NES, for the Gameboy and 00:31:47.670 --> 00:31:52.770 the Gameboy color and even for the Sega Master System and the Sega Game Gear. And 00:31:52.770 --> 00:31:58.290 after a couple of days, Conrad actually was able to show off his NES emulator 00:31:58.290 --> 00:32:02.960 running Zelda and other games such as Contra and so on, on the Nintendo Game & 00:32:02.960 --> 00:32:09.230 Watch. This is super fun and initially we only had really a basic emulator that 00:32:09.230 --> 00:32:13.170 could barely play and we had a lot of frame drops, we didn't have nice scaling, 00:32:13.170 --> 00:32:18.290 VSync and so on. But now after a couple of weeks, it's really a nice device to use 00:32:18.290 --> 00:32:24.090 and to play with. And so we also have a Gameboy emulator running and so you can 00:32:24.090 --> 00:32:29.440 play your favorite Gameboy games such as Pokémon, Super Mario Land and so on on the 00:32:29.440 --> 00:32:35.160 Nintendo Game & Watch if you own the corresponding ROM Backups. And we also 00:32:35.160 --> 00:32:38.650 experimented with different scaling algorithms to make the most out of the 00:32:38.650 --> 00:32:43.310 screen. And so you can basically change the scaling algorithm that is used for the 00:32:43.310 --> 00:32:48.160 display, depending on what you prefer. And you could even change the palette for the 00:32:48.160 --> 00:32:54.450 different games. We also have a nice game chooser menu which allows you to basically 00:32:54.450 --> 00:32:59.240 have multiple ROMs on the device that you can switch between. We have safe state 00:32:59.240 --> 00:33:04.210 support and so if you turn off the device, it will save wherever you left off and you 00:33:04.210 --> 00:33:08.870 can even come back to your save game once the battery run out. You can find the 00:33:08.870 --> 00:33:14.380 source code for all of that on the Retro Go repository from Conrad. And it's 00:33:14.380 --> 00:33:20.710 really, really awesome. Other people build for example emulators for the CHIP-8 00:33:20.710 --> 00:33:25.430 system and so the CHIP-8 emulator comes with a nice collection of small arcade 00:33:25.430 --> 00:33:31.271 games and so on, and it's really fun and really easy to develop for it. And so 00:33:31.271 --> 00:33:37.010 really give this a try if you own a Game & Watch and want to try homebrew on it. Tim 00:33:37.010 --> 00:33:41.590 Schuerwegen is even working on an emulator for the original Game & Watch 00:33:41.590 --> 00:33:45.920 games. And so this is really cool because it basically turned the Nintendo Game & 00:33:45.920 --> 00:33:53.130 Watch into an emulator for all Game & Watch games that were ever released. And 00:33:53.130 --> 00:33:57.860 what was really amazing to me is how the community came together. And so we were 00:33:57.860 --> 00:34:02.140 pretty open about the progress on Twitter. And also Conrad was Twitch streaming a lot 00:34:02.140 --> 00:34:06.480 of the process. And we opened up a discord where people could join who were 00:34:06.480 --> 00:34:11.850 interested in hacking on the device. And it was amazing to see what came out of the 00:34:11.850 --> 00:34:16.720 community. And so, for example, we now have a working storage upgrade that works 00:34:16.720 --> 00:34:21.179 both with homebrew but also with the original firmware. And so instead of one 00:34:21.179 --> 00:34:25.320 megabyte of storage, you can have 60 megabytes of flash and you just need to 00:34:25.320 --> 00:34:30.549 replace a single chip, which is pretty easy to do. Then for understanding the 00:34:30.549 --> 00:34:35.690 full hardware. Daniel Cuthbert and Daniel Padilla provided us with high resolution x 00:34:35.690 --> 00:34:41.010 ray images, which allowed us to fully understand every single connection, even 00:34:41.010 --> 00:34:46.379 of the PGA parts, without desoldering anything. Then Jake Little of Upcycle 00:34:46.379 --> 00:34:52.980 Electronics traced on the x rays and also using a multimeter every last trace on the 00:34:52.980 --> 00:34:58.220 PCB, and he even created a schematic of the device, which gives you all the 00:34:58.220 --> 00:35:02.260 details you need when you want to program something also and it was really, really 00:35:02.260 --> 00:35:07.099 fun. Sander van der Wel for example even created a custom backplate and now there 00:35:07.099 --> 00:35:13.220 are even projects that try to replace the original PCB with a custom PCB with an 00:35:13.220 --> 00:35:20.019 FPGA and an ESP 32. And so it's really exciting to see what people come up with. 00:35:20.019 --> 00:35:24.819 Now, I hope you enjoyed this talk and I hope to see you on our discord if you want 00:35:24.819 --> 00:35:35.019 to join the fun. And thank you for coming. 00:35:35.019 --> 00:35:41.329 Herald: Hi. Wow, that was a really amazing talk. Thank you very much Thomas. As 00:35:41.329 --> 00:35:48.140 announced in the beginning we do accept questions from you and we have quite a 00:35:48.140 --> 00:35:54.450 few. Let's see if we manage to make it through all of them. The first one is: 00:35:54.450 --> 00:35:59.650 Q: Did you read the articles about Nintendo observing hackers, like private 00:35:59.650 --> 00:36:04.799 investigators, et cetera and are you somehow worried about this? 00:36:04.799 --> 00:36:08.400 Thomas: Oh, what's going on with my camera? Looks like Luigi messed around 00:36:08.400 --> 00:36:17.539 with my video setup here. Yeah, I so I've read those articles, but so I believe that 00:36:17.539 --> 00:36:22.210 in this case, there is no piracy issue, right? Like, I'm not allowing anyone to 00:36:22.210 --> 00:36:26.940 play any new games. If you wanted to to dump a Super Mario ROM, you would have 00:36:26.940 --> 00:36:32.160 done it 30 years ago or on the NES Classic or on the Switch or on any of the hundred 00:36:32.160 --> 00:36:37.240 consoles Nintendo launched in between. And so I'm really not too worried about it, to 00:36:37.240 --> 00:36:41.480 be honest. Herald: I also think the aspect of the 00:36:41.480 --> 00:36:50.270 target audience is to be seen here. So off to the next question which is: Do you 00:36:50.270 --> 00:36:55.460 think that there is a reason why an external flash chip has been used? 00:36:55.460 --> 00:37:02.849 Thomas: Yeah. So the internal flash of the STM32-H7B0 is relatively small. It's only 00:37:02.849 --> 00:37:08.450 128 kB. And so they simply couldn't fit everything in, like basically even 00:37:08.450 --> 00:37:13.240 just the frame buffer. Even just a frame buffer picture also is larger than the 00:37:13.240 --> 00:37:19.100 internal flash. And so I think that's why they did it and I'm glad they did. 00:37:19.100 --> 00:37:26.730 Herald: Sure. And is the decryption done in software or is it a feature of the 00:37:26.730 --> 00:37:30.460 microcontroller? Thomas: So the microcontroller has an 00:37:30.460 --> 00:37:36.160 integrated feature called OTF-DEC and basically the flash is directly mapped 00:37:36.160 --> 00:37:41.109 into memory and they have this chip prefill called OTF DEC that automatically 00:37:41.109 --> 00:37:45.430 provides the decryption and so on. And so it's done all in hardware and you can even 00:37:45.430 --> 00:37:48.350 retrieve the keys from hardware, basically. 00:37:48.350 --> 00:37:57.910 Herald: OK, very nice. And also, the next question is somehow related to that: Is in 00:37:57.910 --> 00:38:03.520 your opinion the encryption Nintendo has applied even worth the effort for them? 00:38:03.520 --> 00:38:07.430 It feels like it's just there to give shareholders a false sense of security. 00:38:07.430 --> 00:38:12.709 What would you think about that? Thomas: I think from my perspective, they 00:38:12.709 --> 00:38:16.489 choose just the right encryption because it was a ton of fun to reverse engineer 00:38:16.489 --> 00:38:21.910 and try to to bypass it and so it was an awesome challenge and so I think they did 00:38:21.910 --> 00:38:26.900 everything right. But I also think in the end, it's such a simple device and it's 00:38:26.900 --> 00:38:31.569 like if you take a look at what people are building on top of it with like games and 00:38:31.569 --> 00:38:36.680 all that kind of stuff. I think they did everything right, but probably it was just 00:38:36.680 --> 00:38:41.569 a tick markup. Yeah, we totally locked down JTAG and yeah, but I think it's fun 00:38:41.569 --> 00:38:44.609 because again, it doesn't open up any piracy issues. 00:38:44.609 --> 00:38:51.140 Herald: Sure. The one thing is related to the NOP slide, which you very, very well 00:38:51.140 --> 00:39:01.189 animated. So wouldn't starts of subroutines be suitable as well for that, 00:39:01.189 --> 00:39:11.460 for that goal. The person asking says that a big push R4, R5, etc. instructions are 00:39:11.460 --> 00:39:20.640 quite recognizable. How would ... Yeah Thomas: Yeah. So absolutely. The time from 00:39:20.640 --> 00:39:25.019 finding the data in the ITCM-RAM and actually exploiting it was less than an 00:39:25.019 --> 00:39:29.950 hour. And so if we would have tried to reverse engineer it, it would be more 00:39:29.950 --> 00:39:33.660 work. Like absolutely possible and also not difficult, but just filling the RAM 00:39:33.660 --> 00:39:38.559 with NOP took a couple of minutes and so was really the easiest way and the fastest 00:39:38.559 --> 00:39:45.420 way without fiddling around in Ghidra or so. Herald: OK, cool, thanks. And this is more 00:39:45.420 --> 00:39:54.329 a remark than a question. The person says it's strange that an STAN5281 does not 00:39:54.329 --> 00:39:59.630 mention a single time that the data is not verified during encryption. I think it's 00:39:59.630 --> 00:40:05.759 more a fault on STs than Nintendos site. What would you think about that? 00:40:05.759 --> 00:40:10.690 Thomas: Yeah, I would somewhat agree because in this case, even if you don't 00:40:10.690 --> 00:40:17.670 have JTAG, like an ARM thum instruction is 2-4 bytes and so you have a relatively small 00:40:17.670 --> 00:40:21.859 space to brute force to potentially get an interesting branch instruction and so on. 00:40:21.859 --> 00:40:28.009 So I think it's yeah, I mean, it's not perfect, but also doing verification 00:40:28.009 --> 00:40:33.410 is very expensive, computational wise and so I think it should just be the firmware 00:40:33.410 --> 00:40:37.160 that actually verifies the contents of the external flash. 00:40:37.160 --> 00:40:44.109 Herald: OK, so I think we should ask 2 questions more and then we can go back to 00:40:44.109 --> 00:40:52.000 the studio. There is a question about the AS encryption keys. Have you managed to 00:40:52.000 --> 00:40:57.349 recover them? Thomas: Yes, we did. But so it's an 00:40:57.349 --> 00:41:01.700 applicational AST, and they do some crazy shifting around with the keys but I think 00:41:01.700 --> 00:41:07.400 even just today, like an hour before the talk, a guy, sorry I'm not sure it's a 00:41:07.400 --> 00:41:12.650 guy, a person on our discord actually managed to rebuild the full encryption. 00:41:12.650 --> 00:41:16.779 But we, I personally wasn't never interested in that because after you've 00:41:16.779 --> 00:41:22.080 downgraded to RTP 0, the device. You can just access the memory mapped flash and 00:41:22.080 --> 00:41:24.740 get the completely decrypted flash contents basically. 00:41:24.740 --> 00:41:32.009 Herald: Sure. Thanks. And a last question about the LCD-Controller, whether it's 00:41:32.009 --> 00:41:38.180 used by writing pixels over SPI or if it has some extra features, maybe even 00:41:38.180 --> 00:41:40.930 background or sprites or something like that? 00:41:40.930 --> 00:41:46.809 Thomas: So the the LCD itself doesn't have any special features. It has one SPI bus 00:41:46.809 --> 00:41:50.930 to configure it and then a parallel interface where - so it takes up a lot 00:41:50.930 --> 00:41:56.809 of pins. But the chip itself has a hardware called LTDC, which is an LCD 00:41:56.809 --> 00:42:00.769 controller, which provides two layers with alpha blending and some basic windowing 00:42:00.769 --> 00:42:06.630 and so on. Herald: OK, cool then thank you very, very 00:42:06.630 --> 00:42:11.799 much for the great talk and the great intro. And with that, back to our main 00:42:11.799 --> 00:42:14.859 studio in the orbit. Thank you very much. Back to orbit. 00:42:14.859 --> 00:42:17.977 rC3 postroll music 00:42:17.977 --> 00:42:56.000 Subtitles created by c3subtitles.de in the year 2020. Join, and help us!