WEBVTT
00:00:00.000 --> 00:00:12.710
rC3 Opening Music
00:00:12.710 --> 00:00:19.340
Herald: So about our next speaker. He's a
security researcher focused on embedded
00:00:19.340 --> 00:00:26.500
systems, secure communications and mobile
security. He was nominated by
00:00:26.500 --> 00:00:39.940
Forbes for the 30 under 30 in technology
and also has won a OWASP Appsec CTF.
00:00:39.940 --> 00:00:46.790
He has also found and disclosed responsibly
multiple vulnerabilities. And especially
00:00:46.790 --> 00:00:52.280
for you Nintendo aficionados I want you to
watch out for the next intro, which is
00:00:52.280 --> 00:00:56.270
really amazing and you will all love.
Thank you very much.
00:00:56.305 --> 00:01:00.825
shows nintendo cartridge
00:01:00.825 --> 00:01:06.065
plugs cartridge
00:01:09.532 --> 00:01:10.532
nintendo start sound plays
00:01:10.532 --> 00:01:14.850
Thomas: Oh, damn it.
retrieves cartridge
00:01:14.850 --> 00:01:19.980
blows into cartridge
plugs cartridge again
00:01:22.146 --> 00:01:23.446
nintendo start sound plays
00:01:26.243 --> 00:01:29.423
music plays
00:02:52.810 --> 00:02:56.099
Thomas Roth: Uff, what a trip.
Welcome to my talk on
00:02:56.099 --> 00:03:00.810
hacking the new Nintendo Game & Watch
Super Mario Brothers. My name is Thomas
00:03:00.810 --> 00:03:05.290
Roth and I'm a security researcher and
trainer from Germany. And you can find me
00:03:05.290 --> 00:03:10.719
on Twitter at @ghidraninja and also on
YouTube at stacksmashing. Now, this year
00:03:10.719 --> 00:03:16.439
marks the 35th anniversary of our favorite
plumber, Super Mario And to celebrate
00:03:16.439 --> 00:03:20.699
that, Nintendo launched a new game console
called the Nintendo Game & Watch Super
00:03:20.699 --> 00:03:26.669
Mario Brothers. The console is lightweight
and looks pretty nice, and it comes
00:03:26.669 --> 00:03:31.859
preinstalled with three games and also
this nice animated clock. The three games
00:03:31.859 --> 00:03:36.920
are Super Mario Brothers, the original NES
game, Super Mario Brothers 2 The Lost
00:03:36.920 --> 00:03:44.830
Levels and also a reinterpretation of an
old Game & Watch game called Ball. Now, as
00:03:44.830 --> 00:03:49.939
you probably know, this is not the first
retro console that Nintendo released. In
00:03:49.939 --> 00:03:57.400
2016, they released the NES Classic and
2017 they released the SNES Classic. Now,
00:03:57.400 --> 00:04:01.729
these devices were super popular in the
homebrew community, because they make it
00:04:01.729 --> 00:04:05.779
really easy to add additional ROMs to it.
They make it really easy to modify the
00:04:05.779 --> 00:04:10.540
firmware and so on. And you can basically
just plug them into your computer, install
00:04:10.540 --> 00:04:14.959
a simple software and you can do whatever
you want with them. The reason for that is
00:04:14.959 --> 00:04:21.140
that they run Linux and have a pretty
powerful ARM processor on the inside. And
00:04:21.140 --> 00:04:27.360
so it's really a nice device to play with
and so on. And so when Nintendo announced
00:04:27.360 --> 00:04:31.650
this new console, a lot of people were
hoping for a similar experience of having
00:04:31.650 --> 00:04:38.810
a nice mobile home brew device. Now, if
you were to make a Venn diagram of some of
00:04:38.810 --> 00:04:43.099
my biggest interests, you would have
reverse engineering, hardware hacking and
00:04:43.099 --> 00:04:48.539
retro computing. And this new Game & Watch
fits right in the middle of that. And so
00:04:48.539 --> 00:04:52.920
when it was announced on the 3rd of
September, I knew that I needed to have
00:04:52.920 --> 00:04:58.930
one of those. And given how hard the NES
and SNES classic were to buy for a while,
00:04:58.930 --> 00:05:03.389
I preordered it on like four or five
different sites, a couple of which got
00:05:03.389 --> 00:05:09.470
canceled. But I was pretty excited, because
I had three preorders and was supposed to
00:05:09.470 --> 00:05:15.380
ship on the 13th of November. And so I was
really looking forward to this. And I was
00:05:15.380 --> 00:05:19.909
having breakfast on the 12th of November,
when suddenly the doorbell rang and DHL
00:05:19.909 --> 00:05:25.730
delivered me the new Game & Watch one day
before the official release. Now, at that
00:05:25.730 --> 00:05:30.300
point in time, there was no technical
information available about the device
00:05:30.300 --> 00:05:35.449
whatsoever. Like, if you searched for Game
& Watch on Twitter, you would only find
00:05:35.449 --> 00:05:40.680
denouncements or maybe a picture of the
box of someone who also received it early.
00:05:40.680 --> 00:05:44.900
But there were no teardowns, no pictures
of the insides and most importantly,
00:05:44.900 --> 00:05:50.319
nobody had hacked it yet. And this gave
me, as a hardware hacker, the kind of
00:05:50.319 --> 00:05:55.669
unique opportunity to potentially be the
first one to hack a new Nintendo console.
00:05:55.669 --> 00:06:00.199
And so I just literally dropped everything
else I was doing and started investigating
00:06:00.199 --> 00:06:05.949
the device. Now, I should say that
normally I stay pretty far away from any
00:06:05.949 --> 00:06:11.460
new console hacking. Mainly, because of the
piracy issues. I don't want to enable
00:06:11.460 --> 00:06:18.930
piracy. I don't want to deal with piracy.
And I don't want to build tools that
00:06:18.930 --> 00:06:23.930
enable other people to pirate stuff,
basically. But given that on this device,
00:06:23.930 --> 00:06:28.900
you cannot buy any more games and that all
the games, that are on there, were basically
00:06:28.900 --> 00:06:33.840
already released over 30 years ago. I was
not really worried about piracy and felt
00:06:33.840 --> 00:06:39.449
pretty comfortable in sharing all the
results of the investigation and also
00:06:39.449 --> 00:06:44.490
the... basically the issues we found that
allowed us to customize the device and so
00:06:44.490 --> 00:06:49.389
on. And in this talk, I want to walk you
through, how we managed to hack the device
00:06:49.389 --> 00:06:54.931
and how you can do it at home using
relatively cheap hardware. And, yeah, hope
00:06:54.931 --> 00:07:03.040
you enjoy it. Now, let's start by looking
at the device itself. The device is
00:07:03.040 --> 00:07:08.469
pretty lightweight and comes with a nicely
sized case. And so it really... for me, it
00:07:08.469 --> 00:07:14.889
sits really well in my hand. And it has a
nice 320 by 240 LCD display, a d-pad, A
00:07:14.889 --> 00:07:19.529
and B buttons and also three buttons to
switch between the different game modes.
00:07:19.529 --> 00:07:23.940
On the right side we also have the power
button and the USB-C port. Now, before you
00:07:23.940 --> 00:07:28.640
get excited about the USB port, I can
already tell you that unfortunately,
00:07:28.640 --> 00:07:33.030
Nintendo decided to not connect the data
lines off the USB port. And so you can
00:07:33.030 --> 00:07:38.550
really only use it for charging. Also,
because we are talking about Nintendo
00:07:38.550 --> 00:07:43.979
here, they use their proprietary tri-point
screws on the device. And so to open it
00:07:43.979 --> 00:07:48.730
up, you need one of those special tri-
point bits. Luckily, nowadays, most bit
00:07:48.730 --> 00:07:54.120
sets should have them, but it still would
suck, if you order your unit and then you
00:07:54.120 --> 00:07:59.639
can't open it up, because you're missing a
screwdriver. After opening it up, the
00:07:59.639 --> 00:08:03.779
first thing you probably notice is the
battery. And if you've ever opened up a
00:08:03.779 --> 00:08:07.599
Nintendo switch joycon before, you might
recognize the battery, because it's the
00:08:07.599 --> 00:08:12.729
exact same one that's used in the joycons.
This is very cool, because if down the
00:08:12.729 --> 00:08:16.529
line, like, let's say in two or three
years, your battery of your Game & Watch
00:08:16.529 --> 00:08:20.650
dies, you can just go and buy a joycon
battery, which you can have really
00:08:20.650 --> 00:08:26.520
cheaply, almost anywhere. Next to the
battery, on the right side, we have a
00:08:26.520 --> 00:08:32.349
small speaker which is not very good. And
underneath we have the main PCB with the
00:08:32.349 --> 00:08:37.510
processor, all the storage and so on and
so forth. Let's take a look at those. Now,
00:08:37.510 --> 00:08:44.779
the main processor of the device is an
STM32H7B0. This is a Cortex M7 from
00:08:44.779 --> 00:08:53.201
STMicroelectronics with 1.3 MB of RAM and
128 kB of flash. It runs at 280 MHz and is
00:08:53.201 --> 00:08:59.460
a pretty beefy microcontroller. But it's
much less powerful than the processor in
00:08:59.460 --> 00:09:03.860
the NES or SNES classic. Like this
processor is really just a microcontroller
00:09:03.860 --> 00:09:09.260
and so it can't run Linux. It can't run,
let's say, super complex software. Instead
00:09:09.260 --> 00:09:14.170
it'll be programed in some bare metal
way. And so we will have a bare metal
00:09:14.170 --> 00:09:20.580
firmware on the device. To the right of
it, you can also find a 1 MB SPI flash.
00:09:20.580 --> 00:09:26.180
And so overall, we have roughly 1.1 MB of
storage on the device. Now, most
00:09:26.180 --> 00:09:31.279
microcontrollers or basically all
microcontrollers have a debugging port.
00:09:31.279 --> 00:09:36.370
And if we take a look at the PCB, you can
see that there are five unpopulated
00:09:36.370 --> 00:09:40.980
contacts here. And if you see a couple of
contacts, that are not populated close to
00:09:40.980 --> 00:09:47.510
your CPU, it's very likely, that it's the
debugging port. And luckily, the datasheet
00:09:47.510 --> 00:09:54.449
for the STM32 is openly available. And so
we can check the pinouts in the datasheet
00:09:54.449 --> 00:09:59.050
and then use a multimeter to to see
whether these pins are actually the
00:09:59.050 --> 00:10:04.500
debugging interface. And turns out they
actually are. And so we can find the SWD
00:10:04.500 --> 00:10:11.630
debugging interface as well as Vcc and
ground exposed on these pins. Now this
00:10:11.630 --> 00:10:16.779
means that we can use a debugger. So, for
example, a J-link or ST-link or whatever
00:10:16.779 --> 00:10:21.980
to connect to the device. And because the
the contacts are really easy to access,
00:10:21.980 --> 00:10:25.870
you don't even have to solder. You can
just hook up a couple of test pins and
00:10:25.870 --> 00:10:32.600
they will allow you to easily hook-up
your debugger. Now, the problem is, on most
00:10:32.600 --> 00:10:36.900
devices, the debugging interface will be
locked during manufacturing, this is done
00:10:36.900 --> 00:10:42.550
to prevent people like us to basically do
whatever with the device and to prevent us
00:10:42.550 --> 00:10:47.450
from being able to dump the firmware,
potentially reflash it and so on. And so I
00:10:47.450 --> 00:10:52.190
was very curious to see, whether we can
actually connect to the debugging port.
00:10:52.190 --> 00:10:56.090
And when starting up J-link and trying to
connect, we can see it can actually
00:10:56.090 --> 00:11:01.230
successfully connect. But, when you take a
closer look, there's also a message that
00:11:01.230 --> 00:11:09.269
the device is active read protected. This
is because the chip, the STM32 chip,
00:11:09.269 --> 00:11:15.650
features something called RDP protection
level or readout protection level. This is
00:11:15.650 --> 00:11:20.300
basically the security setting for the
debugging interface and it has three
00:11:20.300 --> 00:11:26.769
levels. Level zero means no protection is
active. Level one means that the flash
00:11:26.769 --> 00:11:31.839
memory is protected and so we can't dump
the internal flash of the device. However,
00:11:31.839 --> 00:11:36.939
we can dump the RAM contents and we can
also execute code from RAM. And then
00:11:36.939 --> 00:11:42.240
there's also level two, which means that
all debugging features are disabled. Now,
00:11:42.240 --> 00:11:46.630
just because a chip is in level two,
doesn't mean that you have to give up.
00:11:46.630 --> 00:11:51.589
For example, in our talk wallet.fail a couple
of years ago, we showed how to use fault
00:11:51.589 --> 00:11:56.000
injection to bypass the level two
protection and downgrade a chip to level
00:11:56.000 --> 00:12:00.820
one. However, on the Game & Watch, we are
lucky and the interface is not fully
00:12:00.820 --> 00:12:07.139
disabled. Instead, it's in level one. And
so we can still dump the RAM, which is a
00:12:07.139 --> 00:12:11.300
pretty good entry point, even though we
can't dump the firmware yet. Now, having
00:12:11.300 --> 00:12:17.010
dumped the RAM of the device, I was pretty
curious to see, what's inside of it. And
00:12:17.010 --> 00:12:21.660
one of my suspicions was, that potentially
the emulator, that's hopefully running on
00:12:21.660 --> 00:12:29.000
the device, loads the original Super Mario
Brothers ROM into RAM. And so, I was
00:12:29.000 --> 00:12:34.830
wondering whether maybe we can find the
ROM that the device uses in the RAM-dump.
00:12:34.830 --> 00:12:39.750
And so I opened up the RAM-dump in a hex
editor and I also opened up the original
00:12:39.750 --> 00:12:44.450
Super Mario Brothers ROM in a second
window in a hex editor and tried to find
00:12:44.450 --> 00:12:49.411
different parts of the original ROM in the
RAM-dump. And it turns out that, yes, the
00:12:49.411 --> 00:12:55.380
NES ROM is loaded into RAM and it's always
at the same address. And so it's probably
00:12:55.380 --> 00:13:00.289
like during boot up, it gets copied into
RAM or something along those lines. And so
00:13:00.289 --> 00:13:05.420
this is pretty cool to know, because it
tells us a couple of things. First off, we
00:13:05.420 --> 00:13:09.790
know now that the debug port is enabled
and working, but that it's unfortunately
00:13:09.790 --> 00:13:16.319
at RDP level one and so we can only dump
the RAM. And we also know that the NES ROM
00:13:16.319 --> 00:13:21.259
is loaded into RAM. And this means that
the device runs a real NES emulator. And
00:13:21.259 --> 00:13:25.680
so if we get lucky, we can, for example,
just replace the ROM that is used by
00:13:25.680 --> 00:13:29.840
the device and play, for example,
our own NES game.
00:13:30.600 --> 00:13:33.460
little pause
00:13:33.930 --> 00:13:37.010
Next, it was time to dump the flash chip
00:13:37.010 --> 00:13:41.160
of the device. For this, I'm using a
device called Mini Pro and I'm using one
00:13:41.160 --> 00:13:46.959
of these really useful SOIC8 clips. And so
these ones you can simply clip onto the
00:13:46.959 --> 00:13:52.240
flash chip and then dump it. Now, one
warning though, the flash chip on the device,
00:13:52.240 --> 00:13:56.220
is running at 1.8 volts. And so you want to
make sure that your programmer also
00:13:56.220 --> 00:14:01.839
supports 1.8 volt operation. If you
accidentally try to read it out at 3.3 volts,
00:14:01.839 --> 00:14:06.770
you will break your flash. Trust
me, because it happened to me on one of my
00:14:06.770 --> 00:14:12.940
units. Now, with this flash dump from the
device, we can start to analyze it. And
00:14:12.940 --> 00:14:17.319
what I always like to do first, is take a
look at the entropy or the randomness of
00:14:17.319 --> 00:14:23.350
the flash dump. And so using binwalk with
the -E option, we get a nice entropy
00:14:23.350 --> 00:14:27.410
graph. And in this case, you can see we
have a very high entropy over almost the
00:14:27.410 --> 00:14:32.899
whole flash contents. And this mostly
indicates, that the flash contents are
00:14:32.899 --> 00:14:37.240
encrypted. It could also mean compression,
but if it's compressed, you would often
00:14:37.240 --> 00:14:43.529
see more like dips in the entropy. And in
this case, it's one very high entropy
00:14:43.529 --> 00:14:48.830
stream. We also noticed, that there are no
repetitions whatsoever, which also tells
00:14:48.830 --> 00:14:53.350
us that it's probably not like a simple
XOR based encryption or so and instead
00:14:53.350 --> 00:14:58.340
something like AES or something similar.
But, just because the flash is encrypted
00:14:58.340 --> 00:15:02.199
doesn't mean we have to give up. On the
contrary, I think now it starts to get
00:15:02.199 --> 00:15:06.829
interesting, because you actually have a
challenge and it's not just plug and play,
00:15:06.829 --> 00:15:13.020
so to say. One of the biggest questions I
had is, is the flash actually verified?
00:15:13.020 --> 00:15:18.160
Like does the device boot, even though the
flash has been modified? Because, if it
00:15:18.160 --> 00:15:24.789
does, this would open up a lot of attack
vectors, basically, as you will see. And
00:15:24.789 --> 00:15:30.720
so to verify this, I basically try to
put zeros in random places in the flash
00:15:30.720 --> 00:15:35.760
image. And so, I put some at adress zero,
some at 0x2000 and so on. And then I
00:15:35.760 --> 00:15:39.910
checked whether the device would still
boot-up. And with the most flash
00:15:39.910 --> 00:15:44.370
modifications, it would still boot just
fine. This tells us, that even though the
00:15:44.370 --> 00:15:48.599
flash contents are encrypted, they are not
validated, they are not checksummed or
00:15:48.599 --> 00:15:54.610
anything. And so we can potentially trick
the device into accepting a modified flash
00:15:54.610 --> 00:15:58.529
image. And this is really important to
know, as you will see in a couple of
00:15:58.529 --> 00:16:05.310
minutes. My next suspicion was, that maybe
the NES ROM we see in RAM, is actually
00:16:05.310 --> 00:16:12.839
loaded from the external flash. And so to
find out whether that's the case, I again
00:16:12.839 --> 00:16:18.939
took the flash and I inserted zeros at
multiple positions in the flash image.
00:16:18.939 --> 00:16:24.550
Flashed that over, booted-up the game,
dumped the RAM and then compared the NES
00:16:24.550 --> 00:16:29.620
ROM that I'm now dumping from RAM with the
one that I dumped initially and checked
00:16:29.620 --> 00:16:35.399
whether they are equal. Because my
suspicion was that maybe I can overwrite a
00:16:35.399 --> 00:16:41.519
couple of bytes in the encrypted flash and
then I will modify the NES room. And after
00:16:41.519 --> 00:16:46.760
doing this for, like, I don't know, half
an hour, I got lucky and I modified 4
00:16:46.760 --> 00:16:51.399
bytes in the flash image and 4 bytes in the
RAM...sorry...in the ROM that was loaded
00:16:51.399 --> 00:16:56.790
into RAM changed. And this tells us quite
a bit. It means that the ROM is loaded
00:16:56.790 --> 00:17:04.450
from flash into RAM and that the flash
contents are not validated. And what's
00:17:04.450 --> 00:17:10.280
also important is, that we change 4
bytes in the flash and now 4 bytes in
00:17:10.280 --> 00:17:15.510
the decrypted image changed. And this is
very important to know, because if we take
00:17:15.510 --> 00:17:19.740
a look at what we would expect to happen
when we change the flash contents, there
00:17:19.740 --> 00:17:23.880
are multiple outcomes. And so, for
example, here we have the SPI-flash
00:17:23.880 --> 00:17:29.310
contents on the left and the RAM contents
on the right. And so the RAM contents are
00:17:29.310 --> 00:17:35.410
basically the decrypted version of the
SPI-flash contents. Now let's say we
00:17:35.410 --> 00:17:41.750
change 4 bytes in the encrypted flash
image to zeros. How would we expect the
00:17:41.750 --> 00:17:47.580
RAM contents to change, for example, if we
would see that now 16 bytes in the RAM are
00:17:47.580 --> 00:17:52.960
changing, this means that we are
potentially looking at an encryption
00:17:52.960 --> 00:17:57.650
algorithm, such as AES in electronic
codebook mode. Because, it's a block based
00:17:57.650 --> 00:18:03.180
encryption and so if we change four bytes
in the input data, a block size, in this
00:18:03.180 --> 00:18:09.730
case 16 bytes, in the output data would
change. The next possibility is, that we
00:18:09.730 --> 00:18:16.160
change 4 bytes in the SPI-flash and all
data afterwards will be changed. And in
00:18:16.160 --> 00:18:21.830
this case, we would look at some kind of
chaining cipher such as AES in the CBC
00:18:21.830 --> 00:18:27.600
mode. However, if we change 4 bytes in
the SPI-flash and only 4 bytes in the
00:18:27.600 --> 00:18:33.510
RAM changed, we are looking at
something such as AES in counter mode. And
00:18:33.510 --> 00:18:40.270
to understand this, let's take a better
look at how AES in CTR works. AES-CTR
00:18:40.270 --> 00:18:45.930
works by having your cleartext and xoring
it with an AES encryption stream, that is
00:18:45.930 --> 00:18:53.211
generated from a key, a Nonce and the
counter algorithm. Now, the AES stream,
00:18:53.211 --> 00:18:57.370
that will be used to xor your your
cleartext will always be the same, if key
00:18:57.370 --> 00:19:02.840
and Nonce is the same. This is why it's
super important, that if you use AES-CTR,
00:19:02.840 --> 00:19:08.780
you always select a unique Nonce for each
encryption. If you encrypt similar data
00:19:08.780 --> 00:19:15.060
with the same Nonce twice, large parts of
the resulting ciphertext will be the same.
00:19:15.060 --> 00:19:19.960
And so the cleartext gets xored with the
AES-CTR stream and then we get our
00:19:19.960 --> 00:19:26.570
ciphertext. Now, if we know the cleartext,
as we do, because the cleartext is the ROM,
00:19:26.570 --> 00:19:32.270
that is loaded into RAM and we know the
ciphertext, which we do, because it's the
00:19:32.270 --> 00:19:38.010
contents of the encrypted flash we just
dump. We can basically reverse the
00:19:38.010 --> 00:19:44.580
operation and as a result, we get the AES-
CTR stream, that was used to encrypt the
00:19:44.580 --> 00:19:52.050
flash. And now this means, that we can
take, for example, a custom ROM, xor it
00:19:52.050 --> 00:19:57.830
with the AES-CTR stream we just
calculated and then generate our own
00:19:57.830 --> 00:20:02.010
encrypted flash image, for example, with a
modified ROM. And so I wrote a couple of
00:20:02.010 --> 00:20:08.340
Python scripts to try this. And after a
while, I was running Hacked Super Mario
00:20:08.340 --> 00:20:14.290
Brothers instead of Super Mario Brothers.
So, wohoo, we hacked the Nintendo Game &
00:20:14.290 --> 00:20:18.870
Watch one day before the official release.
And we can install modified Super Mario
00:20:18.870 --> 00:20:23.990
Brothers ROMs. Now, you can find the
scripts that I used for this on my Github.
00:20:23.990 --> 00:20:28.260
So it's in a repository called "Game &
Watch Hacking". And I was super excited,
00:20:28.260 --> 00:20:33.570
because it meant, that I succeeded and that
I basically hacked a Nintendo console one
00:20:33.570 --> 00:20:37.961
day before the official release.
Unfortunately, I finished the level, but
00:20:37.961 --> 00:20:43.350
Toad wasn't as excited. He told me that
unfortunately, our firmware is still in
00:20:43.350 --> 00:20:50.050
another castle. And so on the Monday after
the launch of the device, I teamed up with
00:20:50.050 --> 00:20:54.790
Konrad Beckman, a hardware hacker from
Sweden who I met at the previous Congress.
00:20:54.790 --> 00:20:59.850
And we started chatting and throwing ideas
back and forth and so on. And eventually
00:20:59.850 --> 00:21:05.620
we noticed that the device has a special
RAM area called ITCM-RAM, which is a
00:21:05.620 --> 00:21:10.570
tightly coupled instruction RAM that is
normally used for very high performance
00:21:10.570 --> 00:21:15.121
routines such as interrupt handlers and so
on. And so it's in a very fast RAM area.
00:21:15.121 --> 00:21:22.160
And we realized that we never actually
looked at the contents of that ITCM-RAM.
00:21:22.160 --> 00:21:26.540
And so we dumped it from the device using
the debugging port. And it turns out that
00:21:26.540 --> 00:21:33.020
this ITCM-RAM contains ARM code. And so,
again, the question is, where does this
00:21:33.020 --> 00:21:37.570
ARM code come from, does it maybe just
like the NES ROM come from the external
00:21:37.570 --> 00:21:45.741
flash? And so basically, I repeated the
whole thing that we also did with the NES
00:21:45.741 --> 00:21:52.260
ROM and just put zeros at the very
beginning of the encrypted flash. Rebooted
00:21:52.260 --> 00:21:57.720
the device and dumped the ITCM-RAM and I
got super lucky on the first try already
00:21:57.720 --> 00:22:03.990
the ITCM contents changed. And because the
ITCM contains code, not just data, so
00:22:03.990 --> 00:22:09.300
early we only had the NES-ROM, which is
just data, but this time the RAM contains
00:22:09.300 --> 00:22:14.850
code. This means that with the same x or
trick we used before, we could inject
00:22:14.850 --> 00:22:21.530
custom ITCM code into the external flash,
which would then be loaded into RAM when
00:22:21.530 --> 00:22:27.620
the device boots. And because it's a
persistent method, we can then reboot the
00:22:27.620 --> 00:22:32.520
device and let it run without the debugger
connected. And so whatever code we load
00:22:32.520 --> 00:22:38.490
into this ITCM area will be able to
actually read the flash. And so we could
00:22:38.490 --> 00:22:43.280
potentially write some code that gets
somehow called by the firmware and then
00:22:43.280 --> 00:22:49.540
copies the internal flash into RAM from
where we then can retrieve it using the
00:22:49.540 --> 00:22:57.560
debugger. Now, the problem is, let's say
we have a custom payload somehow in this
00:22:57.560 --> 00:23:04.750
ITCM area. We don't know which address of
this ITCM code gets executed. And so we
00:23:04.750 --> 00:23:09.410
don't know whether the firmware will jump
to adress zero or adress 200 or whatever.
00:23:09.410 --> 00:23:14.270
But there's a really simple trick to still
build a successful payload. And it's
00:23:14.270 --> 00:23:19.230
called a NOP slide. A NOP, or no
operation, is an instruction that simply
00:23:19.230 --> 00:23:25.100
does nothing. And if we fill most of the
ITCM-RAM with NOPs and put our payload at
00:23:25.100 --> 00:23:31.700
the very end, we build something that is
basically a NOP-slide. And so when the
00:23:31.700 --> 00:23:37.260
CPU, indicated by Mario here, jumps to a
random address in that whole NOP-slide, it
00:23:37.260 --> 00:23:43.500
will start executing NOPs and slide down
into our payload and execute it. And so
00:23:43.500 --> 00:23:49.100
even if Mario jumps right in the middle of
the NOP-slide, he will always slide down
00:23:49.100 --> 00:23:54.920
the slide and end up in our payload. And
Konrad wrote this really, really simple
00:23:54.920 --> 00:23:58.330
payload, which is only like 10
instructions, which basically just copies
00:23:58.330 --> 00:24:03.980
the internal flash into RAM from where we
can then retrieve it using the debugger.
00:24:03.980 --> 00:24:08.280
So wohoo, super simple exploit. We have a
full firmware backup and a full flash
00:24:08.280 --> 00:24:13.590
backup and now we can really fiddle with
everything on the device. And we've
00:24:13.590 --> 00:24:17.700
actually released tools to do this
yourself. And so if you want to back up
00:24:17.700 --> 00:24:23.161
your Nintendo Game & Watch, you can just
go onto my GitHub and download the game
00:24:23.161 --> 00:24:27.670
and watch backup repository, which
contains a lot of information on how to
00:24:27.670 --> 00:24:33.270
back it up. It does check something and
so on to ensure that you don't
00:24:33.270 --> 00:24:38.420
accidentally brick your device and you can
easily back up the original firmware,
00:24:38.420 --> 00:24:43.610
install homebrew, and then always go back
to the original software. We also have an
00:24:43.610 --> 00:24:50.630
awesome support community on Discord. And
so if you ever need help, I think you will
00:24:50.630 --> 00:24:55.270
find success there. And so far we haven't
had a single bricked Game & Watch and so
00:24:55.270 --> 00:25:02.200
looks to be pretty stable. And so I
was pretty excited because the quest was
00:25:02.200 --> 00:25:11.170
over. Or is it? If you ever claim on the
internet that you successfully hacked an
00:25:11.170 --> 00:25:18.180
embedded device, there will be exactly one
response and one response only: but does
00:25:18.180 --> 00:25:23.610
it run Doom? Literally my Twitter DMs, my
YouTube comments, and even my friends were
00:25:23.610 --> 00:25:28.720
spamming me with the challenge to get Doom
running on the device. But to get Doom
00:25:28.720 --> 00:25:34.390
running, we first needed to bring up all
the hardware. And so we basically needed
00:25:34.390 --> 00:25:40.070
to create a way to develop and load
homebrew onto the device. Now, luckily for
00:25:40.070 --> 00:25:44.880
us, most of the components on the board
are very well documented and so there are
00:25:44.880 --> 00:25:50.040
no NDA components. And so, for example,
the processor has an open reference manual
00:25:50.040 --> 00:25:56.890
and open source library to use it. The
flash is a well-known flash chip. And so
00:25:56.890 --> 00:26:00.440
on and so forth. And there are only a
couple of very proprietary or custom
00:26:00.440 --> 00:26:06.280
components. And so, for example, the LCD
on the device is proprietary and we had to
00:26:06.280 --> 00:26:12.690
basically sniff the SPI-bus that goes to
the display to basically decode the
00:26:12.690 --> 00:26:19.160
initialization of the display and so on.
And after a while, we had the full
00:26:19.160 --> 00:26:24.540
hardware running, we had LCD support, we
had audio support, deep support, buttons
00:26:24.540 --> 00:26:29.210
and even flashing tools that allow you to
simply use an SWD debugger to dump and
00:26:29.210 --> 00:26:33.820
rewrite the external flash. And you can
find all of these things on our GitHub.
00:26:33.820 --> 00:26:38.520
Now, if you want to mod your own Game &
Watch, all you need is a simple debugging
00:26:38.520 --> 00:26:46.840
adapter such as a cheap, three dollar ST-
link, a J-link or a real ST-link device,
00:26:46.840 --> 00:26:51.140
and then you can get started. We've also
published a base project for anyone who
00:26:51.140 --> 00:26:54.911
wants to get started with building their
own games for the Game & Watch. And so
00:26:54.911 --> 00:26:58.670
it's really simple. It's just a frame
buffer you can draw to, input is really
00:26:58.670 --> 00:27:04.470
simple and so on. And as said, we have a
really helpful community. Now with all the
00:27:04.470 --> 00:27:10.000
hardware up and running, I could finally
start porting Doom. I started by looking
00:27:10.000 --> 00:27:15.420
around for other ports of Doom to an
STM32. And I found this project by floppes
00:27:15.420 --> 00:27:22.010
called stm32doom. Now the issue is,
stm32doom is designed for a board with
00:27:22.010 --> 00:27:28.340
eight megabytes of RAM and also the data
files for Doom were stored on external USB
00:27:28.340 --> 00:27:37.630
drive. On our platform, we only have 1.3
MB of RAM, 128 kB of flash and only 1 MB
00:27:37.630 --> 00:27:42.600
of external flash and we have to fit all
the level information, all the code and
00:27:42.600 --> 00:27:50.880
so on in there. Now, the Doom level
information is stored in so-called WAD -
00:27:50.880 --> 00:27:57.240
Where's All my Data files. And these data
files contain the sprites, the textures,
00:27:57.240 --> 00:28:03.230
the levels and so on. Now the WAD for Doom
1 is roughly four megabytes in size and
00:28:03.230 --> 00:28:11.440
the WAD for Doom 2 is 40 MB in size. But
we only have 1.1 MB of storage. Plus we
00:28:11.440 --> 00:28:16.390
have to fit all the code in there. So
obviously we needed to find a very, very
00:28:16.390 --> 00:28:22.200
small Doom port. And as it turns out,
there's a file called Mini-WAD, which is a
00:28:22.200 --> 00:28:27.680
minimal Doom, I wrote, which is basically
all the bells and whistles are stripped
00:28:27.680 --> 00:28:34.240
from the WAD file and everything replaced
by simple outlines and so on. And while
00:28:34.240 --> 00:28:38.130
it's not pretty, I was pretty confident
that I could get it working as it's only
00:28:38.130 --> 00:28:46.320
250 kB of storage, down from 40 megabytes.
Now, in addition to that, a lot of stuff
00:28:46.320 --> 00:28:51.300
on the Chocolate Doom port itself had to
be changed. And so, for example, I had to
00:28:51.300 --> 00:28:56.150
rip out all the file handling and add a
custom file handler. I had to add support
00:28:56.150 --> 00:29:01.230
for the Game & Watch LCD, button input
support. And I also had to get rid of a
00:29:01.230 --> 00:29:05.350
lot of things to get it running somewhat
smoothly. And so, for example, the
00:29:05.350 --> 00:29:10.630
infamous Wipe effect had to go and I also
had to remove sound support. Now, the next
00:29:10.630 --> 00:29:16.270
issue was that once it was compiling, it
simply would not fit into RAM and crash
00:29:16.270 --> 00:29:22.820
all the time. Now on the device, we have
roughly 1.3 MB of RAM in different RAM
00:29:22.820 --> 00:29:27.510
areas. And for example just the frame
buffer, that we obviously need, takes up
00:29:27.510 --> 00:29:36.350
154 kB off that. Then we have 160 kB of
initialized data, 320 kB of uninitialized
00:29:36.350 --> 00:29:42.000
data and a ton of dynamic allocations that
are done by Chocolate Doom. And these
00:29:42.000 --> 00:29:46.610
dynamic allocations were a huge issue
because the Chocolate Doom source code
00:29:46.610 --> 00:29:52.480
does a lot of small allocations, which are
only used for temporary data. And so they
00:29:52.480 --> 00:29:58.600
get freed again and so on, and so your
dynamic memory gets very, very fragmented
00:29:58.600 --> 00:30:02.710
very quickly, and so eventually there's
just not enough space to, for example,
00:30:02.710 --> 00:30:09.791
initialize the level. And so to fix this,
I took the Chocolate Doom code and I
00:30:09.791 --> 00:30:15.110
changed a lot of the dynamic allocations
to static allocations, which also had the
00:30:15.110 --> 00:30:22.030
big advantage of making the error messages
by the compiler much more meaningful.
00:30:22.030 --> 00:30:27.340
Because it would actually tell you: Hey,
this and this data does not fit into RAM.
00:30:27.340 --> 00:30:31.990
And eventually, after a lot of trial and
error and copying as many of the original
00:30:31.990 --> 00:30:39.400
assets as possible into the minimal IWAD,
I got it. I had Doom running on the
00:30:39.400 --> 00:30:45.030
Nintendo Game & Watch Super Mario Brothers
and I hopefully calmed the internet gods
00:30:45.030 --> 00:30:49.750
that forced me to do it. Now,
unfortunately, the USB port is physically
00:30:49.750 --> 00:30:55.690
not connected to the processor and so it
will not be possible to hack the device
00:30:55.690 --> 00:31:00.390
simply by plugging it into your computer.
However, it's relatively simple to do this
00:31:00.390 --> 00:31:06.790
using one of these USB-Debuggers. Now, the
most requested type of homebrew software
00:31:06.790 --> 00:31:12.870
was obviously emulators. And I'm proud to
say that by now we actually have kind of a
00:31:12.870 --> 00:31:19.210
large collection of emulators running on
the Nintendo Game & Watch. And it all
00:31:19.210 --> 00:31:23.370
started with Conrad Beckman discovering
the Retro Go Project, which is an emulator
00:31:23.370 --> 00:31:29.970
collection for a device called the Odroid
Go and the Odroid Go is a small handheld
00:31:29.970 --> 00:31:35.880
with similar input and size constraints as
the Nintendo Game & Watch. And so it's
00:31:35.880 --> 00:31:40.630
kind of cool to port this over because it
basically already did all of the hard
00:31:40.630 --> 00:31:47.670
work, so to say. And Retro Go comes with
emulators for the NES, for the Gameboy and
00:31:47.670 --> 00:31:52.770
the Gameboy color and even for the Sega
Master System and the Sega Game Gear. And
00:31:52.770 --> 00:31:58.290
after a couple of days, Conrad actually
was able to show off his NES emulator
00:31:58.290 --> 00:32:02.960
running Zelda and other games such as
Contra and so on, on the Nintendo Game &
00:32:02.960 --> 00:32:09.230
Watch. This is super fun and initially we
only had really a basic emulator that
00:32:09.230 --> 00:32:13.170
could barely play and we had a lot of
frame drops, we didn't have nice scaling,
00:32:13.170 --> 00:32:18.290
VSync and so on. But now after a couple of
weeks, it's really a nice device to use
00:32:18.290 --> 00:32:24.090
and to play with. And so we also have a
Gameboy emulator running and so you can
00:32:24.090 --> 00:32:29.440
play your favorite Gameboy games such as
Pokémon, Super Mario Land and so on on the
00:32:29.440 --> 00:32:35.160
Nintendo Game & Watch if you own the
corresponding ROM Backups. And we also
00:32:35.160 --> 00:32:38.650
experimented with different scaling
algorithms to make the most out of the
00:32:38.650 --> 00:32:43.310
screen. And so you can basically change
the scaling algorithm that is used for the
00:32:43.310 --> 00:32:48.160
display, depending on what you prefer. And
you could even change the palette for the
00:32:48.160 --> 00:32:54.450
different games. We also have a nice game
chooser menu which allows you to basically
00:32:54.450 --> 00:32:59.240
have multiple ROMs on the device that you
can switch between. We have safe state
00:32:59.240 --> 00:33:04.210
support and so if you turn off the device,
it will save wherever you left off and you
00:33:04.210 --> 00:33:08.870
can even come back to your save game once
the battery run out. You can find the
00:33:08.870 --> 00:33:14.380
source code for all of that on the Retro
Go repository from Conrad. And it's
00:33:14.380 --> 00:33:20.710
really, really awesome. Other people build
for example emulators for the CHIP-8
00:33:20.710 --> 00:33:25.430
system and so the CHIP-8 emulator comes
with a nice collection of small arcade
00:33:25.430 --> 00:33:31.271
games and so on, and it's really fun and
really easy to develop for it. And so
00:33:31.271 --> 00:33:37.010
really give this a try if you own a Game &
Watch and want to try homebrew on it. Tim
00:33:37.010 --> 00:33:41.590
Schuerwegen is even working on an
emulator for the original Game & Watch
00:33:41.590 --> 00:33:45.920
games. And so this is really cool because
it basically turned the Nintendo Game &
00:33:45.920 --> 00:33:53.130
Watch into an emulator for all Game &
Watch games that were ever released. And
00:33:53.130 --> 00:33:57.860
what was really amazing to me is how the
community came together. And so we were
00:33:57.860 --> 00:34:02.140
pretty open about the progress on Twitter.
And also Conrad was Twitch streaming a lot
00:34:02.140 --> 00:34:06.480
of the process. And we opened up a discord
where people could join who were
00:34:06.480 --> 00:34:11.850
interested in hacking on the device. And
it was amazing to see what came out of the
00:34:11.850 --> 00:34:16.720
community. And so, for example, we now
have a working storage upgrade that works
00:34:16.720 --> 00:34:21.179
both with homebrew but also with the
original firmware. And so instead of one
00:34:21.179 --> 00:34:25.320
megabyte of storage, you can have 60
megabytes of flash and you just need to
00:34:25.320 --> 00:34:30.549
replace a single chip, which is pretty
easy to do. Then for understanding the
00:34:30.549 --> 00:34:35.690
full hardware. Daniel Cuthbert and Daniel
Padilla provided us with high resolution x
00:34:35.690 --> 00:34:41.010
ray images, which allowed us to fully
understand every single connection, even
00:34:41.010 --> 00:34:46.379
of the PGA parts, without desoldering
anything. Then Jake Little of Upcycle
00:34:46.379 --> 00:34:52.980
Electronics traced on the x rays and also
using a multimeter every last trace on the
00:34:52.980 --> 00:34:58.220
PCB, and he even created a schematic of
the device, which gives you all the
00:34:58.220 --> 00:35:02.260
details you need when you want to program
something also and it was really, really
00:35:02.260 --> 00:35:07.099
fun. Sander van der Wel for example even
created a custom backplate and now there
00:35:07.099 --> 00:35:13.220
are even projects that try to replace the
original PCB with a custom PCB with an
00:35:13.220 --> 00:35:20.019
FPGA and an ESP 32. And so it's really
exciting to see what people come up with.
00:35:20.019 --> 00:35:24.819
Now, I hope you enjoyed this talk and I
hope to see you on our discord if you want
00:35:24.819 --> 00:35:35.019
to join the fun. And thank you for coming.
00:35:35.019 --> 00:35:41.329
Herald: Hi. Wow, that was a really amazing
talk. Thank you very much Thomas. As
00:35:41.329 --> 00:35:48.140
announced in the beginning we do accept
questions from you and we have quite a
00:35:48.140 --> 00:35:54.450
few. Let's see if we manage to make it
through all of them. The first one is:
00:35:54.450 --> 00:35:59.650
Q: Did you read the articles about
Nintendo observing hackers, like private
00:35:59.650 --> 00:36:04.799
investigators, et cetera and are you
somehow worried about this?
00:36:04.799 --> 00:36:08.400
Thomas: Oh, what's going on with my
camera? Looks like Luigi messed around
00:36:08.400 --> 00:36:17.539
with my video setup here. Yeah, I so I've
read those articles, but so I believe that
00:36:17.539 --> 00:36:22.210
in this case, there is no piracy issue,
right? Like, I'm not allowing anyone to
00:36:22.210 --> 00:36:26.940
play any new games. If you wanted to to
dump a Super Mario ROM, you would have
00:36:26.940 --> 00:36:32.160
done it 30 years ago or on the NES Classic
or on the Switch or on any of the hundred
00:36:32.160 --> 00:36:37.240
consoles Nintendo launched in between. And
so I'm really not too worried about it, to
00:36:37.240 --> 00:36:41.480
be honest.
Herald: I also think the aspect of the
00:36:41.480 --> 00:36:50.270
target audience is to be seen here. So off
to the next question which is: Do you
00:36:50.270 --> 00:36:55.460
think that there is a reason why an
external flash chip has been used?
00:36:55.460 --> 00:37:02.849
Thomas: Yeah. So the internal flash of the
STM32-H7B0 is relatively small. It's only
00:37:02.849 --> 00:37:08.450
128 kB. And so they simply couldn't
fit everything in, like basically even
00:37:08.450 --> 00:37:13.240
just the frame buffer. Even just a frame
buffer picture also is larger than the
00:37:13.240 --> 00:37:19.100
internal flash. And so I think that's why
they did it and I'm glad they did.
00:37:19.100 --> 00:37:26.730
Herald: Sure. And is the decryption done
in software or is it a feature of the
00:37:26.730 --> 00:37:30.460
microcontroller?
Thomas: So the microcontroller has an
00:37:30.460 --> 00:37:36.160
integrated feature called OTF-DEC and
basically the flash is directly mapped
00:37:36.160 --> 00:37:41.109
into memory and they have this chip
prefill called OTF DEC that automatically
00:37:41.109 --> 00:37:45.430
provides the decryption and so on. And so
it's done all in hardware and you can even
00:37:45.430 --> 00:37:48.350
retrieve the keys from hardware,
basically.
00:37:48.350 --> 00:37:57.910
Herald: OK, very nice. And also, the next
question is somehow related to that: Is in
00:37:57.910 --> 00:38:03.520
your opinion the encryption Nintendo has
applied even worth the effort for them?
00:38:03.520 --> 00:38:07.430
It feels like it's just there to give
shareholders a false sense of security.
00:38:07.430 --> 00:38:12.709
What would you think about that?
Thomas: I think from my perspective, they
00:38:12.709 --> 00:38:16.489
choose just the right encryption because
it was a ton of fun to reverse engineer
00:38:16.489 --> 00:38:21.910
and try to to bypass it and so it was an
awesome challenge and so I think they did
00:38:21.910 --> 00:38:26.900
everything right. But I also think in the
end, it's such a simple device and it's
00:38:26.900 --> 00:38:31.569
like if you take a look at what people are
building on top of it with like games and
00:38:31.569 --> 00:38:36.680
all that kind of stuff. I think they did
everything right, but probably it was just
00:38:36.680 --> 00:38:41.569
a tick markup. Yeah, we totally locked
down JTAG and yeah, but I think it's fun
00:38:41.569 --> 00:38:44.609
because again, it doesn't open up any
piracy issues.
00:38:44.609 --> 00:38:51.140
Herald: Sure. The one thing is related to
the NOP slide, which you very, very well
00:38:51.140 --> 00:39:01.189
animated. So wouldn't starts of
subroutines be suitable as well for that,
00:39:01.189 --> 00:39:11.460
for that goal. The person asking says that
a big push R4, R5, etc. instructions are
00:39:11.460 --> 00:39:20.640
quite recognizable. How would ... Yeah
Thomas: Yeah. So absolutely. The time from
00:39:20.640 --> 00:39:25.019
finding the data in the ITCM-RAM and
actually exploiting it was less than an
00:39:25.019 --> 00:39:29.950
hour. And so if we would have tried to
reverse engineer it, it would be more
00:39:29.950 --> 00:39:33.660
work. Like absolutely possible and also
not difficult, but just filling the RAM
00:39:33.660 --> 00:39:38.559
with NOP took a couple of minutes and so
was really the easiest way and the fastest
00:39:38.559 --> 00:39:45.420
way without fiddling around in Ghidra or so.
Herald: OK, cool, thanks. And this is more
00:39:45.420 --> 00:39:54.329
a remark than a question. The person says
it's strange that an STAN5281 does not
00:39:54.329 --> 00:39:59.630
mention a single time that the data is not
verified during encryption. I think it's
00:39:59.630 --> 00:40:05.759
more a fault on STs than Nintendos site.
What would you think about that?
00:40:05.759 --> 00:40:10.690
Thomas: Yeah, I would somewhat agree
because in this case, even if you don't
00:40:10.690 --> 00:40:17.670
have JTAG, like an ARM thum instruction is
2-4 bytes and so you have a relatively small
00:40:17.670 --> 00:40:21.859
space to brute force to potentially get an
interesting branch instruction and so on.
00:40:21.859 --> 00:40:28.009
So I think it's yeah, I mean, it's
not perfect, but also doing verification
00:40:28.009 --> 00:40:33.410
is very expensive, computational wise and
so I think it should just be the firmware
00:40:33.410 --> 00:40:37.160
that actually verifies the contents of the
external flash.
00:40:37.160 --> 00:40:44.109
Herald: OK, so I think we should ask 2
questions more and then we can go back to
00:40:44.109 --> 00:40:52.000
the studio. There is a question about the
AS encryption keys. Have you managed to
00:40:52.000 --> 00:40:57.349
recover them?
Thomas: Yes, we did. But so it's an
00:40:57.349 --> 00:41:01.700
applicational AST, and they do some crazy
shifting around with the keys but I think
00:41:01.700 --> 00:41:07.400
even just today, like an hour before the
talk, a guy, sorry I'm not sure it's a
00:41:07.400 --> 00:41:12.650
guy, a person on our discord actually
managed to rebuild the full encryption.
00:41:12.650 --> 00:41:16.779
But we, I personally wasn't never
interested in that because after you've
00:41:16.779 --> 00:41:22.080
downgraded to RTP 0, the device. You can
just access the memory mapped flash and
00:41:22.080 --> 00:41:24.740
get the completely decrypted flash
contents basically.
00:41:24.740 --> 00:41:32.009
Herald: Sure. Thanks. And a last question
about the LCD-Controller, whether it's
00:41:32.009 --> 00:41:38.180
used by writing pixels over SPI or if it
has some extra features, maybe even
00:41:38.180 --> 00:41:40.930
background or sprites or something like
that?
00:41:40.930 --> 00:41:46.809
Thomas: So the the LCD itself doesn't have
any special features. It has one SPI bus
00:41:46.809 --> 00:41:50.930
to configure it and then a parallel
interface where - so it takes up a lot
00:41:50.930 --> 00:41:56.809
of pins. But the chip itself has a
hardware called LTDC, which is an LCD
00:41:56.809 --> 00:42:00.769
controller, which provides two layers with
alpha blending and some basic windowing
00:42:00.769 --> 00:42:06.630
and so on.
Herald: OK, cool then thank you very, very
00:42:06.630 --> 00:42:11.799
much for the great talk and the great
intro. And with that, back to our main
00:42:11.799 --> 00:42:14.859
studio in the orbit. Thank you very much.
Back to orbit.
00:42:14.859 --> 00:42:17.977
rC3 postroll music
00:42:17.977 --> 00:42:56.000
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!