1
00:00:00,320 --> 00:00:09,500
32C3 preroll music
2
00:00:09,500 --> 00:00:16,240
Herald: Okay, welcome to our
last talk in this hall today!
3
00:00:16,240 --> 00:00:20,420
It’s about Console Hacking and I guess
that’s the reason why you are here.
4
00:00:20,420 --> 00:00:23,509
Console hacking has a long
tradition at our great conference
5
00:00:23,509 --> 00:00:30,109
and we have seen lots of funny things.
People doing stuff with Xboxes,
6
00:00:30,109 --> 00:00:33,900
Playstations and everything.
7
00:00:33,900 --> 00:00:39,010
Okay. Today we got a team which
deals with the Nintendo DS,
8
00:00:39,010 --> 00:00:44,260
so give a warm applause
for plutoo, derrek and smea!
9
00:00:44,260 --> 00:00:53,770
applause
10
00:00:53,770 --> 00:00:58,910
smea: Hi! I’m smea,
this is plutoo, this is derrek,
11
00:00:58,910 --> 00:01:02,930
and today we are going to talk to you
about our work on the Nintendo 3DS.
12
00:01:02,930 --> 00:01:05,390
So, the way this talk is going to be
structured, is we are just going to
13
00:01:05,390 --> 00:01:08,850
go over all the hardware,
organisation, software, like…
14
00:01:08,850 --> 00:01:12,240
Just give you a basic overview
about how the system works.
15
00:01:12,240 --> 00:01:15,040
And after that we are going to go into
16
00:01:15,040 --> 00:01:18,330
basically every layer of
security the system has,
17
00:01:18,330 --> 00:01:21,269
and break every one of them.
18
00:01:21,269 --> 00:01:23,219
laughter
19
00:01:23,219 --> 00:01:27,550
applause
20
00:01:27,550 --> 00:01:31,860
Okay. So, as you probably know,
the 3DS, the original Nintendo 3DS
21
00:01:31,860 --> 00:01:36,500
was released in 2011. It’s a system
that is kind of underpowered.
22
00:01:36,500 --> 00:01:41,479
It’s got, like… It’s got an
ARM11 dual core CPU,
23
00:01:41,479 --> 00:01:46,399
268Mhz, it’s got a nice
proprietary GPU, a bit of RAM,
24
00:01:46,399 --> 00:01:49,920
you know, the usual. It’s also backwards
compatible with the DS games,
25
00:01:49,920 --> 00:01:55,299
which is nice. Then the new 3DS
was released in 2014 and 2015,
26
00:01:55,299 --> 00:02:01,060
there was like different regions. And it
was basically just the same console,
27
00:02:01,060 --> 00:02:04,239
just some improvements in the
hardware. You’ve got a better CPU,
28
00:02:04,239 --> 00:02:09,410
it has got more cores. It’s faster, it has
got more RAM. Basically everywhere.
29
00:02:09,410 --> 00:02:12,240
So, it is just the same thing,
it runs the same software, exactly.
30
00:02:12,240 --> 00:02:15,800
It has got some exclusive
software, but not much.
31
00:02:15,800 --> 00:02:19,460
So, in terms of a hardware overview, this
is what what we are going to talk about
32
00:02:19,460 --> 00:02:24,050
looks like; in general. So you
got the top part right here,
33
00:02:24,050 --> 00:02:27,490
which is what we are
going to go into first.
34
00:02:27,490 --> 00:02:31,470
This is like the ARM11 part.
35
00:02:31,470 --> 00:02:35,110
Basically, you’ve got the ARM11,
which is the main CPU. It runs
36
00:02:35,110 --> 00:02:40,740
the main operating system. It has
2 cores as I just said, or 4 cores.
37
00:02:40,740 --> 00:02:42,790
So, it runs the main operating
system, it runs the games,
38
00:02:42,790 --> 00:02:45,340
it runs all the applications.
Basically, it’s just –
39
00:02:45,340 --> 00:02:48,380
if you’re doing something on the 3DS
that you can… you can see it happening,
40
00:02:48,380 --> 00:02:52,220
it’s happening on that CPU. It has got
access to all of the main memory.
41
00:02:52,220 --> 00:02:56,090
So that includes FCRAM,
42
00:02:56,090 --> 00:03:01,040
which is 128MB or 256MB,
43
00:03:01,040 --> 00:03:04,730
depending on which model it is.
And FCRAM is actually divided
44
00:03:04,730 --> 00:03:09,130
into 3 separate regions. So you
first got the Application Region,
45
00:03:09,130 --> 00:03:12,520
which contains the currently
running game or application.
46
00:03:12,520 --> 00:03:17,200
The System Region, which contains applets,
which are basically tiny applications,
47
00:03:17,200 --> 00:03:20,050
which run in the background.
So, that includes the home menu,
48
00:03:20,050 --> 00:03:23,390
which is actually always running
in background, and the web browser,
49
00:03:23,390 --> 00:03:25,890
which you can actually run at
the same time as your game, so
50
00:03:25,890 --> 00:03:28,860
it has to run there. And then you got the
Base Region, which is more interesting.
51
00:03:28,860 --> 00:03:31,050
It contains all the system modules
of the operating system,
52
00:03:31,050 --> 00:03:35,260
as well as some kernel data,
such as handle tables
53
00:03:35,260 --> 00:03:39,840
and MMU tables. So it is kind of sensitive
stuff. And then we got a WRAM,
54
00:03:39,840 --> 00:03:44,330
which is tiny and contains all
the kernel code, and, well,
55
00:03:44,330 --> 00:03:49,550
most of the kernel structures as well.
So it’s also an interesting target.
56
00:03:49,550 --> 00:03:55,160
Then we’ve got the lower part, which
is the ARM9 part of the hardware.
57
00:03:55,160 --> 00:03:58,270
So the ARM9 is basically a separate, well…
58
00:03:58,270 --> 00:04:02,790
it’s an entirely separate CPU,
which has access to…
59
00:04:02,790 --> 00:04:06,760
well… So it runs basically the
same microkernel as the ARM11.
60
00:04:06,760 --> 00:04:11,600
It’s mostly the same code,
it has just got some pure features.
61
00:04:11,600 --> 00:04:14,630
Mostly it runs a single process,
which is called ‘Process9’,
62
00:04:14,630 --> 00:04:19,399
which does everything the ARM9 does.
Beyond that the role of the ARM9 is
63
00:04:19,399 --> 00:04:24,260
to broker access to hardware that
might be sensitive in terms of security.
64
00:04:24,260 --> 00:04:29,320
So one of the things it does is it
brokers access to all storage media,
65
00:04:29,320 --> 00:04:33,590
so that includes the permanent
storage as well as the SD card.
66
00:04:33,590 --> 00:04:38,450
And then it does all sorts of crypto
stuff, which is really important,
67
00:04:38,450 --> 00:04:43,930
and does that by using hardware, actually.
So there is this hardware key scrambler,
68
00:04:43,930 --> 00:04:48,260
which is used to.. to store
secrets in hardware basically.
69
00:04:48,260 --> 00:04:51,100
The idea is, you feed
it two separate keys,
70
00:04:51,100 --> 00:04:54,980
and it is going to generate a
normal key and feed that directly
71
00:04:54,980 --> 00:04:59,260
into the hardware implementation
of the AES algorithm.
72
00:04:59,260 --> 00:05:02,340
So that way, we never
actually see the final keys.
73
00:05:02,340 --> 00:05:06,430
So that’s something that
is kind of annoying.
74
00:05:06,430 --> 00:05:10,100
And then beyond that what you can see is:
the ARM9 has access to all of main memory
75
00:05:10,100 --> 00:05:13,890
without much of, well, without any
restrictions. But it has also got
76
00:05:13,890 --> 00:05:17,790
its own internal memory which the
ARM11 does not have access to.
77
00:05:17,790 --> 00:05:21,350
So the ARM9 internal memory is
where the ARM9 stores all its code,
78
00:05:21,350 --> 00:05:26,600
all of its data; and this way we
can’t actually take over the ARM9
79
00:05:26,600 --> 00:05:33,340
just from the ARM11 without some kind of
exploit. So it’s basically a security CPU.
80
00:05:33,340 --> 00:05:36,730
So this leads us to having
4 layers of security.
81
00:05:36,730 --> 00:05:39,940
Basically, you’re first going to have
the ARM11 userland, which is what…
82
00:05:39,940 --> 00:05:43,550
well, like your games, your applications,
whatever. On top of that,
83
00:05:43,550 --> 00:05:48,630
you’re going to have, well, below
that, I guess, the ARM11 kernel.
84
00:05:48,630 --> 00:05:51,810
So that is going to have
full privileges on the ARM11.
85
00:05:51,810 --> 00:05:55,300
And then you’re going to have
ARM9 userland, which is ‘Process9’.
86
00:05:55,300 --> 00:05:59,560
Beyond that you’ll have ARM9
kernel mode. So that’s in theory.
87
00:05:59,560 --> 00:06:04,380
In practice, the microkernel
has a system call,
88
00:06:04,380 --> 00:06:09,280
which we call… syscall…
we call it ‘svc backdoor’.
89
00:06:09,280 --> 00:06:13,510
Because essentially you feed it a
function pointer and it just executes
90
00:06:13,510 --> 00:06:16,970
that function in kernel mode.
So you don’t even need an exploit
91
00:06:16,970 --> 00:06:20,889
if you have access to that syscall.
Of course, on the ARM11
92
00:06:20,889 --> 00:06:25,300
no application or title or anything
ever has access to that,
93
00:06:25,300 --> 00:06:29,560
but on the ARM9 ‘Process9’ actually
has access to it. Which means,
94
00:06:29,560 --> 00:06:34,050
that from here we actually…
well, userland and kernel mode
95
00:06:34,050 --> 00:06:37,770
are basically the same thing.
When you got userland on the ARM9,
96
00:06:37,770 --> 00:06:41,020
you got kernel mode.
So that’s nice.
97
00:06:41,020 --> 00:06:44,950
Beyond that, in terms of
cryptography on the system,
98
00:06:44,950 --> 00:06:49,030
basically, they went out loud (?). So, anything
that can be signed, is signed.
99
00:06:49,030 --> 00:06:51,570
So, that includes the firmware,
that includes every application.
100
00:06:51,570 --> 00:06:55,480
Signatures are checked not only
at install time but also at runtime,
101
00:06:55,480 --> 00:06:58,750
so that’s something to keep in mind.
102
00:06:58,750 --> 00:07:02,889
Same thing: anything that can
be encrypted is encrypted.
103
00:07:02,889 --> 00:07:07,650
And anything that can be made, well,
console-specific through cryptography
104
00:07:07,650 --> 00:07:13,270
or authentication, such as
internal permanent storage
105
00:07:13,270 --> 00:07:17,510
or the data that is stored on
the SD card, or savegames,
106
00:07:17,510 --> 00:07:22,740
or extra data for games, this
is all made console-specific.
107
00:07:22,740 --> 00:07:26,510
And gamecard-specific in
regards of savegames.
108
00:07:26,510 --> 00:07:31,470
So, that’s kind of annoying as well. And,
of course, all this is handled by the ARM9
109
00:07:31,470 --> 00:07:35,590
using the hardware… the crypto
hardware, so we got to get through that
110
00:07:35,590 --> 00:07:38,190
if we want to do interesting things.
111
00:07:38,190 --> 00:07:43,860
So, first we are going to go through the
first layer, which is the ARM11 userland.
112
00:07:43,860 --> 00:07:47,320
Basically, getting a full
hold onto the system.
113
00:07:47,320 --> 00:07:51,370
So, we first need to find
some kind of entry point.
114
00:07:51,370 --> 00:07:55,780
There are problems… well,
there are challenges there.
115
00:07:55,780 --> 00:07:59,760
One of the challenges is
that the system implements
116
00:07:59,760 --> 00:08:05,080
strict Data Execution Prevention. So,
existing pages will never be read…
117
00:08:05,080 --> 00:08:09,290
well, will never be read-write-executable.
It’s all only going to be read-only,
118
00:08:09,290 --> 00:08:13,480
or read-writable or read-executable.
There’s no way from a standard application
119
00:08:13,480 --> 00:08:18,079
to reprotect or map new pages
that are read-write-executable.
120
00:08:18,079 --> 00:08:22,180
Because all of the system
calls are locked out, except for
121
00:08:22,180 --> 00:08:26,400
higher privileged system
modules. Another thing is
122
00:08:26,400 --> 00:08:29,840
that there is no ASLR, so that is not
a challenge, that’s actually kind of nice.
123
00:08:29,840 --> 00:08:34,020
The nice thing here is that we… well,
that makes savegame vulnerabilities
124
00:08:34,020 --> 00:08:37,010
totally fair game because, well, we don’t
need an actual scripting environment
125
00:08:37,010 --> 00:08:40,640
or any kind of exotic
vulnerability to exploit this.
126
00:08:40,640 --> 00:08:44,930
As long as we can get past
DEP somehow. And then,
127
00:08:44,930 --> 00:08:48,990
of course, the fact that all
savegames are both encrypted
128
00:08:48,990 --> 00:08:52,960
and made specific either to the
gamecard or the game console,
129
00:08:52,960 --> 00:08:57,630
in the case of eShop games, is really
annoying for savegame vulnerabilities
130
00:08:57,630 --> 00:09:01,450
because basically you can’t use those
as an initial entry point in most cases,
131
00:09:01,450 --> 00:09:05,460
because, well, you can’t generate
the right, well, ES MAC,
132
00:09:05,460 --> 00:09:12,160
or just… you don’t know the right
cryptography. So, that’s annoying.
133
00:09:12,160 --> 00:09:15,300
Thankfully, the 3DS runs Webkit…
134
00:09:15,300 --> 00:09:18,470
laughter
135
00:09:18,470 --> 00:09:21,780
So, that’s nice.
Can always use that.
136
00:09:21,780 --> 00:09:26,400
applause
137
00:09:26,400 --> 00:09:29,690
So, Webkit is used in a number of places,
obviously it’s using the main web browser,
138
00:09:29,690 --> 00:09:32,810
which you can access from the home menu.
It’s also used in the Youtube application,
139
00:09:32,810 --> 00:09:37,210
which is available free on the eShop
and doesn’t use any kind of
140
00:09:37,210 --> 00:09:41,180
client side authentication for the server,
so you can just redirect traffic through,
141
00:09:41,180 --> 00:09:46,589
like a DNS server for example. Miiverse
applet, other stuff, that also uses it.
142
00:09:46,589 --> 00:09:50,870
Slightly more secure, but might be
usable at some point, I don’t know.
143
00:09:50,870 --> 00:09:54,900
Anywho, the important part here,
is that it’s not only using webkit,
144
00:09:54,900 --> 00:09:59,310
it is using a very old version of webkit.
Basically, they do cherrypick
145
00:09:59,310 --> 00:10:03,290
some patches into the version
of webkit they use, but only
146
00:10:03,290 --> 00:10:10,040
after we exploit those on release, so it
comes a little too late, most of the time.
147
00:10:10,040 --> 00:10:15,690
So yeah, this has been used by multiple
people, most notably yellows8,
148
00:10:15,690 --> 00:10:21,580
but it has proven to be a very
efficient, reliable entry point.
149
00:10:21,580 --> 00:10:25,690
Beyond that, we got Cubic Ninja as initial
entry point. Cubic Ninja is a game
150
00:10:25,690 --> 00:10:30,020
that was released in 2011 on Nintendo
3DS. It is nice, because it actually
151
00:10:30,020 --> 00:10:34,350
allows users to share levels
that they make themselves
152
00:10:34,350 --> 00:10:40,850
through QR codes; and also it is
really bad at parsing those levels.
153
00:10:40,850 --> 00:10:44,910
So what you can do, is just, well,
manufacture your own QR code
154
00:10:44,910 --> 00:10:47,740
that is going to crash the game
and give you access. So these are
155
00:10:47,740 --> 00:10:52,529
nice initial entry points. So, once we’ve
got this, what we have to remember is
156
00:10:52,529 --> 00:10:56,020
that we might be able to crash the game
and may be able to control registers,
157
00:10:56,020 --> 00:11:00,550
but we don’t actually have our code
running because of that. So,
158
00:11:00,550 --> 00:11:04,200
the obvious solution to
hit this, is to use ROP.
159
00:11:04,200 --> 00:11:07,770
For those of you, who are
not familiar with ROP:
160
00:11:07,770 --> 00:11:11,730
You build your own fake stack
that lets you return into
161
00:11:11,730 --> 00:11:15,899
code snippets that are located right
before return instructions. That way…
162
00:11:15,899 --> 00:11:20,750
so this is an example. You can just
163
00:11:20,750 --> 00:11:24,779
jump to this kind of instruction,
so ‘pop {r0, pc}’ and then
164
00:11:24,779 --> 00:11:29,220
this is going to let you load your own
register value and then it is going to
165
00:11:29,220 --> 00:11:33,870
jump to the next instruction that you give
it. So, this is a way of executing code
166
00:11:33,870 --> 00:11:37,580
without actually executing code,
which is widely used; so this is like
167
00:11:37,580 --> 00:11:42,080
the obvious thing to do. Of course,
ROP is annoying. It is very limiting.
168
00:11:42,080 --> 00:11:47,560
It can be enough to actually execute
an exploit to get higher privileges,
169
00:11:47,560 --> 00:11:53,149
but overall it is just annoying and very
limiting for homebrew, for example.
170
00:11:53,149 --> 00:11:56,000
And of course, as I mentioned earlier, we
don’t have access to any of the system calls
171
00:11:56,000 --> 00:12:01,010
that would let us map
read-writable-executable pages.
172
00:12:01,010 --> 00:12:04,850
Also, the system does support dynamically
linked libraries, so that might be a way,
173
00:12:04,850 --> 00:12:09,560
but these are signed and checked in
places that we can’t access at this point.
174
00:12:09,560 --> 00:12:13,959
So, what we’re going to look
at next is the GPU to see
175
00:12:13,959 --> 00:12:19,070
if we use that to bypass that.
What you can see here is that
176
00:12:19,070 --> 00:12:23,220
the GPU has access not only to
video RAM, but also to FCRAM,
177
00:12:23,220 --> 00:12:26,420
which is, if you recall it, main
memory. So, if you look at this,
178
00:12:26,420 --> 00:12:30,540
with all the different memory regions,
179
00:12:30,540 --> 00:12:33,480
we have got the Application Region
here, which is entirely contained within
180
00:12:33,480 --> 00:12:38,700
what the GPU can access within FCRAM.
Of course, the GPU can not actually access
181
00:12:38,700 --> 00:12:42,790
all of that FCRAM, so that is kind
of limiting. What we can see here,
182
00:12:42,790 --> 00:12:49,279
is that, of course, application code is
within range of the GPU’s level of access.
183
00:12:49,279 --> 00:12:53,250
The reason the GPU has access to
FCRAM and Video RAM, through DMA,
184
00:12:53,250 --> 00:12:58,209
by the way, is, so that it can access
information such as textures,
185
00:12:58,209 --> 00:13:01,030
vertex buffers, this sort of thing.
186
00:13:01,030 --> 00:13:04,240
So, it’s actually kind of important. And
the reason it can write to it is because
187
00:13:04,240 --> 00:13:08,730
it has to render its data somewhere.
The point is, that we can use this
188
00:13:08,730 --> 00:13:12,050
to render data into main memory.
189
00:13:12,050 --> 00:13:16,490
And main memory contains application
code. And since the physical layout is
190
00:13:16,490 --> 00:13:20,200
actually completely deterministic, and
even if it wasn’t, we could just use the
191
00:13:20,200 --> 00:13:23,580
read capabilities of the GPU to
search for what we are looking for.
192
00:13:23,580 --> 00:13:27,970
Well, we can use this to overwrite our
current application’s text section
193
00:13:27,970 --> 00:13:32,610
and we get code execution
that way, in spite of DEP.
194
00:13:32,610 --> 00:13:34,440
Yeah, so this is where
we get code execution…
195
00:13:34,440 --> 00:13:35,280
applause
196
00:13:35,280 --> 00:13:37,779
We execute our own,
unsigned code, which is very…
197
00:13:37,779 --> 00:13:39,830
applause
198
00:13:39,830 --> 00:13:44,520
It’s great, but we are still confined
within the application sandbox.
199
00:13:44,520 --> 00:13:47,450
So, we bypassed DEP,
we are inside the sandbox.
200
00:13:47,450 --> 00:13:53,140
This means we can only access
our current application’s savedata,
201
00:13:53,140 --> 00:13:58,120
so if we want to install some kind of
secondary exploit, this is too limiting.
202
00:13:58,120 --> 00:14:02,190
We can only access certain services and
system calls, which is also limiting
203
00:14:02,190 --> 00:14:06,200
and frustrating. And we can’t alter
memory layout, so we can’t allocate
204
00:14:06,200 --> 00:14:08,769
more executable pages
than I mentioned earlier.
205
00:14:08,769 --> 00:14:10,779
So, we are still kind
of limited at this point.
206
00:14:10,779 --> 00:14:14,680
So, what we are going to do, is look at
what else the GPU can access.
207
00:14:14,680 --> 00:14:18,630
And you can see, is that, of course, there
is this entirely separate memory region
208
00:14:18,630 --> 00:14:21,780
the GPU can modify.
209
00:14:21,780 --> 00:14:24,860
So it can access most of the System
Region. And the System Region contains
210
00:14:24,860 --> 00:14:27,510
a few things. It contains the home menu, as
I mentioned, because that is an applet.
211
00:14:27,510 --> 00:14:31,500
It contains the internet browser, and it
contains actually a single system module,
212
00:14:31,500 --> 00:14:38,020
which is called ‘NS’, which we think stands
for ‘Nintendo Shell’, we don’t really know.
213
00:14:38,020 --> 00:14:42,810
So, let’s look at this. First we got
NS code well beyond the GPU cutoff.
214
00:14:42,810 --> 00:14:46,110
We got menu code, which is
also well beyond GPU cutoff.
215
00:14:46,110 --> 00:14:51,310
But we got the menu’s heap, right here,
well, actually there is separate heaps,
216
00:14:51,310 --> 00:14:55,089
these are well within the
GPU’s range, so that’s good.
217
00:14:55,089 --> 00:14:59,830
NS unfortunately is still well beyond the
cutoff. All of its data, all of its code.
218
00:14:59,830 --> 00:15:03,059
So we apparently can’t get to that.
219
00:15:03,059 --> 00:15:07,830
So, then the idea is, to just,
well, okay, so actually…
220
00:15:07,830 --> 00:15:11,029
What’s interesting here, is that
the cutoff is right before the end of
221
00:15:11,029 --> 00:15:14,200
the System Region, which as we just
saw, has some interesting things, but
222
00:15:14,200 --> 00:15:18,680
also excludes all of Base Region,
which also has very interesting things.
223
00:15:18,680 --> 00:15:23,670
So, it seems likely that Nintendo knew
about the capabilities of GPU DMA,
224
00:15:23,670 --> 00:15:27,480
like the theoretical capabilities, but
they didn’t do anything about it.
225
00:15:27,480 --> 00:15:30,899
So, it seems that they probably didn’t
realize what we could do with it,
226
00:15:30,899 --> 00:15:33,220
which is a lot.
227
00:15:33,220 --> 00:15:37,630
So, basically, we got menu heaps. So
what we do, is… we have a heap, and
228
00:15:37,630 --> 00:15:42,399
this is all C++ code. We are just
going to find objects inside the heap
229
00:15:42,399 --> 00:15:46,790
and overwrite it. So it’s pretty simple.
Just find an object, that is going to be
230
00:15:46,790 --> 00:15:50,300
triggered to some kind of synchronisation
mechanism. In this case, it’s gonna be
231
00:15:50,300 --> 00:15:55,010
just ‘Return to Menu’. And we
create some kind of vague vtable
232
00:15:55,010 --> 00:15:59,560
and get it to run our own
stack pivot. And then we get…
233
00:15:59,560 --> 00:16:03,300
we get ROP execution under
Home menu, which is cool.
234
00:16:03,300 --> 00:16:07,060
We still don’t have code execution
in the Home menu, but that’s okay.
235
00:16:07,060 --> 00:16:10,630
So, we can do a bunch of stuff from ROP.
236
00:16:10,630 --> 00:16:16,180
We can access a new system
service, which is called ‘ns:s’,
237
00:16:16,180 --> 00:16:19,890
which is very helpful, because it can
kill any arbitrary process, as well as
238
00:16:19,890 --> 00:16:24,930
create new ones. Also it gives us access
to SD card, which most applications
239
00:16:24,930 --> 00:16:29,690
actually don’t have. And it lets us
decrypt/dump any title on the system.
240
00:16:29,690 --> 00:16:34,300
So any game, even if it uses new
cryptography that Nintendo introduced,
241
00:16:34,300 --> 00:16:38,230
we can actually dump that, because
for some reason, well, Home menu
242
00:16:38,230 --> 00:16:41,890
apparently needs access to
that. And then we can also
243
00:16:41,890 --> 00:16:47,490
access and overwrite all that extra data
used by any application, which is great.
244
00:16:47,490 --> 00:16:50,380
So we use this as a base
for running homebrew.
245
00:16:50,380 --> 00:16:54,920
Our homebrew launcher is
essentially just a service
246
00:16:54,920 --> 00:16:58,810
that runs in the background under Home
menu process. It is written in ROP,
247
00:16:58,810 --> 00:17:02,370
which is kind of disgusting, but it works.
laughter
248
00:17:02,370 --> 00:17:05,999
The ‘Service’ handles running homebrew,
so the process is very simple. You just
249
00:17:05,999 --> 00:17:09,358
kill off the current application, you
spawn a new one, and then you take it over
250
00:17:09,358 --> 00:17:15,019
using the GPU DMA access.
And then, what we do is
251
00:17:15,019 --> 00:17:19,489
we send all of these new capabilities that
we got through handles to the new process
252
00:17:19,489 --> 00:17:23,558
and that gives us some
higher privilege homebrew.
253
00:17:23,558 --> 00:17:30,190
It also handles events, such as Home
button, Power button, all that good stuff.
254
00:17:30,190 --> 00:17:33,749
Which is nice, because we can actually
run code under any arbitrary application
255
00:17:33,749 --> 00:17:37,929
or game, so we can actually modify
these games. We can run ROM hacks.
256
00:17:37,929 --> 00:17:41,179
So there has been a bunch of translations
that can be run through this, for games
257
00:17:41,179 --> 00:17:44,469
that haven’t come out outside
of Japan, so that’s pretty nice.
258
00:17:44,469 --> 00:17:46,889
It’s the same principle, you just
launch the app, you take it over,
259
00:17:46,889 --> 00:17:50,769
you pass the code, and then
you jump to it, essentially.
260
00:17:50,769 --> 00:17:53,959
All within the confines of
userland, which is nice.
261
00:17:53,959 --> 00:17:59,600
So, the other thing is, we can actually
access any game or application’s data
262
00:17:59,600 --> 00:18:03,460
because we can run code under
it. So, these things include
263
00:18:03,460 --> 00:18:07,970
savegame data for any game. So we
can actually install more convenient
264
00:18:07,970 --> 00:18:11,980
secondary entry points, which do not
rely on the browser, which can be
265
00:18:11,980 --> 00:18:15,749
patched any moment, or on some old game.
266
00:18:15,749 --> 00:18:21,019
So, some examples include ‘Menuhax’
by yellows8, which exploits
267
00:18:21,019 --> 00:18:27,539
faulty theme handling code, which
was introduced in firmware 9.0.
268
00:18:27,539 --> 00:18:30,519
Which is really nice, because this way,
you can actually just run homebrew
269
00:18:30,519 --> 00:18:35,359
right as Home menu is opened,
so right on boot time,
270
00:18:35,359 --> 00:18:38,929
which is great. Then you got other games.
Of course you got a Zelda game
271
00:18:38,929 --> 00:18:41,619
that’s vulnerable.
audience chuckles
272
00:18:41,619 --> 00:18:44,549
This time it wasn’t the
horse’s name, but pretty similar.
273
00:18:44,549 --> 00:18:48,389
And then you got other games. We
got tons of entry points at this point.
274
00:18:48,389 --> 00:18:54,999
We’re really, literally drowning
in them. So, this is nice.
275
00:18:54,999 --> 00:18:58,749
But we forgot about ‘Nintendo Shell’,
right? It’s a very attractive target,
276
00:18:58,749 --> 00:19:03,090
for a couple of reasons. For one thing,
it has access the ‘am:u’ service,
277
00:19:03,090 --> 00:19:05,929
which can be used to
downgrade any system title.
278
00:19:05,929 --> 00:19:09,600
It’s not actually designed to downgrade
titles, the thing is, you can both
279
00:19:09,600 --> 00:19:13,200
install and uninstall titles.
So, what happens is,
280
00:19:13,200 --> 00:19:16,639
if you uninstall a title, and
then install an older version
281
00:19:16,639 --> 00:19:19,210
of that title, you actually
bypass the version check.
282
00:19:19,210 --> 00:19:22,210
So, you can just do that to
downgrade any system title
283
00:19:22,210 --> 00:19:27,699
and bring back old exploits,
if that is necessary.
284
00:19:27,699 --> 00:19:30,320
Assuming you have
access to the service.
285
00:19:30,320 --> 00:19:32,679
And of course it’s in a region
that we can partially modify,
286
00:19:32,679 --> 00:19:35,989
so it’s an interesting target.
287
00:19:35,989 --> 00:19:38,769
Unfortunately, we can’t actually
access its data right now.
288
00:19:38,769 --> 00:19:42,489
But maybe we can actually move
it to somewhere, where we can.
289
00:19:42,489 --> 00:19:47,830
The idea is, if you were to kill NS, and
then allocate something in it’s place,
290
00:19:47,830 --> 00:19:52,129
then run NS again, you can
move it below the cutoff.
291
00:19:52,129 --> 00:19:54,519
laughter
292
00:19:54,519 --> 00:20:01,809
applause
293
00:20:01,809 --> 00:20:06,369
Thanks. But unfortunately
it’s not that simple. That can’t work.
294
00:20:06,369 --> 00:20:10,790
The reason being, that we actually need
NS to be running to launch NS again.
295
00:20:10,790 --> 00:20:13,369
So that kind of sucks.
296
00:20:13,369 --> 00:20:15,820
But… well, no.
Actually we also can’t run
297
00:20:15,820 --> 00:20:17,960
a second instance of NS at the same time,
298
00:20:17,960 --> 00:20:20,369
so we can’t do that either.
299
00:20:20,369 --> 00:20:23,559
But interestingly…
Well, the 3DS has an interesting feature,
300
00:20:23,559 --> 00:20:28,200
which is called ‘Safe Mode’. Basically
it’s a second firmware, which is
301
00:20:28,200 --> 00:20:32,649
an old version of the
regular one, and that
302
00:20:32,649 --> 00:20:37,070
creates a bunch of
copies of system titles.
303
00:20:37,070 --> 00:20:41,499
Most of them, anyways. So that gives
it a different ID. So, the idea is,
304
00:20:41,499 --> 00:20:44,249
that if it has got a different ID, we
might be able to run it at the same time,
305
00:20:44,249 --> 00:20:48,129
because, well, PM might fail
to notice that. Of course it doesn’t.
306
00:20:48,129 --> 00:20:51,889
It actually does notice that. So we can’t
run the Safe Mode version of a title
307
00:20:51,889 --> 00:20:54,830
at the sime time as the regular
version of the title. But,
308
00:20:54,830 --> 00:20:59,960
for some reason, in the case of NS – you
might not be able to see this very well,
309
00:20:59,960 --> 00:21:04,669
but we’ve got NS’s regular title right
here, and then we got Safe Mode NS
310
00:21:04,669 --> 00:21:07,100
right here. And for some reason
they created a new 3DS version
311
00:21:07,100 --> 00:21:12,070
of the Safe Mode version of NS,
though there is no new 3DS version
312
00:21:12,070 --> 00:21:16,440
of the original NS. So that
creates a separate title ID
313
00:21:16,440 --> 00:21:20,340
which we can run at the same time
as regular NS. So then, the exploit
314
00:21:20,340 --> 00:21:25,059
becomes very simple. You keep NS running,
just allocate enough data, that it will be
315
00:21:25,059 --> 00:21:29,440
below the cutoff; and then you
just run new 3DS Safe Mode NS.
316
00:21:29,440 --> 00:21:33,239
And then it’s within range of the GPU
and you can take it over and have
317
00:21:33,239 --> 00:21:36,979
access to everything. So, this is nice.
318
00:21:36,979 --> 00:21:43,509
It’s more of an oversight than
a proper exploit, but whatever.
319
00:21:43,509 --> 00:21:46,399
So this gives us access to a
bunch of system calls. Mostly
320
00:21:46,399 --> 00:21:50,909
service handling system calls,
so we can post our own service,
321
00:21:50,909 --> 00:21:54,639
which can be useful for other
exploits that I won’t get into, for
322
00:21:54,639 --> 00:21:59,190
impersonating other services
to other system modules.
323
00:21:59,190 --> 00:22:02,570
And then we got access to all of
these services, which is great.
324
00:22:02,570 --> 00:22:06,559
So we can downgrade
system titles arbitrarily.
325
00:22:06,559 --> 00:22:10,529
And this runs in background, which
can always be helpful for homebrew.
326
00:22:10,529 --> 00:22:14,210
The only problem is at this point,
it’s still new 3DS only, because
327
00:22:14,210 --> 00:22:20,519
it relies on this new 3DS title. But
there are actually ways around that.
328
00:22:20,519 --> 00:22:24,269
This was just to show that we can actually
get fairly high levels of privilege,
329
00:22:24,269 --> 00:22:28,759
even still just always staying
in userland on the ARM11.
330
00:22:28,759 --> 00:22:32,199
And there are other, similar attacks to
that. If you’re interested you can look up
331
00:22:32,199 --> 00:22:36,489
‘rohax’, which is a similar
attack in the system module.
332
00:22:36,489 --> 00:22:41,229
So, now derrek is going to talk to you
about exploiting the ARM11 kernel.
333
00:22:41,229 --> 00:22:52,279
derrek?
applause
334
00:22:52,279 --> 00:22:55,319
derrek: So, hi everyone!
335
00:22:55,319 --> 00:22:59,530
First, I will give you some
very short inside view
336
00:22:59,530 --> 00:23:05,059
of the kernel, and then I will
explain how you can exploit
337
00:23:05,059 --> 00:23:09,269
the latest version of the ARM11 kernel.
338
00:23:09,269 --> 00:23:12,190
So,
339
00:23:12,190 --> 00:23:16,199
this is actually Nintendo’s very
first gaming console kernel.
340
00:23:16,199 --> 00:23:20,679
Like on any other older console,
341
00:23:20,679 --> 00:23:26,200
there was no kernel. All games
were just running on bare metal.
342
00:23:26,200 --> 00:23:31,499
Like there was a kernel for the Wii,
343
00:23:31,499 --> 00:23:36,209
like a very small microkernel
running on the security processor,
344
00:23:36,209 --> 00:23:41,039
but that wasn’t written by Nintendo.
345
00:23:41,039 --> 00:23:44,830
So it’s their very first
gaming console kernel.
346
00:23:44,830 --> 00:23:50,789
That kernel is made to be thread safe,
347
00:23:50,789 --> 00:23:54,830
so it can run on multiple cores
348
00:23:54,830 --> 00:23:58,679
at the same time and there are like
349
00:23:58,679 --> 00:24:02,659
130 system calls available.
350
00:24:02,659 --> 00:24:07,349
So that’s quite a lot, in my opinion.
351
00:24:07,349 --> 00:24:12,309
But usually, if you have gained execution
352
00:24:12,309 --> 00:24:16,999
in ARM11 userland, you
only have access to, like,
353
00:24:16,999 --> 00:24:22,049
around 50 system calls.
354
00:24:22,049 --> 00:24:27,019
And there’s a reason for that, but I’m
going to explain that in a second.
355
00:24:27,019 --> 00:24:34,210
So, internally, the kernel
works with C++ objects.
356
00:24:34,210 --> 00:24:38,029
So here are some examples
for system calls. So, we have
357
00:24:38,029 --> 00:24:43,539
‘CreateSemaphore’, for
example. That will just create
358
00:24:43,539 --> 00:24:47,259
a semaphore object in the kernel
359
00:24:47,259 --> 00:24:52,109
and it will return a
handle to the userland.
360
00:24:52,109 --> 00:24:55,940
And when you want to do any operations
361
00:24:55,940 --> 00:24:59,879
on that semaphore, you
have to pass that handle
362
00:24:59,879 --> 00:25:04,720
to the kernel, and it will look up
this handle in a handle table
363
00:25:04,720 --> 00:25:10,919
to find the original C++ object.
364
00:25:10,919 --> 00:25:15,710
Also there are 2 different
kinds of memory allocators.
365
00:25:15,710 --> 00:25:19,299
So, we have a memory allocator
for the main memory, which is
366
00:25:19,299 --> 00:25:25,039
the FCRAM. And there is also a Slab Heap,
367
00:25:25,039 --> 00:25:29,869
where all the C++ objects are stored in.
368
00:25:29,869 --> 00:25:35,239
And this Slab Heap is located in FCRAM,
369
00:25:35,239 --> 00:25:39,339
which is the ARM11 memory,
370
00:25:39,339 --> 00:25:43,659
where all the kernel code and data is in.
371
00:25:43,659 --> 00:25:50,450
Also, there’s an IPC system.
372
00:25:50,450 --> 00:25:53,680
IPC is ‘inter process communication’.
373
00:25:53,680 --> 00:26:05,149
And it basically allows you
to talk to other processes
374
00:26:05,149 --> 00:26:08,269
like services,
375
00:26:08,269 --> 00:26:17,270
e.g. the GSP service or FS.
376
00:26:17,270 --> 00:26:21,939
So, let’s look at the security.
377
00:26:21,939 --> 00:26:28,779
So, the kernel is really small.
There are only like 200KB of code,
378
00:26:28,779 --> 00:26:34,649
which is pure ARM code. And
there are only like 1000 functions.
379
00:26:34,649 --> 00:26:39,659
So, they try to keep
the code size very low
380
00:26:39,659 --> 00:26:46,720
and that makes it harder to find bugs.
381
00:26:46,720 --> 00:26:51,999
The code size is really small, and
382
00:26:51,999 --> 00:26:57,349
you don’t have really much to choose from
383
00:26:57,349 --> 00:27:03,690
what to exploit. Also there are no
symbols included in the kernel.
384
00:27:03,690 --> 00:27:11,629
Like when you run strings on it, it will
just give you some names of C++ objects,
385
00:27:11,629 --> 00:27:16,389
but there are no function
names or something like that.
386
00:27:16,389 --> 00:27:21,039
As we have seen earlier
it’s physically isolated
387
00:27:21,039 --> 00:27:26,599
in its own memory. Which turned out
- of course - to be a good idea.
388
00:27:26,599 --> 00:27:33,679
Otherwise it would have been
overwritable by the CPU eventually.
389
00:27:33,679 --> 00:27:38,299
And all objects have a reference counting.
390
00:27:38,299 --> 00:27:43,450
So that’s similar to the
C++ shared pointer
391
00:27:43,450 --> 00:27:49,809
where every object has a small field
392
00:27:49,809 --> 00:27:54,450
like a counter field and everytime
the kernel wants to use an object
393
00:27:54,450 --> 00:27:59,899
this counter gets increased.
And everytime the…
394
00:27:59,899 --> 00:28:04,239
like when the reference is no longer
needed it will decrease the counter
395
00:28:04,239 --> 00:28:11,080
and when the counter reaches Zero it
will automatically delete that object
396
00:28:11,080 --> 00:28:19,010
from the Slab Heap. So they are basically
trying to prevent use after freeze.
397
00:28:19,010 --> 00:28:24,009
Also I’m not sure if that’s
a security measurement
398
00:28:24,009 --> 00:28:29,690
but there are more than 100
panic calls in the kernel
399
00:28:29,690 --> 00:28:35,689
and that’s every 10th function
400
00:28:35,689 --> 00:28:44,019
- per average. And they have
the syscall access restriction.
401
00:28:44,019 --> 00:28:51,909
So you - as I said - you only have
access to like 50 system calls.
402
00:28:51,909 --> 00:28:55,189
All the interesting ones are disabled.
403
00:28:55,189 --> 00:29:01,729
E.g. you can’t map executable pages.
404
00:29:01,729 --> 00:29:06,039
On the other hand there
is no ASLR. But at least
405
00:29:06,039 --> 00:29:11,809
they’re trying to change the
memory mapping every time
406
00:29:11,809 --> 00:29:17,069
during a larger kernel update.
407
00:29:17,069 --> 00:29:22,549
Also there’s no stack protection. And
the Userland is always mapped.
408
00:29:22,549 --> 00:29:29,059
So once you’ve got control
over the program counter
409
00:29:29,059 --> 00:29:33,090
you can just jump to
410
00:29:33,090 --> 00:29:36,769
Userland pages that are
marked as executable.
411
00:29:36,769 --> 00:29:40,899
So you don’t have to do ROP in the kernel.
412
00:29:40,899 --> 00:29:44,659
It’s pretty nice.
413
00:29:44,659 --> 00:29:50,599
But they tried to have
an execution prevention
414
00:29:50,599 --> 00:29:57,810
in the kernel that is: they’re
marking executable kernel pages
415
00:29:57,810 --> 00:30:01,899
– that is the code – they’re
marking them as executable
416
00:30:01,899 --> 00:30:08,710
in their Page Table. So let’s take a look.
417
00:30:08,710 --> 00:30:14,819
The highlighted parts in orange
are the kernel code sections.
418
00:30:14,819 --> 00:30:20,629
And as you can see like when
looking at the first highlighted line
419
00:30:20,629 --> 00:30:24,909
it says ‘virtual address #FFF00’ etc.
420
00:30:24,909 --> 00:30:32,489
is mapped to the physical
address 1FF80000.
421
00:30:32,489 --> 00:30:40,320
And it is marked as executable
and you only have access to it
422
00:30:40,320 --> 00:30:45,219
in Kernel Mode, of course,
and only Read access. Right?
423
00:30:45,219 --> 00:30:49,979
So this is correct.
424
00:30:49,979 --> 00:30:56,019
But when you look at the second
line of that Page Table dump
425
00:30:56,019 --> 00:31:00,799
you will notice that
there is another section
426
00:31:00,799 --> 00:31:05,960
which covers the entire AXI WRAM
427
00:31:05,960 --> 00:31:09,779
and it’s mapped as Read-Write.
428
00:31:09,779 --> 00:31:15,609
So it doesn’t really make sense. Yeah.
429
00:31:15,609 --> 00:31:23,939
So basically it’s completely useless.
We have Read-Write access to it.
430
00:31:23,939 --> 00:31:28,430
So, to summarize everything,
431
00:31:28,430 --> 00:31:32,849
there’s actually no exploitation
protection. Once we found
432
00:31:32,849 --> 00:31:38,700
an exploitable bug it’s
pretty likely that we gain
433
00:31:38,700 --> 00:31:43,219
code execution in kernel mode.
434
00:31:43,219 --> 00:31:47,779
So, let’s find that bug.
435
00:31:47,779 --> 00:31:53,509
And I started at looking at the SVC table.
436
00:31:53,509 --> 00:31:59,809
So this is kind of the interface
between kernel land and userland.
437
00:31:59,809 --> 00:32:05,889
And this shows all system calls
438
00:32:05,889 --> 00:32:11,369
that are available in the kernel. So
you have like normal system calls.
439
00:32:11,369 --> 00:32:18,049
For memory management you can
map read- and writable pages;
440
00:32:18,049 --> 00:32:25,119
you can mirror pages and do
other memory management stuff.
441
00:32:25,119 --> 00:32:30,869
And there’s also some
configuration for threads like
442
00:32:30,869 --> 00:32:37,589
you can choose which
core should be used for
443
00:32:37,589 --> 00:32:41,450
executing the thread and all that stuff.
444
00:32:41,450 --> 00:32:47,219
You have a really large range
of synchronization objects
445
00:32:47,219 --> 00:32:51,119
like kernel mute tags and
all that stuff. And of course
446
00:32:51,119 --> 00:32:56,299
you have IPC requesting, so you can
447
00:32:56,299 --> 00:33:03,099
send messages to services. And
there’s a more advanced section
448
00:33:03,099 --> 00:33:09,270
like this is used by services mostly,
449
00:33:09,270 --> 00:33:14,629
because they have to
respond to your IPC requests.
450
00:33:14,629 --> 00:33:20,769
And there’s also Kernel DMA,
cache control, some things.
451
00:33:20,769 --> 00:33:26,710
And they have a set of debug system calls.
452
00:33:26,710 --> 00:33:31,099
It’s just basic debugging.
You can set breakpoints,
453
00:33:31,099 --> 00:33:36,429
read and write process memory.
But you don’t have access to them.
454
00:33:36,429 --> 00:33:39,919
Like on retail it’s not actually used.
455
00:33:39,919 --> 00:33:47,099
And so one last section
is the Privileged section.
456
00:33:47,099 --> 00:33:53,719
And here are all the
interesting system calls
457
00:33:53,719 --> 00:34:00,260
that allow you to create processes and
458
00:34:00,260 --> 00:34:07,249
map executable memory and all that stuff.
459
00:34:07,249 --> 00:34:13,870
Unfortunately, we can’t use the Advanced,
Debug and Privileged system calls.
460
00:34:13,870 --> 00:34:19,810
I mean that would require
exploiting some service.
461
00:34:19,810 --> 00:34:24,020
And that’s just more work for us.
462
00:34:24,020 --> 00:34:29,130
So this leaves us with
the normal system calls.
463
00:34:29,130 --> 00:34:33,760
But IPC sounds really interesting.
464
00:34:33,760 --> 00:34:41,239
But unfortunately it’s full of panics.
465
00:34:41,239 --> 00:34:49,570
Also there’s not much to attack at
synchronization object system calls.
466
00:34:49,570 --> 00:34:59,470
So you only have like this
more interesting system call
467
00:34:59,470 --> 00:35:06,520
for local memory management. And in
theory there’s a lot that you can mess up.
468
00:35:06,520 --> 00:35:12,290
Right? There’s a lot that can possibly
go wrong. And also we have
469
00:35:12,290 --> 00:35:17,030
unchecked DMA access!
Like through the GPU.
470
00:35:17,030 --> 00:35:22,180
So maybe we can do
something useful with that.
471
00:35:22,180 --> 00:35:26,430
Okay, so let’s have a look
at the memory allocator.
472
00:35:26,430 --> 00:35:30,440
There are 2 types of memory allocators.
473
00:35:30,440 --> 00:35:37,080
First is the regular one. And it’s
just for mapping normal heap
474
00:35:37,080 --> 00:35:43,700
like for malloc in C, e.g. And you
have the linear memory allocator
475
00:35:43,700 --> 00:35:49,250
that is used for GPU textures, like
476
00:35:49,250 --> 00:35:55,080
when memory has to be
physically continuous
477
00:35:55,080 --> 00:35:58,740
you use the linear memory allocator.
478
00:35:58,740 --> 00:36:03,910
And there’s the FCRAM memory
layout that we saw earlier.
479
00:36:03,910 --> 00:36:09,920
You have these 3 regions
and every region has
480
00:36:09,920 --> 00:36:14,930
its own set of free pages.
481
00:36:14,930 --> 00:36:21,740
So how are they keeping track of them?
482
00:36:21,740 --> 00:36:27,430
So you have a region descriptor
which tells us the dimensions like:
483
00:36:27,430 --> 00:36:32,020
where does it start, the region,
and its size. And you get also
484
00:36:32,020 --> 00:36:39,410
a pointer to the first
free piece of memory
485
00:36:39,410 --> 00:36:47,230
in that region. And each
free piece of memory
486
00:36:47,230 --> 00:36:53,650
which we call a Memchunk
has a Memchunk header
487
00:36:53,650 --> 00:36:58,450
right at the beginning. And
it basically tells the kernel
488
00:36:58,450 --> 00:37:03,850
how large that Memchunk
is. And it’s also linked
489
00:37:03,850 --> 00:37:08,410
in a Doubly Linked List. So you
have a next and previous pointer
490
00:37:08,410 --> 00:37:15,030
pointing to the next and
previous Memchunk headers.
491
00:37:15,030 --> 00:37:20,970
It kind of looks like that.
So you have the red parts
492
00:37:20,970 --> 00:37:29,170
which are the free Memchunks
and the green parts are memory
493
00:37:29,170 --> 00:37:34,760
that is already allocated. So
494
00:37:34,760 --> 00:37:40,240
allocation is pretty straightforward.
It’s not really complicated.
495
00:37:40,240 --> 00:37:45,900
So the first thing that the
allocator function does:
496
00:37:45,900 --> 00:37:52,170
it loads the next free pointer
from the region descriptor.
497
00:37:52,170 --> 00:37:59,230
And for regular memory it
just goes through the list
498
00:37:59,230 --> 00:38:05,380
following the pointers
and it sums up their size
499
00:38:05,380 --> 00:38:10,670
until the requested size is reached.
For linear memory it would just
500
00:38:10,670 --> 00:38:17,120
look for a suitable memory chunk to make
sure that the memory is really continuous.
501
00:38:17,120 --> 00:38:22,490
So when it found enough memory
it sets the next pointer
502
00:38:22,490 --> 00:38:28,230
of the very last Memchunk
to Zero. It will then
503
00:38:28,230 --> 00:38:33,690
update the list and also
the next free pointer
504
00:38:33,690 --> 00:38:38,550
for the region descriptor
and finally it will return
505
00:38:38,550 --> 00:38:44,780
a pointer to the first
Memchunk. So,
506
00:38:44,780 --> 00:38:48,930
let’s look at this from
a security perspective.
507
00:38:48,930 --> 00:38:53,410
And there’s a problem. They
basically have kernel structures
508
00:38:53,410 --> 00:38:59,500
inside the FCRAM!
And that is a problem
509
00:38:59,500 --> 00:39:03,930
because we have DMA access
to it through the GPU.
510
00:39:03,930 --> 00:39:08,740
And there was an attack by yellows8
511
00:39:08,740 --> 00:39:13,180
that is called ‘memchunkhax’.
And what he did
512
00:39:13,180 --> 00:39:17,060
is basically: he overwrote
memchunk headers
513
00:39:17,060 --> 00:39:21,540
with the GPU DMA
flaw. And then
514
00:39:21,540 --> 00:39:27,330
he gained an arbitrary kernel write
515
00:39:27,330 --> 00:39:31,710
when it’s deallocating memory. So because
516
00:39:31,710 --> 00:39:36,790
next/prev pointers have been modified.
517
00:39:36,790 --> 00:39:42,140
So, unfortunately, this
was fixed by Nintendo
518
00:39:42,140 --> 00:39:47,600
in system update 9.3 last year,
519
00:39:47,600 --> 00:39:54,100
like 1 year ago. And the new kernel will
now verify every memchunk header
520
00:39:54,100 --> 00:40:00,280
during allocation. Like its size
and also next/prev pointers.
521
00:40:00,280 --> 00:40:08,160
So, in theory, everything has been fixed.
Invalid pointers or invalid sizes
522
00:40:08,160 --> 00:40:16,870
will just result in a
kernel panic. In theory.
523
00:40:16,870 --> 00:40:22,260
So when you look at the system
call for Controlmemory…
524
00:40:22,260 --> 00:40:29,140
we have access to it. It’s one
of the normal system calls.
525
00:40:29,140 --> 00:40:33,520
It does basic stuff. You
can map/free RW pages,
526
00:40:33,520 --> 00:40:41,040
but not executable of course. And it
takes an address and size as argument.
527
00:40:41,040 --> 00:40:46,530
And also an operation code
which tells the kernel what to do:
528
00:40:46,530 --> 00:40:50,670
to map or free pages, whatever.
529
00:40:50,670 --> 00:40:55,590
So first it does some basic
checks on the address
530
00:40:55,590 --> 00:41:01,710
and eventually it will
call a very large function.
531
00:41:01,710 --> 00:41:08,640
And I just call that function
kern::controlmemory.
532
00:41:08,640 --> 00:41:14,980
So what can kern::controlmemory:
it calls the allocator function
533
00:41:14,980 --> 00:41:20,550
and it will just return a
memchunk header pointer
534
00:41:20,550 --> 00:41:28,460
– as we have seen earlier. Then it goes
through all of the allocated memchunks
535
00:41:28,460 --> 00:41:33,100
and it’s mapping them to user space.
536
00:41:33,100 --> 00:41:40,330
And it’s also updating some block
information for KProcess object.
537
00:41:40,330 --> 00:41:47,490
So there’s a problem. There’s
obviously a race condition.
538
00:41:47,490 --> 00:41:57,070
Like we can overwrite memchunk
headers after they have been allocated.
539
00:41:57,070 --> 00:42:03,570
So we could try using the GPU
but it’s really slow, actually,
540
00:42:03,570 --> 00:42:11,020
because we would have to ask
the GSP service to read memory
541
00:42:11,020 --> 00:42:19,570
and we have to go to this
very large IPC kernel code.
542
00:42:19,570 --> 00:42:26,730
And that would be probably too
slow. Allocation is really fast.
543
00:42:26,730 --> 00:42:30,930
Let’s dig a little bit deeper.
544
00:42:30,930 --> 00:42:38,060
I tried to reconstruct
the source code in C.
545
00:42:38,060 --> 00:42:44,040
So this is the first step.
It tries to allocate memory.
546
00:42:44,040 --> 00:42:54,070
For this example, it will just
allocate regular memory.
547
00:42:54,070 --> 00:42:58,510
So when it found a memchunk
548
00:42:58,510 --> 00:43:04,700
which means that it’s not
enough memory is available.
549
00:43:04,700 --> 00:43:11,890
It will then execute this
really interesting do-while loop.
550
00:43:11,890 --> 00:43:15,520
I know, it’s a lot of code. I’m not
sure that you can actually read it.
551
00:43:15,520 --> 00:43:21,900
So let’s go quickly through this code.
552
00:43:21,900 --> 00:43:27,990
The pages read from the Memchunk header.
It gets converted to a physical address.
553
00:43:27,990 --> 00:43:31,700
And that physical address
gets mapped to userland
554
00:43:31,700 --> 00:43:38,980
by mem_map function. And then
it will go to the next memchunk.
555
00:43:38,980 --> 00:43:45,410
Here. And it will also update
the userland virtual address.
556
00:43:45,410 --> 00:43:49,500
And then it will clear that memory. So
557
00:43:49,500 --> 00:43:53,880
what’s wrong here?
558
00:43:53,880 --> 00:44:00,020
The problem is they’re mapping
the Memorychunk into userland.
559
00:44:00,020 --> 00:44:05,770
And after it has been mapped
they’re accessing it again.
560
00:44:05,770 --> 00:44:10,040
And what they access is the next pointer.
561
00:44:10,040 --> 00:44:13,250
So we can just overwrite it.
562
00:44:13,250 --> 00:44:19,509
When we have 2 threads running we can
563
00:44:19,509 --> 00:44:25,410
– from another CPU core –
try to overwrite that pointer.
564
00:44:25,410 --> 00:44:32,320
So our goal would be to map
kernel pages to userspace.
565
00:44:32,320 --> 00:44:37,510
But there are some problems. It
requires really, really perfect timing.
566
00:44:37,510 --> 00:44:45,040
There’s only a very small
time frame to do the overwrite.
567
00:44:45,040 --> 00:44:53,500
Also, we need a Memchunk header
structure at the next pointer address…
568
00:44:53,500 --> 00:45:00,710
…to do this. To make sure
we get a perfect timing
569
00:45:00,710 --> 00:45:06,810
I came up with a kernel
address arbiter oracle.
570
00:45:06,810 --> 00:45:11,650
It is actually used for thread
synchronization, we don’t care about it.
571
00:45:11,650 --> 00:45:15,430
But it tries to read from address and
returns an error when the address is
572
00:45:15,430 --> 00:45:23,860
not accessible by userland. So
we can use that system call
573
00:45:23,860 --> 00:45:28,600
to make sure that the memory
has been mapped to userland.
574
00:45:28,600 --> 00:45:32,260
And once it has been mapped
we’re trying to overwrite it.
575
00:45:32,260 --> 00:45:38,080
So one last problem: we have to
inject a memory chunk error
576
00:45:38,080 --> 00:45:44,720
in kernel. I did this
by using the Slab Heap.
577
00:45:44,720 --> 00:45:50,720
We can just create some KObject
and set their member variables
578
00:45:50,720 --> 00:45:56,170
to create a faked memchunk header.
579
00:45:56,170 --> 00:46:00,430
So this is the Slab Heap.
We’ve got C++ objects,
580
00:46:00,430 --> 00:46:04,680
vtable pointer and some attributes.
581
00:46:04,680 --> 00:46:11,200
So the Slab Heap is basically just
a really large area of C++ objects.
582
00:46:11,200 --> 00:46:17,030
And what I did was
I changed the attributes
583
00:46:17,030 --> 00:46:22,170
and used them as Memchunk
header. And I am redirecting
584
00:46:22,170 --> 00:46:29,950
the next-pointer to that
object and it will map
585
00:46:29,950 --> 00:46:34,410
multiple C++ objects to userland.
And that’s really nice because
586
00:46:34,410 --> 00:46:40,180
we have vtable pointers, so
we can just overwrite them.
587
00:46:40,180 --> 00:46:44,440
And that means that
we gain code execution.
588
00:46:44,440 --> 00:46:49,570
So, as a summary, we set
up some kernel objects,
589
00:46:49,570 --> 00:46:52,840
change their attributes, request
memory from the kernel;
590
00:46:52,840 --> 00:46:57,290
and once it becomes available
we patch the next-pointer,
591
00:46:57,290 --> 00:47:02,100
overwrite that mapped
SlabHeap pages and
592
00:47:02,100 --> 00:47:08,060
then we call a system call
which closes the handle
593
00:47:08,060 --> 00:47:11,940
for the kernel objects that
we created in step one.
594
00:47:11,940 --> 00:47:17,470
So it will eventually call
some vtable function
595
00:47:17,470 --> 00:47:23,560
and it will just jump to our
modified vtable function.
596
00:47:23,560 --> 00:47:29,380
And we got ARM11
Level0 Code Execution!!
597
00:47:29,380 --> 00:47:38,750
applause, motivated by smea
598
00:47:38,750 --> 00:47:43,880
So, now plutoo will tell us
what nice things you can do
599
00:47:43,880 --> 00:47:47,310
once you gained ARM11
Code execution.
600
00:47:47,310 --> 00:47:55,060
plutoo: Hey guys! Okay, so… the ARM9.
601
00:47:55,060 --> 00:47:58,990
Let’s go.
602
00:47:58,990 --> 00:48:05,500
The ARM9 is actually also used
for executing old DS games.
603
00:48:05,500 --> 00:48:10,390
So what they do is, they actually,
you could say, reused the ARM9
604
00:48:10,390 --> 00:48:14,210
which is their backwards compatibility
processor. They use it
605
00:48:14,210 --> 00:48:21,130
as a security processor
when executing 3DS code.
606
00:48:21,130 --> 00:48:24,890
And like smea said it’s running
a stripped-down version
607
00:48:24,890 --> 00:48:30,700
of the ARM11 kernel. It basically
only does threading sequencation,
608
00:48:30,700 --> 00:48:35,460
things like that. And there’s
no MMU. There’s an MPU,
609
00:48:35,460 --> 00:48:39,560
8 regions you can configure.
610
00:48:39,560 --> 00:48:46,210
You could do no-execute
within those regions etc. but
611
00:48:46,210 --> 00:48:50,280
the granularity is not very
nice. And they only have 8.
612
00:48:50,280 --> 00:48:55,390
So they basically ran out of space.
And .data+stack is executable
613
00:48:55,390 --> 00:49:00,020
as long as you can jump to
it. And .text is writable
614
00:49:00,020 --> 00:49:06,240
so that’s bad. Basically whenever you can
615
00:49:06,240 --> 00:49:11,940
write code into arbitrary memory
you can just overwrite code.
616
00:49:11,940 --> 00:49:16,250
These features – you don’t want
them on a security processor.
617
00:49:16,250 --> 00:49:18,430
laughter
618
00:49:18,430 --> 00:49:23,740
So let’s go. So it turns out that
619
00:49:23,740 --> 00:49:28,040
there have been lots of exploits over
the years and most of them are fixed.
620
00:49:28,040 --> 00:49:33,330
And most of them used the
normal command interface.
621
00:49:33,330 --> 00:49:37,940
But in this case we’re taking
a different approach. So
622
00:49:37,940 --> 00:49:42,730
on the 3DS the memory-mapped
I/O is split up into 3 regions.
623
00:49:42,730 --> 00:49:46,420
There’s the ARM9-only I/O: it does crypto,
624
00:49:46,420 --> 00:49:50,980
it does DMA engine,
625
00:49:50,980 --> 00:49:54,760
things like that. Then there’s
the Shared I/O region.
626
00:49:54,760 --> 00:49:58,030
And then, finally, there’s the
ARM11 I/O region which contains
627
00:49:58,030 --> 00:50:02,760
the GPU video decoder.
628
00:50:02,760 --> 00:50:06,310
Thanks to derrek and smea
we have full ARM11 control.
629
00:50:06,310 --> 00:50:09,680
We execute kernel mode.
630
00:50:09,680 --> 00:50:13,280
So the question is: can we use
the shared I/O region, somehow,
631
00:50:13,280 --> 00:50:17,750
to own the ARM9? So it turns out
632
00:50:17,750 --> 00:50:21,550
the interface for reading old
DS cartridges is actually
633
00:50:21,550 --> 00:50:24,940
in the shared I/O region.
634
00:50:24,940 --> 00:50:30,260
We’re not sure why this is, but
635
00:50:30,260 --> 00:50:33,970
they have it there for some
reason. And it’s only the ARM9
636
00:50:33,970 --> 00:50:38,120
which is actually using this region.
But ARM11 still has access to it.
637
00:50:38,120 --> 00:50:43,780
So when you insert the cartridge
it starts by reading the banner.
638
00:50:43,780 --> 00:50:49,100
And it does this by writing this
magic value to CTRL register.
639
00:50:49,100 --> 00:50:53,940
And basically it just asks
for 0x200 [hex] bytes.
640
00:50:53,940 --> 00:50:56,490
And then there’s this loop.
641
00:50:56,490 --> 00:50:59,770
And this Assembler code
is on the right side.
642
00:50:59,770 --> 00:51:04,640
You can see it basically waits
for some bits to clear / to set
643
00:51:04,640 --> 00:51:11,170
and then they read 4 bytes and
then they wait for another bit.
644
00:51:11,170 --> 00:51:15,520
And there’s no range check on the
buffer. But it’s always 200 bytes,
645
00:51:15,520 --> 00:51:20,540
so it should be fine.
646
00:51:20,540 --> 00:51:24,510
What if we overwrite the
CTRL register from ARM11
647
00:51:24,510 --> 00:51:27,880
asking for 0x4000 bytes?
648
00:51:27,880 --> 00:51:32,080
Boom!
649
00:51:32,080 --> 00:51:36,490
We have a nice buffer overrun.
It’s in the DSS segment but…
650
00:51:36,490 --> 00:51:40,690
it’s still nice. And can control the data.
651
00:51:40,690 --> 00:51:48,110
So the data actually comes
from the cartridge.
652
00:51:48,110 --> 00:51:51,720
We need to make our
own DS cartridge. So,
653
00:51:51,720 --> 00:51:56,030
there’s this old device, called the
PassMe. It’s for the original DS,
654
00:51:56,030 --> 00:51:59,850
where you basically plug
old DS cartridge in
655
00:51:59,850 --> 00:52:03,960
and it basically modifies
the header as its read. So,
656
00:52:03,960 --> 00:52:08,620
these are available online for 5 bucks.
657
00:52:08,620 --> 00:52:15,480
And then you add an FPGA.
658
00:52:15,480 --> 00:52:21,150
I implemented this and it
works, but it’s very gimmicky.
659
00:52:21,150 --> 00:52:26,290
I don’t recommend it.
660
00:52:26,290 --> 00:52:30,790
And here’s my soldering,
it’s not very nice.
661
00:52:30,790 --> 00:52:35,730
This gives us ARM9 code execution
and this works on latest firmware.
662
00:52:35,730 --> 00:52:41,370
But we want something better.
Let’s look at the chain of trust.
663
00:52:41,370 --> 00:52:46,620
The chain of trust: the idea is of course,
you verify all the code that is running.
664
00:52:46,620 --> 00:52:51,560
But you’re basically verifying
everything at load time.
665
00:52:51,560 --> 00:52:55,230
The 3DS has the simplest
chain of trust you can have.
666
00:52:55,230 --> 00:52:58,560
There’s the Boot ROM at
the start. And then it loads
667
00:52:58,560 --> 00:53:04,490
the firmware binary from
NAND and it jumps to it.
668
00:53:04,490 --> 00:53:07,900
On the new 3DS they were a bit clever.
669
00:53:07,900 --> 00:53:12,760
They added an extra crypto
layer on the ARM9 portion.
670
00:53:12,760 --> 00:53:17,520
But it’s actually part
of the firmware binary.
671
00:53:17,520 --> 00:53:20,380
We call this ‘ARM9 loader’.
672
00:53:20,380 --> 00:53:23,530
So the theory that Nintendo had was:
673
00:53:23,530 --> 00:53:27,460
“Let’s add another layer of
crypto, so we change the keys,
674
00:53:27,460 --> 00:53:32,470
we introduce new keys,
and they can’t break it”.
675
00:53:32,470 --> 00:53:35,560
And they don’t have any worked-out
place to put those keys.
676
00:53:35,560 --> 00:53:39,200
So they placed them in NAND!
677
00:53:39,200 --> 00:53:42,760
But they’re encrypted with
the per-Console key that’s
678
00:53:42,760 --> 00:53:48,030
based on a hash of the OTP
that’s unique for each Console.
679
00:53:48,030 --> 00:53:52,120
And then OTP access is
disabled early in the Boot.
680
00:53:52,120 --> 00:53:59,410
So later on you can’t dump the OTP
and you can’t figure out the keys.
681
00:53:59,410 --> 00:54:03,580
This looks safe, in theory.
So here’s the implementation.
682
00:54:03,580 --> 00:54:08,620
So they calculate some hash of the OTP.
They read the key-sector from NAND.
683
00:54:08,620 --> 00:54:12,430
And they decrypt the key.
And they put it in a keyslot.
684
00:54:12,430 --> 00:54:17,180
It’s basically an isolated memory area.
685
00:54:17,180 --> 00:54:21,170
And then they generate
a bunch of sub keys and
686
00:54:21,170 --> 00:54:24,620
they verify that the key they loaded
from NAND is the correct one.
687
00:54:24,620 --> 00:54:30,810
So even if we were to switch the key
they would detect that and just panic.
688
00:54:30,810 --> 00:54:35,300
And then they decrypt the ARM9 binary
and they jump to the entry point.
689
00:54:35,300 --> 00:54:40,420
But… they forgot to clear the 0x11 key!
690
00:54:40,420 --> 00:54:44,190
So we can just get code execution
later on. And we can just regenerate
691
00:54:44,190 --> 00:54:51,460
all those keys! So this
implementation is useless.
692
00:54:51,460 --> 00:54:52,760
Okay.
laughs
693
00:54:52,760 --> 00:54:58,960
applause
694
00:54:58,960 --> 00:55:03,760
And they fixed this because they have
more than 1 key hidden in the NAND.
695
00:55:03,760 --> 00:55:07,780
So they took their next key.
696
00:55:07,780 --> 00:55:10,680
It’s basically the same idea: you
calculate the same hash, you read
697
00:55:10,680 --> 00:55:14,920
the key sector from NAND, you generate
all the previous keys for compatibility,
698
00:55:14,920 --> 00:55:19,900
and then you decrypt a
new key, we call it Key#2.
699
00:55:19,900 --> 00:55:23,920
And then you decrypt ARM9
binary using the second key.
700
00:55:23,920 --> 00:55:27,780
You clear the keyslot, and
you jump to entry point.
701
00:55:27,780 --> 00:55:32,010
But they forgot to verify the second key!
audience laughs
702
00:55:32,010 --> 00:55:40,000
This is epic fail!
applause
703
00:55:40,000 --> 00:55:44,520
So let’s exploit this. ‘ARM9LOADERHAX’.
704
00:55:44,520 --> 00:55:49,510
We can change the second key. ARM9
loader will just decrypt the binary
705
00:55:49,510 --> 00:55:54,820
to garbage and jump to it.
706
00:55:54,820 --> 00:56:00,110
If you look at the encoding
of a ARM Branch instruction:
707
00:56:00,110 --> 00:56:04,310
the probability is pretty high that
there will just be a Branch instruction.
708
00:56:04,310 --> 00:56:08,590
And just any random data will eventually…
like if you try enough keys,
709
00:56:08,590 --> 00:56:14,810
it will eventually become a Branch
instruction to some memory.
710
00:56:14,810 --> 00:56:19,490
So if we try a lot of keys, eventually
we will find some garbage
711
00:56:19,490 --> 00:56:23,990
that is useful.
712
00:56:23,990 --> 00:56:29,680
This is the NAND of the Flash
memory of an unmodified 3DS
713
00:56:29,680 --> 00:56:37,349
– a new 3DS. So there’s a small key
section, marked in teal, like, blue.
714
00:56:37,349 --> 00:56:41,660
And it contains those keys
that we’re talking about.
715
00:56:41,660 --> 00:56:44,550
And then there are 2 firmware partitions.
716
00:56:44,550 --> 00:56:47,960
One is used for backup, in
case one gets corrupted;
717
00:56:47,960 --> 00:56:52,119
so it doesn’t brick the device, whatever.
718
00:56:52,119 --> 00:56:57,190
We installed our custom key.
719
00:56:57,190 --> 00:57:00,920
And we installed the largest
firm binary we have
720
00:57:00,920 --> 00:57:06,100
in the firm0 partition. And we keep
the one with the vulnerability
721
00:57:06,100 --> 00:57:11,760
in the firm1 partition. And
then we put our code payload
722
00:57:11,760 --> 00:57:17,250
on top of the firmware0 binary.
723
00:57:17,250 --> 00:57:21,340
And then we reboot.
And so what will happen?
724
00:57:21,340 --> 00:57:24,070
The Bootrom is executed.
725
00:57:24,070 --> 00:57:29,660
It will load the first firmware partition.
726
00:57:29,660 --> 00:57:34,510
And it has our code in the end,
but it doesn’t know about it.
727
00:57:34,510 --> 00:57:38,880
And then it decrypts it.
And, you see, it looks okay.
728
00:57:38,880 --> 00:57:43,800
There’s the ARM9 loader stub in the front;
and then comes the encrypted binary.
729
00:57:43,800 --> 00:57:48,170
And then, finally,
there’s our payload.
730
00:57:48,170 --> 00:57:52,960
But Bootrom checks the
hash, right? And it fails.
731
00:57:52,960 --> 00:57:58,280
So it thinks the partition got corrupted.
732
00:57:58,280 --> 00:58:03,000
So it will load the smaller one on top.
You see we have our payload in memory,
733
00:58:03,000 --> 00:58:09,380
at Boot. And then it decrypts firmware1
734
00:58:09,380 --> 00:58:14,810
which is smaller and it still has ARM9
loader and another encrypted ARM9 binary.
735
00:58:14,810 --> 00:58:18,910
And then it jumps to ARM9 loader
because the hash checks out.
736
00:58:18,910 --> 00:58:24,230
And then the ARM9 loader will
decrypt our corrupted key
737
00:58:24,230 --> 00:58:28,940
from NAND and it will
decrypt this one to garbage
738
00:58:28,940 --> 00:58:37,100
and it will jump to it. And
hopefully it jumps to our code.
739
00:58:37,100 --> 00:58:41,770
So this gives us ARM9 code
execution from cold Boot.
740
00:58:41,770 --> 00:58:46,230
Early, very early. So it turns out we
can actually use this to get some keys
741
00:58:46,230 --> 00:58:52,000
that are later not available
because they clear those…
742
00:58:52,000 --> 00:58:56,869
they use a certain memory area for seeding
encryption engine to generate keys
743
00:58:56,869 --> 00:59:04,440
and the memory is later cleared.
So you can’t regenerate the keys.
744
00:59:04,440 --> 00:59:08,400
But with this we can actually
get those 2 keys.
745
00:59:08,400 --> 00:59:11,850
They’re called the firmware 6.x save-key
746
00:59:11,850 --> 00:59:15,780
and firmware 7.x NCCH-key.
747
00:59:15,780 --> 00:59:20,400
That’s a bonus.
748
00:59:20,400 --> 00:59:25,220
We talked a bit about the AES engine.
It’s used everywhere for the crypto
749
00:59:25,220 --> 00:59:30,200
and it’s used for everything, basically.
750
00:59:30,200 --> 00:59:35,990
It supports all the usual
block cipher modes.
751
00:59:35,990 --> 00:59:40,940
It has 2 security features: it has
write-only keys. Which is really useful.
752
00:59:40,940 --> 00:59:44,750
Like you write a key and then
you can never ever read it back.
753
00:59:44,750 --> 00:59:49,770
This means that they can
fill in the keys by the Bootrom
754
00:59:49,770 --> 00:59:56,150
and we can’t dump them later.
755
00:59:56,150 --> 01:00:01,300
So they can keep the keys secret.
756
01:00:01,300 --> 01:00:08,280
Even if we hacked the ARM9, even if we get
code execution we’ll never get the keys.
757
01:00:08,280 --> 01:00:12,250
And then there’s the key scrambler.
Which is that the key is actually
758
01:00:12,250 --> 01:00:16,320
– it’s an optional thing –
where the actual key is hidden,
759
01:00:16,320 --> 01:00:21,090
calculated by a hardware
function, that is never…
760
01:00:21,090 --> 01:00:26,359
that we don’t know about. So the key
is actually never exposed to the CPU
761
01:00:26,359 --> 01:00:30,580
– the actual key. So we just feed it 2
values, 2 keys and then it generates
762
01:00:30,580 --> 01:00:35,000
a new key based on that. And
we don’t know what that key is.
763
01:00:35,000 --> 01:00:40,500
So this creates a situation similar to
the isolated SPUs on the PS3
764
01:00:40,500 --> 01:00:44,000
where you can ask it to decrypt
stuff, but you don’t get the keys.
765
01:00:44,000 --> 01:00:49,640
And if you don’t get the keys,
then… we want the keys!!
766
01:00:49,640 --> 01:00:53,300
We want to decrypt things on
our PC because we’re lazy.
767
01:00:53,300 --> 01:00:57,720
So there’re 2 keys –
KeyX, KeyY we call them.
768
01:00:57,720 --> 01:01:01,970
They’re 128bits and the
normal key is derived
769
01:01:01,970 --> 01:01:06,250
as a function of those 2;
and that function is unknown.
770
01:01:06,250 --> 01:01:12,040
It’s implemented in hardware, in silicon.
771
01:01:12,040 --> 01:01:15,760
So even if we know X and Y we
can’t figure out the normal key
772
01:01:15,760 --> 01:01:21,960
and we can’t decrypt things
without asking the 3DS first.
773
01:01:21,960 --> 01:01:26,550
But we can poke this hardware engine.
774
01:01:26,550 --> 01:01:30,050
The first thing you notice when you
do this is that if you set the N-th bit
775
01:01:30,050 --> 01:01:37,140
of the X key and the N+2 bit in
the Y key you get the same result.
776
01:01:37,140 --> 01:01:41,080
And in general, you find that
the function that we’re looking for
777
01:01:41,080 --> 01:01:45,280
is actually just a function
of one variable where it’s
778
01:01:45,280 --> 01:01:50,690
the XOR between the X rotated by 2…
779
01:01:50,690 --> 01:01:56,100
so this is rotation, not shift,
and XOR-ed with Y.
780
01:01:56,100 --> 01:01:59,430
But we still don’t know the key.
But we want to know keys. So…
781
01:01:59,430 --> 01:02:08,140
So step back a little bit.
782
01:02:08,140 --> 01:02:12,070
The keyscrambler is used for Mii QR-codes.
783
01:02:12,070 --> 01:02:18,740
It’s used for everything, right? So it’s
used for network protocol, called UDS,
784
01:02:18,740 --> 01:02:23,930
and it’s used for Download Play – which
is when you download games over WiFi,
785
01:02:23,930 --> 01:02:28,000
temporary games. But the
Wii U also supports all of this.
786
01:02:28,000 --> 01:02:31,180
But it doesn’t have the
key scrambler in hardware.
787
01:02:31,180 --> 01:02:33,090
So the Wii U must be using normal keys.
788
01:02:33,090 --> 01:02:36,520
applause
screamed from audience: WHAT?
789
01:02:36,520 --> 01:02:46,360
applause
790
01:02:46,360 --> 01:02:51,210
So we make a table of the shared keys and
791
01:02:51,210 --> 01:02:54,619
these are the 3 keys that
are shared with the Wii U.
792
01:02:54,619 --> 01:03:00,240
Who is where the KeyX
and KeyY on the 3DS…
793
01:03:00,240 --> 01:03:05,920
where they are set. And 2 of them
have KeyY set by firmware.
794
01:03:05,920 --> 01:03:11,510
So we can’t read the keys set by the
Bootrom because it’s locked away
795
01:03:11,510 --> 01:03:17,310
and we don’t have it. But can
we still figure out G? Let’s see.
796
01:03:17,310 --> 01:03:23,390
So I gave shoutout to shuffle2 and
to fail0verflow who hacked the WiiU
797
01:03:23,390 --> 01:03:27,540
and they helped us… or shuffle
helped us extract the Wii U keys.
798
01:03:27,540 --> 01:03:36,670
So thank you! Now we have KeyY and
we know the normal key from the Wii U.
799
01:03:36,670 --> 01:03:39,740
However, KeyX is still unknown.
800
01:03:39,740 --> 01:03:44,560
And if G(t) is ‘bad’ then a
small change in the KeyY
801
01:03:44,560 --> 01:03:48,970
will only lead to a small
change in the normal key.
802
01:03:48,970 --> 01:03:53,369
It’s bad! So let’s look at the data.
803
01:03:53,369 --> 01:03:56,670
So when we flip one bit in the
KeyY we can brute-force all keys
804
01:03:56,670 --> 01:04:01,390
similar to the normal key which
is just within a couple of bit flips
805
01:04:01,390 --> 01:04:06,540
and we find that it always
results in the normal key
806
01:04:06,540 --> 01:04:12,980
with bits flipped at
position either 87 or 88,
807
01:04:12,980 --> 01:04:16,340
sometimes 89, but never 86.
808
01:04:16,340 --> 01:04:22,359
So this reminds me of an adder
where you had a carry bit
809
01:04:22,359 --> 01:04:26,160
being propagated to upper
bits, but never to lower ones.
810
01:04:26,160 --> 01:04:30,980
So let’s guess that this is
an adder and let’s try:
811
01:04:30,980 --> 01:04:37,599
it’s an adder with a rotation so
we guess that G(t) = (t+C)
812
01:04:37,599 --> 01:04:45,140
– some constant C, we don’t know it –
and rotated to the left by 87.
813
01:04:45,140 --> 01:04:50,680
And then we plug it in to our original
formula and we don’t know KeyX, remember,
814
01:04:50,680 --> 01:04:53,640
because it’s set by Bootrom,
we don’t have it.
815
01:04:53,640 --> 01:04:59,440
We don’t know the constant C because
it’s in silicon, it’s in hardware.
816
01:04:59,440 --> 01:05:04,630
But if we look at the formula,
and we consider the inequality,
817
01:05:04,630 --> 01:05:09,440
where we basically rotate right by 87
818
01:05:09,440 --> 01:05:13,500
– we’re basically undoing
the outer rotation.
819
01:05:13,500 --> 01:05:18,810
And then we plug in our formula
our guess. And then we get this.
820
01:05:18,810 --> 01:05:23,300
And then we subtract C from
both sides. We end up with this.
821
01:05:23,300 --> 01:05:28,510
And this is basically… we’re XOR-ing
2 different keys with the same X value
822
01:05:28,510 --> 01:05:34,810
rotated to the left by 2.
823
01:05:34,810 --> 01:05:38,150
Well if you stare for
this bit you’ll see that
824
01:05:38,150 --> 01:05:45,950
if y0 and y1 – which are 2 different
KeyY’s – are equal except for
825
01:05:45,950 --> 01:05:52,240
at one bit position then
the XOR is smallest
826
01:05:52,240 --> 01:05:58,100
for the one which shares
the same bit value
827
01:05:58,100 --> 01:06:03,070
at the position that the
2 Y’s are differing at.
828
01:06:03,070 --> 01:06:07,740
It’s actually pretty simple
but it sounds difficult.
829
01:06:07,740 --> 01:06:12,720
XOR is Zero if they’re the same
input and One if they’re different.
830
01:06:12,720 --> 01:06:16,080
If they’re the same it’s
Zero and it’s smaller.
831
01:06:16,080 --> 01:06:20,550
So we actually look
bit-by-bit on this. And
832
01:06:20,550 --> 01:06:27,910
we repeat this 128 times. And we
recover all 128 bits of the KeyX.
833
01:06:27,910 --> 01:06:32,740
And when we have the KeyX we can
calculate the silicon constant C.
834
01:06:32,740 --> 01:06:38,250
So the end result is: the key
scrambler is figured out
835
01:06:38,250 --> 01:06:45,290
and we have also the secret Bootrom
KeyX for a couple of keyslots, as a bonus.
836
01:06:45,290 --> 01:07:00,780
applause, motivated by smea
837
01:07:00,780 --> 01:07:04,530
I didn’t think trough the constants in
the slides because I want this to be
838
01:07:04,530 --> 01:07:11,840
an exercise for the listener.
839
01:07:11,840 --> 01:07:16,400
When the new 3DS was released
they rushed it, we think,
840
01:07:16,400 --> 01:07:22,440
because they left some interesting
commands in the PsPs service. And
841
01:07:22,440 --> 01:07:31,150
it included an early version of the NFC
crypto used for the Amiibo figurines.
842
01:07:31,150 --> 01:07:36,609
This implementation, the first
one, uses a normal key. And the…
843
01:07:36,609 --> 01:07:40,060
the newer one changed it to KeyY.
844
01:07:40,060 --> 01:07:44,290
So they accidently gave us one of
these pairs in the firmware images.
845
01:07:44,290 --> 01:07:47,260
We don’t need to use the Wii U at all.
846
01:07:47,260 --> 01:07:52,210
So anyone who can decrypt
3DS firmware binaries
847
01:07:52,210 --> 01:07:58,400
can perform this attack
to get the constants.
848
01:07:58,400 --> 01:08:03,290
So anyone out there: Good luck!
849
01:08:03,290 --> 01:08:06,750
And now: back to smea, for a summary.
850
01:08:06,750 --> 01:08:13,720
applause
851
01:08:13,720 --> 01:08:16,880
smea: Right, I’m just gonna conclude
really quickly. So, some take-aways of
852
01:08:16,880 --> 01:08:20,839
what we talked about
today: first thing is:
853
01:08:20,839 --> 01:08:23,988
it’s all pretty obvious lessons,
but – you know – bare with me
854
01:08:23,988 --> 01:08:29,049
Giving access to physical memory to
any application, through GPU or whatever,
855
01:08:29,049 --> 01:08:31,849
is dangerous. You should always be
careful about that. Even if you think
856
01:08:31,849 --> 01:08:36,059
you’ve protected stuff, there’s probably
gonna be stuff that you forgot. So just,
857
01:08:36,059 --> 01:08:39,538
like “you don’t do it or do it right”.
858
01:08:39,538 --> 01:08:42,408
Other thing is: Shared I/O is
dangerous if you don’t know
859
01:08:42,408 --> 01:08:47,908
what can actually control the I/O, then,
well, again, you should be very careful.
860
01:08:47,908 --> 01:08:52,319
Also, only checking your data
before decryption is dangerous,
861
01:08:52,319 --> 01:08:56,429
and - both that and not checking the key
when you know that it could possibly
862
01:08:56,429 --> 01:09:00,609
be modified by an attacker
is a bad idea. And finally,
863
01:09:00,609 --> 01:09:05,099
secrets in hardware are great
unless you give them away, so…
864
01:09:05,099 --> 01:09:07,569
don’t do that! laughs
audience laughs*
865
01:09:07,569 --> 01:09:11,309
Beyond that we just wanted to talk about
the state of Homebrew really quickly.
866
01:09:11,309 --> 01:09:15,488
You might recall, on the - during the
Wii U talk around here
867
01:09:15,488 --> 01:09:19,828
2 years ago. And fail0verflow said
that they didn’t think necessarily
868
01:09:19,828 --> 01:09:23,599
there was much of a future for console
Homebrew. And there’s definitely
869
01:09:23,599 --> 01:09:28,629
an argument for that with
the rise of phones, mostly.
870
01:09:28,629 --> 01:09:31,908
Anyone can make an app, can make
a game for any number of devices
871
01:09:31,908 --> 01:09:37,189
and sell it to millions of people.
But you know, we disagree.
872
01:09:37,189 --> 01:09:39,059
cheers and applause
873
01:09:39,059 --> 01:09:43,920
It’s been a year since we started
releasing 3DS homebrew. And
874
01:09:43,920 --> 01:09:47,788
– this is supposed to be moving,
but… let’s imagine it’s moving.
875
01:09:47,788 --> 01:09:52,489
Well, there in there - like a bunch of
3DS Homebrew. It’s been awesome!
876
01:09:52,489 --> 01:09:56,200
We’ve been working on this really hard.
A lot of people had been joining us.
877
01:09:56,200 --> 01:10:01,570
It’s a great community effort. And
basically what I want to say is
878
01:10:01,570 --> 01:10:05,860
we want more developers.
So if you’d like to join us
879
01:10:05,860 --> 01:10:10,530
there is a very… well it’s not
very mature, but it’s maturing,
880
01:10:10,530 --> 01:10:15,130
our SDK. And you know what:
reverse-engineering hardware is fun.
881
01:10:15,130 --> 01:10:18,210
When we don’t have any documentation,
reverse-engineering software is fun.
882
01:10:18,210 --> 01:10:22,770
We can always use more reverse-engineers
and just people who want to make cool shit,
883
01:10:22,770 --> 01:10:28,999
so… Yeah, oh… right! Just one more thing.
884
01:10:28,999 --> 01:10:32,769
Lately there has been a wave
of patches by Nintendo,
885
01:10:32,769 --> 01:10:36,170
of known exploits, which
has been really annoying.
886
01:10:36,170 --> 01:10:40,479
So for our Browser Hacks, well,
yellows8’s Browser Hacks,
887
01:10:40,479 --> 01:10:45,150
menu hacks, stuff like that…
Yellows8’s been working pretty hard,
888
01:10:45,150 --> 01:10:49,199
so he actually brought back browser
hacks, it should have been released
889
01:10:49,199 --> 01:11:02,720
about 10 minutes ago.
laughter, applause
890
01:11:02,720 --> 01:11:07,849
But we also had ironhax for an
eShop game, a free eShop game,
891
01:11:07,849 --> 01:11:12,479
so you could just download it. That was
patched. The thing is, there’s actually
892
01:11:12,479 --> 01:11:16,650
a way to download the old version from
the eShop application with some patches.
893
01:11:16,650 --> 01:11:20,269
So we’re also releasing that right now!
So basically if you can get Homebrew
894
01:11:20,269 --> 01:11:23,889
and get on to the eShop
with a modified patch.
895
01:11:23,889 --> 01:11:27,539
That should also be released in about…
well, whenever this is done.
896
01:11:27,539 --> 01:11:31,239
So get it as soon as possible,
this is a free game, it will get you
897
01:11:31,239 --> 01:11:36,590
Homebrew forever. So just do that.
And also, yellows8 just released
898
01:11:36,590 --> 01:11:39,800
a new version of menuhax which
works on latest firmware version.
899
01:11:39,800 --> 01:11:43,499
This was also patched like a couple of
weeks or months ago. So, this is all out
900
01:11:43,499 --> 01:11:48,099
right now. If you have a 3DS, get it.
If you have friends who have 3DS’s,
901
01:11:48,099 --> 01:11:53,749
well, tell them and tell them to get it.
Because it might not last super long.
902
01:11:53,749 --> 01:11:57,950
Yeah, so we would like to thank yellows8
who unfortunately can not be here tonight
903
01:11:57,950 --> 01:12:01,800
but has been super helpful, has been
doing a ton of work on the 3DS.
904
01:12:01,800 --> 01:12:05,479
And honestly, a ton of this could
not have been done without him.
905
01:12:05,479 --> 01:12:08,639
And thanks to everyone on the
#3DSDEV Homebrew channel,
906
01:12:08,639 --> 01:12:11,909
everyone who is attending tonight.
Thanks for this.
907
01:12:11,909 --> 01:12:14,999
And if you have any questions,
I don’t think we have a lot of time,
908
01:12:14,999 --> 01:12:28,429
but we’ll accommodate. Thanks!
applause
909
01:12:28,429 --> 01:12:31,740
Herald: Thank you for your patience, if
you got questions, please come upfront
910
01:12:31,740 --> 01:12:36,469
to these guys, because we have no more
time for structured Q&A. Thank you!
911
01:12:36,469 --> 01:12:41,400
postroll music
912
01:12:41,400 --> 01:12:47,499
Subtitles created by c3subtitles.de
in the year 2016. Join and help us!