1
00:00:00,000 --> 00:00:11,730
rC3 opening music
2
00:00:12,671 --> 00:00:17,070
Jiska: Hello everyone and welcome to my
talk, Fuzzing the phone in the iPhone. The
3
00:00:17,070 --> 00:00:22,270
phone in the iPhone is the component that
receives SMS, sends SMS, receives phone
4
00:00:22,270 --> 00:00:26,930
calls, makes phone calls and also manages
your Internet connection when you are not
5
00:00:26,930 --> 00:00:33,250
on Wi-Fi. However, you might now wonder,
what is it exactly? So I'm talking about
6
00:00:33,250 --> 00:00:39,510
CommC enter and fuzzing it via the QMI and
ARI interfaces. But this is a bit too
7
00:00:39,510 --> 00:00:44,350
technical for most of you. So I will first
introduce you to the concept of fuzzing in
8
00:00:44,350 --> 00:00:50,420
general and protocol fuzzing before I dive
into further details. For those of you
9
00:00:50,420 --> 00:00:54,890
have not yet heard about the concept of
fuzzing - you can send a lot of random
10
00:00:54,890 --> 00:01:00,329
messages and then try to test the security
of an interface with this. And in this
11
00:01:00,329 --> 00:01:07,150
video, you can see how I send SMS over a
Frida-based fuzzer with something like 400
12
00:01:07,150 --> 00:01:12,140
fuzzcases per second. And then the IMH
receives them, catches them and sends a
13
00:01:12,140 --> 00:01:18,900
couple of them also to the smartphone.
Let's start with a motivation and an
14
00:01:18,900 --> 00:01:24,080
explanation to the attacker model. So, if
you look into a modern smartphone, you
15
00:01:24,080 --> 00:01:28,680
have two components if you want to show it
in a simple way. So first of all, there's
16
00:01:28,680 --> 00:01:34,030
the hardware part with a lot of chips. And
then on top of this, there is an operating
17
00:01:34,030 --> 00:01:39,590
system and applications. However, it's not
as simple as this because even those chips
18
00:01:39,590 --> 00:01:44,990
are so complex that they run their own
little real-time operating systems to
19
00:01:44,990 --> 00:01:51,190
preprocess data. So this means that you
can even get code execution on such a
20
00:01:51,190 --> 00:01:55,560
chip. And this is usually much easier than
in the operating system itself, because
21
00:01:55,560 --> 00:02:05,939
those chips cannot have that many
mitigations. However, so what do you even
22
00:02:05,939 --> 00:02:11,400
do if you have code execution in such a
chip, so if you are in a baseband chip,
23
00:02:11,400 --> 00:02:16,419
then one escalation strategy from the chip
towards the operating system might be to
24
00:02:16,419 --> 00:02:20,900
manipulate traffic in the browser.
However, I don't think that this is the
25
00:02:20,900 --> 00:02:26,670
case, because if you look at the Zerodium
price list, then actually the browser
26
00:02:26,670 --> 00:02:32,280
exploits are much more expensive. So it's
probably not done like this. And there
27
00:02:32,280 --> 00:02:40,022
must be other ways to escalate from this
chip into the operating system. In
28
00:02:40,022 --> 00:02:44,829
general, the traffic manipulation is
something that you can always do in
29
00:02:44,829 --> 00:02:50,249
wireless transmission or also on the
Internet. So if you look how those systems
30
00:02:50,249 --> 00:02:54,129
work these days, so you have something
like the Internet in general that serve
31
00:02:54,129 --> 00:02:58,799
websites and so on, and also the core
network of your mobile provider. And there
32
00:02:58,799 --> 00:03:05,370
are many, many ways to manipulate traffic,
either if you are a state level actor who
33
00:03:05,370 --> 00:03:11,799
is able to have something in the core
network or just by sending around websites
34
00:03:11,799 --> 00:03:19,469
or modifying websites. And then there is
the base station subsystem, there might
35
00:03:19,469 --> 00:03:25,019
also be dragons. We don't know exactly.
And of course, there are over-the-air
36
00:03:25,019 --> 00:03:29,519
transmissions and wireless transmissions
are very special because, if there is
37
00:03:29,519 --> 00:03:33,799
something just slightly broken in the
encryption, for example, then it's also
38
00:03:33,799 --> 00:03:39,260
possible to manipulate traffic there, if
you have a software defined radio, for
39
00:03:39,260 --> 00:03:44,540
example. So all of this could be attacked
to manipulate traffic. And I don't think
40
00:03:44,540 --> 00:03:53,049
that for this, one would craft a baseband
exploit. Already in 2014 at the CCC, there
41
00:03:53,049 --> 00:04:00,140
have been two talks about a SS7 protocol
which is run in the core network and is
42
00:04:00,140 --> 00:04:05,370
actually meant to connect different
mobile carriers to each other. And this
43
00:04:05,370 --> 00:04:09,979
can also be used to intercept phone calls,
for example. And this also has been
44
00:04:09,979 --> 00:04:15,589
exploited recently. So even though, there
have been some mitigations, etc. since
45
00:04:15,589 --> 00:04:23,234
then, it's still exploited for the same
purpose to spy on people. So really,
46
00:04:23,234 --> 00:04:28,251
really, really, basement exploits only
exists to escalate from the chip into the
47
00:04:28,251 --> 00:04:37,790
operating system. But now the question is,
what are the strategies? So if it's not
48
00:04:37,790 --> 00:04:43,460
via the browser, what else could it be? So
the browser really I'm sure it is not,
49
00:04:43,460 --> 00:04:47,410
because also you need to have some
traffic and so on, it doesn't really work
50
00:04:47,410 --> 00:04:51,937
instantly, you need to visit the website
to replace traffic on a website and so on.
51
00:04:51,937 --> 00:04:57,020
There must be something else. So
if you are on the chip with remote code
52
00:04:57,020 --> 00:05:01,550
execution and want to go into the
operating system, there is some interface.
53
00:05:01,550 --> 00:05:06,190
And this means that something in those
interfaces needs to be exploitable, so
54
00:05:06,190 --> 00:05:15,400
that you can escalate the privileges from
the chip into the system. And also, those
55
00:05:15,400 --> 00:05:19,230
interfaces are very interesting from a
reverse engineer's perspective. So even if
56
00:05:19,230 --> 00:05:24,561
you don't want to attack anything, just
understanding how they work, is also a
57
00:05:24,561 --> 00:05:30,000
goal of this work. So, for example, if you
have a baseband debug profile, you can
58
00:05:30,000 --> 00:05:33,600
just download it onto your iPhone and then
you open your iDevice syslog, you can
59
00:05:33,600 --> 00:05:38,850
already see a lot of management messages
that are exchanged between the chip and
60
00:05:38,850 --> 00:05:45,130
the iPhone. And if you have a jailbreak
and Frida, you can even inject packets or
61
00:05:45,130 --> 00:05:52,810
modify packets to change the behaviour of
your modem.But if you want to start to
62
00:05:52,810 --> 00:05:59,340
work on such a thing, the question is
like, how do you even start? Where do you
63
00:05:59,340 --> 00:06:03,370
start? And fuzzing is actually a method
that can be used to understand such an
64
00:06:03,370 --> 00:06:08,390
interface. So initially, if you identified
an interface, just to check if it is the
65
00:06:08,390 --> 00:06:13,110
correct interface, so, if it really
changes behaviour, if you flip some bytes,
66
00:06:13,110 --> 00:06:17,200
but also how powerful this interface is.
So what are the features? What breaks
67
00:06:17,200 --> 00:06:23,420
instantly? And if things break, also you
can check if the whole interface has been
68
00:06:23,420 --> 00:06:29,300
designed with security in mind. Now, let's
start with an introduction to wireless
69
00:06:29,300 --> 00:06:34,230
protocol fuzzing, this will also be a
short rant because the current tooling for
70
00:06:34,230 --> 00:06:39,960
fuzzing is usually not made to fuzz a
protocol. So let's start with a very
71
00:06:39,960 --> 00:06:44,461
simple fuzzer, a fuzzer that is just an
image parser. So, you browse your
72
00:06:44,461 --> 00:06:49,620
smartphone for unicorn pictures or PNGs or
JPEGs, and then you send them to the image
73
00:06:49,620 --> 00:06:54,370
parser and in the image parser you might
be able to observe which functions are
74
00:06:54,370 --> 00:07:00,790
executed in the form of basic blocks. And
then, during this initialization, the
75
00:07:00,790 --> 00:07:05,400
image parser can even report which parts
were executed and you can just go to image
76
00:07:05,400 --> 00:07:12,320
again and again with different images and
get this basic block coverage back. In a
77
00:07:12,320 --> 00:07:18,639
next step, you can then combine existing
images or flip bits in these images and
78
00:07:18,639 --> 00:07:23,520
send them to the image parser and again
observe the coverage, most of the time, it
79
00:07:23,520 --> 00:07:28,450
won't generate any new coverage. So you
just say you are not looking into this
80
00:07:28,450 --> 00:07:33,550
image in particular, but sometimes you
might get new coverage, like here, and
81
00:07:33,550 --> 00:07:38,590
then you add this image to your corpus. So
over time, you can increase your corpus
82
00:07:38,590 --> 00:07:46,940
and increase your coverage. Another method
can be, if you know how exactly an image
83
00:07:46,940 --> 00:07:52,220
format looks like, so you might know the
JPEG specification and because of this,
84
00:07:52,220 --> 00:07:56,980
you could just generate images that are
more or less specification compliant and
85
00:07:56,980 --> 00:08:02,250
they look more artificial like this. So
you just generate images and send them to
86
00:08:02,250 --> 00:08:06,820
the image parser and at some point you
might observe a crash. So that also
87
00:08:06,820 --> 00:08:10,000
depends, again, on your harnessing. Maybe
you can observe basic blocks, maybe you
88
00:08:10,000 --> 00:08:18,620
can just observe crashes and then you know
at which image you had a crash. You might
89
00:08:18,620 --> 00:08:22,419
even be able to combine these two
approaches just depending on what you know
90
00:08:22,419 --> 00:08:29,270
about your input and how you can harness
your target. Now it looks a bit different
91
00:08:29,270 --> 00:08:35,149
for a protocol. So, in a protocol, you can
have a very complex state. Let's say you
92
00:08:35,149 --> 00:08:41,260
are in an active phone call or just
something like, you receive an SMS. You
93
00:08:41,260 --> 00:08:45,620
can actually force the iPhone to receive
SMS, if you have a second iPhone and send
94
00:08:45,620 --> 00:08:54,300
SMS. And then during the fuzzing, you can
replace some bits and bytes, like this and
95
00:08:54,300 --> 00:08:58,930
then you would have a modification. So
this is a very simple approach and it
96
00:08:58,930 --> 00:09:02,970
preserves the state. So no matter how
complex the thing is, that you're
97
00:09:02,970 --> 00:09:07,040
currently doing, it's very simple to flip
a bit here and there in an active
98
00:09:07,040 --> 00:09:12,610
interaction. But it's also a bit annoying,
because you need to have these active
99
00:09:12,610 --> 00:09:20,059
phone calls, etc. So something that's more
efficient is injection. So you would
100
00:09:20,059 --> 00:09:24,980
observe certain messages and then just send
them again - and then you don't even need
101
00:09:24,980 --> 00:09:29,959
the second phone to make calls, etc., -
you can just send a lot, a lot, a lot of
102
00:09:29,959 --> 00:09:34,420
data. And this is the effect, when your
iPhone goes di-di-di-di-dimm or something
103
00:09:34,420 --> 00:09:39,540
because of all the notifications and all
the data that is sent. But issue here is,
104
00:09:39,540 --> 00:09:44,080
that this does not preserve state. So
there might be actions where the iPhone
105
00:09:44,080 --> 00:09:49,740
requests something that is then answered.
So, the iPhone might request, for example,
106
00:09:49,740 --> 00:09:54,699
a date and only then the chip would reply
with a date and only then the iPhone would
107
00:09:54,699 --> 00:10:00,420
accept a date. But it's still very
interesting to do this. So even though you
108
00:10:00,420 --> 00:10:03,421
cannot reach certain states because you
can do this without a SIM card and you can
109
00:10:03,421 --> 00:10:09,899
do this very, very fast. So, just to
summarize the issues here: if you fuzz the
110
00:10:09,899 --> 00:10:13,670
wireless protocol, you can have very
significant state differences and just
111
00:10:13,670 --> 00:10:21,740
injecting packets cannot reach all
states. The fact, that you cannot reach
112
00:10:21,740 --> 00:10:26,899
all states also shows in very simple stuff
like a trace replay. So a trace of
113
00:10:26,899 --> 00:10:30,500
something that you record. So let's say I
have an active phone call, I record all
114
00:10:30,500 --> 00:10:35,110
the packets, and I can also observe the
coverage. So , with Frida, you can observe
115
00:10:35,110 --> 00:10:42,079
coverage on an iPhone while the phone call
is active. And then, in a second step, you
116
00:10:42,079 --> 00:10:45,999
would do some injection. But the only
thing that you can inject are the packets
117
00:10:45,999 --> 00:10:51,329
sent from the basement to the smartphone,
not the opposite direction. And this
118
00:10:51,329 --> 00:10:56,009
results usually in much less coverage. So
you are missing a lot of things due to a
119
00:10:56,009 --> 00:11:00,329
missing state. And even worse, if you do
the same thing again, you might be in a
120
00:11:00,329 --> 00:11:04,600
different state, and you might observe a
different coverage. So you do the exact
121
00:11:04,600 --> 00:11:13,910
same thing, but you get different
coverage.So, even replaying recorded
122
00:11:13,910 --> 00:11:22,209
messages results in less or inconsistent
coverage. Anyway, let's take a look into
123
00:11:22,209 --> 00:11:29,149
an injection example. So, in this video,
you can see how I'm in the Unicorn Network
124
00:11:29,149 --> 00:11:35,149
on an iPhone 8, which has obviously 5G,
but also does a lot of fuzzing and in the
125
00:11:35,149 --> 00:11:40,850
fuzzing, what is interesting is, that you
might do a lot of states in a combination
126
00:11:40,850 --> 00:11:45,370
that are not usually possible, like you
have a lost network connection while you
127
00:11:45,370 --> 00:11:51,690
have to confirm a pin or you have a
network connection during this, etc. To
128
00:11:51,690 --> 00:11:56,230
summarize my rant, some states cannot be
reached solely by injecting packets. So,
129
00:11:56,230 --> 00:12:02,249
even if we have a very good corpus and do
very good mutations, we might miss
130
00:12:02,249 --> 00:12:08,059
80% of the code, but we can just fuzz
anyway. But we need to keep in mind, that
131
00:12:08,059 --> 00:12:13,619
some stuff is just not fuzzable. We looked
into a lot of wireless protocols and have
132
00:12:13,619 --> 00:12:18,529
seen more in the past, so, it's worth to
also consider, which tooling we already
133
00:12:18,529 --> 00:12:24,220
had available for fuzzing protocols. The
most advanced tooling, that we have, is
134
00:12:24,220 --> 00:12:28,929
Frankenstein and it's built by Jan. So,
what Jan did is, he emulated the firmware
135
00:12:28,929 --> 00:12:34,490
and attached it to a virtual modem and
also a Linux host. For this, he first
136
00:12:34,490 --> 00:12:40,019
looked into the firmware, that's here, and
we had some partial symbols for this and
137
00:12:40,019 --> 00:12:47,050
also some information about registers.
Then, Frankenstein is actually taking a
138
00:12:47,050 --> 00:12:53,559
snapshot, that you can see here, including
some of those registers of the modem. And
139
00:12:53,559 --> 00:12:57,470
with this, you can build a virtual modem
and fuzz input as if it would come over
140
00:12:57,470 --> 00:13:02,889
the air. Then Frankenstein also emulates
the whole firmware, including thread
141
00:13:02,889 --> 00:13:08,139
switches. So it gets into very complex
states and it's even attached to a Linux
142
00:13:08,139 --> 00:13:15,140
host. So, it also fuzzes a bit of Linux
while actually fuzzing the firmware
143
00:13:15,140 --> 00:13:21,600
itself. Now, the issue with this is that
basement firmware is usually 10 times the
144
00:13:21,600 --> 00:13:27,670
size of bluetooth firmware or even more,
and we don't have any symbols for this, so
145
00:13:27,670 --> 00:13:34,009
it's a lot of work to customize this. And
even if, one would do all those steps and
146
00:13:34,009 --> 00:13:40,579
put all the work into this, it's only, so
to say, code execution in the baseband.
147
00:13:40,579 --> 00:13:47,589
It's not yet a privilege escalation into
the operating system. The next interesting
148
00:13:47,589 --> 00:13:52,439
tooling was built by Steffen and what
Steffen did, he built a fuzzer based on
149
00:13:52,439 --> 00:13:57,720
DTrace and AFL. DTrace is a tool that can
provide functional level coverage in the
150
00:13:57,720 --> 00:14:03,429
macOS kernel and user space. With some
modifications you can even get basic
151
00:14:03,429 --> 00:14:08,519
block coverage in the user space, which is
required for AFL to work. So, in the end,
152
00:14:08,519 --> 00:14:16,069
you have AFL or AFL++ as a fuzzer on any
program on macOS. It's even slightly
153
00:14:16,069 --> 00:14:20,899
faster than Frida, at least the version
that he used. And he gets a couple of
154
00:14:20,899 --> 00:14:27,290
thousand fuzz cases per second, even on a
very old iMac. So, in our lab, we just had
155
00:14:27,290 --> 00:14:33,999
an old iMac 2012 for this and it works on
this. But the issue is, that Wi-Fi and
156
00:14:33,999 --> 00:14:39,869
Bluetooth, which he fuzzed, are very complex
protocols, so he couldn't find any new
157
00:14:39,869 --> 00:14:45,939
bugs with AFL. And also, in the kernel
space, you only get this function level
158
00:14:45,939 --> 00:14:55,180
coverage. He still, despite not finding
any bugs in Wi-Fi or Bluetooth, got a CVE,
159
00:14:55,180 --> 00:15:03,050
because DTrace also has bugs. So, at least
some funding, but on iOS, this is not
160
00:15:03,050 --> 00:15:07,519
supported out of the box. So it might be
possible to get DTrace working with some
161
00:15:07,519 --> 00:15:12,279
tweaks, but it's a lot of work. So
probably it's easier to just use Frida in
162
00:15:12,279 --> 00:15:21,389
the iOS user space. Also during this, so
while Steffen was building all this very
163
00:15:21,389 --> 00:15:28,300
advanced tooling, Wang Yu found issues in
the macOS Bluetooth and Wi-Fi drivers, and
164
00:15:28,300 --> 00:15:35,480
so he was very, very successful in
comparison to us. That's really a pity.
165
00:15:35,480 --> 00:15:41,980
And I think, what he did, is much better
state modelling, so, of how the messages
166
00:15:41,980 --> 00:15:51,720
interact and what is important to reach
certain functions. So what is still left?
167
00:15:51,720 --> 00:15:57,899
So, usually fuzzing the baseband means
that you need to modify firmware or also
168
00:15:57,899 --> 00:16:02,600
emulate firmware, you need to implement
very complex specifications on a software
169
00:16:02,600 --> 00:16:07,796
defined radio if you want to fuzz over the
air or build proof of concepts. And for
170
00:16:07,796 --> 00:16:10,819
everything that's somewhat proprietary,
you need to do protocol reverse
171
00:16:10,819 --> 00:16:17,879
engineering, so you can spend a lot of
time and money just to do very, very basic
172
00:16:17,879 --> 00:16:24,750
research. Or, you can also use Frida, so
you can fuzz with Frida and all you need
173
00:16:24,750 --> 00:16:30,839
to do for this is, write a few lines of
code in JavaScript. So I kid you not. The
174
00:16:30,839 --> 00:16:37,799
option is Frida. Dennis was the first in
our team who was advised as a thesis
175
00:16:37,799 --> 00:16:43,049
student who built a Frida-based fuzzer,
and it's called ToothPicker. It's based on
176
00:16:43,049 --> 00:16:51,129
Frizzer and Radamsa. So what it does is,
well, it hooks into these connections or
177
00:16:51,129 --> 00:16:57,149
into the protocols of the bluetooth
daemon, you could also think of this upper
178
00:16:57,149 --> 00:17:01,499
part here, as one block. So the protocols
are implemented in the Bluetooth daemon,
179
00:17:01,499 --> 00:17:08,050
but we want to fuzz certain protocol
handlers. And to increase the coverage, he
180
00:17:08,050 --> 00:17:13,430
creates a virtual connection. So a virtual
connection holds a connection and pretends
181
00:17:13,430 --> 00:17:18,360
to the Bluetooth daemon that there would
be an active connection to a device. And
182
00:17:18,360 --> 00:17:21,410
of course, the chip would then say, I
don't know anything about this connection.
183
00:17:21,410 --> 00:17:26,000
So, there are also some abstractions in
here, so that the connection is not
184
00:17:26,000 --> 00:17:34,070
terminated. So, that's a very simple tool,
but it really found a lot of bugs and
185
00:17:34,070 --> 00:17:39,780
issues and even there were some issues in
the protocols themselves that also apply to
186
00:17:39,780 --> 00:17:46,030
macOS. So it's not just iOS bugs, but also
protocol bugs in macOS that Dennis found.
187
00:17:46,030 --> 00:17:50,910
And this really got me thinking,
because ToothPicker with only 20
188
00:17:50,910 --> 00:17:56,310
fuzz cases per second, so it's really,
really slow and we were still able to find
189
00:17:56,310 --> 00:18:04,140
Bluetooth vulnerabilities at this speed.
So, why is this? So first of all, if you
190
00:18:04,140 --> 00:18:08,130
try to fuzz Bluetooth over the air, then
the over-the-air connections are
191
00:18:08,130 --> 00:18:13,620
terminated after something like five
invalid packets. So, over-the-air fuzzing
192
00:18:13,620 --> 00:18:18,690
is really, really inefficient. And with
Frida you can actually patch these
193
00:18:18,690 --> 00:18:23,100
functions, so it's gone. Then the
virtual connections are a very important
194
00:18:23,100 --> 00:18:32,120
factor. So they are really, really
important for having coverage. It's still
195
00:18:32,120 --> 00:18:37,030
a lot of coverage that we missed during
replay and fuzzing. But it's
196
00:18:37,030 --> 00:18:41,470
really an advantage compared to the
other fuzzing approaches where you just
197
00:18:41,470 --> 00:18:47,270
inject packets. And in addition, there is
an issue here, because if you have a
198
00:18:47,270 --> 00:18:51,480
virtual connection, it might be that this
virtual connection triggers behaviour,
199
00:18:51,480 --> 00:18:55,910
that you cannot reproduce over the air.
So, that means that everything that you
200
00:18:55,910 --> 00:19:01,851
find, you need also to confirm that it
works over the air. At least the
201
00:19:01,851 --> 00:19:05,760
inconsistent coverage is also fixed in
ToothPicker, because ToothPicker
202
00:19:05,760 --> 00:19:10,860
replays all packets five times in a row.
But the issue here is that it also means
203
00:19:10,860 --> 00:19:17,020
that if you have a sequence of packets,
that is like generating a certain bug -
204
00:19:17,020 --> 00:19:21,550
so you need multiple packets - this is
nothing that the mutator is aware of and
205
00:19:21,550 --> 00:19:29,060
also nothing that's logged properly in
ToothPicker. And because of this, I got a
206
00:19:29,060 --> 00:19:33,820
bit anxious. Maybe we missed a
lot of things? So once I got the
207
00:19:33,820 --> 00:19:38,100
intuition that we are actually missing
certain state information, I had the idea
208
00:19:38,100 --> 00:19:44,060
to replace bytes in active connections.
And this is one part of that you can see
209
00:19:44,060 --> 00:19:52,000
on a keyboard, so I'm just replacing bytes
on keyboard input and see what happens.
210
00:19:52,000 --> 00:19:59,730
And I let this run for a couple of weeks,
also for different protocols and so on to
211
00:19:59,730 --> 00:20:08,680
see, if there are further bugs or not that
we didn't find previously. So here you
212
00:20:08,680 --> 00:20:13,450
can see the same for AirPods with SCO and
then they produce crack-sounds for the
213
00:20:13,450 --> 00:20:18,510
replace bytes, it's even worse for ACL, so
actual music, because then you can hear
214
00:20:18,510 --> 00:20:25,230
very noisy chirps. I let this fuzzer run
for multiple weeks and it didn't find
215
00:20:25,230 --> 00:20:30,070
any bugs that ToothPicker hadn't
discovered before. So, I think the reason
216
00:20:30,070 --> 00:20:35,280
for this is that I mainly passed in active
connections like the one with the audio
217
00:20:35,280 --> 00:20:40,000
or the keyboard, but I only passed a few
active pairings because this requires me
218
00:20:40,000 --> 00:20:48,310
to actually perform those pairings by
hand, so, nothing really interesting. The
219
00:20:48,310 --> 00:20:52,280
only bad thing that I could produce with
this, but not worth a CVE, is that the
220
00:20:52,280 --> 00:20:59,810
sound quality of my AirPods is now a
really, really bad. Well, OK. And also the
221
00:20:59,810 --> 00:21:07,740
Broadcom chips on iOS don't check the UART
lengths, but that's not that bad. So, I
222
00:21:07,740 --> 00:21:13,090
mean, if you consider that they removed
the write-RAM recently, then you might now
223
00:21:13,090 --> 00:21:20,990
still be able to write into the RAM via UART
buffer overflows. But yeah, nothing too
224
00:21:20,990 --> 00:21:27,700
interesting. So after all of this, I asked
myself: "What is still left for fuzzing if
225
00:21:27,700 --> 00:21:33,210
we cannot find a new Bluetooth or Wi-Fi
bugs?" Well, the iPhone baseband - or
226
00:21:33,210 --> 00:21:39,190
actually the iPhone basebands, because
there are two. The first variant of iPhone
227
00:21:39,190 --> 00:21:44,770
baseband, that you can get, are Qualcomm
chips and they are in the US devices they
228
00:21:44,770 --> 00:21:49,980
use the Qualcomm MSM interface. And this
interface comes with some documentation
229
00:21:49,980 --> 00:21:55,460
and there are even open source
implementations for it. So it's something
230
00:21:55,460 --> 00:22:03,390
that's probably easy to understand and
easy to fuzz. On the other hand in almost
231
00:22:03,390 --> 00:22:09,240
all devices that I had on my table, were
Intel chips. Intel has been recently
232
00:22:09,240 --> 00:22:15,210
bought by Apple, at least the part that
does the baseband chips and these are the
233
00:22:15,210 --> 00:22:18,700
chips in the European devices, that's
the reason why almost all my devices had
234
00:22:18,700 --> 00:22:22,870
Intel chips. And they use a special
protocol. It's called Apple Remote
235
00:22:22,870 --> 00:22:26,900
Invocation. And if you search for this on
the Internet, I even checked it like
236
00:22:26,900 --> 00:22:32,350
just today, there are no Google hits at
all. So it really hasn't been researched
237
00:22:32,350 --> 00:22:36,870
before, at least not publicly. It's
completely undocumented and it's a very
238
00:22:36,870 --> 00:22:41,120
custom interface. So it's not even used
for Android. It's really an interface
239
00:22:41,120 --> 00:22:47,410
just for Apple. The component that we are
going to fuzz in the following is CommCenter.
240
00:22:47,410 --> 00:22:53,040
So CommCenter is the equivalent of, for
example, the Bluetooth or Wi-FI daemon,
241
00:22:53,040 --> 00:22:58,600
but for telephony. It's sandboxed as the
user "wireless", but it comes with a lot of
242
00:22:58,600 --> 00:23:02,760
XPC interfaces. And this is something
that we will also see later in the
243
00:23:02,760 --> 00:23:11,121
fuzzing results. The next part is that
there are two flavors of libraries, so
244
00:23:11,121 --> 00:23:15,380
depending on if you have a Qualcomm or an
Intel chip, different libraries will be
245
00:23:15,380 --> 00:23:21,680
used before certain actions or data
actually is then processed by the
246
00:23:21,680 --> 00:23:28,770
CommCenter itself. So we have a different
code paths here. But all of this runs in
247
00:23:28,770 --> 00:23:34,250
user space, and this means that both
libraries can be hooked with Frida and can
248
00:23:34,250 --> 00:23:38,030
be fuzzed with Frida. So that's very
interesting. There is still a lot of stuff
249
00:23:38,030 --> 00:23:44,970
that goes on in the kernel. So what you
can see here is that QMI and ARI have some
250
00:23:44,970 --> 00:23:49,820
management information that is sent to
CommCenter, but they don't contain the
251
00:23:49,820 --> 00:23:54,740
raw network or audio data. So they don't
contain your phone call, they don't
252
00:23:54,740 --> 00:24:03,310
contain your website that you are opening.
And the next issue is that QMI and ARI
253
00:24:03,310 --> 00:24:07,860
are not directly sent over the air, but
what is sent over the air are normal
254
00:24:07,860 --> 00:24:14,500
baseband interactions and these generate
QMI and ARI messages. So there's still
255
00:24:14,500 --> 00:24:19,550
some section in between, but of course,
there are now two ways: either you have
256
00:24:19,550 --> 00:24:24,580
interaction that you can do over the air,
that is causing ARI and QMI messages
257
00:24:24,580 --> 00:24:32,260
directly, that are something that causes an
issue in the upper layers. Or you might
258
00:24:32,260 --> 00:24:36,720
have this full exploit chain requirement
that you first need to exploit the chip
259
00:24:36,720 --> 00:24:44,390
over the air, and then from the chip
break the interface into the CommCenter.
260
00:24:44,390 --> 00:24:51,970
Now, QMI, the code has a lot of
assertions. So it's really asserting
261
00:24:51,970 --> 00:25:00,810
everything about a protocol, delaying the
TRV format and so on, and if anything goes
262
00:25:00,810 --> 00:25:06,240
wrong, it really terminates CommCenter.
So if you just send one invalid packet,
263
00:25:06,240 --> 00:25:11,500
CommCenter is terminated. This doesn't
matter a lot because if your protocol is
264
00:25:11,500 --> 00:25:15,510
stable and you usually don't send any
invalid packets, then you know an attack
265
00:25:15,510 --> 00:25:21,400
is ongoing, so it's valid to terminate
the CommCenter. And furthermore, it
266
00:25:21,400 --> 00:25:25,190
doesn't matter a lot to the user. So the
worst thing that happens when CommCenter
267
00:25:25,190 --> 00:25:28,770
crashes, for example, while you have an
active phone call, it's just that the
268
00:25:28,770 --> 00:25:33,700
phone call gets lost or your LTE
connection is re-established. So you don't
269
00:25:33,700 --> 00:25:40,340
really notice it. It just feels like your
Internet connection breaks for a short
270
00:25:40,340 --> 00:25:46,630
moment. In contrast, there is the ARI
protoctol, and this is the part that just
271
00:25:46,630 --> 00:25:51,370
works very, very, very different. So
whatever it's getting, it just parses it,
272
00:25:51,370 --> 00:25:56,680
and it doesn't terminate CommCenter.
So you can send many, many,
273
00:25:56,680 --> 00:25:59,930
many fancy things and it just
continues, continues, continues,
274
00:25:59,930 --> 00:26:04,210
because the developers were probably very,
very happy once they got their special
275
00:26:04,210 --> 00:26:10,970
protocol for Apple working and then they
never touched it again. But what does it
276
00:26:10,970 --> 00:26:18,111
look like? So it has a very basic format,
also with some TLS(?), and the first
277
00:26:18,111 --> 00:26:24,250
thing that I noticed when I fuzzed it is
that in the iDevice syslog, it always
278
00:26:24,250 --> 00:26:28,830
complained about this sequence number
being wrong. So it just said I expected
279
00:26:28,830 --> 00:26:35,520
the follow-up sequence number, so and so.
So I started to fix this. And if you open
280
00:26:35,520 --> 00:26:38,970
it in IDA, you can see that the range,
that is expected it's between zero and
281
00:26:38,970 --> 00:26:47,100
0x7ff hexadecimal. So you know it is
the range and then it gets weird. So the
282
00:26:47,100 --> 00:26:51,630
sequence number is spread over three
different bytes in single bits and
283
00:26:51,630 --> 00:26:56,850
shifted around and so on. And it's not
even continuous. So very weird code.
284
00:26:56,850 --> 00:27:02,000
Probably they just added those
sequence numbers to confirm some race
285
00:27:02,000 --> 00:27:06,750
conditions or something. I really don't
know. Or out-of-order packets? Something
286
00:27:06,750 --> 00:27:12,620
weird going on there. But I wrote the
code, I fixed the sequence number and
287
00:27:12,620 --> 00:27:17,710
then during the replay of packets, I
noticed, well, it doesn't even matter! So
288
00:27:17,710 --> 00:27:22,470
no matter if your sequence number is valid
or invalid, parsing continues and even
289
00:27:22,470 --> 00:27:28,000
worse, even packets with a wrong sequence
number are parsed. Probably because
290
00:27:28,000 --> 00:27:31,350
otherwise there would be too many issues,
because the protocol implementation is too
291
00:27:31,350 --> 00:27:36,350
buggy. And there are also a couple of
other things, so, for example, if you sent
292
00:27:36,350 --> 00:27:41,190
the first four magic bytes wrong or a
wrong length or something, then the
293
00:27:41,190 --> 00:27:47,410
packet is potentially ignored. But parsing
continues and CommCenter is not terminated
294
00:27:47,410 --> 00:27:53,530
like in QMI. Since it's a proprietary
protocol, there is currently no tooling
295
00:27:53,530 --> 00:27:57,770
available. But, Tobias is working on a
Wireshark dissector and once he finishes
296
00:27:57,770 --> 00:28:02,440
his thesis, it will also be publicly
released. So you need to wait a while, but
297
00:28:02,440 --> 00:28:10,770
then you will have a tool for this.
Anyway, let's also talk about fuzzing
298
00:28:10,770 --> 00:28:16,680
this, so I would not recommend to fuzz
this, because you might brick your device
299
00:28:16,680 --> 00:28:21,010
or at least get into weird states. So
just don't do this on your productive
300
00:28:21,010 --> 00:28:30,560
iPhone. I mean, obviously, I know what
I'm doing, so, yeah, just fuzzing packets,
301
00:28:30,560 --> 00:28:36,620
right? But I'm not so sure about what
exactly I'm doing, so the only direction
302
00:28:36,620 --> 00:28:43,990
that I fuzz is from the baseband to the
iPhone here, not the opposite direction.
303
00:28:43,990 --> 00:28:50,110
So I hopefully do prevent anything weird
on the chip, right? But the iPhone might
304
00:28:50,110 --> 00:28:56,990
still answer with something invalid and
this might confuse the baseband or cause
305
00:28:56,990 --> 00:29:04,271
other crashes. And so I actually had to
call for help, like mimimimimi, I broke my
306
00:29:04,271 --> 00:29:08,340
iPhone - I mean, just one of my research
devices - but still so it booted into
307
00:29:08,340 --> 00:29:14,640
pongoOS but no longer into iOS and it
didn't tell me any debug message that was
308
00:29:14,640 --> 00:29:19,540
useful. Well, it turns out, at least
under Qualcomm chips, and that's where
309
00:29:19,540 --> 00:29:25,590
this happens, it just boots after a
couple of hours again. But before it's
310
00:29:25,590 --> 00:29:31,760
just entering a boot loop and on the
Intel iPhones I also almost bricked an
311
00:29:31,760 --> 00:29:37,620
iPhone 8, but luckily it didn't
completely break. So the issue there is if
312
00:29:37,620 --> 00:29:42,640
you enable the baseband debug profile,
then it writes a lot of stuff to the ISTP
313
00:29:42,640 --> 00:29:49,280
files, so that is some debug format of
Intel, and every few minutes it just
314
00:29:49,280 --> 00:29:53,470
creates something like 500MB of data, at
least on the iPhone 8. On the newer
315
00:29:53,470 --> 00:29:57,650
iPhones, this debug format is a bit
shorter, so it doesn't create as much
316
00:29:57,650 --> 00:30:02,360
data, but still a lot. And if you don't
delete this regularly, then of course
317
00:30:02,360 --> 00:30:07,590
your disk will be full and an iPhone
behaves quite strange if it has a full
318
00:30:07,590 --> 00:30:11,860
disk. So you can still interact with the
user interface, but you can no longer
319
00:30:11,860 --> 00:30:18,970
delete photos because deleting a photo, it
seems, it just needs some file
320
00:30:18,970 --> 00:30:25,060
interaction. Also, you can no longer log
in with SSH, which is also an issue
321
00:30:25,060 --> 00:30:29,370
because it somehow seems to create a file
when logging in, so you can no longer
322
00:30:29,370 --> 00:30:36,300
delete any files. And I was just
rebooting the iPhone after trying a couple
323
00:30:36,300 --> 00:30:41,170
of things and luckily it came back and
deleted some files and I was able to log
324
00:30:41,170 --> 00:30:48,460
in and removed the baseband logs. But be
careful when doing this. And of course,
325
00:30:48,460 --> 00:30:52,600
all the iPhones are very confused from
the fuzzing. So they really lose
326
00:30:52,600 --> 00:30:57,280
everything about their identity and
location and they want to be activated
327
00:30:57,280 --> 00:31:02,540
again. So here you can see a smartphone
that lost its location and really wants
328
00:31:02,540 --> 00:31:08,080
to be activated, activated, activated.
During SMS fuzzing, you might even get
329
00:31:08,080 --> 00:31:12,480
Flash messages. And if you click on the
head menu on dark theme, they are
330
00:31:12,480 --> 00:31:18,260
displayed black on gray, so probably
nobody ever tested it. Also great if you
331
00:31:18,260 --> 00:31:22,559
have a locked iPhone, you can still
display SIM menus and SIM messages on top
332
00:31:22,559 --> 00:31:30,770
of the lock. OK, so I guess I have to
revise my first instruction. So fuzz this!
333
00:31:30,770 --> 00:31:36,500
Really, really fuzz this! It's a lot of
fun. Maybe just not on your primary
334
00:31:36,500 --> 00:31:43,640
device, but you will enjoy fuzzing these
interfaces. But first of all, you
335
00:31:43,640 --> 00:31:50,460
obviously need to build a fuzzer, so how
do you build a fuzzer? The first fuzzer
336
00:31:50,460 --> 00:31:54,680
that I used was the one that I also used
for Bluetooth that just uses the
337
00:31:54,680 --> 00:32:00,510
existing bytestream protocol and then
flips single bits and bytes. So it has
338
00:32:00,510 --> 00:32:03,950
this high state-awareness. But it also
means that like some kind of monkey I was
339
00:32:03,950 --> 00:32:09,950
just calling myself, writing SMS to
myself, enabling flight mode, everything
340
00:32:09,950 --> 00:32:15,350
that you could just imagine. And it's a
very boring task. But it also found very
341
00:32:15,350 --> 00:32:19,680
fancy bugs that I couldn't reproduce with
the other fuzzers yet, because it can
342
00:32:19,680 --> 00:32:26,080
reach states that just injection of
packets cannot reach. So at least it was
343
00:32:26,080 --> 00:32:33,740
quite successful. And when I fuzzed with
this for something like three days and
344
00:32:33,740 --> 00:32:37,341
already found a bugs, that's very
different with the Bluetooth fuzzers, so
345
00:32:37,341 --> 00:32:41,559
there seemed to be more bugs in
CommCenter. And so I just wrote to Apple
346
00:32:41,559 --> 00:32:46,560
PR: "Hey there, I wrote this really,
really ugly 10-lines-of-code fuzzer and
347
00:32:46,560 --> 00:32:52,130
see what it found. Awesome, awesome,
awesome! And crash logs are attached. And
348
00:32:52,130 --> 00:32:56,240
obviously this is simple to reproduce
because I only fuzzed for three days. Got
349
00:32:56,240 --> 00:33:01,840
most of these crashes multiple times.
Yeah. So here you go. Enjoy my fuzzer."
350
00:33:01,840 --> 00:33:07,220
And this was probably quite
stupid because it's not that simple. So
351
00:33:07,220 --> 00:33:12,190
it's really not easy to reproduce the
crashes. First of all, well, of course
352
00:33:12,190 --> 00:33:17,430
this script is so generic that it runs on
all iPhones with an Intel chip, so no
353
00:33:17,430 --> 00:33:24,390
matter if I take an iPhone 7 or an iPhone
11, it will just work. But the crash logs
354
00:33:24,390 --> 00:33:29,070
that you get are very different depending
on if you fuzz on a pre-A12, so iPhone 7
355
00:33:29,070 --> 00:33:34,800
and 8, or on later versions like the iPhone 11
and SE2. So you cannot reproduce the same
356
00:33:34,800 --> 00:33:40,460
crash logs that easy. And also it depends
a lot on the SIM. So even on a passive
357
00:33:40,460 --> 00:33:44,990
iPhone, if you don't do any phone calls
and so on, you would get different
358
00:33:44,990 --> 00:33:51,720
results. So I started my fuzzing actually
with a Singaporean SIM card
359
00:33:51,720 --> 00:33:57,300
without any data contract or phone
contract on top of it and already found a
360
00:33:57,300 --> 00:34:05,120
couple of things. But it might just
behave very different on just a slightly
361
00:34:05,120 --> 00:34:12,540
different configuration. Anyway, let's
listen to a null pointer that it found. And
362
00:34:12,540 --> 00:34:16,490
this null pointer has been fixed in iOS
14.2 and it's in the audio controller, so
363
00:34:16,490 --> 00:34:26,050
you can hear some loop going on there.
What you can see here is me calling the
364
00:34:26,050 --> 00:34:30,350
Deutsche Telekom and so on. So they have
this very important text.
365
00:34:30,350 --> 00:34:35,369
Announcement: Guten Tag, und herzlich
willkommen beim Kundenservice der Telekom.
366
00:34:35,369 --> 00:34:41,015
jiska: And then I call again and have a
crash. And now let's listen to the crash.
367
00:34:43,928 --> 00:34:48,141
Telekom jingle starts playing,
final part loops ten times
368
00:34:51,182 --> 00:34:55,520
jiska: Just for the sound effect, I also recorded
another one, so this one is with ALDI TALK.
369
00:34:57,981 --> 00:35:02,522
Announcement: Guten Tag, ALDI TALK gibt
die Senkung der Mehrwertsteuer vom ersten...
370
00:35:05,352 --> 00:35:08,174
jiska: And now let's listen to a special
offer by ALDI TALK.
371
00:35:09,186 --> 00:35:10,873
In 3, 2, 1... di-dimm...
372
00:35:10,873 --> 00:35:14,953
Announcement: Guten Tag, ALDI TALK gibt die
Senkung der Mehrwersteuer vom
373
00:35:14,953 --> 00:35:18,026
loops ten times
erst-erst-erst-erst-erst-erst-erst-erst-erst-er
374
00:35:23,210 --> 00:35:28,500
Jiska: Since his first fuzzing results
were very promising, I decided to use
375
00:35:28,500 --> 00:35:33,170
the latest ToothPicker version and extend
it for fuzzing ARI and I called it
376
00:35:33,170 --> 00:35:38,410
ICEPicker because the Intel chips are also
called ICE. So I just cloned Dennis'
377
00:35:38,410 --> 00:35:43,840
latest ToothPicker alpha, which is very,
very unstable, but this one actually
378
00:35:43,840 --> 00:35:49,530
runs on the iPhone locally without any
interaction with Mac OS or Linux. So it
379
00:35:49,530 --> 00:35:54,980
doesn't need to exchange any the payload
via USB and also it's using AFL++, which
380
00:35:54,980 --> 00:36:02,190
is a much faster mutator than Radamsa.
So from a speed consideration, this is a
381
00:36:02,190 --> 00:36:08,036
much better design. However, AFL++ didn't
turn out to be the best fuzzer for
382
00:36:08,036 --> 00:36:12,860
protocol, so most of the time is actually
spent trying to brute force the first
383
00:36:12,860 --> 00:36:17,250
magic bytes, the first four bytes, because
it tries to shorten inputs. It's also not
384
00:36:17,250 --> 00:36:22,200
aware of something like a packet order, so
it was just brute forcing those first four
385
00:36:22,200 --> 00:36:28,860
bytes. And well, the next issue is, that
for some reason, if the first four bytes
386
00:36:28,860 --> 00:36:33,770
are invalid, the ARI parser slows down a
lot. So I was suddenly down to something
387
00:36:33,770 --> 00:36:40,790
like less than 10 fuzz cases per second.
And also there is no awareness of the
388
00:36:40,790 --> 00:36:46,390
ICEPicker in this case, of the ARI host
state. So ARI sometimes shuts down this
389
00:36:46,390 --> 00:36:51,560
interface, if it thinks that something is
very invalid and the fuzzer will just
390
00:36:51,560 --> 00:36:57,230
continue. So I looked into the iDevice
syslog after the fuzzer couldn't find any
391
00:36:57,230 --> 00:37:01,340
new coverage for more than six hours.
And I was wondering: "What is the
392
00:37:01,340 --> 00:37:07,520
issue here? Is the implementation
wrong or is it the fuzzer?" And it really
393
00:37:07,520 --> 00:37:12,990
looks like the fuzzer is producing inputs
that are not good for protocol fuzzing.
394
00:37:12,990 --> 00:37:20,260
Of course, this is stuff that you can
optimize, so AFL++ can do a lot here, so
395
00:37:20,260 --> 00:37:25,690
you can tell it a bit how the protocol
looks like and also get it to not brute
396
00:37:25,690 --> 00:37:30,410
force the first four magic bytes. But for
this I would have to recompile the whole
397
00:37:30,410 --> 00:37:35,530
thing. And it was something that compiled
on Dennis' machine, but it didn't compile
398
00:37:35,530 --> 00:37:40,140
on my machine , because I had my Xcode
beta in a weird state. And well, of
399
00:37:40,140 --> 00:37:44,664
course, some of you now say:
"Just download and install a new Xcode!"
400
00:37:44,664 --> 00:37:49,526
But this takes so long that actually
writing the next fuzzer seemed to be.
401
00:37:49,526 --> 00:37:56,500
easier. Still, this variant of ICEPicker
was interesting to me because it was the
402
00:37:56,500 --> 00:37:59,890
first time when I saw that the fuzzer
initialization works, including
403
00:37:59,890 --> 00:38:07,000
coverage and also my replay works across
multiple iPhone versions. So my call was
404
00:38:07,000 --> 00:38:14,320
collected on an iPhone SE2, was replayable
on an iPhone 7. So it was not useless in
405
00:38:14,320 --> 00:38:22,060
that sense, but I just decided to not
use this configuration. So I just wrote a
406
00:38:22,060 --> 00:38:27,010
very simple fuzzer again and I didn't do
the porting of everything to run locally
407
00:38:27,010 --> 00:38:32,720
on iOS. I just kept the design a bit
simpler or at least easier to code and had
408
00:38:32,720 --> 00:38:39,119
my fuzzer running on Linux and then using
only Frida on iOS. It cannot reproduce all
409
00:38:39,119 --> 00:38:43,450
the states and crashes that I observed
with my very first fuzzer, but most
410
00:38:43,450 --> 00:38:50,630
crashes could be reproduced. I didn't do
any coverage. I didn't do any smart
411
00:38:50,630 --> 00:38:56,720
mutations, just very stupid mutations. And
basically I just did a very blind
412
00:38:56,720 --> 00:39:00,720
injection. But this was super fast, so
instead of the 20 fuzz cases per second, I
413
00:39:00,720 --> 00:39:06,370
already had something like 400 fuzz cases
per second on an iPhone 7, which was about
414
00:39:06,370 --> 00:39:14,280
the same speed or even faster than the
AFL++ variant. And I can at least correct
415
00:39:14,280 --> 00:39:21,340
the length field, sequence number and so
on before injecting the payload. Since it
416
00:39:21,340 --> 00:39:26,340
doesn't do that great mutations, at
least, I need to collect a good corpus
417
00:39:26,340 --> 00:39:31,995
with many SIMSs, many calls. And I'm also
logging the packet order with this. So
418
00:39:31,995 --> 00:39:36,250
it's at least aware of a pocket sequence
in the sense of, I can reproduce the
419
00:39:36,250 --> 00:39:43,200
sequence later on. I had this fuzzer
running on a couple of iPhones in
420
00:39:43,200 --> 00:39:50,060
parallel for multiple weeks, and it found
a lot of interesting crashes. So that's
421
00:39:50,060 --> 00:39:57,320
my go-to fuzzer. I still wanted to
confirm that not collecting coverage
422
00:39:57,320 --> 00:40:02,160
wasn't an issue, so I also cloned the
publicly released of ToothPicker, which
423
00:40:02,160 --> 00:40:07,010
definitely finds new coverage, and it's
using the Radamsa-mutator, which is very,
424
00:40:07,010 --> 00:40:14,460
very slow, but it does a bit smarter
mutations, at least in terms of protocol
425
00:40:14,460 --> 00:40:20,220
fuzzing. It's still only a aware of
single packets and it's only using the
426
00:40:20,220 --> 00:40:26,330
same packets five times in a row to
confirm coverage, etc. And also an issue
427
00:40:26,330 --> 00:40:30,820
is that it cannot catch a lot of the
crashes of CommCenter. So it happens
428
00:40:30,820 --> 00:40:35,600
quite often that CommCenter crashes. And
then if you cannot catch the crash with
429
00:40:35,600 --> 00:40:40,080
Frida and everything crashes, then you
need to start the fuzzer again. But you
430
00:40:40,080 --> 00:40:42,990
also need to delete the files in the
corpus that led to the crash because
431
00:40:42,990 --> 00:40:47,869
otherwise you would just run into the same
crash very fast. So it needs a lot of
432
00:40:47,869 --> 00:40:55,550
babysitting. I also had it running for a
couple of weeks, but sadly, it didn't find
433
00:40:55,550 --> 00:41:00,070
any crashes. So at least I can be sure
that fuzzing, much slower, but with
434
00:41:00,070 --> 00:41:07,030
coverage, is not any improvement. Still,
the mutations it creates are quite useful,
435
00:41:07,030 --> 00:41:11,550
as you can see in the following. So you
can even see this phone numbers scrolling
436
00:41:11,550 --> 00:41:18,678
here and so on. So it generated a very
long phone number correctly into some TLV
437
00:41:18,678 --> 00:41:23,006
structure here. And that's quite
interesting to see. So this is something
438
00:41:23,006 --> 00:41:27,900
that you could not reach by just
flipping bits and bytes.
439
00:41:38,657 --> 00:41:43,821
There is one big shortcoming that all of
these fuzzers have, including the initial
440
00:41:43,821 --> 00:41:50,800
ToothPicker which is they don't have any kind
of memory sanitization. So the framework
441
00:41:50,800 --> 00:41:56,420
that you would usually use in user space
on iOS is the MallocStackLogging
442
00:41:56,420 --> 00:42:02,180
framework. I even got this running for
CommCenter, so it's a bit of a command
443
00:42:02,180 --> 00:42:06,150
line juggling. But in the end you can
enable MallocStackLogging also for
444
00:42:06,150 --> 00:42:13,200
CommCenter. The issue here is that it
increases the memory usage a lot and even
445
00:42:13,200 --> 00:42:19,310
if you configure CommCenter to have a
higher memory allowance, it is so high
446
00:42:19,310 --> 00:42:24,370
that it's just immediately killed by the
out-of-memory killer. So this doesn't
447
00:42:24,370 --> 00:42:31,530
work. Then there is also libgmalloc. It
doesn't exist for iOS, it's just exists on
448
00:42:31,530 --> 00:42:36,880
Xcode. I got one of the Xcode libraries
running on one of my iPhones. I have no
449
00:42:36,880 --> 00:42:40,830
idea if this is an expected configuration
or not. At least I could execute smaller
450
00:42:40,830 --> 00:42:47,260
programs. And then when you use this on
CommCenter, it just crashes with a
451
00:42:47,260 --> 00:42:52,700
libgmalloc error on parsing some of the
configuration files very, very early when
452
00:42:52,700 --> 00:42:58,300
starting the CommCenter. So all of this
didn't work. And this also means that the
453
00:42:58,300 --> 00:43:02,990
fuzzer cannot find certain bug types or
crashes much later when encountering
454
00:43:02,990 --> 00:43:11,670
bugs. So all of the fuzzers that I created
are not perfect, but at least they found
455
00:43:11,670 --> 00:43:16,512
a lot of different crashes. Let's look
into this. I mean, the first obvious
456
00:43:16,512 --> 00:43:21,480
number that you see here is the 42. So I
stopped fuzzing after 42 crashes - at
457
00:43:21,480 --> 00:43:25,950
least crashes that I think are individual
crashes and that are not caused by Frida -
458
00:43:25,950 --> 00:43:31,510
so I tried to filter out Frida crashes
and this corresponds to the total amount
459
00:43:31,510 --> 00:43:36,350
of crashes, but only some of them are
replayable by either one or multiple
460
00:43:36,350 --> 00:43:42,250
packets. And for the replayable crashes I
can also check if they were fixed in
461
00:43:42,250 --> 00:43:48,600
recent iOS versions or the most recent iOS
14.3 or not. Then I also marked two
462
00:43:48,600 --> 00:43:51,900
colors here because there is the Intel
libraries, but there's also the
463
00:43:51,900 --> 00:43:57,970
Qualcomm libraries. And for the Qualcomm
libraries, I didn't spend as much time
464
00:43:57,970 --> 00:44:02,150
fuzzing, because I have less Qualcomm
phones, but also all the asserts in the
465
00:44:02,150 --> 00:44:07,000
code prevent a lot of issues from being
reached. So the libraries themselves have
466
00:44:07,000 --> 00:44:14,190
less issues and also within CommCenter,
less of the code that has improper state
467
00:44:14,190 --> 00:44:22,071
handling is reached. The location daemon is
marked also with a big grey box here,
468
00:44:22,071 --> 00:44:27,180
because the location daemon is similarly to
the CommCenter using some of the raw
469
00:44:27,180 --> 00:44:32,650
packet inputs and parses them. So it has
special parsers for Qualcomm and Intel.
470
00:44:32,650 --> 00:44:38,650
And it's also an interesting target
because of this. Other than this I got
471
00:44:38,650 --> 00:44:44,270
really a lot, a lot, a lot of different
daemons crashing. Some of them, even with
472
00:44:44,270 --> 00:44:48,970
replayable behaviour. So, for example,
there is the wireless radio manager daemon
473
00:44:48,970 --> 00:44:57,280
that you can just crash via one Intel
packet. But, this has been fixed. And then
474
00:44:57,280 --> 00:45:02,180
there is one interesting crash that I
actually got via Qualcomm and Intel
475
00:45:02,180 --> 00:45:08,330
libraries. So in the mobile Internet
sharing daemon, this also has been fixed
476
00:45:08,330 --> 00:45:12,570
and some of the crashes only happened via
Qualcomm, but I'm not sure if that's like
477
00:45:12,570 --> 00:45:21,470
a Qualcomm-specific thing or it's just
randomness of the fuzzer. So the mobile
478
00:45:21,470 --> 00:45:26,420
Internet sharing demon has an issue where
it accesses memory at configuration
479
00:45:26,420 --> 00:45:31,619
strings, so there's different strings at
this memory address and I found this quite
480
00:45:31,619 --> 00:45:36,550
early, but I was not aware of the fact,
that so many other daemons are actually
481
00:45:36,550 --> 00:45:40,760
crashing when I fuzz CommCenter. So, I
didn't look into this in the very
482
00:45:40,760 --> 00:45:44,330
beginning. And when I reported it to
Apple, they said: "Yeah, yeah, we already
483
00:45:44,330 --> 00:45:49,570
know about this and we fixed it and a
beta prior to your report." So certainly
484
00:45:49,570 --> 00:45:56,550
nothing that I got a CVE for. Another
interesting crash in the CellMonitor, but
485
00:45:56,550 --> 00:46:01,740
only of the Intel library. The CellMonitor
is something that is running passively in
486
00:46:01,740 --> 00:46:06,650
the background all the time and it parses,
for example, GSM and UMTS cell
487
00:46:06,650 --> 00:46:11,580
information. I already found this on the
Singaporean SIM without any active
488
00:46:11,580 --> 00:46:16,750
data plan in my very first round of
fuzzing and reported it back then to
489
00:46:16,750 --> 00:46:21,460
Apple. I don't know, if it's triggerable
over the air or not. So I guess it's
490
00:46:21,460 --> 00:46:25,860
something that you first need to get code
execution for. And it has been fixed in
491
00:46:25,860 --> 00:46:31,359
iOS 14.2. And I wrote a lot of emails with
Apple because I thought, that they didn't
492
00:46:31,359 --> 00:46:37,600
fix it. And the reason for this is that
both the GSM cell info and the UMTS cell
493
00:46:37,600 --> 00:46:43,320
info function, when they parse data, they
have two different bugs. So I still got
494
00:46:43,320 --> 00:46:47,290
crashes in the same functions and I
thought: "OK, same function, still a
495
00:46:47,290 --> 00:46:52,410
crash: The bug is not fixed.". But actually,
it's very high quality code and it's just
496
00:46:52,410 --> 00:46:57,600
multiple bugs per function. And there is
even one more issue in the CellMonitor,
497
00:46:57,600 --> 00:47:03,140
even though I think the remaining bugs are
very simple crashes or nothing that could
498
00:47:03,140 --> 00:47:11,940
be exploitable at all, but still hints to
the great code quality. And the same story
499
00:47:11,940 --> 00:47:15,670
is, that there're even more bugs to be
fixed. So most of them are probably just
500
00:47:15,670 --> 00:47:21,670
stability improvements, but some of them
are still interesting. So, let's see how
501
00:47:21,670 --> 00:47:26,790
this goes. So since I told, that it's a
very simple fuzzer, some of you might have
502
00:47:26,790 --> 00:47:31,869
already started coding those 10 lines of
code for fuzzing, while I continued talking
503
00:47:31,869 --> 00:47:38,200
and grabbed their old iPhones, that they are
willing to lose, if something goes wrong.
504
00:47:38,200 --> 00:47:44,490
So, how can we actually build a fuzzer
that is performant and replicates some of
505
00:47:44,490 --> 00:47:49,870
the bugs that I found just within a day.
Let's take a look. When you look, Frida
506
00:47:49,870 --> 00:47:55,340
fuzzing, a lot of the stuff that you do,
is limited by the processing power of the
507
00:47:55,340 --> 00:47:59,280
iPhone. So your iPhone will get very,
very, very hot and it might even drain
508
00:47:59,280 --> 00:48:06,000
more battery, than it can get via the USB
port. So it might even discharge while
509
00:48:06,000 --> 00:48:14,540
fuzzing. And performance is really key. So
you need to identify bottlenecks.
510
00:48:14,540 --> 00:48:20,109
I said ToothPicker or ICEPicker, the
initial version is just 20 fuzz cases per
511
00:48:20,109 --> 00:48:26,890
second and you can tune this to something
like 20.000 fuzz cases per second. So, I
512
00:48:26,890 --> 00:48:30,920
already told, that I tuned it to something
like 400 or 500 fuzz cases per second,
513
00:48:30,920 --> 00:48:36,860
but, why the 20.000? So, initially, a
student of mine, did some fuzzing in a
514
00:48:36,860 --> 00:48:41,710
very different parser and said: "On my
iPhone 6S, it's running with 20.000 fuzz
515
00:48:41,710 --> 00:48:51,369
cases per second." I was like: "No way, no
way!" But actually, you can do this. So,
516
00:48:51,369 --> 00:48:56,890
this depends a lot on the Frida design.
The first variant, how most Frida scripts
517
00:48:56,890 --> 00:49:02,720
are written is, that you have some Python
script that runs on Linux or macOS, and it
518
00:49:02,720 --> 00:49:06,098
has a couple of functions that you can see
here. So first of all, it has this
519
00:49:06,098 --> 00:49:10,040
on_message callback. So, this on_message
callback is something that we need later.
520
00:49:10,040 --> 00:49:14,250
And we just register it to our Frida
script, the Frida script, that I'm going
521
00:49:14,250 --> 00:49:18,720
to show you in a second. And you load the
script and the script can then even call
522
00:49:18,720 --> 00:49:25,322
functions on your iPhone. For this, you
load a second script on your iPhone. So
523
00:49:25,322 --> 00:49:32,150
this is JavaScript injected into the iOS
target process and it can, for example,
524
00:49:32,150 --> 00:49:37,360
use to send function to send something
back to the on message function. And it
525
00:49:37,360 --> 00:49:46,840
can export functions via RPCs. So, you can
then call them. All this happens via JSON.
526
00:49:46,840 --> 00:49:51,130
And so it needs serialization and
deserialization, which means you cannot
527
00:49:51,130 --> 00:49:56,990
send hex data or binary data directly. So
you have a hex string that you encode into
528
00:49:56,990 --> 00:50:04,210
JSON, which is then parsed as binary data
and also it's all via USB. So you also
529
00:50:04,210 --> 00:50:10,859
have the speed limitation by USB. And, of
course, if you use the Frida C-bindings
530
00:50:10,859 --> 00:50:19,250
locally on the iOS smartphone, it is a bit
faster, but it's still not perfect. So,
531
00:50:19,250 --> 00:50:29,440
the more you can prevent from this JSON
part and the USB part, the better. The
532
00:50:29,440 --> 00:50:33,890
actual fuzzer looks a bit like this. So,
you are in the libARIServer, so that's the
533
00:50:33,890 --> 00:50:40,520
lowest library from the diagram before.
And then you define this inbound message
534
00:50:40,520 --> 00:50:44,030
callback function, which has two
arguments, which are the payload and the
535
00:50:44,030 --> 00:50:49,681
length. So, this looks a bit cryptic, but
that's basically it. And then you can, but
536
00:50:49,681 --> 00:50:55,940
you don't have to, add this interceptor
here because you might want to fix your
537
00:50:55,940 --> 00:51:01,540
sequence number or add basic block
coverage to your fuzzer, etc. So, this is also
538
00:51:01,540 --> 00:51:08,500
done there. And then you can just call this
inbound message callback of ARI and send
539
00:51:08,500 --> 00:51:17,000
ARI payloads. So, this already can be very
different. So, if you now call this via
540
00:51:17,000 --> 00:51:22,580
RPC export, via a Python script on your
laptop, you can reach something like 500
541
00:51:22,580 --> 00:51:27,030
fuzz cases per second, if you inject SMS,
which are quite processing intensive
542
00:51:27,030 --> 00:51:33,140
payload. Or, if you just do the same
thing and if you just run this inbound
543
00:51:33,140 --> 00:51:36,720
message callback in a loop, locally with
JavaScript, without any external Python
544
00:51:36,720 --> 00:51:42,970
script, then you would get 22.000 fuzz
cases per second on the very same device.
545
00:51:42,970 --> 00:51:49,150
So this is the speed difference that the
JSON serialization, deserialization and
546
00:51:49,150 --> 00:51:57,710
the USB in between make. So, I did a few
more measurements, and certainly on the
547
00:51:57,710 --> 00:52:03,220
iPhone 8, there is a bug that prevents me
from collecting coverage. But, what you
548
00:52:03,220 --> 00:52:09,700
can see is, so, the first part here is if
you have just a bit flipper in a loop that
549
00:52:09,700 --> 00:52:14,240
calls the target function, you can get
17.000 fuzz cases per second on an iPhone 7.
550
00:52:14,240 --> 00:52:19,830
As soon as you start collecting basic
block coverage, not processing it, just
551
00:52:19,830 --> 00:52:25,590
collecting, you drop to 250 fuzz cases per
second. So, you need to ask yourself, if
552
00:52:25,590 --> 00:52:32,250
your fuzzer gets really that much better
from collecting coverage. And another
553
00:52:32,250 --> 00:52:37,660
thing is - that's this line above - so, if you
just print the packet, that you fuzzed or
554
00:52:37,660 --> 00:52:43,400
injected and print this via Python to your
laptop, you also have a huge slow down,
555
00:52:43,400 --> 00:52:47,670
which is not as large as the coverage
slowdown. But still, you can see every
556
00:52:47,670 --> 00:52:52,960
print and every sending off a message in
between the Python script and JavaScript
557
00:52:52,960 --> 00:53:00,690
takes a lot of time. Now, if you have this
remote SMS injection that I had before,
558
00:53:00,690 --> 00:53:04,700
then you drop to 200 fuzz cases per
second. So it is a blind injection without
559
00:53:04,700 --> 00:53:11,650
any coverage. If you collect coverage but
don't process coverage, then you are down
560
00:53:11,650 --> 00:53:15,970
to 100 fuzz cases per second. So, for the
initial ToothPicker design, this would be
561
00:53:15,970 --> 00:53:20,760
the optimum. But, because the Radamsa
mutator is very slow and because you also
562
00:53:20,760 --> 00:53:27,540
need to process the coverage information,
et cetera, that's down to 20 fuzz cases
563
00:53:27,540 --> 00:53:33,500
per second. So, this is the comparison
here. And now you can imagine why
564
00:53:33,500 --> 00:53:39,700
collecting coverage probably isn't always
useful and why also having your laptop
565
00:53:39,700 --> 00:53:45,820
calculating better mutation because it's
easier to write a mutator there, than
566
00:53:45,820 --> 00:53:51,850
directly in JavaScript, is not always the
best idea. So let's watch one last demo
567
00:53:51,850 --> 00:53:55,470
video. What you can see here, is when you
try to delete SMS, after all of the
568
00:53:55,470 --> 00:54:01,420
fuzzing, it really doesn't work neither
via the settings nor via the SMS app. So,
569
00:54:01,420 --> 00:54:05,750
you really need to reset your iPhone after
fuzzing it for too long. No other chance
570
00:54:05,750 --> 00:54:12,050
than this to delete the messages. With
this, we are already at the end of this
571
00:54:12,050 --> 00:54:17,560
talk, but of course, there will be a Q&A
session and if you missed the Q&A session,
572
00:54:17,560 --> 00:54:22,616
you can also ask me on Twitter or write me
an email. Thanks for watching!
573
00:54:31,758 --> 00:54:45,060
rC3 music
574
00:54:45,060 --> 00:55:12,000
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!