1
00:00:00,000 --> 00:00:13,047
rC3 preroll music
2
00:00:13,047 --> 00:00:17,730
Herald: Our next speaker, Alisa Esage, is
an independent vulnerability researcher
3
00:00:17,730 --> 00:00:22,640
and has a notable record of security
research achievements such as this year,
4
00:00:22,640 --> 00:00:29,770
the initiative Silver Bounty Hunter Awards
2018. Alisa is going to present her latest
5
00:00:29,770 --> 00:00:36,007
research on the Qualcomm DIAG protocol,
which is found abundantly in Qualcomm
6
00:00:36,007 --> 00:00:46,500
Hexagon based cellular modems. Alisa,
we're looking forward to your talk now.
7
00:00:46,500 --> 00:00:49,701
Alisa Esage: This is Alisa Esage, you're
attending my presentation about Advanced
8
00:00:49,701 --> 00:01:01,010
Hexagon DIAG at Chaos Communication
Congress 2020 remote experience. My main
9
00:01:01,010 --> 00:01:06,250
interest as advanced vulnerability
researcher is complex systems and hardened
10
00:01:06,250 --> 00:01:11,920
systems. For the last 10 years I have been
researching various classes of software
11
00:01:11,920 --> 00:01:16,280
such as Windows kernel, browsers,
JavaScript engines. And for the last three
12
00:01:16,280 --> 00:01:21,880
years I was focusing mostly on
Hypervisors. The project that I'm
13
00:01:21,880 --> 00:01:27,970
presenting today was a little side project
that I made for distraction a couple years
14
00:01:27,970 --> 00:01:37,560
ago. The name of this talk Advanced
Hexagon DIAG is a bit of an understatement
15
00:01:37,560 --> 00:01:45,290
in the attempt to keep this talk a little
bit low key in the general internet,
16
00:01:45,290 --> 00:01:50,840
because a big part of the talk will
actually be devoted to a general
17
00:01:50,840 --> 00:01:56,710
vulnerability research in basebands. But
the primary focus of this talk is on the
18
00:01:56,710 --> 00:02:02,899
Hexagon DIAG, also known as QCDM Qualcomm
diagnostic manager. This is a proprietary
19
00:02:02,899 --> 00:02:09,229
protocol developed by Qualcomm for use in
their basebands, and it is included on all
20
00:02:09,229 --> 00:02:18,400
Snapdragon SoCs and modem chips produced
by Qualcomm. More than Qualcomm chips run
21
00:02:18,400 --> 00:02:24,299
on custom silicone with a custom
instruction set architecture and named
22
00:02:24,299 --> 00:02:30,930
QDSP6 Hexagon. This is important because
all the DIAG handlers that we will be
23
00:02:30,930 --> 00:02:41,699
dealing with are written in this
instruction set architecture. As usually
24
00:02:41,699 --> 00:02:47,769
with my talks, I have adjusted the
materials of this presentation for various
25
00:02:47,769 --> 00:02:52,659
audiences, for the full spectrum of
audiences, specifically the first part of
26
00:02:52,659 --> 00:03:00,699
the presentation is mostly specialized for
research directors and high level
27
00:03:00,699 --> 00:03:06,719
technical staff. And the last part is more
deep technical. And it would be mostly
28
00:03:06,719 --> 00:03:14,510
interesting to specialized vulnerability
researchers and low level programmers that
29
00:03:14,510 --> 00:03:25,400
somehow are related to this particular
area. Let's start from the top level
30
00:03:25,400 --> 00:03:31,540
overview of cellular technology. This mind
map presents a simplified view of various
31
00:03:31,540 --> 00:03:36,739
types of entities that we'd have to deal
with with respect to basebands. It's not a
32
00:03:36,739 --> 00:03:44,659
complete diagram, of course, but it only
presents the classes of entities that
33
00:03:44,659 --> 00:03:51,540
exist in this space. Also, this mind map
is specific to the clean site equipment,
34
00:03:51,540 --> 00:03:57,109
the user equipment and it completely omits
any server side considerations which are a
35
00:03:57,109 --> 00:04:02,290
world in their own. There exists quite a
large number of cellular protocols on the
36
00:04:02,290 --> 00:04:08,199
planet. From the user perspective, this is
simple. This is usually the shared name
37
00:04:08,199 --> 00:04:15,469
3G, 4G that you see on the mobile screen.
But in reality, this simple name, that
38
00:04:15,469 --> 00:04:27,409
generation name encodes - may encode
several different distinct technologies.
39
00:04:27,409 --> 00:04:32,620
There are a few key points about cellular
protocols that are crucial to understand
40
00:04:32,620 --> 00:04:38,860
before starting to approach this area. The
first one is the concept of a generation.
41
00:04:38,860 --> 00:04:45,379
This is simple. This is simply 1G, 2G and
so on. The generic name of the family of
42
00:04:45,379 --> 00:04:49,910
protocols that are supported in a
particular generation. Generation is
43
00:04:49,910 --> 00:04:55,539
simply a marketing name, for users. It
doesn't really have any strict technical
44
00:04:55,539 --> 00:05:02,199
meaning. And generations represent the
evolution of cellular protocols in time.
45
00:05:02,199 --> 00:05:06,840
The second most important thing about
cellular protocols is the air interface.
46
00:05:06,840 --> 00:05:13,629
This is.. or the protocol, which actually..
this is the lowest level protocol which
47
00:05:13,629 --> 00:05:20,270
defines how exactly the cellular
signal is digitized and read from the
48
00:05:20,270 --> 00:05:26,700
electromagnetic wave and how exactly the
different players in this field divide the
49
00:05:26,700 --> 00:05:32,990
space. Historically, there existed two
main implementations of this low level
50
00:05:32,990 --> 00:05:39,330
code called TDMA and CDMA. TDMA means time
division multiple access, which basically
51
00:05:39,330 --> 00:05:43,670
divides the entire electromagnetic
spectrum within the radio band into time
52
00:05:43,670 --> 00:05:51,490
slots that are rotated in a round robin
manner by various mobile phones so that
53
00:05:51,490 --> 00:06:04,319
they speak in turns. TDMA was the base for
the GSM technology. And GSM was the main
54
00:06:04,319 --> 00:06:09,919
protocol used on this planet for a long
time. Another low level implementation is
55
00:06:09,919 --> 00:06:16,689
CDMA. It was a little bit more complex
from the beginning. It's decoded as coded
56
00:06:16,689 --> 00:06:24,300
division multiple access. And instead of
dividing the spectrum in time slots and
57
00:06:24,300 --> 00:06:32,580
dividing the protocol in bursts, CDMA uses
random codes that are assigned to mobile
58
00:06:32,580 --> 00:06:43,060
phones so that this code can be used as an
additional randomizing mask against the
59
00:06:43,060 --> 00:06:48,400
modulation protocol. And multiple user
equipments can talk on the same frequency
60
00:06:48,400 --> 00:06:57,110
without interrupting each other. Note here
that CDMA was developed by Qualcomm and it
61
00:06:57,110 --> 00:07:03,159
was mostly used in the United States. So
at the level of 2G, there were two main
62
00:07:03,159 --> 00:07:11,581
protocols, GSM based on the TDMA and the
cdmaOne based on the CDMA. On the third
63
00:07:11,581 --> 00:07:17,919
generation of mobile protocols these two
branches of development were continued. So
64
00:07:17,919 --> 00:07:24,160
GSM evolved into UMTS, while cdmaOne
evolved into CDMA2000. The important point
65
00:07:24,160 --> 00:07:31,029
here is that UMTS has at this point
already adopted the low level air
66
00:07:31,029 --> 00:07:37,340
interface protocol from the CDMA and
eventually at the fourth generation of
67
00:07:37,340 --> 00:07:41,240
protocols these two branches of
development come together to create the
68
00:07:41,240 --> 00:07:52,680
LTE technology and the same for the 5G.
This is a bit important for us as from the
69
00:07:52,680 --> 00:07:57,909
offensive perspective, because first of
all, all of this technologies including
70
00:07:57,909 --> 00:08:04,999
the air interfaces represents separate
bits of code with separate parsing
71
00:08:04,999 --> 00:08:09,900
algorithms within the baseband firmware.
And all of them are usually presented in
72
00:08:09,900 --> 00:08:15,099
each baseband, regardless of which one you
actually use. Does your mobile provider
73
00:08:15,099 --> 00:08:20,919
actually support. Another important and
not obvious thing from the offensive
74
00:08:20,919 --> 00:08:29,940
security perspective here is that because
of this, evolutionary development of the..
75
00:08:29,940 --> 00:08:34,669
protocols are not actually completely
distinct. So if you think about LTE, it is
76
00:08:34,669 --> 00:08:39,289
not a completely different protocol from
GSM, but instead it is based largely on
77
00:08:39,289 --> 00:08:47,600
the same internal structures. And in fact,
if you look at the specifications, some of
78
00:08:47,600 --> 00:08:53,560
them are almost directly relevant. The
specifications of the GSM 2G, some of them
79
00:08:53,560 --> 00:08:59,810
are still directly relevant to some extent
to LTE. This is also important when you
80
00:08:59,810 --> 00:09:06,350
start analyzing protocols from the
offensive perspective. The cellular
81
00:09:06,350 --> 00:09:17,460
protocols are structured in a nested
way, in layers. Layers is the official
82
00:09:17,460 --> 00:09:25,120
terminology adopted by the specifications
with the exception of level zero. Here I
83
00:09:25,120 --> 00:09:29,980
just edited it for convenience, but it's
in the specifications layer start from one
84
00:09:29,980 --> 00:09:34,649
and proceed to three. From the offensive
perspective, the most interesting is level
85
00:09:34,649 --> 00:09:39,050
three, as you can see from the screenshot
of the specifications, because it encodes
86
00:09:39,050 --> 00:09:45,260
most of the high level protocol data, such
as handling SMS and GSM. This is the part
87
00:09:45,260 --> 00:09:49,830
of the protocol which actually contains
interesting data structures with TLV
88
00:09:49,830 --> 00:09:58,550
values and so on. When people talk about
attack in basebands, they usually mean
89
00:09:58,550 --> 00:10:06,010
attack in baseband over the air. Their OTA
attack vector, which is definitely one of
90
00:10:06,010 --> 00:10:11,930
the most interesting. But let's take a
step back and consider the entire big
91
00:10:11,930 --> 00:10:21,070
picture of the baseband ecosystem. This
diagram presents a unified view of
92
00:10:21,070 --> 00:10:28,009
generalized architecture of a modern
baseband with attack surfaces. First of
93
00:10:28,009 --> 00:10:34,680
all, there are two separate distinct
processors: the AP, application processor,
94
00:10:34,680 --> 00:10:40,140
and the MP, which is mobile processor. It
may be either a DSP or another CPU.
95
00:10:40,140 --> 00:10:45,290
Usually there are two separate processors
and each one of them runs a separate
96
00:10:45,290 --> 00:10:51,311
operating system. In case of the AP, it
may be Android or iOS and the baseband
97
00:10:51,311 --> 00:10:55,940
processor will draw on some sort of real-
time operating system provided by the
98
00:10:55,940 --> 00:11:03,400
mobile vendor. Important point here that
on modern implementations, baseband
99
00:11:03,400 --> 00:11:08,649
actually protected by some sort of secure
execution environment, maybe TrustZone on
100
00:11:08,649 --> 00:11:17,100
Androids or SEPOS on Apple devices. Which
means that the privilege boundary which is
101
00:11:17,100 --> 00:11:22,820
depicted here on the left side is dual
sided. So even if you have kernel access
102
00:11:22,820 --> 00:11:29,740
to the Android kernel, you still are not
supposed to be able to read the memory of
103
00:11:29,740 --> 00:11:33,620
the baseband or somehow intersect with its
operation, at least on the modern
104
00:11:33,620 --> 00:11:38,560
production smartphones. And the same goes
around to the baseband, which is not
105
00:11:38,560 --> 00:11:45,540
supposed to be able to access to application
processor directly. So these two are
106
00:11:45,540 --> 00:11:50,191
mutually distrusting entities that are
separated from each other. And so there
107
00:11:50,191 --> 00:12:01,892
exists privilege boundary, which is -
which represents attack surface. Within
108
00:12:01,892 --> 00:12:07,389
the real-time operating systems, there are
three large attack surfaces. Starting from
109
00:12:07,389 --> 00:12:14,180
right to left: the rightmost gray box
represents the attack surface of the
110
00:12:14,180 --> 00:12:20,639
cellular stacks. This is the code which
actually parses the cellular protocols.
111
00:12:20,639 --> 00:12:31,699
It's usually runs in several distant real-
time operating system tasks. And this part
112
00:12:31,699 --> 00:12:38,519
of the attack surface handles all the
layers of the protocol. There is a huge
113
00:12:38,519 --> 00:12:44,070
amount of parsing that happens here. The
second box represents the various
114
00:12:44,070 --> 00:12:50,980
management protocols. The simplest one to
think about is the AT command protocol. It
115
00:12:50,980 --> 00:12:56,700
is still widely included in all basebands,
and it's even usually exposed in some way
116
00:12:56,700 --> 00:13:01,279
to the application processor. So you can
actually send some AT commands to the
117
00:13:01,279 --> 00:13:09,000
cellular modem. About a bit more interesting
is the vendor specific management
118
00:13:09,000 --> 00:13:16,680
protocols, one of them is the DIAG
protocol. Because the modern basebands are
119
00:13:16,680 --> 00:13:22,569
very complex. So vendors need some sort of
specialized protocol to enable
120
00:13:22,569 --> 00:13:28,910
configuration and diagnostics for the
OEM's. In case of Qualcomm, for example,
121
00:13:28,910 --> 00:13:37,170
DIAG is just one of the many diagnostic
protocols involved. The third box is what
122
00:13:37,170 --> 00:13:45,350
I call the RTOS core, it is various
core level functionality, such as the
123
00:13:45,350 --> 00:13:57,770
code, which implements that interface to
the application processor. On the side of
124
00:13:57,770 --> 00:14:04,019
the application operating system such as
Android, there are also 2 attack surfaces
125
00:14:04,019 --> 00:14:10,370
that are attackable from the baseband. The
first one is the peripheral drivers,
126
00:14:10,370 --> 00:14:13,579
because the basement is a separate part of
peripherals. So it requires some
127
00:14:13,579 --> 00:14:21,110
specialized drivers that handle I/O and
such things. And the second one is the
128
00:14:21,110 --> 00:14:29,002
dark surface represented with various
interface handlers because the baseband
129
00:14:29,002 --> 00:14:34,800
and the main operating system cannot
communicate directly. They use some sort
130
00:14:34,800 --> 00:14:39,839
of a specialized interface to do that. In
case of Qualcomm this is shared memory.
131
00:14:39,839 --> 00:14:44,670
And so this shared memory implementations
are usually quite complex and they
132
00:14:44,670 --> 00:14:51,460
represent an attack surface on the both
sides. And finally, the third piece of this
133
00:14:51,460 --> 00:14:57,319
diagram is in the lowest part. I have
depicted two grey boxes which are related
134
00:14:57,319 --> 00:15:03,139
to the trusted execution environment.
Because typically a modem runs as a
135
00:15:03,139 --> 00:15:11,379
Trustled in a secure environment. So
technically, the attack surfaces that
136
00:15:11,379 --> 00:15:16,550
exists within TrustZone or related to it
also can be useful for baseband offensive
137
00:15:16,550 --> 00:15:22,890
research. Here we can distinguish at least
two large attack surfaces. The first one
138
00:15:22,890 --> 00:15:31,490
is the secure manager of call handlers,
which is the core interface that
139
00:15:31,490 --> 00:15:36,960
handles calls from the application
processor to the TrustZone. And the second
140
00:15:36,960 --> 00:15:44,810
one are the Trustlets. They are separate
pieces of code which are executed and
141
00:15:44,810 --> 00:15:56,790
protected by the TrustZone. On this
diagram, I have also added some
142
00:15:56,790 --> 00:16:02,839
information about data codex, I'm not sure
if they are supposed to be in the RTOS
143
00:16:02,839 --> 00:16:06,319
core because these things are directly
accessible from the cellular stacks
144
00:16:06,319 --> 00:16:14,959
usually, especially ASN. 1, which I have
seen some bugs reachable from the over the
145
00:16:14,959 --> 00:16:23,009
air interface. On this diagram, I have
shown some example of vulnerabilities. I
146
00:16:23,009 --> 00:16:26,769
will not discuss them in details here
since it's not the point of the
147
00:16:26,769 --> 00:16:32,480
presentation, but at least the ones from
Baodong, you can find the writeups on
148
00:16:32,480 --> 00:16:46,589
the Internet. To discuss baseband
offensive tools and approaches, I have
149
00:16:46,589 --> 00:16:50,720
narrowed down the previous diagram to just
one attack surface, the over the air
150
00:16:50,720 --> 00:16:55,620
attack surface. This is the attack
surface, which is represented by parsing
151
00:16:55,620 --> 00:16:59,480
implementations of various cellular
protocols inside the baseband operating
152
00:16:59,480 --> 00:17:06,610
system. And this is the attack surface
that we can reach from the air interface.
153
00:17:06,610 --> 00:17:13,390
In order to accomplish that, we need a
transceiver such as software defined radio
154
00:17:13,390 --> 00:17:21,170
or a mobile tester, which is able to talk
the specific cellular protocol that we're
155
00:17:21,170 --> 00:17:28,780
planning to attack. The simplest way to
accomplish this is use some sort of a
156
00:17:28,780 --> 00:17:34,730
software defined radio, such as Ettus
research USRP or blade RF and install open
157
00:17:34,730 --> 00:17:41,240
source implementation of a base station
such as OpenBTS or OpenBSC. The thing to
158
00:17:41,240 --> 00:17:50,050
note here is that the software based
implementations actually lagged behind the
159
00:17:50,050 --> 00:17:54,970
development of technologies.
Implementations of GSM base stations are
160
00:17:54,970 --> 00:18:03,630
very well established and popular, such as
OpenBTS. And in fact, when I tried to
161
00:18:03,630 --> 00:18:15,140
establish BTS with my USRP, it was quite
simple. For UMTS and LTE, there exists less
162
00:18:15,140 --> 00:18:19,950
number of software based implementations
and also there are more constraints on the
163
00:18:19,950 --> 00:18:26,310
hardware. For example, my model of the
USRP does not support UMTS due to resource
164
00:18:26,310 --> 00:18:31,690
constraints. And the most interesting
thing here is that there does not exist
165
00:18:31,690 --> 00:18:36,580
any software based implementation on the
CDMA that you can use to establish a base
166
00:18:36,580 --> 00:18:53,270
station. This is a pseudorandom diagram of
one of the Snapdragon chips. There exists
167
00:18:53,270 --> 00:18:58,820
a huge amount of various models of
Snapdragons. This one I have chosen
168
00:18:58,820 --> 00:19:05,680
pseudorandomly when I was searching for
some sort of visual diagram. Qualcomm used
169
00:19:05,680 --> 00:19:12,030
to include some high level diagrams of the
architecture in their marketing materials
170
00:19:12,030 --> 00:19:19,400
previously. But since they don't do this
anymore. And this particular diagram is
171
00:19:19,400 --> 00:19:26,820
from a technical specification of a
particular model 820. Also this particular
172
00:19:26,820 --> 00:19:34,420
model Snapdragon is... a bit interesting
because it is the first one that included
173
00:19:34,420 --> 00:19:44,790
the artificial intelligence agent, which
is also based on Hexagon. For all
174
00:19:44,790 --> 00:19:52,890
purposes, the main interest here are the
processors. Majority of snapdragons
175
00:19:52,890 --> 00:19:59,630
include quite a long list of processors.
There are at least 4 ARM-based Kryo-CPUs
176
00:19:59,630 --> 00:20:11,480
that actually run the Android operating
system. Then there are the Adreno GPUs and
177
00:20:11,480 --> 00:20:16,380
then there are several Hexagons. On the
most recent models there is not just one
178
00:20:16,380 --> 00:20:23,360
Hexagon processing unit, but several of
them. And they are called respectively to
179
00:20:23,360 --> 00:20:28,030
their purposes. Each one of them, each one
of these Hexagon cores is responsible for
180
00:20:28,030 --> 00:20:35,770
handling a specific functionality. For
example, MDSB handles modem and runs the
181
00:20:35,770 --> 00:20:44,260
real-time operating system. The ADSP
handles media and the CDSP handles
182
00:20:44,260 --> 00:20:52,540
compute. So the Hexagons actually
represent around one half of the
183
00:20:52,540 --> 00:21:08,771
processing power, more than Snapdragons.
There are two key points about the Hexagon
184
00:21:08,771 --> 00:21:17,501
architecture from the hardware
perspective. First of all, it is- Hexagon
185
00:21:17,501 --> 00:21:25,410
is specialized to parallel processing. And
so the first concept is variable size
186
00:21:25,410 --> 00:21:31,000
destruction packets. It means that
several instructions can execute
187
00:21:31,000 --> 00:21:42,330
simultaneously in separate execution
units. It also uses hardware
188
00:21:42,330 --> 00:21:48,990
multithreading for the same purposes. On
the right side of the slide here is some
189
00:21:48,990 --> 00:22:00,630
example of the Hexagon assembly. It is
quite funny at times. This curly brackets
190
00:22:00,630 --> 00:22:07,160
should present the instructions that are
executed simultaneously. And these
191
00:22:07,160 --> 00:22:15,500
instructions must be compactable in order
to be able to use that distant processing
192
00:22:15,500 --> 00:22:21,040
slots. And then there is the funny .new
notation which actually enables the
193
00:22:21,040 --> 00:22:26,050
instructions to use both the old and the
new value of a particular register within
194
00:22:26,050 --> 00:22:32,850
the same instruction cycle. This provides
quite a bit of optimization on the lower
195
00:22:32,850 --> 00:22:41,200
level. For more information, I can direct
you to the Hexagon Specification and
196
00:22:41,200 --> 00:22:53,830
programmers reference manual, which is
available from the Qualcomm website. The
197
00:22:53,830 --> 00:22:59,270
concept of production fusing is quite
common. As I said previously, it's a
198
00:22:59,270 --> 00:23:05,590
common practice from mobile device vendors
to lock down the devices before they enter
199
00:23:05,590 --> 00:23:11,540
the market to prevent modifications and
tinkering. And for the purposes of this
200
00:23:11,540 --> 00:23:17,300
locking down, they usually- there are
several ways how this can be accomplished.
201
00:23:17,300 --> 00:23:24,356
Usually various advanced diagnostic and
debugging functionalities are removed from
202
00:23:24,356 --> 00:23:30,820
either software or hardware or both. It is
quite common that this functionalities are
203
00:23:30,820 --> 00:23:37,180
only removed from software while the
hardware remains here. And in such case,
204
00:23:37,180 --> 00:23:43,869
we will- eventually the researchers will
come up with their own software based
205
00:23:43,869 --> 00:23:50,050
implementation. All this functionality as
in case with some custom iOS kernel
206
00:23:50,050 --> 00:23:55,910
debuggers, for example. In case of
Qualcomm, there was at some point a leaked
207
00:23:55,910 --> 00:24:02,416
internal memo which discusses what exactly
they are doing for production fusing the
208
00:24:02,416 --> 00:24:15,730
devices. In addition to our production
fusing in case of modern Androids, the
209
00:24:15,730 --> 00:24:22,860
baseband runs within the trust zone. And
on my implementation, it is already quite
210
00:24:22,860 --> 00:24:28,680
locked down. It uses a separate component.
The baseband uses a separate component
211
00:24:28,680 --> 00:24:36,510
named the MBA this stands for the modem
basic authenticator. And this entire thing
212
00:24:36,510 --> 00:24:42,210
is run by the subsystem of Android kernel
named PILO, the peripheral image loader.
213
00:24:42,210 --> 00:24:50,820
You can open the source code and
investigate how exactly it looks. And the
214
00:24:50,820 --> 00:24:57,430
purpose of the MBA is to authenticate the
modem firmware so that you would not be
215
00:24:57,430 --> 00:25:04,000
able to inject some arbitrary commands
into the modem firmware and flash it. This
216
00:25:04,000 --> 00:25:09,250
is another side of the hardening, which
makes it very difficult to inject any
217
00:25:09,250 --> 00:25:13,260
arbitrary code into the baseband.
Basically, the only way to do this is
218
00:25:13,260 --> 00:25:23,130
through a software vulnerability. During
this project I have reverse engineered
219
00:25:23,130 --> 00:25:33,360
partially the Hexagon modem firmware from
my implementation, from my Nexus 6b. The
220
00:25:33,360 --> 00:25:38,770
process of reverse engineering is not very
difficult because all you need is to
221
00:25:38,770 --> 00:25:44,950
download the firmware from the website,
Googles website in this case. Then you
222
00:25:44,950 --> 00:25:50,960
need to find the binary which corresponds
to the modem firmware. This binary is
223
00:25:50,960 --> 00:25:57,680
actually a compound binary that must be
divided into separate binaries that
224
00:25:57,680 --> 00:26:04,940
represent specific sections inside the
firmware. And for that purpose we can use
225
00:26:04,940 --> 00:26:11,410
the unified Trustlet script. After you
have split the baseband firmware into separate
226
00:26:11,410 --> 00:26:18,270
sections, you can load them into IDA Pro.
There are several plugins available for
227
00:26:18,270 --> 00:26:26,110
IDA Pro that support Hexagon. I have tried
one of them. I think it was GSMK and it
228
00:26:26,110 --> 00:26:35,650
works quite good for basic reverse
engineering purposes. Notable here is that
229
00:26:35,650 --> 00:26:41,660
some sections of the modem firmware are
compressed and relocated at runtime, so
230
00:26:41,660 --> 00:26:48,350
you would not be able to reverse engineer
them. And unless you can decompress them,
231
00:26:48,350 --> 00:26:52,270
which is also a bit of a challenge because
the Qualcomm uses some internal
232
00:26:52,270 --> 00:27:02,000
compression algorithm for that. For the
reverse engineering the main approach here
233
00:27:02,000 --> 00:27:06,010
is to get started with some root points,
for example, because this is a real time
234
00:27:06,010 --> 00:27:11,290
operating system, we know that it should
have some task structures and task
235
00:27:11,290 --> 00:27:16,340
structures that we can locate. And from
there we can locate some interesting code.
236
00:27:16,340 --> 00:27:20,160
In case of Hexagon this is a bit non-
trivial because, as I said, it doesn't
237
00:27:20,160 --> 00:27:24,930
have any log strings. So even though you
may locate something that looks like a
238
00:27:24,930 --> 00:27:30,530
task struct, but it's not clear which code
does it actually represent. So the first
239
00:27:30,530 --> 00:27:43,360
step here is to apply the log strings that
were removed from the binary by Qshrink. I
240
00:27:43,360 --> 00:27:51,920
think the only way to do it is by using
that msg_hash.txt file from the leaked
241
00:27:51,920 --> 00:27:57,590
sources. This file is not supposed to be
available neither on the mobile devices
242
00:27:57,590 --> 00:28:05,470
nor in some open ecosystem. And after you
have applied these log strings, you will
243
00:28:05,470 --> 00:28:10,841
be able to rename some functions. And
based on these log strings and because the
244
00:28:10,841 --> 00:28:17,420
log strings often contain the names of the
source file, source module from which the
245
00:28:17,420 --> 00:28:27,090
code was built. So it creates opportunity
to understand what exactly this code is
246
00:28:27,090 --> 00:28:34,920
doing. Debugging was completely
unavailable in my case, and I realized
247
00:28:34,920 --> 00:28:44,820
that it would require some couple of
months more work to make it work and the
248
00:28:44,820 --> 00:28:49,490
only way I think, and the best way is to
create a software based debugger similar
249
00:28:49,490 --> 00:28:57,100
to modkit, the publication that I will be
referencing in the references, based on
250
00:28:57,100 --> 00:29:05,520
software vulnerability in either the modem
itself or in some authenticator or in the
251
00:29:05,520 --> 00:29:09,700
trust zone so that we can inject a
software debugger callbacks into the
252
00:29:09,700 --> 00:29:20,180
baseband and connect it to the GDB stop.
This is how the part of the firmware looks
253
00:29:20,180 --> 00:29:28,040
that has log strings stripped out. Here it
already has some names applied using IDA
254
00:29:28,040 --> 00:29:32,940
script. So of course there was no such
names initially, only the hashes. Each one
255
00:29:32,940 --> 00:29:38,450
of these hashes represent a log string
that you can take in from the message hash
256
00:29:38,450 --> 00:29:48,720
file. And here is what you can get after
you have applied the textual messages and
257
00:29:48,720 --> 00:29:54,120
renamed some functions. In this case, you
would be able to find some hundreds of
258
00:29:54,120 --> 00:29:59,600
procedures that are directly related to
the DIAG subsystem. And in a similar way
259
00:29:59,600 --> 00:30:07,460
you can locate various subsystems related
to over the air vectors as well. But
260
00:30:07,460 --> 00:30:17,650
unfortunately, majority of the OTA vectors
are located in the segments that are not
261
00:30:17,650 --> 00:30:23,190
immediately available in the firmware, the
ones that are compressed and relocated.
262
00:30:23,190 --> 00:30:31,360
Meanwhile, I have tried many different
things during this project. The things
263
00:30:31,360 --> 00:30:37,360
that definitely worked is building the MSM
kernel. There is nothing special about
264
00:30:37,360 --> 00:30:44,980
this, just a regular cross-build. Another
commonly well known offensive approach is
265
00:30:44,980 --> 00:30:50,280
firmware downgrades. When you take some
old firmware that contains a well-known
266
00:30:50,280 --> 00:30:56,070
security vulnerability and flash it and
use the bug to create and exploit to
267
00:30:56,070 --> 00:31:06,680
achieve some additional functionality or
introspection into the system. This part
268
00:31:06,680 --> 00:31:13,390
definitely works, downgrades are trivial
both on the entire firmware and a modem as
269
00:31:13,390 --> 00:31:18,870
well as the trust zone. I did try to build
the Qualcomm firmware from the leaked
270
00:31:18,870 --> 00:31:23,420
source codes. I assigned just a few days
to the task since it's not mission-
271
00:31:23,420 --> 00:31:29,700
critical and I have run out of time,
probably was a different version of sorce
272
00:31:29,700 --> 00:31:37,820
codes. But actually, this is not a
critical project because building leaked
273
00:31:37,820 --> 00:31:42,250
firmware is not directly relevant to
finding new bugs in the production
274
00:31:42,250 --> 00:31:53,140
firmware. So I just said it aside for some
later investigation. I have also
275
00:31:53,140 --> 00:31:58,380
investigated the ramdump's ecosystem a
little bit on the software side at least.
276
00:31:58,380 --> 00:32:10,640
And it seems that it's also fused quite
reliably. This is when I remembered about
277
00:32:10,640 --> 00:32:16,890
the Qualcomm DIAG. During the initial
reconnaisance I stumbled on some
278
00:32:16,890 --> 00:32:23,720
whitepapers and slides that mentioned the
Qualcomm diagnostic protocol. And it
279
00:32:23,720 --> 00:32:27,960
seemed like quite a powerful protocol,
specifically with respect to reconfiguring
280
00:32:27,960 --> 00:32:33,910
the baseband. So I decided to, first of
all, to test it in case that it would
281
00:32:33,910 --> 00:32:37,810
actually provide some advanced
introspection functionality and then
282
00:32:37,810 --> 00:32:48,790
probably to use it.. to use the protocol for
enabling log dumps. Qualcomm DIAG or QCDM
283
00:32:48,790 --> 00:32:53,290
is a proprietary protocol developed by
Qualcomm with the purposes of advanced
284
00:32:53,290 --> 00:32:59,910
baseband software configuration and
diagnostics. It is mostly aimed for OEM
285
00:32:59,910 --> 00:33:07,410
developers, not for users. The Qualcomm
DIAG protocol consists of around 200
286
00:33:07,410 --> 00:33:14,660
commands at least in theory. Some of them
are quite powerful on paper such as
287
00:33:14,660 --> 00:33:25,450
downloader mode and read/write memory.
Initially the DIAG was partially reverse
288
00:33:25,450 --> 00:33:33,580
engineered around 2010 and included in the
open source project named Modem Manager.
289
00:33:33,580 --> 00:33:39,680
And then it was also exposed in a
presentation at the Chaos Communication
290
00:33:39,680 --> 00:33:49,840
Congress 2011 by Guillaume Delugré. I
think this presentation popularized it and
291
00:33:49,840 --> 00:33:55,050
this is the one that introduced me to this
protocol. Unfortunately, that presentation
292
00:33:55,050 --> 00:34:01,771
is not really relevant - majority of it -
to modern production phones, but it does
293
00:34:01,771 --> 00:34:08,200
provide a high level overview and a
general expectation of what you will have
294
00:34:08,200 --> 00:34:15,149
to deal with. From the offensive
perspective, the DIAG protocol represents
295
00:34:15,149 --> 00:34:21,240
a local attack vector from the application
processor to the baseband. A common
296
00:34:21,240 --> 00:34:27,319
scenario of how it can be useful is
unlocking mobile phones which are locked
297
00:34:27,319 --> 00:34:33,269
to a particular mobile carrier. If we find
a memory corruption vulnerability in DIAG
298
00:34:33,269 --> 00:34:40,829
protocol, it may be possible to execute a
call directly on the baseband and change
299
00:34:40,829 --> 00:34:45,089
some internal settings. This is usually
accomplished historically through the IT
300
00:34:45,089 --> 00:34:51,429
common handlers, but internal proprietary
protocols are also very convenient for
301
00:34:51,429 --> 00:34:59,740
that. The second scenario how that diag
offensive can be useful is using it for
302
00:34:59,740 --> 00:35:08,750
injecting a software based debugger. If
you can find a bug in DIAG that enables
303
00:35:08,750 --> 00:35:14,440
read/write capability on the baseband, you
can inject some debugging hooks and
304
00:35:14,440 --> 00:35:22,509
eventually connect it to a GDB stop. So it
enables to create a software based
305
00:35:22,509 --> 00:35:32,450
debugger even when GTAG is not available.
What has changed in DIAG in 10 years based
306
00:35:32,450 --> 00:35:37,750
on some cursory investigation that I did.
First of all, the original publication
307
00:35:37,750 --> 00:35:46,390
mentioned Qualcomm baseband based on ARM
and with a Rex operating system. All modern
308
00:35:46,390 --> 00:35:50,770
Qualcomm basements are based on
Hexagon as opposed to ARM. And the Rex
309
00:35:50,770 --> 00:35:57,470
operating system was replaced with Kirt,
which I think is still has some bits of
310
00:35:57,470 --> 00:36:05,359
Rex, but in general it's a different
operating system. Majority of super
311
00:36:05,359 --> 00:36:09,921
powerful commands of DIAG such as
downloader mode and memory read/write were
312
00:36:09,921 --> 00:36:17,369
removed, at least on my device. And also
it does not expose any immediately
313
00:36:17,369 --> 00:36:25,579
available interfaces such as USB channel.
I hear that it's possible to enable the
314
00:36:25,579 --> 00:36:37,040
USB DIAG channel by adding some special
boot properties, but usually it's not, it
315
00:36:37,040 --> 00:36:42,650
wouldn't be available. It shouldn't be
expected to be available on all devices.
316
00:36:42,650 --> 00:36:48,599
So this observations are based on my test
device, Nexus 6b. And this this should be
317
00:36:48,599 --> 00:36:57,150
around medium level of hardening. More
modern devices such as Google pixels, the
318
00:36:57,150 --> 00:37:02,799
modern ones should be expected to be even
more hardened than that. Especially on the
319
00:37:02,799 --> 00:37:07,720
Google side, because they take hardening
very seriously. As opposed to it on the
320
00:37:07,720 --> 00:37:14,631
other side of the spectrum if you think
about some no name modem sticks, these
321
00:37:14,631 --> 00:37:24,329
things can be more open and more easy to
investigate. The DIAG implementation
322
00:37:24,329 --> 00:37:29,119
architecture is relatively simple. This
diagram is based roughly on the same
323
00:37:29,119 --> 00:37:34,319
diagram that I presented in the beginning
of talk. On the left side there is the
324
00:37:34,319 --> 00:37:42,099
Android kernel and on the right side there
is the baseband operating system. DIAG
325
00:37:42,099 --> 00:37:47,160
protocol actually it works in both sides.
It's not only commands that can be sent by
326
00:37:47,160 --> 00:37:51,000
the application processor to the baseband,
but it's also the messages that can be
327
00:37:51,000 --> 00:37:55,730
sent by the baseband to the application
processor. So DIAG comments are not really
328
00:37:55,730 --> 00:38:02,150
comments - they're more like tokens that
also can be used to encode messages. The
329
00:38:02,150 --> 00:38:10,269
green arrows on this slide represents an
example of call flow, of the data flow
330
00:38:10,269 --> 00:38:14,609
originating from the baseband and going to
the application processor. So obviously,
331
00:38:14,609 --> 00:38:25,820
in case of commands there would be a
reverse call flow or data flow. The main
332
00:38:25,820 --> 00:38:29,810
entity inside the operating system,
baseband operating system responsible for
333
00:38:29,810 --> 00:38:37,230
DIAG is the DIAG task. It has a separate
task which handles specifically various
334
00:38:37,230 --> 00:38:47,210
operations related to the DIAG protocol.
The exchange of data between the DIAG task
335
00:38:47,210 --> 00:38:55,390
and other tasks are done through the ring
buffer. So, for example, if some tasks
336
00:38:55,390 --> 00:39:05,730
needs to log something through the DIAG,
it will use specialized logging APIs that
337
00:39:05,730 --> 00:39:10,930
will in turn put logging data into the
ring buffer. The ring buffer will be
338
00:39:10,930 --> 00:39:20,330
drained either on timer or on a software
based interrupt from the caller. And at
339
00:39:20,330 --> 00:39:28,480
this point the data will be wrapped into
DIAG protocol and from there it will go to
340
00:39:28,480 --> 00:39:37,119
sI/O task, this Serial I/O which is
responsible to send in the output to a
341
00:39:37,119 --> 00:39:49,529
specific interface. This is based on the
modem, on the baseband configuration. The
342
00:39:49,529 --> 00:39:56,549
main interface that I was dealing with is
the shared memory, which ends up in the
343
00:39:56,549 --> 00:40:06,130
DIAG shared driver inside the Android
kernel. So in case of sending the commands
344
00:40:06,130 --> 00:40:11,809
from the Android kernel to the baseband,
it will be the reverse flow. First, you
345
00:40:11,809 --> 00:40:17,420
will need to send some- to craft the DIAG
protocol data, send it through the DIAG
346
00:40:17,420 --> 00:40:21,920
shared driver that will write to the
shared memory interface. From there, it
347
00:40:21,920 --> 00:40:28,109
will go to the specialized task in the
basement and eventually end up in the DIAG
348
00:40:28,109 --> 00:40:42,400
task and potentially other responsible
task. On the Android side, DIAG is
349
00:40:42,400 --> 00:40:47,970
represented with the /dev/diag device,
which is implemented with the diagchar,
350
00:40:47,970 --> 00:40:54,980
and diagfwd kernel drivers in the MSM
kernel. The purpose of the DIAG shared
351
00:40:54,980 --> 00:41:02,910
driver is to support the DIAG interface.
It is quite complex in code, but
352
00:41:02,910 --> 00:41:09,569
functionally it's quite simple. It
contains some basic minimum of DIAG
353
00:41:09,569 --> 00:41:15,310
commands that enable configuration of the
interface on the baseband side. And then
354
00:41:15,310 --> 00:41:20,609
it would be able to multiplex the DIAG
channel to either USB or a memory device.
355
00:41:20,609 --> 00:41:29,680
It also contains some IOCTLs for
configuration that can be accessed from
356
00:41:29,680 --> 00:41:36,029
the Android user land. And finally, the
IOCTL filters various DIAG commands that
357
00:41:36,029 --> 00:41:43,890
it considers unnecessary. This is a bit
important because when you will start,
358
00:41:43,890 --> 00:41:47,970
when you'll try to do some tests and send
some arbitrary DIAG comments with the DIAG
359
00:41:47,970 --> 00:41:54,980
interface, you would be required to
rebuild the actual driver to remove this
360
00:41:54,980 --> 00:42:03,249
masking, otherwise your commands will not
make it to the baseband side. At the core,
361
00:42:03,249 --> 00:42:09,299
the DIAG shared driver is based on the SMD
shared memory device interface, which is a
362
00:42:09,299 --> 00:42:21,470
core interface specific to Qualcomm modem.
So this is where DIAG is, diagchar
363
00:42:21,470 --> 00:42:29,059
is on the diagram. The diagchar
driver itself is located in the
364
00:42:29,059 --> 00:42:39,039
application OS's vendor specific drivers.
And then there is some shared memory
365
00:42:39,039 --> 00:42:43,759
implementation in the baseband that
handles this and the DIAG implementation
366
00:42:43,759 --> 00:42:56,589
itself. diagchar driver is quite complex
in code, but the functionality is quite
367
00:42:56,589 --> 00:43:06,869
simple. It does implement a handful of
CTLs that enables some configuration. I
368
00:43:06,869 --> 00:43:14,529
didn't check what exactly this IOCTLs are
responsible for. It exposes the /dev/diag
369
00:43:14,529 --> 00:43:19,430
device which is available for it in the
writing. However, by default, you are not
370
00:43:19,430 --> 00:43:25,380
able to access the DIAG channel based
on- for this device, because in order to
371
00:43:25,380 --> 00:43:33,220
access it, there is diag_switch_logging
function, which switches the channel that
372
00:43:33,220 --> 00:43:41,230
is used for DIAG communications. On the
screen there are several modes listed, but
373
00:43:41,230 --> 00:43:45,009
in practice only two of them are
supported. The USB mode and the memory
374
00:43:45,009 --> 00:43:53,000
device mode. USB mode is the default, so
which is why if you just open, the
375
00:43:53,000 --> 00:43:58,269
/dev/diag driver, dev/diag device and try
to read something from it, it won't work,
376
00:43:58,269 --> 00:44:07,559
is tied to USB. And in order to
reconfigure it to use the memory device,
377
00:44:07,559 --> 00:44:17,280
you need to send a special IOCTL code.
Notice the procedure named
378
00:44:17,280 --> 00:44:24,950
mask_request_validate, which employs a
quite strict filtering on the DIAG commands
379
00:44:24,950 --> 00:44:31,619
that you try to send through this
interface. So it filters out basically
380
00:44:31,619 --> 00:44:40,072
everything with the exception of some
basic configuration requests. At the core,
381
00:44:40,072 --> 00:44:46,990
DIAG shared driver use the shared memory
device to communicate with the baseband.
382
00:44:46,990 --> 00:44:55,079
The SMD implementation is quite complex.
It exposes SMD Read API, which is used by
383
00:44:55,079 --> 00:45:02,679
DIAG share for reading the data from the
shared memory, one of the APIs. Shared
384
00:45:02,679 --> 00:45:14,309
memory also operates on the abstraction of
channels which are accessed through the
385
00:45:14,309 --> 00:45:19,619
API named smd_named_open_on_edge. So you
can notice here that there are some DIAG
386
00:45:19,619 --> 00:45:25,120
specific channels that can be opened.
Now, let's take a look at the SMD
387
00:45:25,120 --> 00:45:29,730
implementation. This is a bit important
because a shared memory device represents
388
00:45:29,730 --> 00:45:33,420
a part of the attack surface for
escalation from the modem to the
389
00:45:33,420 --> 00:45:37,880
application processor. This is a very
important attack surface because if you
390
00:45:37,880 --> 00:45:42,509
just achieve code execution on the
baseband, it's mostly useless because it
391
00:45:42,509 --> 00:45:49,480
cannot access the main operating system.
And in order to make it useful, you'll
392
00:45:49,480 --> 00:45:59,119
need to create and exploit chain and add
one more exploit based on that bug with
393
00:45:59,119 --> 00:46:04,210
privilege escalation from the modem to the
application processor. So shared memory
394
00:46:04,210 --> 00:46:10,559
device is one of the attack surfaces for
this. The shared memory device is
395
00:46:10,559 --> 00:46:22,160
implemented as exposed memory region
exposed by the Qualcomm peripheral. The
396
00:46:22,160 --> 00:46:28,619
specialized MSM driver will map it and
here it's the name is smem_ram_phys, the
397
00:46:28,619 --> 00:46:40,099
base of the shared memory region. The
shared memory region operates on the
398
00:46:40,099 --> 00:46:50,519
concept of entries and channels, so it's
partitioned in distant parts that can be
399
00:46:50,519 --> 00:47:00,470
accessed through the procedure,
smem_get_entry and one of these entries is
400
00:47:00,470 --> 00:47:08,070
SMEM_CHANNEL_ALLOC_TBL, which contains the
list of available channels that can be
401
00:47:08,070 --> 00:47:13,740
opened. From there, we can actually open
the channels and use the shared memory
402
00:47:13,740 --> 00:47:25,700
interface. During this initial research
project, it wasn't my goal to research the
403
00:47:25,700 --> 00:47:32,460
entire Qualcomm ecosystem, so while I was
preparing for this talk, I have noticed
404
00:47:32,460 --> 00:47:37,569
some more interesting things in the source
codes, such as, for example, the
405
00:47:37,569 --> 00:47:45,859
specialized driver that handles GTAG
memory region, which is presumably exposed
406
00:47:45,859 --> 00:47:53,140
by some Qualcomm system of chips. In the
drivers this is mostly used read only, and
407
00:47:53,140 --> 00:47:58,609
I suppose that will not really work for
writing, but it's worth checking probably.
408
00:47:58,609 --> 00:48:07,849
And now, finally, let's take a look at the
DIAG protocol itself. One of the first
409
00:48:07,849 --> 00:48:13,119
things that I noticed when researching the
DIAG protocol is that it's actually used
410
00:48:13,119 --> 00:48:21,460
in a few places, not only in libqcdm. A
popular tool named SnoopSnitch can enable
411
00:48:21,460 --> 00:48:27,460
protocol dumps, so there are protocol
dumps on rooted devices. And in order to
412
00:48:27,460 --> 00:48:33,349
accomplish this, it's SnoopSnitch sends an
opaque blob of the commands to the mobile
413
00:48:33,349 --> 00:48:40,349
device through the DIAG interface. This is
blob is not documented. So it got me
414
00:48:40,349 --> 00:48:46,740
curious what exactly these commands are
doing. But before we can look at the dump,
415
00:48:46,740 --> 00:48:53,780
let's understand the protocol. The DIAG
protocol consists of around 200 of commands
416
00:48:53,780 --> 00:49:02,365
or tokens. Some of them are documented in
the open source, but not all of them. So
417
00:49:02,365 --> 00:49:07,630
you can notice on the screenshots, some of
the commands are missing. And one of the
418
00:49:07,630 --> 00:49:21,680
missing commands is actually the token 0x92
hexadecimal, which represents an encoded hash log
419
00:49:21,680 --> 00:49:34,069
message. The common format is quite
simple. The best pritimitive here is the
420
00:49:34,069 --> 00:49:42,819
DIAG token number 0x7E, it's not really a
delimiter, it's a separate DIAG command
421
00:49:42,819 --> 00:49:49,519
126. It's missing in the open source, as
you can see here. So the DIAG command is
422
00:49:49,519 --> 00:49:57,870
nested. The outer layer consists of this
wrapper of 0x7e hexadecimal bytes. Then
423
00:49:57,870 --> 00:50:02,329
there is the main command and then there
is some variable length data that can
424
00:50:02,329 --> 00:50:10,839
contain even more subcommands. This entire
thing is verified using the CRC and some
425
00:50:10,839 --> 00:50:16,860
bytes are escaped. Specifically, as you
can see on the snippet. One interesting
426
00:50:16,860 --> 00:50:24,539
thing about the DIAG protocol is that it
supports subsystem extensions. Basically,
427
00:50:24,539 --> 00:50:29,820
different subsystems in the baseband can
register their own DIAG system handlers,
428
00:50:29,820 --> 00:50:38,119
arbitrary ones. And there is a special DIAG
command number 75, which simply forwards..
429
00:50:38,119 --> 00:50:43,419
instructs the DIAG system to forward this
command to the respective subsystem. And
430
00:50:43,419 --> 00:50:56,849
then it will be parsed there. There exists
quite a large number of subsystems. Not
431
00:50:56,849 --> 00:51:01,480
all of them are documented, and when I
started investigating this, I noticed that
432
00:51:01,480 --> 00:51:08,360
there actually exists a DIAG subsystem-
subsystem and debugging subsystem. The
433
00:51:08,360 --> 00:51:15,089
later one immediately interested me
because I was hoping that it would enable
434
00:51:15,089 --> 00:51:19,700
some more advanced introspection through
this debugging subsystem. But it turned
435
00:51:19,700 --> 00:51:25,910
out that the debugging subsystem is quite
simple. It only supported one command:
436
00:51:25,910 --> 00:51:35,470
inject crash. So you can send a special
DIAG comment that will inject the crash
437
00:51:35,470 --> 00:51:43,970
into the baseband. I will talk later about
this. Now, let's take a look at specific
438
00:51:43,970 --> 00:51:52,410
examples of the DIAG protocol. This is the
annotated snippet of the blob of commands
439
00:51:52,410 --> 00:52:00,720
from SnoopSnitch. This blob actually
consists of three large logical parts. The
440
00:52:00,720 --> 00:52:04,470
first part is largely irrelevant. It's a
bunch of commands that request various
441
00:52:04,470 --> 00:52:10,249
informations from the baseband, such as
timestamp, version info, build id and so
442
00:52:10,249 --> 00:52:16,839
on. The second batch of commands starts
with a command Number 0x73 hexadecimal.
443
00:52:16,839 --> 00:52:26,529
This is DIAG common log config. This is the
command which enables protocol dumps and
444
00:52:26,529 --> 00:52:34,390
configures them. And third part of this
blob starts with the command number 0x7D
445
00:52:34,390 --> 00:52:38,459
hexadecimal. This is the
CMD_EXT_MESSAGE_CONFIG. This is actually
446
00:52:38,459 --> 00:52:43,410
the command that is supposed to enable
textual message logging, except that in
447
00:52:43,410 --> 00:52:51,680
case of SnoopSnitch it disables all of the
logging altogether. So how do you actually
448
00:52:51,680 --> 00:52:57,390
cellular protocol dumps work? In order to
enable the cellular product dumps, we need
449
00:52:57,390 --> 00:53:04,210
DIAG_CMD_LOG_CONFIG, number 0x73
hexadecimal. It is partially documented in
450
00:53:04,210 --> 00:53:12,640
the libqcdm. The structure of the packet
would contain the code and the subcommand,
451
00:53:12,640 --> 00:53:18,079
that would be set mask in this case. It
also needs an equipment ID, which
452
00:53:18,079 --> 00:53:25,230
corresponds to the specific protocol that
we want to dump. And finally, the masks
453
00:53:25,230 --> 00:53:33,369
that are applied to filter some
parts of the dump. This is relatively
454
00:53:33,369 --> 00:53:41,020
straightforward. And now the second command, DIAG_CMD_EXT_MESSAGE_CONFIG. This
455
00:53:41,020 --> 00:53:48,359
is the one which is supposed to enable
textual message logs. The command format
456
00:53:48,359 --> 00:54:00,130
is undocumented. So let's take a closer
look at it. The command consists of a
457
00:54:00,130 --> 00:54:06,720
subcommand. In this case, it's subcommand
number 4, the set mask. And then there are
458
00:54:06,720 --> 00:54:15,819
two 16 bit integers. SSID start and end.
SSID is subsystem ID, which is not the
459
00:54:15,819 --> 00:54:26,099
same as DIAG subsystems. And the last one
is the mask, so subsystem IDs are used to
460
00:54:26,099 --> 00:54:31,859
filter the messages based on a specific
subsystem, because there is a huge amount
461
00:54:31,859 --> 00:54:35,970
of subsystems in the baseband. And if all
of them start logging, this is a huge
462
00:54:35,970 --> 00:54:41,720
amount of data. So DIAG provides this
capability to filter a little bit, to a
463
00:54:41,720 --> 00:54:49,569
specific subsystem that you're interested
in. The snippet of Python code here is an
464
00:54:49,569 --> 00:54:58,440
example how to enable textual message logging
for all subsystems. You need to set the
465
00:54:58,440 --> 00:55:12,680
mask to all 1s. And this is quite a lot of
logging in my experience. Now for parsing
466
00:55:12,680 --> 00:55:18,039
the incoming log messages, there are two
types of DIAG tokens, both of them are
467
00:55:18,039 --> 00:55:26,399
undocumented. The first one is a legacy
message number 0x79 hexadecimal. This is a
468
00:55:26,399 --> 00:55:32,420
simple ASCII based message that arrives
through the DIAG interface so you can
469
00:55:32,420 --> 00:55:38,509
parse it quite straightforwardly. The
second one is I called it
470
00:55:38,509 --> 00:55:43,640
DIAG_CMD_LOG_HASH, it's number 0x92
hexadecimal. This is the token which
471
00:55:43,640 --> 00:55:50,650
encodes the log messages that contain only
the hashes. This is the one that if you
472
00:55:50,650 --> 00:55:57,579
have the msg_hash.txt file, you can
correspond the hash that was arrived to
473
00:55:57,579 --> 00:56:02,170
this command to the messages provided in
the text file. And you can get the textual
474
00:56:02,170 --> 00:56:08,900
logs. On the lower part of the slide there
are two examples of hexdumps from both
475
00:56:08,900 --> 00:56:16,019
commands. Both of them have a similar
structure. First, there are 4 bytes
476
00:56:16,019 --> 00:56:23,569
that are essential. The first one is the
command itself. And the third byte is
477
00:56:23,569 --> 00:56:30,950
quite interesting is the number of
arguments included. Next there is 64 bit
478
00:56:30,950 --> 00:56:40,470
value of timestamp. Next there is the SSID
value, 16 bit. Some line number, and I'm
479
00:56:40,470 --> 00:56:48,509
not sure what is the next argument. And
finally, after that, there is either ASCII
480
00:56:48,509 --> 00:56:59,380
encoded log string in plain text or hash
of the log string. And optionally there
481
00:56:59,380 --> 00:57:06,060
may be included some arguments, though, in
case of the first legacy command. The
482
00:57:06,060 --> 00:57:10,400
arguments are included before the log
message and in case of the second command
483
00:57:10,400 --> 00:57:16,670
they are included after the MD5 hash in
the log message, at least in my version of
484
00:57:16,670 --> 00:57:29,109
this implementation. And this is the DIAG
packet that enables you to inject a crash
485
00:57:29,109 --> 00:57:36,970
into the baseband, at least in theory.
Because in my case it did not work. And by
486
00:57:36,970 --> 00:57:41,410
not working, I mean that it did simply not
enter the baseband. Normally, I would
487
00:57:41,410 --> 00:57:46,470
expect that on production device it should
just reset the baseband. You will not get
488
00:57:46,470 --> 00:57:53,029
a crash dump or anything like that, just a
reset. So I suppose that it still should
489
00:57:53,029 --> 00:57:58,150
be working on some other devices. So it's
worth of checking. There are a few types of
490
00:57:58,150 --> 00:58:09,789
crashes that you can request in this way.
In order to accomplish this, I needed a
491
00:58:09,789 --> 00:58:17,119
very simple tool with basically two
functions. first, direct easy access to
492
00:58:17,119 --> 00:58:22,839
the DIAG interface, ideally through some
sort of python shell. And second is the
493
00:58:22,839 --> 00:58:29,779
ability to read and parse data with
advanced log strings. For that purpose. I
494
00:58:29,779 --> 00:58:37,999
wrote a simple framework that I named
diagtalk, which is based directly on the
495
00:58:37,999 --> 00:58:49,349
diag interface in the Android kernel and
or with a Python harness. So on the left
496
00:58:49,349 --> 00:58:56,970
side, here is the example of some advanced
parsing with some leaked values. And on
497
00:58:56,970 --> 00:59:02,014
the right side, here is the example of the
advanced message log, which includes the
498
00:59:02,014 --> 00:59:10,589
log strings that were extracted.. that were
stripped out from the firmware. The log is
499
00:59:10,589 --> 00:59:16,791
quite fun, as I expected it to be, it has
a lot of detailed data, such as, for
500
00:59:16,791 --> 00:59:22,800
example, GPS coordinates and various
attempts of the basement to connect to
501
00:59:22,800 --> 00:59:34,539
different channels. And I think it's quite
useful for offensive research purposes,
502
00:59:34,539 --> 00:59:42,960
it's even contained sometimes raw pointers
as you can notice on the screenshot. So in
503
00:59:42,960 --> 00:59:50,069
this project, my conclusion was that
indeed I was reassured that it was the
504
00:59:50,069 --> 00:59:56,660
right choice and Hexagon seems to be a
quite a challenging target, and it would
505
00:59:56,660 --> 01:00:00,940
probably need several more months of work
to even begin to do some serious offensive
506
01:00:00,940 --> 01:00:08,500
work. I also started to think about
writing a software debugger because it
507
01:00:08,500 --> 01:00:15,640
seems to be the most.. probably the most
reliable way to achieve debugging
508
01:00:15,640 --> 01:00:22,140
introspection. And also, I noticed some
blank spaces in the field that may require
509
01:00:22,140 --> 01:00:27,839
future work. For Qualcomm Hexagon
specifically, there is a lot of things
510
01:00:27,839 --> 01:00:35,539
that can be done. For example, you can
take a look at other Qualcomm proprietary
511
01:00:35,539 --> 01:00:40,609
diagnostic protocols of which there are a
few, such as QMI for example, I think they
512
01:00:40,609 --> 01:00:49,400
are lesser known than DIAG protocol. And
then there is a requirement to create a
513
01:00:49,400 --> 01:00:58,569
full system emulation based on QEMU at
least for some chips. And a big problem
514
01:00:58,569 --> 01:01:04,140
about the decompiler, which is a major
obstacle to any serious static analysis in
515
01:01:04,140 --> 01:01:14,979
the code and for the offensive research,
there are 3 large directions. First one is
516
01:01:14,979 --> 01:01:18,920
enabling debugging. There are different
ways for that. For example, software based
517
01:01:18,920 --> 01:01:25,940
debugging or bypassing JTAG fusing, on the
other hand. Next, there are explorations
518
01:01:25,940 --> 01:01:33,000
of the over the air attack vectors. And
the 3rd one is escalation from the baseband
519
01:01:33,000 --> 01:01:39,369
to the application processor. These are
the 3 large offensive research vectors.
520
01:01:39,369 --> 01:01:44,670
And for the basebands in general, there
also exists some interesting directions of
521
01:01:44,670 --> 01:01:54,140
future work. First of all, the OsmocommBB.
It definitely deserves some update a
522
01:01:54,140 --> 01:01:59,989
little bit. It is the only one open source
implementation of a baseband. And it is so
523
01:01:59,989 --> 01:02:09,040
outdated. And there is, and it is based on
some real obscure hardwares. Another
524
01:02:09,040 --> 01:02:17,677
problem here is that there doesn't exist
any software based CDMA implementation.
525
01:02:17,677 --> 01:02:28,660
No sound
526
01:02:28,660 --> 01:02:34,067
Herald: Alisa, thank you very much for
this nice talk. Um, there are some
527
01:02:34,067 --> 01:02:39,030
questions from the audience. So basically
the first one is a little bit of an
528
01:02:39,030 --> 01:02:46,358
icebreaker: Do you use a mobile phone?
And do you trust it?
529
01:02:46,358 --> 01:02:51,769
Alisa: No, I don't try to use a mobile
phone only for Twitter. Does anyone still
530
01:02:51,769 --> 01:03:00,065
use mobile phones nowadays?
H: laughs Well, no idea. Another
531
01:03:00,065 --> 01:03:07,979
question concerns the other Qualcomm
chips. Did you have a look at the Qualcom
532
01:03:07,979 --> 01:03:15,960
Wi-Fi chips sets?
A: As I mentioned during the talk, I had
533
01:03:15,960 --> 01:03:20,509
only one month. It was like a short
reconnaissance project, so I didn't really
534
01:03:20,509 --> 01:03:27,020
have time to investigate everything. I did
notice that Qualcomm socks have a Wi-Fi
535
01:03:27,020 --> 01:03:32,369
chip, which is also based on Hexagon. And
more than that, it also shares some of the
536
01:03:32,369 --> 01:03:38,540
same low level technical primitives. So
it's definitely worth looking, but I didn't
537
01:03:38,540 --> 01:03:45,019
investigate it in details.
H: OK, OK, thanks. There is also a pretty
538
01:03:45,019 --> 01:03:50,820
technical question here, so instead of
having to go through the rigorous command
539
01:03:50,820 --> 01:03:57,600
checking for the DIAG card driver,
wouldn't it be possible to nmap /dev/mem
540
01:03:57,600 --> 01:04:04,604
into userspace process and send over
commands directly so. Depends a little bit
541
01:04:04,604 --> 01:04:11,799
on what the goal is.
A: OK, so it really depends on your
542
01:04:11,799 --> 01:04:16,869
previous background and your goals. The
point here is that by default, the DIAG
543
01:04:16,869 --> 01:04:23,420
shared ecosystem does not allow to send
arbitrary DIAG commands. So either way,
544
01:04:23,420 --> 01:04:28,749
you will have to hack something. One way
to hack this is to rebuild the actual
545
01:04:28,749 --> 01:04:33,529
driver. So you would be able to send the
commands directly through that DIAG
546
01:04:33,529 --> 01:04:37,859
interface. Another way would be to access
the shared memory directly, for example.
547
01:04:37,859 --> 01:04:42,079
But I think it would be more complex
because the Qualcomm shared memory
548
01:04:42,079 --> 01:04:47,440
implementation is quite complex. So I
think that the easiest way would be
549
01:04:47,440 --> 01:04:52,789
actually to hack the DIAG shared driver
and use the deb. DIAG interface for this.
550
01:04:52,789 --> 01:05:00,270
H: OK, thanks. Thanks. There is one
question which I'm going to read out,
551
01:05:00,270 --> 01:05:14,870
maybe you can make sense of it: is this
typically [unclear] security fall mobile phones?
552
01:05:14,870 --> 01:05:19,289
A: This level of hardening that I
presented, I think is around medium level.
553
01:05:19,289 --> 01:05:24,270
So usually production falls are even more
hardened. If you take a look at things
554
01:05:24,270 --> 01:05:31,249
like Google Pixel5 or the latest iPhones,
they will be even better, hardened than
555
01:05:31,249 --> 01:05:38,640
the one that I discussed.
H: Oh, OK. Yeah, thanks. Thanks then. So it
556
01:05:38,640 --> 01:05:42,900
doesn't look like we have any more
questions left. Anyway, so if you want to
557
01:05:42,900 --> 01:05:49,122
get in contact with Alisa, no problem.
There is the feedback tab below your
558
01:05:49,122 --> 01:05:56,888
video now at the moment, just drop your
questions over there. And that's a way to
559
01:05:56,888 --> 01:06:02,736
get in touch with Alisa. Other than that I
would say we're done for today for this
560
01:06:02,736 --> 01:06:07,410
session. Thank you very, very much Alisa
for this really nice presentation once
561
01:06:07,410 --> 01:06:14,160
again. Applause And I'll transfer now over
to the Herald News Show.
562
01:06:14,160 --> 01:06:33,639
postroll music
563
01:06:33,639 --> 01:06:54,000
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!