WEBVTT
00:00:00.000 --> 00:00:13.047
rC3 preroll music
00:00:13.047 --> 00:00:17.730
Herald: Our next speaker, Alisa Esage, is
an independent vulnerability researcher
00:00:17.730 --> 00:00:22.640
and has a notable record of security
research achievements such as this year,
00:00:22.640 --> 00:00:29.770
the initiative Silver Bounty Hunter Awards
2018. Alisa is going to present her latest
00:00:29.770 --> 00:00:36.007
research on the Qualcomm DIAG protocol,
which is found abundantly in Qualcomm
00:00:36.007 --> 00:00:46.500
Hexagon based cellular modems. Alisa,
we're looking forward to your talk now.
00:00:46.500 --> 00:00:49.701
Alisa Esage: This is Alisa Esage, you're
attending my presentation about Advanced
00:00:49.701 --> 00:01:01.010
Hexagon DIAG at Chaos Communication
Congress 2020 remote experience. My main
00:01:01.010 --> 00:01:06.250
interest as advanced vulnerability
researcher is complex systems and hardened
00:01:06.250 --> 00:01:11.920
systems. For the last 10 years I have been
researching various classes of software
00:01:11.920 --> 00:01:16.280
such as Windows kernel, browsers,
JavaScript engines. And for the last three
00:01:16.280 --> 00:01:21.880
years I was focusing mostly on
Hypervisors. The project that I'm
00:01:21.880 --> 00:01:27.970
presenting today was a little side project
that I made for distraction a couple years
00:01:27.970 --> 00:01:37.560
ago. The name of this talk Advanced
Hexagon DIAG is a bit of an understatement
00:01:37.560 --> 00:01:45.290
in the attempt to keep this talk a little
bit low key in the general internet,
00:01:45.290 --> 00:01:50.840
because a big part of the talk will
actually be devoted to a general
00:01:50.840 --> 00:01:56.710
vulnerability research in basebands. But
the primary focus of this talk is on the
00:01:56.710 --> 00:02:02.899
Hexagon DIAG, also known as QCDM Qualcomm
diagnostic manager. This is a proprietary
00:02:02.899 --> 00:02:09.229
protocol developed by Qualcomm for use in
their basebands, and it is included on all
00:02:09.229 --> 00:02:18.400
Snapdragon SoCs and modem chips produced
by Qualcomm. More than Qualcomm chips run
00:02:18.400 --> 00:02:24.299
on custom silicone with a custom
instruction set architecture and named
00:02:24.299 --> 00:02:30.930
QDSP6 Hexagon. This is important because
all the DIAG handlers that we will be
00:02:30.930 --> 00:02:41.699
dealing with are written in this
instruction set architecture. As usually
00:02:41.699 --> 00:02:47.769
with my talks, I have adjusted the
materials of this presentation for various
00:02:47.769 --> 00:02:52.659
audiences, for the full spectrum of
audiences, specifically the first part of
00:02:52.659 --> 00:03:00.699
the presentation is mostly specialized for
research directors and high level
00:03:00.699 --> 00:03:06.719
technical staff. And the last part is more
deep technical. And it would be mostly
00:03:06.719 --> 00:03:14.510
interesting to specialized vulnerability
researchers and low level programmers that
00:03:14.510 --> 00:03:25.400
somehow are related to this particular
area. Let's start from the top level
00:03:25.400 --> 00:03:31.540
overview of cellular technology. This mind
map presents a simplified view of various
00:03:31.540 --> 00:03:36.739
types of entities that we'd have to deal
with with respect to basebands. It's not a
00:03:36.739 --> 00:03:44.659
complete diagram, of course, but it only
presents the classes of entities that
00:03:44.659 --> 00:03:51.540
exist in this space. Also, this mind map
is specific to the clean site equipment,
00:03:51.540 --> 00:03:57.109
the user equipment and it completely omits
any server side considerations which are a
00:03:57.109 --> 00:04:02.290
world in their own. There exists quite a
large number of cellular protocols on the
00:04:02.290 --> 00:04:08.199
planet. From the user perspective, this is
simple. This is usually the shared name
00:04:08.199 --> 00:04:15.469
3G, 4G that you see on the mobile screen.
But in reality, this simple name, that
00:04:15.469 --> 00:04:27.409
generation name encodes - may encode
several different distinct technologies.
00:04:27.409 --> 00:04:32.620
There are a few key points about cellular
protocols that are crucial to understand
00:04:32.620 --> 00:04:38.860
before starting to approach this area. The
first one is the concept of a generation.
00:04:38.860 --> 00:04:45.379
This is simple. This is simply 1G, 2G and
so on. The generic name of the family of
00:04:45.379 --> 00:04:49.910
protocols that are supported in a
particular generation. Generation is
00:04:49.910 --> 00:04:55.539
simply a marketing name, for users. It
doesn't really have any strict technical
00:04:55.539 --> 00:05:02.199
meaning. And generations represent the
evolution of cellular protocols in time.
00:05:02.199 --> 00:05:06.840
The second most important thing about
cellular protocols is the air interface.
00:05:06.840 --> 00:05:13.629
This is.. or the protocol, which actually..
this is the lowest level protocol which
00:05:13.629 --> 00:05:20.270
defines how exactly the cellular
signal is digitized and read from the
00:05:20.270 --> 00:05:26.700
electromagnetic wave and how exactly the
different players in this field divide the
00:05:26.700 --> 00:05:32.990
space. Historically, there existed two
main implementations of this low level
00:05:32.990 --> 00:05:39.330
code called TDMA and CDMA. TDMA means time
division multiple access, which basically
00:05:39.330 --> 00:05:43.670
divides the entire electromagnetic
spectrum within the radio band into time
00:05:43.670 --> 00:05:51.490
slots that are rotated in a round robin
manner by various mobile phones so that
00:05:51.490 --> 00:06:04.319
they speak in turns. TDMA was the base for
the GSM technology. And GSM was the main
00:06:04.319 --> 00:06:09.919
protocol used on this planet for a long
time. Another low level implementation is
00:06:09.919 --> 00:06:16.689
CDMA. It was a little bit more complex
from the beginning. It's decoded as coded
00:06:16.689 --> 00:06:24.300
division multiple access. And instead of
dividing the spectrum in time slots and
00:06:24.300 --> 00:06:32.580
dividing the protocol in bursts, CDMA uses
random codes that are assigned to mobile
00:06:32.580 --> 00:06:43.060
phones so that this code can be used as an
additional randomizing mask against the
00:06:43.060 --> 00:06:48.400
modulation protocol. And multiple user
equipments can talk on the same frequency
00:06:48.400 --> 00:06:57.110
without interrupting each other. Note here
that CDMA was developed by Qualcomm and it
00:06:57.110 --> 00:07:03.159
was mostly used in the United States. So
at the level of 2G, there were two main
00:07:03.159 --> 00:07:11.581
protocols, GSM based on the TDMA and the
cdmaOne based on the CDMA. On the third
00:07:11.581 --> 00:07:17.919
generation of mobile protocols these two
branches of development were continued. So
00:07:17.919 --> 00:07:24.160
GSM evolved into UMTS, while cdmaOne
evolved into CDMA2000. The important point
00:07:24.160 --> 00:07:31.029
here is that UMTS has at this point
already adopted the low level air
00:07:31.029 --> 00:07:37.340
interface protocol from the CDMA and
eventually at the fourth generation of
00:07:37.340 --> 00:07:41.240
protocols these two branches of
development come together to create the
00:07:41.240 --> 00:07:52.680
LTE technology and the same for the 5G.
This is a bit important for us as from the
00:07:52.680 --> 00:07:57.909
offensive perspective, because first of
all, all of this technologies including
00:07:57.909 --> 00:08:04.999
the air interfaces represents separate
bits of code with separate parsing
00:08:04.999 --> 00:08:09.900
algorithms within the baseband firmware.
And all of them are usually presented in
00:08:09.900 --> 00:08:15.099
each baseband, regardless of which one you
actually use. Does your mobile provider
00:08:15.099 --> 00:08:20.919
actually support. Another important and
not obvious thing from the offensive
00:08:20.919 --> 00:08:29.940
security perspective here is that because
of this, evolutionary development of the..
00:08:29.940 --> 00:08:34.669
protocols are not actually completely
distinct. So if you think about LTE, it is
00:08:34.669 --> 00:08:39.289
not a completely different protocol from
GSM, but instead it is based largely on
00:08:39.289 --> 00:08:47.600
the same internal structures. And in fact,
if you look at the specifications, some of
00:08:47.600 --> 00:08:53.560
them are almost directly relevant. The
specifications of the GSM 2G, some of them
00:08:53.560 --> 00:08:59.810
are still directly relevant to some extent
to LTE. This is also important when you
00:08:59.810 --> 00:09:06.350
start analyzing protocols from the
offensive perspective. The cellular
00:09:06.350 --> 00:09:17.460
protocols are structured in a nested
way, in layers. Layers is the official
00:09:17.460 --> 00:09:25.120
terminology adopted by the specifications
with the exception of level zero. Here I
00:09:25.120 --> 00:09:29.980
just edited it for convenience, but it's
in the specifications layer start from one
00:09:29.980 --> 00:09:34.649
and proceed to three. From the offensive
perspective, the most interesting is level
00:09:34.649 --> 00:09:39.050
three, as you can see from the screenshot
of the specifications, because it encodes
00:09:39.050 --> 00:09:45.260
most of the high level protocol data, such
as handling SMS and GSM. This is the part
00:09:45.260 --> 00:09:49.830
of the protocol which actually contains
interesting data structures with TLV
00:09:49.830 --> 00:09:58.550
values and so on. When people talk about
attack in basebands, they usually mean
00:09:58.550 --> 00:10:06.010
attack in baseband over the air. Their OTA
attack vector, which is definitely one of
00:10:06.010 --> 00:10:11.930
the most interesting. But let's take a
step back and consider the entire big
00:10:11.930 --> 00:10:21.070
picture of the baseband ecosystem. This
diagram presents a unified view of
00:10:21.070 --> 00:10:28.009
generalized architecture of a modern
baseband with attack surfaces. First of
00:10:28.009 --> 00:10:34.680
all, there are two separate distinct
processors: the AP, application processor,
00:10:34.680 --> 00:10:40.140
and the MP, which is mobile processor. It
may be either a DSP or another CPU.
00:10:40.140 --> 00:10:45.290
Usually there are two separate processors
and each one of them runs a separate
00:10:45.290 --> 00:10:51.311
operating system. In case of the AP, it
may be Android or iOS and the baseband
00:10:51.311 --> 00:10:55.940
processor will draw on some sort of real-
time operating system provided by the
00:10:55.940 --> 00:11:03.400
mobile vendor. Important point here that
on modern implementations, baseband
00:11:03.400 --> 00:11:08.649
actually protected by some sort of secure
execution environment, maybe TrustZone on
00:11:08.649 --> 00:11:17.100
Androids or SEPOS on Apple devices. Which
means that the privilege boundary which is
00:11:17.100 --> 00:11:22.820
depicted here on the left side is dual
sided. So even if you have kernel access
00:11:22.820 --> 00:11:29.740
to the Android kernel, you still are not
supposed to be able to read the memory of
00:11:29.740 --> 00:11:33.620
the baseband or somehow intersect with its
operation, at least on the modern
00:11:33.620 --> 00:11:38.560
production smartphones. And the same goes
around to the baseband, which is not
00:11:38.560 --> 00:11:45.540
supposed to be able to access to application
processor directly. So these two are
00:11:45.540 --> 00:11:50.191
mutually distrusting entities that are
separated from each other. And so there
00:11:50.191 --> 00:12:01.892
exists privilege boundary, which is -
which represents attack surface. Within
00:12:01.892 --> 00:12:07.389
the real-time operating systems, there are
three large attack surfaces. Starting from
00:12:07.389 --> 00:12:14.180
right to left: the rightmost gray box
represents the attack surface of the
00:12:14.180 --> 00:12:20.639
cellular stacks. This is the code which
actually parses the cellular protocols.
00:12:20.639 --> 00:12:31.699
It's usually runs in several distant real-
time operating system tasks. And this part
00:12:31.699 --> 00:12:38.519
of the attack surface handles all the
layers of the protocol. There is a huge
00:12:38.519 --> 00:12:44.070
amount of parsing that happens here. The
second box represents the various
00:12:44.070 --> 00:12:50.980
management protocols. The simplest one to
think about is the AT command protocol. It
00:12:50.980 --> 00:12:56.700
is still widely included in all basebands,
and it's even usually exposed in some way
00:12:56.700 --> 00:13:01.279
to the application processor. So you can
actually send some AT commands to the
00:13:01.279 --> 00:13:09.000
cellular modem. About a bit more interesting
is the vendor specific management
00:13:09.000 --> 00:13:16.680
protocols, one of them is the DIAG
protocol. Because the modern basebands are
00:13:16.680 --> 00:13:22.569
very complex. So vendors need some sort of
specialized protocol to enable
00:13:22.569 --> 00:13:28.910
configuration and diagnostics for the
OEM's. In case of Qualcomm, for example,
00:13:28.910 --> 00:13:37.170
DIAG is just one of the many diagnostic
protocols involved. The third box is what
00:13:37.170 --> 00:13:45.350
I call the RTOS core, it is various
core level functionality, such as the
00:13:45.350 --> 00:13:57.770
code, which implements that interface to
the application processor. On the side of
00:13:57.770 --> 00:14:04.019
the application operating system such as
Android, there are also 2 attack surfaces
00:14:04.019 --> 00:14:10.370
that are attackable from the baseband. The
first one is the peripheral drivers,
00:14:10.370 --> 00:14:13.579
because the basement is a separate part of
peripherals. So it requires some
00:14:13.579 --> 00:14:21.110
specialized drivers that handle I/O and
such things. And the second one is the
00:14:21.110 --> 00:14:29.002
dark surface represented with various
interface handlers because the baseband
00:14:29.002 --> 00:14:34.800
and the main operating system cannot
communicate directly. They use some sort
00:14:34.800 --> 00:14:39.839
of a specialized interface to do that. In
case of Qualcomm this is shared memory.
00:14:39.839 --> 00:14:44.670
And so this shared memory implementations
are usually quite complex and they
00:14:44.670 --> 00:14:51.460
represent an attack surface on the both
sides. And finally, the third piece of this
00:14:51.460 --> 00:14:57.319
diagram is in the lowest part. I have
depicted two grey boxes which are related
00:14:57.319 --> 00:15:03.139
to the trusted execution environment.
Because typically a modem runs as a
00:15:03.139 --> 00:15:11.379
Trustled in a secure environment. So
technically, the attack surfaces that
00:15:11.379 --> 00:15:16.550
exists within TrustZone or related to it
also can be useful for baseband offensive
00:15:16.550 --> 00:15:22.890
research. Here we can distinguish at least
two large attack surfaces. The first one
00:15:22.890 --> 00:15:31.490
is the secure manager of call handlers,
which is the core interface that
00:15:31.490 --> 00:15:36.960
handles calls from the application
processor to the TrustZone. And the second
00:15:36.960 --> 00:15:44.810
one are the Trustlets. They are separate
pieces of code which are executed and
00:15:44.810 --> 00:15:56.790
protected by the TrustZone. On this
diagram, I have also added some
00:15:56.790 --> 00:16:02.839
information about data codex, I'm not sure
if they are supposed to be in the RTOS
00:16:02.839 --> 00:16:06.319
core because these things are directly
accessible from the cellular stacks
00:16:06.319 --> 00:16:14.959
usually, especially ASN. 1, which I have
seen some bugs reachable from the over the
00:16:14.959 --> 00:16:23.009
air interface. On this diagram, I have
shown some example of vulnerabilities. I
00:16:23.009 --> 00:16:26.769
will not discuss them in details here
since it's not the point of the
00:16:26.769 --> 00:16:32.480
presentation, but at least the ones from
Baodong, you can find the writeups on
00:16:32.480 --> 00:16:46.589
the Internet. To discuss baseband
offensive tools and approaches, I have
00:16:46.589 --> 00:16:50.720
narrowed down the previous diagram to just
one attack surface, the over the air
00:16:50.720 --> 00:16:55.620
attack surface. This is the attack
surface, which is represented by parsing
00:16:55.620 --> 00:16:59.480
implementations of various cellular
protocols inside the baseband operating
00:16:59.480 --> 00:17:06.610
system. And this is the attack surface
that we can reach from the air interface.
00:17:06.610 --> 00:17:13.390
In order to accomplish that, we need a
transceiver such as software defined radio
00:17:13.390 --> 00:17:21.170
or a mobile tester, which is able to talk
the specific cellular protocol that we're
00:17:21.170 --> 00:17:28.780
planning to attack. The simplest way to
accomplish this is use some sort of a
00:17:28.780 --> 00:17:34.730
software defined radio, such as Ettus
research USRP or blade RF and install open
00:17:34.730 --> 00:17:41.240
source implementation of a base station
such as OpenBTS or OpenBSC. The thing to
00:17:41.240 --> 00:17:50.050
note here is that the software based
implementations actually lagged behind the
00:17:50.050 --> 00:17:54.970
development of technologies.
Implementations of GSM base stations are
00:17:54.970 --> 00:18:03.630
very well established and popular, such as
OpenBTS. And in fact, when I tried to
00:18:03.630 --> 00:18:15.140
establish BTS with my USRP, it was quite
simple. For UMTS and LTE, there exists less
00:18:15.140 --> 00:18:19.950
number of software based implementations
and also there are more constraints on the
00:18:19.950 --> 00:18:26.310
hardware. For example, my model of the
USRP does not support UMTS due to resource
00:18:26.310 --> 00:18:31.690
constraints. And the most interesting
thing here is that there does not exist
00:18:31.690 --> 00:18:36.580
any software based implementation on the
CDMA that you can use to establish a base
00:18:36.580 --> 00:18:53.270
station. This is a pseudorandom diagram of
one of the Snapdragon chips. There exists
00:18:53.270 --> 00:18:58.820
a huge amount of various models of
Snapdragons. This one I have chosen
00:18:58.820 --> 00:19:05.680
pseudorandomly when I was searching for
some sort of visual diagram. Qualcomm used
00:19:05.680 --> 00:19:12.030
to include some high level diagrams of the
architecture in their marketing materials
00:19:12.030 --> 00:19:19.400
previously. But since they don't do this
anymore. And this particular diagram is
00:19:19.400 --> 00:19:26.820
from a technical specification of a
particular model 820. Also this particular
00:19:26.820 --> 00:19:34.420
model Snapdragon is... a bit interesting
because it is the first one that included
00:19:34.420 --> 00:19:44.790
the artificial intelligence agent, which
is also based on Hexagon. For all
00:19:44.790 --> 00:19:52.890
purposes, the main interest here are the
processors. Majority of snapdragons
00:19:52.890 --> 00:19:59.630
include quite a long list of processors.
There are at least 4 ARM-based Kryo-CPUs
00:19:59.630 --> 00:20:11.480
that actually run the Android operating
system. Then there are the Adreno GPUs and
00:20:11.480 --> 00:20:16.380
then there are several Hexagons. On the
most recent models there is not just one
00:20:16.380 --> 00:20:23.360
Hexagon processing unit, but several of
them. And they are called respectively to
00:20:23.360 --> 00:20:28.030
their purposes. Each one of them, each one
of these Hexagon cores is responsible for
00:20:28.030 --> 00:20:35.770
handling a specific functionality. For
example, MDSB handles modem and runs the
00:20:35.770 --> 00:20:44.260
real-time operating system. The ADSP
handles media and the CDSP handles
00:20:44.260 --> 00:20:52.540
compute. So the Hexagons actually
represent around one half of the
00:20:52.540 --> 00:21:08.771
processing power, more than Snapdragons.
There are two key points about the Hexagon
00:21:08.771 --> 00:21:17.501
architecture from the hardware
perspective. First of all, it is- Hexagon
00:21:17.501 --> 00:21:25.410
is specialized to parallel processing. And
so the first concept is variable size
00:21:25.410 --> 00:21:31.000
destruction packets. It means that
several instructions can execute
00:21:31.000 --> 00:21:42.330
simultaneously in separate execution
units. It also uses hardware
00:21:42.330 --> 00:21:48.990
multithreading for the same purposes. On
the right side of the slide here is some
00:21:48.990 --> 00:22:00.630
example of the Hexagon assembly. It is
quite funny at times. This curly brackets
00:22:00.630 --> 00:22:07.160
should present the instructions that are
executed simultaneously. And these
00:22:07.160 --> 00:22:15.500
instructions must be compactable in order
to be able to use that distant processing
00:22:15.500 --> 00:22:21.040
slots. And then there is the funny .new
notation which actually enables the
00:22:21.040 --> 00:22:26.050
instructions to use both the old and the
new value of a particular register within
00:22:26.050 --> 00:22:32.850
the same instruction cycle. This provides
quite a bit of optimization on the lower
00:22:32.850 --> 00:22:41.200
level. For more information, I can direct
you to the Hexagon Specification and
00:22:41.200 --> 00:22:53.830
programmers reference manual, which is
available from the Qualcomm website. The
00:22:53.830 --> 00:22:59.270
concept of production fusing is quite
common. As I said previously, it's a
00:22:59.270 --> 00:23:05.590
common practice from mobile device vendors
to lock down the devices before they enter
00:23:05.590 --> 00:23:11.540
the market to prevent modifications and
tinkering. And for the purposes of this
00:23:11.540 --> 00:23:17.300
locking down, they usually- there are
several ways how this can be accomplished.
00:23:17.300 --> 00:23:24.356
Usually various advanced diagnostic and
debugging functionalities are removed from
00:23:24.356 --> 00:23:30.820
either software or hardware or both. It is
quite common that this functionalities are
00:23:30.820 --> 00:23:37.180
only removed from software while the
hardware remains here. And in such case,
00:23:37.180 --> 00:23:43.869
we will- eventually the researchers will
come up with their own software based
00:23:43.869 --> 00:23:50.050
implementation. All this functionality as
in case with some custom iOS kernel
00:23:50.050 --> 00:23:55.910
debuggers, for example. In case of
Qualcomm, there was at some point a leaked
00:23:55.910 --> 00:24:02.416
internal memo which discusses what exactly
they are doing for production fusing the
00:24:02.416 --> 00:24:15.730
devices. In addition to our production
fusing in case of modern Androids, the
00:24:15.730 --> 00:24:22.860
baseband runs within the trust zone. And
on my implementation, it is already quite
00:24:22.860 --> 00:24:28.680
locked down. It uses a separate component.
The baseband uses a separate component
00:24:28.680 --> 00:24:36.510
named the MBA this stands for the modem
basic authenticator. And this entire thing
00:24:36.510 --> 00:24:42.210
is run by the subsystem of Android kernel
named PILO, the peripheral image loader.
00:24:42.210 --> 00:24:50.820
You can open the source code and
investigate how exactly it looks. And the
00:24:50.820 --> 00:24:57.430
purpose of the MBA is to authenticate the
modem firmware so that you would not be
00:24:57.430 --> 00:25:04.000
able to inject some arbitrary commands
into the modem firmware and flash it. This
00:25:04.000 --> 00:25:09.250
is another side of the hardening, which
makes it very difficult to inject any
00:25:09.250 --> 00:25:13.260
arbitrary code into the baseband.
Basically, the only way to do this is
00:25:13.260 --> 00:25:23.130
through a software vulnerability. During
this project I have reverse engineered
00:25:23.130 --> 00:25:33.360
partially the Hexagon modem firmware from
my implementation, from my Nexus 6b. The
00:25:33.360 --> 00:25:38.770
process of reverse engineering is not very
difficult because all you need is to
00:25:38.770 --> 00:25:44.950
download the firmware from the website,
Googles website in this case. Then you
00:25:44.950 --> 00:25:50.960
need to find the binary which corresponds
to the modem firmware. This binary is
00:25:50.960 --> 00:25:57.680
actually a compound binary that must be
divided into separate binaries that
00:25:57.680 --> 00:26:04.940
represent specific sections inside the
firmware. And for that purpose we can use
00:26:04.940 --> 00:26:11.410
the unified Trustlet script. After you
have split the baseband firmware into separate
00:26:11.410 --> 00:26:18.270
sections, you can load them into IDA Pro.
There are several plugins available for
00:26:18.270 --> 00:26:26.110
IDA Pro that support Hexagon. I have tried
one of them. I think it was GSMK and it
00:26:26.110 --> 00:26:35.650
works quite good for basic reverse
engineering purposes. Notable here is that
00:26:35.650 --> 00:26:41.660
some sections of the modem firmware are
compressed and relocated at runtime, so
00:26:41.660 --> 00:26:48.350
you would not be able to reverse engineer
them. And unless you can decompress them,
00:26:48.350 --> 00:26:52.270
which is also a bit of a challenge because
the Qualcomm uses some internal
00:26:52.270 --> 00:27:02.000
compression algorithm for that. For the
reverse engineering the main approach here
00:27:02.000 --> 00:27:06.010
is to get started with some root points,
for example, because this is a real time
00:27:06.010 --> 00:27:11.290
operating system, we know that it should
have some task structures and task
00:27:11.290 --> 00:27:16.340
structures that we can locate. And from
there we can locate some interesting code.
00:27:16.340 --> 00:27:20.160
In case of Hexagon this is a bit non-
trivial because, as I said, it doesn't
00:27:20.160 --> 00:27:24.930
have any log strings. So even though you
may locate something that looks like a
00:27:24.930 --> 00:27:30.530
task struct, but it's not clear which code
does it actually represent. So the first
00:27:30.530 --> 00:27:43.360
step here is to apply the log strings that
were removed from the binary by Qshrink. I
00:27:43.360 --> 00:27:51.920
think the only way to do it is by using
that msg_hash.txt file from the leaked
00:27:51.920 --> 00:27:57.590
sources. This file is not supposed to be
available neither on the mobile devices
00:27:57.590 --> 00:28:05.470
nor in some open ecosystem. And after you
have applied these log strings, you will
00:28:05.470 --> 00:28:10.841
be able to rename some functions. And
based on these log strings and because the
00:28:10.841 --> 00:28:17.420
log strings often contain the names of the
source file, source module from which the
00:28:17.420 --> 00:28:27.090
code was built. So it creates opportunity
to understand what exactly this code is
00:28:27.090 --> 00:28:34.920
doing. Debugging was completely
unavailable in my case, and I realized
00:28:34.920 --> 00:28:44.820
that it would require some couple of
months more work to make it work and the
00:28:44.820 --> 00:28:49.490
only way I think, and the best way is to
create a software based debugger similar
00:28:49.490 --> 00:28:57.100
to modkit, the publication that I will be
referencing in the references, based on
00:28:57.100 --> 00:29:05.520
software vulnerability in either the modem
itself or in some authenticator or in the
00:29:05.520 --> 00:29:09.700
trust zone so that we can inject a
software debugger callbacks into the
00:29:09.700 --> 00:29:20.180
baseband and connect it to the GDB stop.
This is how the part of the firmware looks
00:29:20.180 --> 00:29:28.040
that has log strings stripped out. Here it
already has some names applied using IDA
00:29:28.040 --> 00:29:32.940
script. So of course there was no such
names initially, only the hashes. Each one
00:29:32.940 --> 00:29:38.450
of these hashes represent a log string
that you can take in from the message hash
00:29:38.450 --> 00:29:48.720
file. And here is what you can get after
you have applied the textual messages and
00:29:48.720 --> 00:29:54.120
renamed some functions. In this case, you
would be able to find some hundreds of
00:29:54.120 --> 00:29:59.600
procedures that are directly related to
the DIAG subsystem. And in a similar way
00:29:59.600 --> 00:30:07.460
you can locate various subsystems related
to over the air vectors as well. But
00:30:07.460 --> 00:30:17.650
unfortunately, majority of the OTA vectors
are located in the segments that are not
00:30:17.650 --> 00:30:23.190
immediately available in the firmware, the
ones that are compressed and relocated.
00:30:23.190 --> 00:30:31.360
Meanwhile, I have tried many different
things during this project. The things
00:30:31.360 --> 00:30:37.360
that definitely worked is building the MSM
kernel. There is nothing special about
00:30:37.360 --> 00:30:44.980
this, just a regular cross-build. Another
commonly well known offensive approach is
00:30:44.980 --> 00:30:50.280
firmware downgrades. When you take some
old firmware that contains a well-known
00:30:50.280 --> 00:30:56.070
security vulnerability and flash it and
use the bug to create and exploit to
00:30:56.070 --> 00:31:06.680
achieve some additional functionality or
introspection into the system. This part
00:31:06.680 --> 00:31:13.390
definitely works, downgrades are trivial
both on the entire firmware and a modem as
00:31:13.390 --> 00:31:18.870
well as the trust zone. I did try to build
the Qualcomm firmware from the leaked
00:31:18.870 --> 00:31:23.420
source codes. I assigned just a few days
to the task since it's not mission-
00:31:23.420 --> 00:31:29.700
critical and I have run out of time,
probably was a different version of sorce
00:31:29.700 --> 00:31:37.820
codes. But actually, this is not a
critical project because building leaked
00:31:37.820 --> 00:31:42.250
firmware is not directly relevant to
finding new bugs in the production
00:31:42.250 --> 00:31:53.140
firmware. So I just said it aside for some
later investigation. I have also
00:31:53.140 --> 00:31:58.380
investigated the ramdump's ecosystem a
little bit on the software side at least.
00:31:58.380 --> 00:32:10.640
And it seems that it's also fused quite
reliably. This is when I remembered about
00:32:10.640 --> 00:32:16.890
the Qualcomm DIAG. During the initial
reconnaisance I stumbled on some
00:32:16.890 --> 00:32:23.720
whitepapers and slides that mentioned the
Qualcomm diagnostic protocol. And it
00:32:23.720 --> 00:32:27.960
seemed like quite a powerful protocol,
specifically with respect to reconfiguring
00:32:27.960 --> 00:32:33.910
the baseband. So I decided to, first of
all, to test it in case that it would
00:32:33.910 --> 00:32:37.810
actually provide some advanced
introspection functionality and then
00:32:37.810 --> 00:32:48.790
probably to use it.. to use the protocol for
enabling log dumps. Qualcomm DIAG or QCDM
00:32:48.790 --> 00:32:53.290
is a proprietary protocol developed by
Qualcomm with the purposes of advanced
00:32:53.290 --> 00:32:59.910
baseband software configuration and
diagnostics. It is mostly aimed for OEM
00:32:59.910 --> 00:33:07.410
developers, not for users. The Qualcomm
DIAG protocol consists of around 200
00:33:07.410 --> 00:33:14.660
commands at least in theory. Some of them
are quite powerful on paper such as
00:33:14.660 --> 00:33:25.450
downloader mode and read/write memory.
Initially the DIAG was partially reverse
00:33:25.450 --> 00:33:33.580
engineered around 2010 and included in the
open source project named Modem Manager.
00:33:33.580 --> 00:33:39.680
And then it was also exposed in a
presentation at the Chaos Communication
00:33:39.680 --> 00:33:49.840
Congress 2011 by Guillaume Delugré. I
think this presentation popularized it and
00:33:49.840 --> 00:33:55.050
this is the one that introduced me to this
protocol. Unfortunately, that presentation
00:33:55.050 --> 00:34:01.771
is not really relevant - majority of it -
to modern production phones, but it does
00:34:01.771 --> 00:34:08.200
provide a high level overview and a
general expectation of what you will have
00:34:08.200 --> 00:34:15.149
to deal with. From the offensive
perspective, the DIAG protocol represents
00:34:15.149 --> 00:34:21.240
a local attack vector from the application
processor to the baseband. A common
00:34:21.240 --> 00:34:27.319
scenario of how it can be useful is
unlocking mobile phones which are locked
00:34:27.319 --> 00:34:33.269
to a particular mobile carrier. If we find
a memory corruption vulnerability in DIAG
00:34:33.269 --> 00:34:40.829
protocol, it may be possible to execute a
call directly on the baseband and change
00:34:40.829 --> 00:34:45.089
some internal settings. This is usually
accomplished historically through the IT
00:34:45.089 --> 00:34:51.429
common handlers, but internal proprietary
protocols are also very convenient for
00:34:51.429 --> 00:34:59.740
that. The second scenario how that diag
offensive can be useful is using it for
00:34:59.740 --> 00:35:08.750
injecting a software based debugger. If
you can find a bug in DIAG that enables
00:35:08.750 --> 00:35:14.440
read/write capability on the baseband, you
can inject some debugging hooks and
00:35:14.440 --> 00:35:22.509
eventually connect it to a GDB stop. So it
enables to create a software based
00:35:22.509 --> 00:35:32.450
debugger even when GTAG is not available.
What has changed in DIAG in 10 years based
00:35:32.450 --> 00:35:37.750
on some cursory investigation that I did.
First of all, the original publication
00:35:37.750 --> 00:35:46.390
mentioned Qualcomm baseband based on ARM
and with a Rex operating system. All modern
00:35:46.390 --> 00:35:50.770
Qualcomm basements are based on
Hexagon as opposed to ARM. And the Rex
00:35:50.770 --> 00:35:57.470
operating system was replaced with Kirt,
which I think is still has some bits of
00:35:57.470 --> 00:36:05.359
Rex, but in general it's a different
operating system. Majority of super
00:36:05.359 --> 00:36:09.921
powerful commands of DIAG such as
downloader mode and memory read/write were
00:36:09.921 --> 00:36:17.369
removed, at least on my device. And also
it does not expose any immediately
00:36:17.369 --> 00:36:25.579
available interfaces such as USB channel.
I hear that it's possible to enable the
00:36:25.579 --> 00:36:37.040
USB DIAG channel by adding some special
boot properties, but usually it's not, it
00:36:37.040 --> 00:36:42.650
wouldn't be available. It shouldn't be
expected to be available on all devices.
00:36:42.650 --> 00:36:48.599
So this observations are based on my test
device, Nexus 6b. And this this should be
00:36:48.599 --> 00:36:57.150
around medium level of hardening. More
modern devices such as Google pixels, the
00:36:57.150 --> 00:37:02.799
modern ones should be expected to be even
more hardened than that. Especially on the
00:37:02.799 --> 00:37:07.720
Google side, because they take hardening
very seriously. As opposed to it on the
00:37:07.720 --> 00:37:14.631
other side of the spectrum if you think
about some no name modem sticks, these
00:37:14.631 --> 00:37:24.329
things can be more open and more easy to
investigate. The DIAG implementation
00:37:24.329 --> 00:37:29.119
architecture is relatively simple. This
diagram is based roughly on the same
00:37:29.119 --> 00:37:34.319
diagram that I presented in the beginning
of talk. On the left side there is the
00:37:34.319 --> 00:37:42.099
Android kernel and on the right side there
is the baseband operating system. DIAG
00:37:42.099 --> 00:37:47.160
protocol actually it works in both sides.
It's not only commands that can be sent by
00:37:47.160 --> 00:37:51.000
the application processor to the baseband,
but it's also the messages that can be
00:37:51.000 --> 00:37:55.730
sent by the baseband to the application
processor. So DIAG comments are not really
00:37:55.730 --> 00:38:02.150
comments - they're more like tokens that
also can be used to encode messages. The
00:38:02.150 --> 00:38:10.269
green arrows on this slide represents an
example of call flow, of the data flow
00:38:10.269 --> 00:38:14.609
originating from the baseband and going to
the application processor. So obviously,
00:38:14.609 --> 00:38:25.820
in case of commands there would be a
reverse call flow or data flow. The main
00:38:25.820 --> 00:38:29.810
entity inside the operating system,
baseband operating system responsible for
00:38:29.810 --> 00:38:37.230
DIAG is the DIAG task. It has a separate
task which handles specifically various
00:38:37.230 --> 00:38:47.210
operations related to the DIAG protocol.
The exchange of data between the DIAG task
00:38:47.210 --> 00:38:55.390
and other tasks are done through the ring
buffer. So, for example, if some tasks
00:38:55.390 --> 00:39:05.730
needs to log something through the DIAG,
it will use specialized logging APIs that
00:39:05.730 --> 00:39:10.930
will in turn put logging data into the
ring buffer. The ring buffer will be
00:39:10.930 --> 00:39:20.330
drained either on timer or on a software
based interrupt from the caller. And at
00:39:20.330 --> 00:39:28.480
this point the data will be wrapped into
DIAG protocol and from there it will go to
00:39:28.480 --> 00:39:37.119
sI/O task, this Serial I/O which is
responsible to send in the output to a
00:39:37.119 --> 00:39:49.529
specific interface. This is based on the
modem, on the baseband configuration. The
00:39:49.529 --> 00:39:56.549
main interface that I was dealing with is
the shared memory, which ends up in the
00:39:56.549 --> 00:40:06.130
DIAG shared driver inside the Android
kernel. So in case of sending the commands
00:40:06.130 --> 00:40:11.809
from the Android kernel to the baseband,
it will be the reverse flow. First, you
00:40:11.809 --> 00:40:17.420
will need to send some- to craft the DIAG
protocol data, send it through the DIAG
00:40:17.420 --> 00:40:21.920
shared driver that will write to the
shared memory interface. From there, it
00:40:21.920 --> 00:40:28.109
will go to the specialized task in the
basement and eventually end up in the DIAG
00:40:28.109 --> 00:40:42.400
task and potentially other responsible
task. On the Android side, DIAG is
00:40:42.400 --> 00:40:47.970
represented with the /dev/diag device,
which is implemented with the diagchar,
00:40:47.970 --> 00:40:54.980
and diagfwd kernel drivers in the MSM
kernel. The purpose of the DIAG shared
00:40:54.980 --> 00:41:02.910
driver is to support the DIAG interface.
It is quite complex in code, but
00:41:02.910 --> 00:41:09.569
functionally it's quite simple. It
contains some basic minimum of DIAG
00:41:09.569 --> 00:41:15.310
commands that enable configuration of the
interface on the baseband side. And then
00:41:15.310 --> 00:41:20.609
it would be able to multiplex the DIAG
channel to either USB or a memory device.
00:41:20.609 --> 00:41:29.680
It also contains some IOCTLs for
configuration that can be accessed from
00:41:29.680 --> 00:41:36.029
the Android user land. And finally, the
IOCTL filters various DIAG commands that
00:41:36.029 --> 00:41:43.890
it considers unnecessary. This is a bit
important because when you will start,
00:41:43.890 --> 00:41:47.970
when you'll try to do some tests and send
some arbitrary DIAG comments with the DIAG
00:41:47.970 --> 00:41:54.980
interface, you would be required to
rebuild the actual driver to remove this
00:41:54.980 --> 00:42:03.249
masking, otherwise your commands will not
make it to the baseband side. At the core,
00:42:03.249 --> 00:42:09.299
the DIAG shared driver is based on the SMD
shared memory device interface, which is a
00:42:09.299 --> 00:42:21.470
core interface specific to Qualcomm modem.
So this is where DIAG is, diagchar
00:42:21.470 --> 00:42:29.059
is on the diagram. The diagchar
driver itself is located in the
00:42:29.059 --> 00:42:39.039
application OS's vendor specific drivers.
And then there is some shared memory
00:42:39.039 --> 00:42:43.759
implementation in the baseband that
handles this and the DIAG implementation
00:42:43.759 --> 00:42:56.589
itself. diagchar driver is quite complex
in code, but the functionality is quite
00:42:56.589 --> 00:43:06.869
simple. It does implement a handful of
CTLs that enables some configuration. I
00:43:06.869 --> 00:43:14.529
didn't check what exactly this IOCTLs are
responsible for. It exposes the /dev/diag
00:43:14.529 --> 00:43:19.430
device which is available for it in the
writing. However, by default, you are not
00:43:19.430 --> 00:43:25.380
able to access the DIAG channel based
on- for this device, because in order to
00:43:25.380 --> 00:43:33.220
access it, there is diag_switch_logging
function, which switches the channel that
00:43:33.220 --> 00:43:41.230
is used for DIAG communications. On the
screen there are several modes listed, but
00:43:41.230 --> 00:43:45.009
in practice only two of them are
supported. The USB mode and the memory
00:43:45.009 --> 00:43:53.000
device mode. USB mode is the default, so
which is why if you just open, the
00:43:53.000 --> 00:43:58.269
/dev/diag driver, dev/diag device and try
to read something from it, it won't work,
00:43:58.269 --> 00:44:07.559
is tied to USB. And in order to
reconfigure it to use the memory device,
00:44:07.559 --> 00:44:17.280
you need to send a special IOCTL code.
Notice the procedure named
00:44:17.280 --> 00:44:24.950
mask_request_validate, which employs a
quite strict filtering on the DIAG commands
00:44:24.950 --> 00:44:31.619
that you try to send through this
interface. So it filters out basically
00:44:31.619 --> 00:44:40.072
everything with the exception of some
basic configuration requests. At the core,
00:44:40.072 --> 00:44:46.990
DIAG shared driver use the shared memory
device to communicate with the baseband.
00:44:46.990 --> 00:44:55.079
The SMD implementation is quite complex.
It exposes SMD Read API, which is used by
00:44:55.079 --> 00:45:02.679
DIAG share for reading the data from the
shared memory, one of the APIs. Shared
00:45:02.679 --> 00:45:14.309
memory also operates on the abstraction of
channels which are accessed through the
00:45:14.309 --> 00:45:19.619
API named smd_named_open_on_edge. So you
can notice here that there are some DIAG
00:45:19.619 --> 00:45:25.120
specific channels that can be opened.
Now, let's take a look at the SMD
00:45:25.120 --> 00:45:29.730
implementation. This is a bit important
because a shared memory device represents
00:45:29.730 --> 00:45:33.420
a part of the attack surface for
escalation from the modem to the
00:45:33.420 --> 00:45:37.880
application processor. This is a very
important attack surface because if you
00:45:37.880 --> 00:45:42.509
just achieve code execution on the
baseband, it's mostly useless because it
00:45:42.509 --> 00:45:49.480
cannot access the main operating system.
And in order to make it useful, you'll
00:45:49.480 --> 00:45:59.119
need to create and exploit chain and add
one more exploit based on that bug with
00:45:59.119 --> 00:46:04.210
privilege escalation from the modem to the
application processor. So shared memory
00:46:04.210 --> 00:46:10.559
device is one of the attack surfaces for
this. The shared memory device is
00:46:10.559 --> 00:46:22.160
implemented as exposed memory region
exposed by the Qualcomm peripheral. The
00:46:22.160 --> 00:46:28.619
specialized MSM driver will map it and
here it's the name is smem_ram_phys, the
00:46:28.619 --> 00:46:40.099
base of the shared memory region. The
shared memory region operates on the
00:46:40.099 --> 00:46:50.519
concept of entries and channels, so it's
partitioned in distant parts that can be
00:46:50.519 --> 00:47:00.470
accessed through the procedure,
smem_get_entry and one of these entries is
00:47:00.470 --> 00:47:08.070
SMEM_CHANNEL_ALLOC_TBL, which contains the
list of available channels that can be
00:47:08.070 --> 00:47:13.740
opened. From there, we can actually open
the channels and use the shared memory
00:47:13.740 --> 00:47:25.700
interface. During this initial research
project, it wasn't my goal to research the
00:47:25.700 --> 00:47:32.460
entire Qualcomm ecosystem, so while I was
preparing for this talk, I have noticed
00:47:32.460 --> 00:47:37.569
some more interesting things in the source
codes, such as, for example, the
00:47:37.569 --> 00:47:45.859
specialized driver that handles GTAG
memory region, which is presumably exposed
00:47:45.859 --> 00:47:53.140
by some Qualcomm system of chips. In the
drivers this is mostly used read only, and
00:47:53.140 --> 00:47:58.609
I suppose that will not really work for
writing, but it's worth checking probably.
00:47:58.609 --> 00:48:07.849
And now, finally, let's take a look at the
DIAG protocol itself. One of the first
00:48:07.849 --> 00:48:13.119
things that I noticed when researching the
DIAG protocol is that it's actually used
00:48:13.119 --> 00:48:21.460
in a few places, not only in libqcdm. A
popular tool named SnoopSnitch can enable
00:48:21.460 --> 00:48:27.460
protocol dumps, so there are protocol
dumps on rooted devices. And in order to
00:48:27.460 --> 00:48:33.349
accomplish this, it's SnoopSnitch sends an
opaque blob of the commands to the mobile
00:48:33.349 --> 00:48:40.349
device through the DIAG interface. This is
blob is not documented. So it got me
00:48:40.349 --> 00:48:46.740
curious what exactly these commands are
doing. But before we can look at the dump,
00:48:46.740 --> 00:48:53.780
let's understand the protocol. The DIAG
protocol consists of around 200 of commands
00:48:53.780 --> 00:49:02.365
or tokens. Some of them are documented in
the open source, but not all of them. So
00:49:02.365 --> 00:49:07.630
you can notice on the screenshots, some of
the commands are missing. And one of the
00:49:07.630 --> 00:49:21.680
missing commands is actually the token 0x92
hexadecimal, which represents an encoded hash log
00:49:21.680 --> 00:49:34.069
message. The common format is quite
simple. The best pritimitive here is the
00:49:34.069 --> 00:49:42.819
DIAG token number 0x7E, it's not really a
delimiter, it's a separate DIAG command
00:49:42.819 --> 00:49:49.519
126. It's missing in the open source, as
you can see here. So the DIAG command is
00:49:49.519 --> 00:49:57.870
nested. The outer layer consists of this
wrapper of 0x7e hexadecimal bytes. Then
00:49:57.870 --> 00:50:02.329
there is the main command and then there
is some variable length data that can
00:50:02.329 --> 00:50:10.839
contain even more subcommands. This entire
thing is verified using the CRC and some
00:50:10.839 --> 00:50:16.860
bytes are escaped. Specifically, as you
can see on the snippet. One interesting
00:50:16.860 --> 00:50:24.539
thing about the DIAG protocol is that it
supports subsystem extensions. Basically,
00:50:24.539 --> 00:50:29.820
different subsystems in the baseband can
register their own DIAG system handlers,
00:50:29.820 --> 00:50:38.119
arbitrary ones. And there is a special DIAG
command number 75, which simply forwards..
00:50:38.119 --> 00:50:43.419
instructs the DIAG system to forward this
command to the respective subsystem. And
00:50:43.419 --> 00:50:56.849
then it will be parsed there. There exists
quite a large number of subsystems. Not
00:50:56.849 --> 00:51:01.480
all of them are documented, and when I
started investigating this, I noticed that
00:51:01.480 --> 00:51:08.360
there actually exists a DIAG subsystem-
subsystem and debugging subsystem. The
00:51:08.360 --> 00:51:15.089
later one immediately interested me
because I was hoping that it would enable
00:51:15.089 --> 00:51:19.700
some more advanced introspection through
this debugging subsystem. But it turned
00:51:19.700 --> 00:51:25.910
out that the debugging subsystem is quite
simple. It only supported one command:
00:51:25.910 --> 00:51:35.470
inject crash. So you can send a special
DIAG comment that will inject the crash
00:51:35.470 --> 00:51:43.970
into the baseband. I will talk later about
this. Now, let's take a look at specific
00:51:43.970 --> 00:51:52.410
examples of the DIAG protocol. This is the
annotated snippet of the blob of commands
00:51:52.410 --> 00:52:00.720
from SnoopSnitch. This blob actually
consists of three large logical parts. The
00:52:00.720 --> 00:52:04.470
first part is largely irrelevant. It's a
bunch of commands that request various
00:52:04.470 --> 00:52:10.249
informations from the baseband, such as
timestamp, version info, build id and so
00:52:10.249 --> 00:52:16.839
on. The second batch of commands starts
with a command Number 0x73 hexadecimal.
00:52:16.839 --> 00:52:26.529
This is DIAG common log config. This is the
command which enables protocol dumps and
00:52:26.529 --> 00:52:34.390
configures them. And third part of this
blob starts with the command number 0x7D
00:52:34.390 --> 00:52:38.459
hexadecimal. This is the
CMD_EXT_MESSAGE_CONFIG. This is actually
00:52:38.459 --> 00:52:43.410
the command that is supposed to enable
textual message logging, except that in
00:52:43.410 --> 00:52:51.680
case of SnoopSnitch it disables all of the
logging altogether. So how do you actually
00:52:51.680 --> 00:52:57.390
cellular protocol dumps work? In order to
enable the cellular product dumps, we need
00:52:57.390 --> 00:53:04.210
DIAG_CMD_LOG_CONFIG, number 0x73
hexadecimal. It is partially documented in
00:53:04.210 --> 00:53:12.640
the libqcdm. The structure of the packet
would contain the code and the subcommand,
00:53:12.640 --> 00:53:18.079
that would be set mask in this case. It
also needs an equipment ID, which
00:53:18.079 --> 00:53:25.230
corresponds to the specific protocol that
we want to dump. And finally, the masks
00:53:25.230 --> 00:53:33.369
that are applied to filter some
parts of the dump. This is relatively
00:53:33.369 --> 00:53:41.020
straightforward. And now the second command, DIAG_CMD_EXT_MESSAGE_CONFIG. This
00:53:41.020 --> 00:53:48.359
is the one which is supposed to enable
textual message logs. The command format
00:53:48.359 --> 00:54:00.130
is undocumented. So let's take a closer
look at it. The command consists of a
00:54:00.130 --> 00:54:06.720
subcommand. In this case, it's subcommand
number 4, the set mask. And then there are
00:54:06.720 --> 00:54:15.819
two 16 bit integers. SSID start and end.
SSID is subsystem ID, which is not the
00:54:15.819 --> 00:54:26.099
same as DIAG subsystems. And the last one
is the mask, so subsystem IDs are used to
00:54:26.099 --> 00:54:31.859
filter the messages based on a specific
subsystem, because there is a huge amount
00:54:31.859 --> 00:54:35.970
of subsystems in the baseband. And if all
of them start logging, this is a huge
00:54:35.970 --> 00:54:41.720
amount of data. So DIAG provides this
capability to filter a little bit, to a
00:54:41.720 --> 00:54:49.569
specific subsystem that you're interested
in. The snippet of Python code here is an
00:54:49.569 --> 00:54:58.440
example how to enable textual message logging
for all subsystems. You need to set the
00:54:58.440 --> 00:55:12.680
mask to all 1s. And this is quite a lot of
logging in my experience. Now for parsing
00:55:12.680 --> 00:55:18.039
the incoming log messages, there are two
types of DIAG tokens, both of them are
00:55:18.039 --> 00:55:26.399
undocumented. The first one is a legacy
message number 0x79 hexadecimal. This is a
00:55:26.399 --> 00:55:32.420
simple ASCII based message that arrives
through the DIAG interface so you can
00:55:32.420 --> 00:55:38.509
parse it quite straightforwardly. The
second one is I called it
00:55:38.509 --> 00:55:43.640
DIAG_CMD_LOG_HASH, it's number 0x92
hexadecimal. This is the token which
00:55:43.640 --> 00:55:50.650
encodes the log messages that contain only
the hashes. This is the one that if you
00:55:50.650 --> 00:55:57.579
have the msg_hash.txt file, you can
correspond the hash that was arrived to
00:55:57.579 --> 00:56:02.170
this command to the messages provided in
the text file. And you can get the textual
00:56:02.170 --> 00:56:08.900
logs. On the lower part of the slide there
are two examples of hexdumps from both
00:56:08.900 --> 00:56:16.019
commands. Both of them have a similar
structure. First, there are 4 bytes
00:56:16.019 --> 00:56:23.569
that are essential. The first one is the
command itself. And the third byte is
00:56:23.569 --> 00:56:30.950
quite interesting is the number of
arguments included. Next there is 64 bit
00:56:30.950 --> 00:56:40.470
value of timestamp. Next there is the SSID
value, 16 bit. Some line number, and I'm
00:56:40.470 --> 00:56:48.509
not sure what is the next argument. And
finally, after that, there is either ASCII
00:56:48.509 --> 00:56:59.380
encoded log string in plain text or hash
of the log string. And optionally there
00:56:59.380 --> 00:57:06.060
may be included some arguments, though, in
case of the first legacy command. The
00:57:06.060 --> 00:57:10.400
arguments are included before the log
message and in case of the second command
00:57:10.400 --> 00:57:16.670
they are included after the MD5 hash in
the log message, at least in my version of
00:57:16.670 --> 00:57:29.109
this implementation. And this is the DIAG
packet that enables you to inject a crash
00:57:29.109 --> 00:57:36.970
into the baseband, at least in theory.
Because in my case it did not work. And by
00:57:36.970 --> 00:57:41.410
not working, I mean that it did simply not
enter the baseband. Normally, I would
00:57:41.410 --> 00:57:46.470
expect that on production device it should
just reset the baseband. You will not get
00:57:46.470 --> 00:57:53.029
a crash dump or anything like that, just a
reset. So I suppose that it still should
00:57:53.029 --> 00:57:58.150
be working on some other devices. So it's
worth of checking. There are a few types of
00:57:58.150 --> 00:58:09.789
crashes that you can request in this way.
In order to accomplish this, I needed a
00:58:09.789 --> 00:58:17.119
very simple tool with basically two
functions. first, direct easy access to
00:58:17.119 --> 00:58:22.839
the DIAG interface, ideally through some
sort of python shell. And second is the
00:58:22.839 --> 00:58:29.779
ability to read and parse data with
advanced log strings. For that purpose. I
00:58:29.779 --> 00:58:37.999
wrote a simple framework that I named
diagtalk, which is based directly on the
00:58:37.999 --> 00:58:49.349
diag interface in the Android kernel and
or with a Python harness. So on the left
00:58:49.349 --> 00:58:56.970
side, here is the example of some advanced
parsing with some leaked values. And on
00:58:56.970 --> 00:59:02.014
the right side, here is the example of the
advanced message log, which includes the
00:59:02.014 --> 00:59:10.589
log strings that were extracted.. that were
stripped out from the firmware. The log is
00:59:10.589 --> 00:59:16.791
quite fun, as I expected it to be, it has
a lot of detailed data, such as, for
00:59:16.791 --> 00:59:22.800
example, GPS coordinates and various
attempts of the basement to connect to
00:59:22.800 --> 00:59:34.539
different channels. And I think it's quite
useful for offensive research purposes,
00:59:34.539 --> 00:59:42.960
it's even contained sometimes raw pointers
as you can notice on the screenshot. So in
00:59:42.960 --> 00:59:50.069
this project, my conclusion was that
indeed I was reassured that it was the
00:59:50.069 --> 00:59:56.660
right choice and Hexagon seems to be a
quite a challenging target, and it would
00:59:56.660 --> 01:00:00.940
probably need several more months of work
to even begin to do some serious offensive
01:00:00.940 --> 01:00:08.500
work. I also started to think about
writing a software debugger because it
01:00:08.500 --> 01:00:15.640
seems to be the most.. probably the most
reliable way to achieve debugging
01:00:15.640 --> 01:00:22.140
introspection. And also, I noticed some
blank spaces in the field that may require
01:00:22.140 --> 01:00:27.839
future work. For Qualcomm Hexagon
specifically, there is a lot of things
01:00:27.839 --> 01:00:35.539
that can be done. For example, you can
take a look at other Qualcomm proprietary
01:00:35.539 --> 01:00:40.609
diagnostic protocols of which there are a
few, such as QMI for example, I think they
01:00:40.609 --> 01:00:49.400
are lesser known than DIAG protocol. And
then there is a requirement to create a
01:00:49.400 --> 01:00:58.569
full system emulation based on QEMU at
least for some chips. And a big problem
01:00:58.569 --> 01:01:04.140
about the decompiler, which is a major
obstacle to any serious static analysis in
01:01:04.140 --> 01:01:14.979
the code and for the offensive research,
there are 3 large directions. First one is
01:01:14.979 --> 01:01:18.920
enabling debugging. There are different
ways for that. For example, software based
01:01:18.920 --> 01:01:25.940
debugging or bypassing JTAG fusing, on the
other hand. Next, there are explorations
01:01:25.940 --> 01:01:33.000
of the over the air attack vectors. And
the 3rd one is escalation from the baseband
01:01:33.000 --> 01:01:39.369
to the application processor. These are
the 3 large offensive research vectors.
01:01:39.369 --> 01:01:44.670
And for the basebands in general, there
also exists some interesting directions of
01:01:44.670 --> 01:01:54.140
future work. First of all, the OsmocommBB.
It definitely deserves some update a
01:01:54.140 --> 01:01:59.989
little bit. It is the only one open source
implementation of a baseband. And it is so
01:01:59.989 --> 01:02:09.040
outdated. And there is, and it is based on
some real obscure hardwares. Another
01:02:09.040 --> 01:02:17.677
problem here is that there doesn't exist
any software based CDMA implementation.
01:02:17.677 --> 01:02:28.660
No sound
01:02:28.660 --> 01:02:34.067
Herald: Alisa, thank you very much for
this nice talk. Um, there are some
01:02:34.067 --> 01:02:39.030
questions from the audience. So basically
the first one is a little bit of an
01:02:39.030 --> 01:02:46.358
icebreaker: Do you use a mobile phone?
And do you trust it?
01:02:46.358 --> 01:02:51.769
Alisa: No, I don't try to use a mobile
phone only for Twitter. Does anyone still
01:02:51.769 --> 01:03:00.065
use mobile phones nowadays?
H: laughs Well, no idea. Another
01:03:00.065 --> 01:03:07.979
question concerns the other Qualcomm
chips. Did you have a look at the Qualcom
01:03:07.979 --> 01:03:15.960
Wi-Fi chips sets?
A: As I mentioned during the talk, I had
01:03:15.960 --> 01:03:20.509
only one month. It was like a short
reconnaissance project, so I didn't really
01:03:20.509 --> 01:03:27.020
have time to investigate everything. I did
notice that Qualcomm socks have a Wi-Fi
01:03:27.020 --> 01:03:32.369
chip, which is also based on Hexagon. And
more than that, it also shares some of the
01:03:32.369 --> 01:03:38.540
same low level technical primitives. So
it's definitely worth looking, but I didn't
01:03:38.540 --> 01:03:45.019
investigate it in details.
H: OK, OK, thanks. There is also a pretty
01:03:45.019 --> 01:03:50.820
technical question here, so instead of
having to go through the rigorous command
01:03:50.820 --> 01:03:57.600
checking for the DIAG card driver,
wouldn't it be possible to nmap /dev/mem
01:03:57.600 --> 01:04:04.604
into userspace process and send over
commands directly so. Depends a little bit
01:04:04.604 --> 01:04:11.799
on what the goal is.
A: OK, so it really depends on your
01:04:11.799 --> 01:04:16.869
previous background and your goals. The
point here is that by default, the DIAG
01:04:16.869 --> 01:04:23.420
shared ecosystem does not allow to send
arbitrary DIAG commands. So either way,
01:04:23.420 --> 01:04:28.749
you will have to hack something. One way
to hack this is to rebuild the actual
01:04:28.749 --> 01:04:33.529
driver. So you would be able to send the
commands directly through that DIAG
01:04:33.529 --> 01:04:37.859
interface. Another way would be to access
the shared memory directly, for example.
01:04:37.859 --> 01:04:42.079
But I think it would be more complex
because the Qualcomm shared memory
01:04:42.079 --> 01:04:47.440
implementation is quite complex. So I
think that the easiest way would be
01:04:47.440 --> 01:04:52.789
actually to hack the DIAG shared driver
and use the deb. DIAG interface for this.
01:04:52.789 --> 01:05:00.270
H: OK, thanks. Thanks. There is one
question which I'm going to read out,
01:05:00.270 --> 01:05:14.870
maybe you can make sense of it: is this
typically [unclear] security fall mobile phones?
01:05:14.870 --> 01:05:19.289
A: This level of hardening that I
presented, I think is around medium level.
01:05:19.289 --> 01:05:24.270
So usually production falls are even more
hardened. If you take a look at things
01:05:24.270 --> 01:05:31.249
like Google Pixel5 or the latest iPhones,
they will be even better, hardened than
01:05:31.249 --> 01:05:38.640
the one that I discussed.
H: Oh, OK. Yeah, thanks. Thanks then. So it
01:05:38.640 --> 01:05:42.900
doesn't look like we have any more
questions left. Anyway, so if you want to
01:05:42.900 --> 01:05:49.122
get in contact with Alisa, no problem.
There is the feedback tab below your
01:05:49.122 --> 01:05:56.888
video now at the moment, just drop your
questions over there. And that's a way to
01:05:56.888 --> 01:06:02.736
get in touch with Alisa. Other than that I
would say we're done for today for this
01:06:02.736 --> 01:06:07.410
session. Thank you very, very much Alisa
for this really nice presentation once
01:06:07.410 --> 01:06:14.160
again. Applause And I'll transfer now over
to the Herald News Show.
01:06:14.160 --> 01:06:33.639
postroll music
01:06:33.639 --> 01:06:54.000
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!