rC3 preroll music Herald: Our next speaker, Alisa Esage, is an independent vulnerability researcher and has a notable record of security research achievements such as this year, the initiative Silver Bounty Hunter Awards 2018. Alisa is going to present her latest research on the Qualcomm DIAG protocol, which is found abundantly in Qualcomm Hexagon based cellular modems. Alisa, we're looking forward to your talk now. Alisa Esage: This is Alisa Esage, you're attending my presentation about Advanced Hexagon DIAG at Chaos Communication Congress 2020 remote experience. My main interest as advanced vulnerability researcher is complex systems and hardened systems. For the last 10 years I have been researching various classes of software such as Windows kernel, browsers, JavaScript engines. And for the last three years I was focusing mostly on Hypervisors. The project that I'm presenting today was a little side project that I made for distraction a couple years ago. The name of this talk Advanced Hexagon DIAG is a bit of an understatement in the attempt to keep this talk a little bit low key in the general internet, because a big part of the talk will actually be devoted to a general vulnerability research in basebands. But the primary focus of this talk is on the Hexagon DIAG, also known as QCDM Qualcomm diagnostic manager. This is a proprietary protocol developed by Qualcomm for use in their basebands, and it is included on all Snapdragon SoCs and modem chips produced by Qualcomm. More than Qualcomm chips run on custom silicone with a custom instruction set architecture and named QDSP6 Hexagon. This is important because all the DIAG handlers that we will be dealing with are written in this instruction set architecture. As usually with my talks, I have adjusted the materials of this presentation for various audiences, for the full spectrum of audiences, specifically the first part of the presentation is mostly specialized for research directors and high level technical staff. And the last part is more deep technical. And it would be mostly interesting to specialized vulnerability researchers and low level programmers that somehow are related to this particular area. Let's start from the top level overview of cellular technology. This mind map presents a simplified view of various types of entities that we'd have to deal with with respect to basebands. It's not a complete diagram, of course, but it only presents the classes of entities that exist in this space. Also, this mind map is specific to the clean site equipment, the user equipment and it completely omits any server side considerations which are a world in their own. There exists quite a large number of cellular protocols on the planet. From the user perspective, this is simple. This is usually the shared name 3G, 4G that you see on the mobile screen. But in reality, this simple name, that generation name encodes - may encode several different distinct technologies. There are a few key points about cellular protocols that are crucial to understand before starting to approach this area. The first one is the concept of a generation. This is simple. This is simply 1G, 2G and so on. The generic name of the family of protocols that are supported in a particular generation. Generation is simply a marketing name, for users. It doesn't really have any strict technical meaning. And generations represent the evolution of cellular protocols in time. The second most important thing about cellular protocols is the air interface. This is.. or the protocol, which actually.. this is the lowest level protocol which defines how exactly the cellular signal is digitized and read from the electromagnetic wave and how exactly the different players in this field divide the space. Historically, there existed two main implementations of this low level code called TDMA and CDMA. TDMA means time division multiple access, which basically divides the entire electromagnetic spectrum within the radio band into time slots that are rotated in a round robin manner by various mobile phones so that they speak in turns. TDMA was the base for the GSM technology. And GSM was the main protocol used on this planet for a long time. Another low level implementation is CDMA. It was a little bit more complex from the beginning. It's decoded as coded division multiple access. And instead of dividing the spectrum in time slots and dividing the protocol in bursts, CDMA uses random codes that are assigned to mobile phones so that this code can be used as an additional randomizing mask against the modulation protocol. And multiple user equipments can talk on the same frequency without interrupting each other. Note here that CDMA was developed by Qualcomm and it was mostly used in the United States. So at the level of 2G, there were two main protocols, GSM based on the TDMA and the cdmaOne based on the CDMA. On the third generation of mobile protocols these two branches of development were continued. So GSM evolved into UMTS, while cdmaOne evolved into CDMA2000. The important point here is that UMTS has at this point already adopted the low level air interface protocol from the CDMA and eventually at the fourth generation of protocols these two branches of development come together to create the LTE technology and the same for the 5G. This is a bit important for us as from the offensive perspective, because first of all, all of this technologies including the air interfaces represents separate bits of code with separate parsing algorithms within the baseband firmware. And all of them are usually presented in each baseband, regardless of which one you actually use. Does your mobile provider actually support. Another important and not obvious thing from the offensive security perspective here is that because of this, evolutionary development of the.. protocols are not actually completely distinct. So if you think about LTE, it is not a completely different protocol from GSM, but instead it is based largely on the same internal structures. And in fact, if you look at the specifications, some of them are almost directly relevant. The specifications of the GSM 2G, some of them are still directly relevant to some extent to LTE. This is also important when you start analyzing protocols from the offensive perspective. The cellular protocols are structured in a nested way, in layers. Layers is the official terminology adopted by the specifications with the exception of level zero. Here I just edited it for convenience, but it's in the specifications layer start from one and proceed to three. From the offensive perspective, the most interesting is level three, as you can see from the screenshot of the specifications, because it encodes most of the high level protocol data, such as handling SMS and GSM. This is the part of the protocol which actually contains interesting data structures with TLV values and so on. When people talk about attack in basebands, they usually mean attack in baseband over the air. Their OTA attack vector, which is definitely one of the most interesting. But let's take a step back and consider the entire big picture of the baseband ecosystem. This diagram presents a unified view of generalized architecture of a modern baseband with attack surfaces. First of all, there are two separate distinct processors: the AP, application processor, and the MP, which is mobile processor. It may be either a DSP or another CPU. Usually there are two separate processors and each one of them runs a separate operating system. In case of the AP, it may be Android or iOS and the baseband processor will draw on some sort of real- time operating system provided by the mobile vendor. Important point here that on modern implementations, baseband actually protected by some sort of secure execution environment, maybe TrustZone on Androids or SEPOS on Apple devices. Which means that the privilege boundary which is depicted here on the left side is dual sided. So even if you have kernel access to the Android kernel, you still are not supposed to be able to read the memory of the baseband or somehow intersect with its operation, at least on the modern production smartphones. And the same goes around to the baseband, which is not supposed to be able to access to application processor directly. So these two are mutually distrusting entities that are separated from each other. And so there exists privilege boundary, which is - which represents attack surface. Within the real-time operating systems, there are three large attack surfaces. Starting from right to left: the rightmost gray box represents the attack surface of the cellular stacks. This is the code which actually parses the cellular protocols. It's usually runs in several distant real- time operating system tasks. And this part of the attack surface handles all the layers of the protocol. There is a huge amount of parsing that happens here. The second box represents the various management protocols. The simplest one to think about is the AT command protocol. It is still widely included in all basebands, and it's even usually exposed in some way to the application processor. So you can actually send some AT commands to the cellular modem. About a bit more interesting is the vendor specific management protocols, one of them is the DIAG protocol. Because the modern basebands are very complex. So vendors need some sort of specialized protocol to enable configuration and diagnostics for the OEM's. In case of Qualcomm, for example, DIAG is just one of the many diagnostic protocols involved. The third box is what I call the RTOS core, it is various core level functionality, such as the code, which implements that interface to the application processor. On the side of the application operating system such as Android, there are also 2 attack surfaces that are attackable from the baseband. The first one is the peripheral drivers, because the basement is a separate part of peripherals. So it requires some specialized drivers that handle I/O and such things. And the second one is the dark surface represented with various interface handlers because the baseband and the main operating system cannot communicate directly. They use some sort of a specialized interface to do that. In case of Qualcomm this is shared memory. And so this shared memory implementations are usually quite complex and they represent an attack surface on the both sides. And finally, the third piece of this diagram is in the lowest part. I have depicted two grey boxes which are related to the trusted execution environment. Because typically a modem runs as a Trustled in a secure environment. So technically, the attack surfaces that exists within TrustZone or related to it also can be useful for baseband offensive research. Here we can distinguish at least two large attack surfaces. The first one is the secure manager of call handlers, which is the core interface that handles calls from the application processor to the TrustZone. And the second one are the Trustlets. They are separate pieces of code which are executed and protected by the TrustZone. On this diagram, I have also added some information about data codex, I'm not sure if they are supposed to be in the RTOS core because these things are directly accessible from the cellular stacks usually, especially ASN. 1, which I have seen some bugs reachable from the over the air interface. On this diagram, I have shown some example of vulnerabilities. I will not discuss them in details here since it's not the point of the presentation, but at least the ones from Baodong, you can find the writeups on the Internet. To discuss baseband offensive tools and approaches, I have narrowed down the previous diagram to just one attack surface, the over the air attack surface. This is the attack surface, which is represented by parsing implementations of various cellular protocols inside the baseband operating system. And this is the attack surface that we can reach from the air interface. In order to accomplish that, we need a transceiver such as software defined radio or a mobile tester, which is able to talk the specific cellular protocol that we're planning to attack. The simplest way to accomplish this is use some sort of a software defined radio, such as Ettus research USRP or blade RF and install open source implementation of a base station such as OpenBTS or OpenBSC. The thing to note here is that the software based implementations actually lagged behind the development of technologies. Implementations of GSM base stations are very well established and popular, such as OpenBTS. And in fact, when I tried to establish BTS with my USRP, it was quite simple. For UMTS and LTE, there exists less number of software based implementations and also there are more constraints on the hardware. For example, my model of the USRP does not support UMTS due to resource constraints. And the most interesting thing here is that there does not exist any software based implementation on the CDMA that you can use to establish a base station. This is a pseudorandom diagram of one of the Snapdragon chips. There exists a huge amount of various models of Snapdragons. This one I have chosen pseudorandomly when I was searching for some sort of visual diagram. Qualcomm used to include some high level diagrams of the architecture in their marketing materials previously. But since they don't do this anymore. And this particular diagram is from a technical specification of a particular model 820. Also this particular model Snapdragon is... a bit interesting because it is the first one that included the artificial intelligence agent, which is also based on Hexagon. For all purposes, the main interest here are the processors. Majority of snapdragons include quite a long list of processors. There are at least 4 ARM-based Kryo-CPUs that actually run the Android operating system. Then there are the Adreno GPUs and then there are several Hexagons. On the most recent models there is not just one Hexagon processing unit, but several of them. And they are called respectively to their purposes. Each one of them, each one of these Hexagon cores is responsible for handling a specific functionality. For example, MDSB handles modem and runs the real-time operating system. The ADSP handles media and the CDSP handles compute. So the Hexagons actually represent around one half of the processing power, more than Snapdragons. There are two key points about the Hexagon architecture from the hardware perspective. First of all, it is- Hexagon is specialized to parallel processing. And so the first concept is variable size destruction packets. It means that several instructions can execute simultaneously in separate execution units. It also uses hardware multithreading for the same purposes. On the right side of the slide here is some example of the Hexagon assembly. It is quite funny at times. This curly brackets should present the instructions that are executed simultaneously. And these instructions must be compactable in order to be able to use that distant processing slots. And then there is the funny .new notation which actually enables the instructions to use both the old and the new value of a particular register within the same instruction cycle. This provides quite a bit of optimization on the lower level. For more information, I can direct you to the Hexagon Specification and programmers reference manual, which is available from the Qualcomm website. The concept of production fusing is quite common. As I said previously, it's a common practice from mobile device vendors to lock down the devices before they enter the market to prevent modifications and tinkering. And for the purposes of this locking down, they usually- there are several ways how this can be accomplished. Usually various advanced diagnostic and debugging functionalities are removed from either software or hardware or both. It is quite common that this functionalities are only removed from software while the hardware remains here. And in such case, we will- eventually the researchers will come up with their own software based implementation. All this functionality as in case with some custom iOS kernel debuggers, for example. In case of Qualcomm, there was at some point a leaked internal memo which discusses what exactly they are doing for production fusing the devices. In addition to our production fusing in case of modern Androids, the baseband runs within the trust zone. And on my implementation, it is already quite locked down. It uses a separate component. The baseband uses a separate component named the MBA this stands for the modem basic authenticator. And this entire thing is run by the subsystem of Android kernel named PILO, the peripheral image loader. You can open the source code and investigate how exactly it looks. And the purpose of the MBA is to authenticate the modem firmware so that you would not be able to inject some arbitrary commands into the modem firmware and flash it. This is another side of the hardening, which makes it very difficult to inject any arbitrary code into the baseband. Basically, the only way to do this is through a software vulnerability. During this project I have reverse engineered partially the Hexagon modem firmware from my implementation, from my Nexus 6b. The process of reverse engineering is not very difficult because all you need is to download the firmware from the website, Googles website in this case. Then you need to find the binary which corresponds to the modem firmware. This binary is actually a compound binary that must be divided into separate binaries that represent specific sections inside the firmware. And for that purpose we can use the unified Trustlet script. After you have split the baseband firmware into separate sections, you can load them into IDA Pro. There are several plugins available for IDA Pro that support Hexagon. I have tried one of them. I think it was GSMK and it works quite good for basic reverse engineering purposes. Notable here is that some sections of the modem firmware are compressed and relocated at runtime, so you would not be able to reverse engineer them. And unless you can decompress them, which is also a bit of a challenge because the Qualcomm uses some internal compression algorithm for that. For the reverse engineering the main approach here is to get started with some root points, for example, because this is a real time operating system, we know that it should have some task structures and task structures that we can locate. And from there we can locate some interesting code. In case of Hexagon this is a bit non- trivial because, as I said, it doesn't have any log strings. So even though you may locate something that looks like a task struct, but it's not clear which code does it actually represent. So the first step here is to apply the log strings that were removed from the binary by Qshrink. I think the only way to do it is by using that msg_hash.txt file from the leaked sources. This file is not supposed to be available neither on the mobile devices nor in some open ecosystem. And after you have applied these log strings, you will be able to rename some functions. And based on these log strings and because the log strings often contain the names of the source file, source module from which the code was built. So it creates opportunity to understand what exactly this code is doing. Debugging was completely unavailable in my case, and I realized that it would require some couple of months more work to make it work and the only way I think, and the best way is to create a software based debugger similar to modkit, the publication that I will be referencing in the references, based on software vulnerability in either the modem itself or in some authenticator or in the trust zone so that we can inject a software debugger callbacks into the baseband and connect it to the GDB stop. This is how the part of the firmware looks that has log strings stripped out. Here it already has some names applied using IDA script. So of course there was no such names initially, only the hashes. Each one of these hashes represent a log string that you can take in from the message hash file. And here is what you can get after you have applied the textual messages and renamed some functions. In this case, you would be able to find some hundreds of procedures that are directly related to the DIAG subsystem. And in a similar way you can locate various subsystems related to over the air vectors as well. But unfortunately, majority of the OTA vectors are located in the segments that are not immediately available in the firmware, the ones that are compressed and relocated. Meanwhile, I have tried many different things during this project. The things that definitely worked is building the MSM kernel. There is nothing special about this, just a regular cross-build. Another commonly well known offensive approach is firmware downgrades. When you take some old firmware that contains a well-known security vulnerability and flash it and use the bug to create and exploit to achieve some additional functionality or introspection into the system. This part definitely works, downgrades are trivial both on the entire firmware and a modem as well as the trust zone. I did try to build the Qualcomm firmware from the leaked source codes. I assigned just a few days to the task since it's not mission- critical and I have run out of time, probably was a different version of sorce codes. But actually, this is not a critical project because building leaked firmware is not directly relevant to finding new bugs in the production firmware. So I just said it aside for some later investigation. I have also investigated the ramdump's ecosystem a little bit on the software side at least. And it seems that it's also fused quite reliably. This is when I remembered about the Qualcomm DIAG. During the initial reconnaisance I stumbled on some whitepapers and slides that mentioned the Qualcomm diagnostic protocol. And it seemed like quite a powerful protocol, specifically with respect to reconfiguring the baseband. So I decided to, first of all, to test it in case that it would actually provide some advanced introspection functionality and then probably to use it.. to use the protocol for enabling log dumps. Qualcomm DIAG or QCDM is a proprietary protocol developed by Qualcomm with the purposes of advanced baseband software configuration and diagnostics. It is mostly aimed for OEM developers, not for users. The Qualcomm DIAG protocol consists of around 200 commands at least in theory. Some of them are quite powerful on paper such as downloader mode and read/write memory. Initially the DIAG was partially reverse engineered around 2010 and included in the open source project named Modem Manager. And then it was also exposed in a presentation at the Chaos Communication Congress 2011 by Guillaume Delugré. I think this presentation popularized it and this is the one that introduced me to this protocol. Unfortunately, that presentation is not really relevant - majority of it - to modern production phones, but it does provide a high level overview and a general expectation of what you will have to deal with. From the offensive perspective, the DIAG protocol represents a local attack vector from the application processor to the baseband. A common scenario of how it can be useful is unlocking mobile phones which are locked to a particular mobile carrier. If we find a memory corruption vulnerability in DIAG protocol, it may be possible to execute a call directly on the baseband and change some internal settings. This is usually accomplished historically through the IT common handlers, but internal proprietary protocols are also very convenient for that. The second scenario how that diag offensive can be useful is using it for injecting a software based debugger. If you can find a bug in DIAG that enables read/write capability on the baseband, you can inject some debugging hooks and eventually connect it to a GDB stop. So it enables to create a software based debugger even when GTAG is not available. What has changed in DIAG in 10 years based on some cursory investigation that I did. First of all, the original publication mentioned Qualcomm baseband based on ARM and with a Rex operating system. All modern Qualcomm basements are based on Hexagon as opposed to ARM. And the Rex operating system was replaced with Kirt, which I think is still has some bits of Rex, but in general it's a different operating system. Majority of super powerful commands of DIAG such as downloader mode and memory read/write were removed, at least on my device. And also it does not expose any immediately available interfaces such as USB channel. I hear that it's possible to enable the USB DIAG channel by adding some special boot properties, but usually it's not, it wouldn't be available. It shouldn't be expected to be available on all devices. So this observations are based on my test device, Nexus 6b. And this this should be around medium level of hardening. More modern devices such as Google pixels, the modern ones should be expected to be even more hardened than that. Especially on the Google side, because they take hardening very seriously. As opposed to it on the other side of the spectrum if you think about some no name modem sticks, these things can be more open and more easy to investigate. The DIAG implementation architecture is relatively simple. This diagram is based roughly on the same diagram that I presented in the beginning of talk. On the left side there is the Android kernel and on the right side there is the baseband operating system. DIAG protocol actually it works in both sides. It's not only commands that can be sent by the application processor to the baseband, but it's also the messages that can be sent by the baseband to the application processor. So DIAG comments are not really comments - they're more like tokens that also can be used to encode messages. The green arrows on this slide represents an example of call flow, of the data flow originating from the baseband and going to the application processor. So obviously, in case of commands there would be a reverse call flow or data flow. The main entity inside the operating system, baseband operating system responsible for DIAG is the DIAG task. It has a separate task which handles specifically various operations related to the DIAG protocol. The exchange of data between the DIAG task and other tasks are done through the ring buffer. So, for example, if some tasks needs to log something through the DIAG, it will use specialized logging APIs that will in turn put logging data into the ring buffer. The ring buffer will be drained either on timer or on a software based interrupt from the caller. And at this point the data will be wrapped into DIAG protocol and from there it will go to sI/O task, this Serial I/O which is responsible to send in the output to a specific interface. This is based on the modem, on the baseband configuration. The main interface that I was dealing with is the shared memory, which ends up in the DIAG shared driver inside the Android kernel. So in case of sending the commands from the Android kernel to the baseband, it will be the reverse flow. First, you will need to send some- to craft the DIAG protocol data, send it through the DIAG shared driver that will write to the shared memory interface. From there, it will go to the specialized task in the basement and eventually end up in the DIAG task and potentially other responsible task. On the Android side, DIAG is represented with the /dev/diag device, which is implemented with the diagchar, and diagfwd kernel drivers in the MSM kernel. The purpose of the DIAG shared driver is to support the DIAG interface. It is quite complex in code, but functionally it's quite simple. It contains some basic minimum of DIAG commands that enable configuration of the interface on the baseband side. And then it would be able to multiplex the DIAG channel to either USB or a memory device. It also contains some IOCTLs for configuration that can be accessed from the Android user land. And finally, the IOCTL filters various DIAG commands that it considers unnecessary. This is a bit important because when you will start, when you'll try to do some tests and send some arbitrary DIAG comments with the DIAG interface, you would be required to rebuild the actual driver to remove this masking, otherwise your commands will not make it to the baseband side. At the core, the DIAG shared driver is based on the SMD shared memory device interface, which is a core interface specific to Qualcomm modem. So this is where DIAG is, diagchar is on the diagram. The diagchar driver itself is located in the application OS's vendor specific drivers. And then there is some shared memory implementation in the baseband that handles this and the DIAG implementation itself. diagchar driver is quite complex in code, but the functionality is quite simple. It does implement a handful of CTLs that enables some configuration. I didn't check what exactly this IOCTLs are responsible for. It exposes the /dev/diag device which is available for it in the writing. However, by default, you are not able to access the DIAG channel based on- for this device, because in order to access it, there is diag_switch_logging function, which switches the channel that is used for DIAG communications. On the screen there are several modes listed, but in practice only two of them are supported. The USB mode and the memory device mode. USB mode is the default, so which is why if you just open, the /dev/diag driver, dev/diag device and try to read something from it, it won't work, is tied to USB. And in order to reconfigure it to use the memory device, you need to send a special IOCTL code. Notice the procedure named mask_request_validate, which employs a quite strict filtering on the DIAG commands that you try to send through this interface. So it filters out basically everything with the exception of some basic configuration requests. At the core, DIAG shared driver use the shared memory device to communicate with the baseband. The SMD implementation is quite complex. It exposes SMD Read API, which is used by DIAG share for reading the data from the shared memory, one of the APIs. Shared memory also operates on the abstraction of channels which are accessed through the API named smd_named_open_on_edge. So you can notice here that there are some DIAG specific channels that can be opened. Now, let's take a look at the SMD implementation. This is a bit important because a shared memory device represents a part of the attack surface for escalation from the modem to the application processor. This is a very important attack surface because if you just achieve code execution on the baseband, it's mostly useless because it cannot access the main operating system. And in order to make it useful, you'll need to create and exploit chain and add one more exploit based on that bug with privilege escalation from the modem to the application processor. So shared memory device is one of the attack surfaces for this. The shared memory device is implemented as exposed memory region exposed by the Qualcomm peripheral. The specialized MSM driver will map it and here it's the name is smem_ram_phys, the base of the shared memory region. The shared memory region operates on the concept of entries and channels, so it's partitioned in distant parts that can be accessed through the procedure, smem_get_entry and one of these entries is SMEM_CHANNEL_ALLOC_TBL, which contains the list of available channels that can be opened. From there, we can actually open the channels and use the shared memory interface. During this initial research project, it wasn't my goal to research the entire Qualcomm ecosystem, so while I was preparing for this talk, I have noticed some more interesting things in the source codes, such as, for example, the specialized driver that handles GTAG memory region, which is presumably exposed by some Qualcomm system of chips. In the drivers this is mostly used read only, and I suppose that will not really work for writing, but it's worth checking probably. And now, finally, let's take a look at the DIAG protocol itself. One of the first things that I noticed when researching the DIAG protocol is that it's actually used in a few places, not only in libqcdm. A popular tool named SnoopSnitch can enable protocol dumps, so there are protocol dumps on rooted devices. And in order to accomplish this, it's SnoopSnitch sends an opaque blob of the commands to the mobile device through the DIAG interface. This is blob is not documented. So it got me curious what exactly these commands are doing. But before we can look at the dump, let's understand the protocol. The DIAG protocol consists of around 200 of commands or tokens. Some of them are documented in the open source, but not all of them. So you can notice on the screenshots, some of the commands are missing. And one of the missing commands is actually the token 0x92 hexadecimal, which represents an encoded hash log message. The common format is quite simple. The best pritimitive here is the DIAG token number 0x7E, it's not really a delimiter, it's a separate DIAG command 126. It's missing in the open source, as you can see here. So the DIAG command is nested. The outer layer consists of this wrapper of 0x7e hexadecimal bytes. Then there is the main command and then there is some variable length data that can contain even more subcommands. This entire thing is verified using the CRC and some bytes are escaped. Specifically, as you can see on the snippet. One interesting thing about the DIAG protocol is that it supports subsystem extensions. Basically, different subsystems in the baseband can register their own DIAG system handlers, arbitrary ones. And there is a special DIAG command number 75, which simply forwards.. instructs the DIAG system to forward this command to the respective subsystem. And then it will be parsed there. There exists quite a large number of subsystems. Not all of them are documented, and when I started investigating this, I noticed that there actually exists a DIAG subsystem- subsystem and debugging subsystem. The later one immediately interested me because I was hoping that it would enable some more advanced introspection through this debugging subsystem. But it turned out that the debugging subsystem is quite simple. It only supported one command: inject crash. So you can send a special DIAG comment that will inject the crash into the baseband. I will talk later about this. Now, let's take a look at specific examples of the DIAG protocol. This is the annotated snippet of the blob of commands from SnoopSnitch. This blob actually consists of three large logical parts. The first part is largely irrelevant. It's a bunch of commands that request various informations from the baseband, such as timestamp, version info, build id and so on. The second batch of commands starts with a command Number 0x73 hexadecimal. This is DIAG common log config. This is the command which enables protocol dumps and configures them. And third part of this blob starts with the command number 0x7D hexadecimal. This is the CMD_EXT_MESSAGE_CONFIG. This is actually the command that is supposed to enable textual message logging, except that in case of SnoopSnitch it disables all of the logging altogether. So how do you actually cellular protocol dumps work? In order to enable the cellular product dumps, we need DIAG_CMD_LOG_CONFIG, number 0x73 hexadecimal. It is partially documented in the libqcdm. The structure of the packet would contain the code and the subcommand, that would be set mask in this case. It also needs an equipment ID, which corresponds to the specific protocol that we want to dump. And finally, the masks that are applied to filter some parts of the dump. This is relatively straightforward. And now the second command, DIAG_CMD_EXT_MESSAGE_CONFIG. This is the one which is supposed to enable textual message logs. The command format is undocumented. So let's take a closer look at it. The command consists of a subcommand. In this case, it's subcommand number 4, the set mask. And then there are two 16 bit integers. SSID start and end. SSID is subsystem ID, which is not the same as DIAG subsystems. And the last one is the mask, so subsystem IDs are used to filter the messages based on a specific subsystem, because there is a huge amount of subsystems in the baseband. And if all of them start logging, this is a huge amount of data. So DIAG provides this capability to filter a little bit, to a specific subsystem that you're interested in. The snippet of Python code here is an example how to enable textual message logging for all subsystems. You need to set the mask to all 1s. And this is quite a lot of logging in my experience. Now for parsing the incoming log messages, there are two types of DIAG tokens, both of them are undocumented. The first one is a legacy message number 0x79 hexadecimal. This is a simple ASCII based message that arrives through the DIAG interface so you can parse it quite straightforwardly. The second one is I called it DIAG_CMD_LOG_HASH, it's number 0x92 hexadecimal. This is the token which encodes the log messages that contain only the hashes. This is the one that if you have the msg_hash.txt file, you can correspond the hash that was arrived to this command to the messages provided in the text file. And you can get the textual logs. On the lower part of the slide there are two examples of hexdumps from both commands. Both of them have a similar structure. First, there are 4 bytes that are essential. The first one is the command itself. And the third byte is quite interesting is the number of arguments included. Next there is 64 bit value of timestamp. Next there is the SSID value, 16 bit. Some line number, and I'm not sure what is the next argument. And finally, after that, there is either ASCII encoded log string in plain text or hash of the log string. And optionally there may be included some arguments, though, in case of the first legacy command. The arguments are included before the log message and in case of the second command they are included after the MD5 hash in the log message, at least in my version of this implementation. And this is the DIAG packet that enables you to inject a crash into the baseband, at least in theory. Because in my case it did not work. And by not working, I mean that it did simply not enter the baseband. Normally, I would expect that on production device it should just reset the baseband. You will not get a crash dump or anything like that, just a reset. So I suppose that it still should be working on some other devices. So it's worth of checking. There are a few types of crashes that you can request in this way. In order to accomplish this, I needed a very simple tool with basically two functions. first, direct easy access to the DIAG interface, ideally through some sort of python shell. And second is the ability to read and parse data with advanced log strings. For that purpose. I wrote a simple framework that I named diagtalk, which is based directly on the diag interface in the Android kernel and or with a Python harness. So on the left side, here is the example of some advanced parsing with some leaked values. And on the right side, here is the example of the advanced message log, which includes the log strings that were extracted.. that were stripped out from the firmware. The log is quite fun, as I expected it to be, it has a lot of detailed data, such as, for example, GPS coordinates and various attempts of the basement to connect to different channels. And I think it's quite useful for offensive research purposes, it's even contained sometimes raw pointers as you can notice on the screenshot. So in this project, my conclusion was that indeed I was reassured that it was the right choice and Hexagon seems to be a quite a challenging target, and it would probably need several more months of work to even begin to do some serious offensive work. I also started to think about writing a software debugger because it seems to be the most.. probably the most reliable way to achieve debugging introspection. And also, I noticed some blank spaces in the field that may require future work. For Qualcomm Hexagon specifically, there is a lot of things that can be done. For example, you can take a look at other Qualcomm proprietary diagnostic protocols of which there are a few, such as QMI for example, I think they are lesser known than DIAG protocol. And then there is a requirement to create a full system emulation based on QEMU at least for some chips. And a big problem about the decompiler, which is a major obstacle to any serious static analysis in the code and for the offensive research, there are 3 large directions. First one is enabling debugging. There are different ways for that. For example, software based debugging or bypassing JTAG fusing, on the other hand. Next, there are explorations of the over the air attack vectors. And the 3rd one is escalation from the baseband to the application processor. These are the 3 large offensive research vectors. And for the basebands in general, there also exists some interesting directions of future work. First of all, the OsmocommBB. It definitely deserves some update a little bit. It is the only one open source implementation of a baseband. And it is so outdated. And there is, and it is based on some real obscure hardwares. Another problem here is that there doesn't exist any software based CDMA implementation. No sound Herald: Alisa, thank you very much for this nice talk. Um, there are some questions from the audience. So basically the first one is a little bit of an icebreaker: Do you use a mobile phone? And do you trust it? Alisa: No, I don't try to use a mobile phone only for Twitter. Does anyone still use mobile phones nowadays? H: laughs Well, no idea. Another question concerns the other Qualcomm chips. Did you have a look at the Qualcom Wi-Fi chips sets? A: As I mentioned during the talk, I had only one month. It was like a short reconnaissance project, so I didn't really have time to investigate everything. I did notice that Qualcomm socks have a Wi-Fi chip, which is also based on Hexagon. And more than that, it also shares some of the same low level technical primitives. So it's definitely worth looking, but I didn't investigate it in details. H: OK, OK, thanks. There is also a pretty technical question here, so instead of having to go through the rigorous command checking for the DIAG card driver, wouldn't it be possible to nmap /dev/mem into userspace process and send over commands directly so. Depends a little bit on what the goal is. A: OK, so it really depends on your previous background and your goals. The point here is that by default, the DIAG shared ecosystem does not allow to send arbitrary DIAG commands. So either way, you will have to hack something. One way to hack this is to rebuild the actual driver. So you would be able to send the commands directly through that DIAG interface. Another way would be to access the shared memory directly, for example. But I think it would be more complex because the Qualcomm shared memory implementation is quite complex. So I think that the easiest way would be actually to hack the DIAG shared driver and use the deb. DIAG interface for this. H: OK, thanks. Thanks. There is one question which I'm going to read out, maybe you can make sense of it: is this typically [unclear] security fall mobile phones? A: This level of hardening that I presented, I think is around medium level. So usually production falls are even more hardened. If you take a look at things like Google Pixel5 or the latest iPhones, they will be even better, hardened than the one that I discussed. H: Oh, OK. Yeah, thanks. Thanks then. So it doesn't look like we have any more questions left. Anyway, so if you want to get in contact with Alisa, no problem. There is the feedback tab below your video now at the moment, just drop your questions over there. And that's a way to get in touch with Alisa. Other than that I would say we're done for today for this session. Thank you very, very much Alisa for this really nice presentation once again. Applause And I'll transfer now over to the Herald News Show. postroll music Subtitles created by c3subtitles.de in the year 2021. Join, and help us!