rC3 opening music Jiska: Hello everyone and welcome to my talk, Fuzzing the phone in the iPhone. The phone in the iPhone is the component that receives SMS, sends SMS, receives phone calls, makes phone calls and also manages your Internet connection when you are not on Wi-Fi. However, you might now wonder, what is it exactly? So I'm talking about CommC enter and fuzzing it via the QMI and ARI interfaces. But this is a bit too technical for most of you. So I will first introduce you to the concept of fuzzing in general and protocol fuzzing before I dive into further details. For those of you have not yet heard about the concept of fuzzing - you can send a lot of random messages and then try to test the security of an interface with this. And in this video, you can see how I send SMS over a Frida-based fuzzer with something like 400 fuzzcases per second. And then the IMH receives them, catches them and sends a couple of them also to the smartphone. Let's start with a motivation and an explanation to the attacker model. So, if you look into a modern smartphone, you have two components if you want to show it in a simple way. So first of all, there's the hardware part with a lot of chips. And then on top of this, there is an operating system and applications. However, it's not as simple as this because even those chips are so complex that they run their own little real-time operating systems to preprocess data. So this means that you can even get code execution on such a chip. And this is usually much easier than in the operating system itself, because those chips cannot have that many mitigations. However, so what do you even do if you have code execution in such a chip, so if you are in a baseband chip, then one escalation strategy from the chip towards the operating system might be to manipulate traffic in the browser. However, I don't think that this is the case, because if you look at the Zerodium price list, then actually the browser exploits are much more expensive. So it's probably not done like this. And there must be other ways to escalate from this chip into the operating system. In general, the traffic manipulation is something that you can always do in wireless transmission or also on the Internet. So if you look how those systems work these days, so you have something like the Internet in general that serve websites and so on, and also the core network of your mobile provider. And there are many, many ways to manipulate traffic, either if you are a state level actor who is able to have something in the core network or just by sending around websites or modifying websites. And then there is the base station subsystem, there might also be dragons. We don't know exactly. And of course, there are over-the-air transmissions and wireless transmissions are very special because, if there is something just slightly broken in the encryption, for example, then it's also possible to manipulate traffic there, if you have a software defined radio, for example. So all of this could be attacked to manipulate traffic. And I don't think that for this, one would craft a baseband exploit. Already in 2014 at the CCC, there have been two talks about a SS7 protocol which is run in the core network and is actually meant to connect different mobile carriers to each other. And this can also be used to intercept phone calls, for example. And this also has been exploited recently. So even though, there have been some mitigations, etc. since then, it's still exploited for the same purpose to spy on people. So really, really, really, basement exploits only exists to escalate from the chip into the operating system. But now the question is, what are the strategies? So if it's not via the browser, what else could it be? So the browser really I'm sure it is not, because also you need to have some traffic and so on, it doesn't really work instantly, you need to visit the website to replace traffic on a website and so on. There must be something else. So if you are on the chip with remote code execution and want to go into the operating system, there is some interface. And this means that something in those interfaces needs to be exploitable, so that you can escalate the privileges from the chip into the system. And also, those interfaces are very interesting from a reverse engineer's perspective. So even if you don't want to attack anything, just understanding how they work, is also a goal of this work. So, for example, if you have a baseband debug profile, you can just download it onto your iPhone and then you open your iDevice syslog, you can already see a lot of management messages that are exchanged between the chip and the iPhone. And if you have a jailbreak and Frida, you can even inject packets or modify packets to change the behaviour of your modem.But if you want to start to work on such a thing, the question is like, how do you even start? Where do you start? And fuzzing is actually a method that can be used to understand such an interface. So initially, if you identified an interface, just to check if it is the correct interface, so, if it really changes behaviour, if you flip some bytes, but also how powerful this interface is. So what are the features? What breaks instantly? And if things break, also you can check if the whole interface has been designed with security in mind. Now, let's start with an introduction to wireless protocol fuzzing, this will also be a short rant because the current tooling for fuzzing is usually not made to fuzz a protocol. So let's start with a very simple fuzzer, a fuzzer that is just an image parser. So, you browse your smartphone for unicorn pictures or PNGs or JPEGs, and then you send them to the image parser and in the image parser you might be able to observe which functions are executed in the form of basic blocks. And then, during this initialization, the image parser can even report which parts were executed and you can just go to image again and again with different images and get this basic block coverage back. In a next step, you can then combine existing images or flip bits in these images and send them to the image parser and again observe the coverage, most of the time, it won't generate any new coverage. So you just say you are not looking into this image in particular, but sometimes you might get new coverage, like here, and then you add this image to your corpus. So over time, you can increase your corpus and increase your coverage. Another method can be, if you know how exactly an image format looks like, so you might know the JPEG specification and because of this, you could just generate images that are more or less specification compliant and they look more artificial like this. So you just generate images and send them to the image parser and at some point you might observe a crash. So that also depends, again, on your harnessing. Maybe you can observe basic blocks, maybe you can just observe crashes and then you know at which image you had a crash. You might even be able to combine these two approaches just depending on what you know about your input and how you can harness your target. Now it looks a bit different for a protocol. So, in a protocol, you can have a very complex state. Let's say you are in an active phone call or just something like, you receive an SMS. You can actually force the iPhone to receive SMS, if you have a second iPhone and send SMS. And then during the fuzzing, you can replace some bits and bytes, like this and then you would have a modification. So this is a very simple approach and it preserves the state. So no matter how complex the thing is, that you're currently doing, it's very simple to flip a bit here and there in an active interaction. But it's also a bit annoying, because you need to have these active phone calls, etc. So something that's more efficient is injection. So you would observe certain messages and then just send them again - and then you don't even need the second phone to make calls, etc., - you can just send a lot, a lot, a lot of data. And this is the effect, when your iPhone goes di-di-di-di-dimm or something because of all the notifications and all the data that is sent. But issue here is, that this does not preserve state. So there might be actions where the iPhone requests something that is then answered. So, the iPhone might request, for example, a date and only then the chip would reply with a date and only then the iPhone would accept a date. But it's still very interesting to do this. So even though you cannot reach certain states because you can do this without a SIM card and you can do this very, very fast. So, just to summarize the issues here: if you fuzz the wireless protocol, you can have very significant state differences and just injecting packets cannot reach all states. The fact, that you cannot reach all states also shows in very simple stuff like a trace replay. So a trace of something that you record. So let's say I have an active phone call, I record all the packets, and I can also observe the coverage. So , with Frida, you can observe coverage on an iPhone while the phone call is active. And then, in a second step, you would do some injection. But the only thing that you can inject are the packets sent from the basement to the smartphone, not the opposite direction. And this results usually in much less coverage. So you are missing a lot of things due to a missing state. And even worse, if you do the same thing again, you might be in a different state, and you might observe a different coverage. So you do the exact same thing, but you get different coverage.So, even replaying recorded messages results in less or inconsistent coverage. Anyway, let's take a look into an injection example. So, in this video, you can see how I'm in the Unicorn Network on an iPhone 8, which has obviously 5G, but also does a lot of fuzzing and in the fuzzing, what is interesting is, that you might do a lot of states in a combination that are not usually possible, like you have a lost network connection while you have to confirm a pin or you have a network connection during this, etc. To summarize my rant, some states cannot be reached solely by injecting packets. So, even if we have a very good corpus and do very good mutations, we might miss 80% of the code, but we can just fuzz anyway. But we need to keep in mind, that some stuff is just not fuzzable. We looked into a lot of wireless protocols and have seen more in the past, so, it's worth to also consider, which tooling we already had available for fuzzing protocols. The most advanced tooling, that we have, is Frankenstein and it's built by Jan. So, what Jan did is, he emulated the firmware and attached it to a virtual modem and also a Linux host. For this, he first looked into the firmware, that's here, and we had some partial symbols for this and also some information about registers. Then, Frankenstein is actually taking a snapshot, that you can see here, including some of those registers of the modem. And with this, you can build a virtual modem and fuzz input as if it would come over the air. Then Frankenstein also emulates the whole firmware, including thread switches. So it gets into very complex states and it's even attached to a Linux host. So, it also fuzzes a bit of Linux while actually fuzzing the firmware itself. Now, the issue with this is that basement firmware is usually 10 times the size of bluetooth firmware or even more, and we don't have any symbols for this, so it's a lot of work to customize this. And even if, one would do all those steps and put all the work into this, it's only, so to say, code execution in the baseband. It's not yet a privilege escalation into the operating system. The next interesting tooling was built by Steffen and what Steffen did, he built a fuzzer based on DTrace and AFL. DTrace is a tool that can provide functional level coverage in the macOS kernel and user space. With some modifications you can even get basic block coverage in the user space, which is required for AFL to work. So, in the end, you have AFL or AFL++ as a fuzzer on any program on macOS. It's even slightly faster than Frida, at least the version that he used. And he gets a couple of thousand fuzz cases per second, even on a very old iMac. So, in our lab, we just had an old iMac 2012 for this and it works on this. But the issue is, that Wi-Fi and Bluetooth, which he fuzzed, are very complex protocols, so he couldn't find any new bugs with AFL. And also, in the kernel space, you only get this function level coverage. He still, despite not finding any bugs in Wi-Fi or Bluetooth, got a CVE, because DTrace also has bugs. So, at least some funding, but on iOS, this is not supported out of the box. So it might be possible to get DTrace working with some tweaks, but it's a lot of work. So probably it's easier to just use Frida in the iOS user space. Also during this, so while Steffen was building all this very advanced tooling, Wang Yu found issues in the macOS Bluetooth and Wi-Fi drivers, and so he was very, very successful in comparison to us. That's really a pity. And I think, what he did, is much better state modelling, so, of how the messages interact and what is important to reach certain functions. So what is still left? So, usually fuzzing the baseband means that you need to modify firmware or also emulate firmware, you need to implement very complex specifications on a software defined radio if you want to fuzz over the air or build proof of concepts. And for everything that's somewhat proprietary, you need to do protocol reverse engineering, so you can spend a lot of time and money just to do very, very basic research. Or, you can also use Frida, so you can fuzz with Frida and all you need to do for this is, write a few lines of code in JavaScript. So I kid you not. The option is Frida. Dennis was the first in our team who was advised as a thesis student who built a Frida-based fuzzer, and it's called ToothPicker. It's based on Frizzer and Radamsa. So what it does is, well, it hooks into these connections or into the protocols of the bluetooth daemon, you could also think of this upper part here, as one block. So the protocols are implemented in the Bluetooth daemon, but we want to fuzz certain protocol handlers. And to increase the coverage, he creates a virtual connection. So a virtual connection holds a connection and pretends to the Bluetooth daemon that there would be an active connection to a device. And of course, the chip would then say, I don't know anything about this connection. So, there are also some abstractions in here, so that the connection is not terminated. So, that's a very simple tool, but it really found a lot of bugs and issues and even there were some issues in the protocols themselves that also apply to macOS. So it's not just iOS bugs, but also protocol bugs in macOS that Dennis found. And this really got me thinking, because ToothPicker with only 20 fuzz cases per second, so it's really, really slow and we were still able to find Bluetooth vulnerabilities at this speed. So, why is this? So first of all, if you try to fuzz Bluetooth over the air, then the over-the-air connections are terminated after something like five invalid packets. So, over-the-air fuzzing is really, really inefficient. And with Frida you can actually patch these functions, so it's gone. Then the virtual connections are a very important factor. So they are really, really important for having coverage. It's still a lot of coverage that we missed during replay and fuzzing. But it's really an advantage compared to the other fuzzing approaches where you just inject packets. And in addition, there is an issue here, because if you have a virtual connection, it might be that this virtual connection triggers behaviour, that you cannot reproduce over the air. So, that means that everything that you find, you need also to confirm that it works over the air. At least the inconsistent coverage is also fixed in ToothPicker, because ToothPicker replays all packets five times in a row. But the issue here is that it also means that if you have a sequence of packets, that is like generating a certain bug - so you need multiple packets - this is nothing that the mutator is aware of and also nothing that's logged properly in ToothPicker. And because of this, I got a bit anxious. Maybe we missed a lot of things? So once I got the intuition that we are actually missing certain state information, I had the idea to replace bytes in active connections. And this is one part of that you can see on a keyboard, so I'm just replacing bytes on keyboard input and see what happens. And I let this run for a couple of weeks, also for different protocols and so on to see, if there are further bugs or not that we didn't find previously. So here you can see the same for AirPods with SCO and then they produce crack-sounds for the replace bytes, it's even worse for ACL, so actual music, because then you can hear very noisy chirps. I let this fuzzer run for multiple weeks and it didn't find any bugs that ToothPicker hadn't discovered before. So, I think the reason for this is that I mainly passed in active connections like the one with the audio or the keyboard, but I only passed a few active pairings because this requires me to actually perform those pairings by hand, so, nothing really interesting. The only bad thing that I could produce with this, but not worth a CVE, is that the sound quality of my AirPods is now a really, really bad. Well, OK. And also the Broadcom chips on iOS don't check the UART lengths, but that's not that bad. So, I mean, if you consider that they removed the write-RAM recently, then you might now still be able to write into the RAM via UART buffer overflows. But yeah, nothing too interesting. So after all of this, I asked myself: "What is still left for fuzzing if we cannot find a new Bluetooth or Wi-Fi bugs?" Well, the iPhone baseband - or actually the iPhone basebands, because there are two. The first variant of iPhone baseband, that you can get, are Qualcomm chips and they are in the US devices they use the Qualcomm MSM interface. And this interface comes with some documentation and there are even open source implementations for it. So it's something that's probably easy to understand and easy to fuzz. On the other hand in almost all devices that I had on my table, were Intel chips. Intel has been recently bought by Apple, at least the part that does the baseband chips and these are the chips in the European devices, that's the reason why almost all my devices had Intel chips. And they use a special protocol. It's called Apple Remote Invocation. And if you search for this on the Internet, I even checked it like just today, there are no Google hits at all. So it really hasn't been researched before, at least not publicly. It's completely undocumented and it's a very custom interface. So it's not even used for Android. It's really an interface just for Apple. The component that we are going to fuzz in the following is CommCenter. So CommCenter is the equivalent of, for example, the Bluetooth or Wi-FI daemon, but for telephony. It's sandboxed as the user "wireless", but it comes with a lot of XPC interfaces. And this is something that we will also see later in the fuzzing results. The next part is that there are two flavors of libraries, so depending on if you have a Qualcomm or an Intel chip, different libraries will be used before certain actions or data actually is then processed by the CommCenter itself. So we have a different code paths here. But all of this runs in user space, and this means that both libraries can be hooked with Frida and can be fuzzed with Frida. So that's very interesting. There is still a lot of stuff that goes on in the kernel. So what you can see here is that QMI and ARI have some management information that is sent to CommCenter, but they don't contain the raw network or audio data. So they don't contain your phone call, they don't contain your website that you are opening. And the next issue is that QMI and ARI are not directly sent over the air, but what is sent over the air are normal baseband interactions and these generate QMI and ARI messages. So there's still some section in between, but of course, there are now two ways: either you have interaction that you can do over the air, that is causing ARI and QMI messages directly, that are something that causes an issue in the upper layers. Or you might have this full exploit chain requirement that you first need to exploit the chip over the air, and then from the chip break the interface into the CommCenter. Now, QMI, the code has a lot of assertions. So it's really asserting everything about a protocol, delaying the TRV format and so on, and if anything goes wrong, it really terminates CommCenter. So if you just send one invalid packet, CommCenter is terminated. This doesn't matter a lot because if your protocol is stable and you usually don't send any invalid packets, then you know an attack is ongoing, so it's valid to terminate the CommCenter. And furthermore, it doesn't matter a lot to the user. So the worst thing that happens when CommCenter crashes, for example, while you have an active phone call, it's just that the phone call gets lost or your LTE connection is re-established. So you don't really notice it. It just feels like your Internet connection breaks for a short moment. In contrast, there is the ARI protoctol, and this is the part that just works very, very, very different. So whatever it's getting, it just parses it, and it doesn't terminate CommCenter. So you can send many, many, many fancy things and it just continues, continues, continues, because the developers were probably very, very happy once they got their special protocol for Apple working and then they never touched it again. But what does it look like? So it has a very basic format, also with some TLS(?), and the first thing that I noticed when I fuzzed it is that in the iDevice syslog, it always complained about this sequence number being wrong. So it just said I expected the follow-up sequence number, so and so. So I started to fix this. And if you open it in IDA, you can see that the range, that is expected it's between zero and 0x7ff hexadecimal. So you know it is the range and then it gets weird. So the sequence number is spread over three different bytes in single bits and shifted around and so on. And it's not even continuous. So very weird code. Probably they just added those sequence numbers to confirm some race conditions or something. I really don't know. Or out-of-order packets? Something weird going on there. But I wrote the code, I fixed the sequence number and then during the replay of packets, I noticed, well, it doesn't even matter! So no matter if your sequence number is valid or invalid, parsing continues and even worse, even packets with a wrong sequence number are parsed. Probably because otherwise there would be too many issues, because the protocol implementation is too buggy. And there are also a couple of other things, so, for example, if you sent the first four magic bytes wrong or a wrong length or something, then the packet is potentially ignored. But parsing continues and CommCenter is not terminated like in QMI. Since it's a proprietary protocol, there is currently no tooling available. But, Tobias is working on a Wireshark dissector and once he finishes his thesis, it will also be publicly released. So you need to wait a while, but then you will have a tool for this. Anyway, let's also talk about fuzzing this, so I would not recommend to fuzz this, because you might brick your device or at least get into weird states. So just don't do this on your productive iPhone. I mean, obviously, I know what I'm doing, so, yeah, just fuzzing packets, right? But I'm not so sure about what exactly I'm doing, so the only direction that I fuzz is from the baseband to the iPhone here, not the opposite direction. So I hopefully do prevent anything weird on the chip, right? But the iPhone might still answer with something invalid and this might confuse the baseband or cause other crashes. And so I actually had to call for help, like mimimimimi, I broke my iPhone - I mean, just one of my research devices - but still so it booted into pongoOS but no longer into iOS and it didn't tell me any debug message that was useful. Well, it turns out, at least under Qualcomm chips, and that's where this happens, it just boots after a couple of hours again. But before it's just entering a boot loop and on the Intel iPhones I also almost bricked an iPhone 8, but luckily it didn't completely break. So the issue there is if you enable the baseband debug profile, then it writes a lot of stuff to the ISTP files, so that is some debug format of Intel, and every few minutes it just creates something like 500MB of data, at least on the iPhone 8. On the newer iPhones, this debug format is a bit shorter, so it doesn't create as much data, but still a lot. And if you don't delete this regularly, then of course your disk will be full and an iPhone behaves quite strange if it has a full disk. So you can still interact with the user interface, but you can no longer delete photos because deleting a photo, it seems, it just needs some file interaction. Also, you can no longer log in with SSH, which is also an issue because it somehow seems to create a file when logging in, so you can no longer delete any files. And I was just rebooting the iPhone after trying a couple of things and luckily it came back and deleted some files and I was able to log in and removed the baseband logs. But be careful when doing this. And of course, all the iPhones are very confused from the fuzzing. So they really lose everything about their identity and location and they want to be activated again. So here you can see a smartphone that lost its location and really wants to be activated, activated, activated. During SMS fuzzing, you might even get Flash messages. And if you click on the head menu on dark theme, they are displayed black on gray, so probably nobody ever tested it. Also great if you have a locked iPhone, you can still display SIM menus and SIM messages on top of the lock. OK, so I guess I have to revise my first instruction. So fuzz this! Really, really fuzz this! It's a lot of fun. Maybe just not on your primary device, but you will enjoy fuzzing these interfaces. But first of all, you obviously need to build a fuzzer, so how do you build a fuzzer? The first fuzzer that I used was the one that I also used for Bluetooth that just uses the existing bytestream protocol and then flips single bits and bytes. So it has this high state-awareness. But it also means that like some kind of monkey I was just calling myself, writing SMS to myself, enabling flight mode, everything that you could just imagine. And it's a very boring task. But it also found very fancy bugs that I couldn't reproduce with the other fuzzers yet, because it can reach states that just injection of packets cannot reach. So at least it was quite successful. And when I fuzzed with this for something like three days and already found a bugs, that's very different with the Bluetooth fuzzers, so there seemed to be more bugs in CommCenter. And so I just wrote to Apple PR: "Hey there, I wrote this really, really ugly 10-lines-of-code fuzzer and see what it found. Awesome, awesome, awesome! And crash logs are attached. And obviously this is simple to reproduce because I only fuzzed for three days. Got most of these crashes multiple times. Yeah. So here you go. Enjoy my fuzzer." And this was probably quite stupid because it's not that simple. So it's really not easy to reproduce the crashes. First of all, well, of course this script is so generic that it runs on all iPhones with an Intel chip, so no matter if I take an iPhone 7 or an iPhone 11, it will just work. But the crash logs that you get are very different depending on if you fuzz on a pre-A12, so iPhone 7 and 8, or on later versions like the iPhone 11 and SE2. So you cannot reproduce the same crash logs that easy. And also it depends a lot on the SIM. So even on a passive iPhone, if you don't do any phone calls and so on, you would get different results. So I started my fuzzing actually with a Singaporean SIM card without any data contract or phone contract on top of it and already found a couple of things. But it might just behave very different on just a slightly different configuration. Anyway, let's listen to a null pointer that it found. And this null pointer has been fixed in iOS 14.2 and it's in the audio controller, so you can hear some loop going on there. What you can see here is me calling the Deutsche Telekom and so on. So they have this very important text. Announcement: Guten Tag, und herzlich willkommen beim Kundenservice der Telekom. jiska: And then I call again and have a crash. And now let's listen to the crash. Telekom jingle starts playing, final part loops ten times jiska: Just for the sound effect, I also recorded another one, so this one is with ALDI TALK. Announcement: Guten Tag, ALDI TALK gibt die Senkung der Mehrwertsteuer vom ersten... jiska: And now let's listen to a special offer by ALDI TALK. In 3, 2, 1... di-dimm... Announcement: Guten Tag, ALDI TALK gibt die Senkung der Mehrwersteuer vom loops ten times erst-erst-erst-erst-erst-erst-erst-erst-erst-er Jiska: Since his first fuzzing results were very promising, I decided to use the latest ToothPicker version and extend it for fuzzing ARI and I called it ICEPicker because the Intel chips are also called ICE. So I just cloned Dennis' latest ToothPicker alpha, which is very, very unstable, but this one actually runs on the iPhone locally without any interaction with Mac OS or Linux. So it doesn't need to exchange any the payload via USB and also it's using AFL++, which is a much faster mutator than Radamsa. So from a speed consideration, this is a much better design. However, AFL++ didn't turn out to be the best fuzzer for protocol, so most of the time is actually spent trying to brute force the first magic bytes, the first four bytes, because it tries to shorten inputs. It's also not aware of something like a packet order, so it was just brute forcing those first four bytes. And well, the next issue is, that for some reason, if the first four bytes are invalid, the ARI parser slows down a lot. So I was suddenly down to something like less than 10 fuzz cases per second. And also there is no awareness of the ICEPicker in this case, of the ARI host state. So ARI sometimes shuts down this interface, if it thinks that something is very invalid and the fuzzer will just continue. So I looked into the iDevice syslog after the fuzzer couldn't find any new coverage for more than six hours. And I was wondering: "What is the issue here? Is the implementation wrong or is it the fuzzer?" And it really looks like the fuzzer is producing inputs that are not good for protocol fuzzing. Of course, this is stuff that you can optimize, so AFL++ can do a lot here, so you can tell it a bit how the protocol looks like and also get it to not brute force the first four magic bytes. But for this I would have to recompile the whole thing. And it was something that compiled on Dennis' machine, but it didn't compile on my machine , because I had my Xcode beta in a weird state. And well, of course, some of you now say: "Just download and install a new Xcode!" But this takes so long that actually writing the next fuzzer seemed to be. easier. Still, this variant of ICEPicker was interesting to me because it was the first time when I saw that the fuzzer initialization works, including coverage and also my replay works across multiple iPhone versions. So my call was collected on an iPhone SE2, was replayable on an iPhone 7. So it was not useless in that sense, but I just decided to not use this configuration. So I just wrote a very simple fuzzer again and I didn't do the porting of everything to run locally on iOS. I just kept the design a bit simpler or at least easier to code and had my fuzzer running on Linux and then using only Frida on iOS. It cannot reproduce all the states and crashes that I observed with my very first fuzzer, but most crashes could be reproduced. I didn't do any coverage. I didn't do any smart mutations, just very stupid mutations. And basically I just did a very blind injection. But this was super fast, so instead of the 20 fuzz cases per second, I already had something like 400 fuzz cases per second on an iPhone 7, which was about the same speed or even faster than the AFL++ variant. And I can at least correct the length field, sequence number and so on before injecting the payload. Since it doesn't do that great mutations, at least, I need to collect a good corpus with many SIMSs, many calls. And I'm also logging the packet order with this. So it's at least aware of a pocket sequence in the sense of, I can reproduce the sequence later on. I had this fuzzer running on a couple of iPhones in parallel for multiple weeks, and it found a lot of interesting crashes. So that's my go-to fuzzer. I still wanted to confirm that not collecting coverage wasn't an issue, so I also cloned the publicly released of ToothPicker, which definitely finds new coverage, and it's using the Radamsa-mutator, which is very, very slow, but it does a bit smarter mutations, at least in terms of protocol fuzzing. It's still only a aware of single packets and it's only using the same packets five times in a row to confirm coverage, etc. And also an issue is that it cannot catch a lot of the crashes of CommCenter. So it happens quite often that CommCenter crashes. And then if you cannot catch the crash with Frida and everything crashes, then you need to start the fuzzer again. But you also need to delete the files in the corpus that led to the crash because otherwise you would just run into the same crash very fast. So it needs a lot of babysitting. I also had it running for a couple of weeks, but sadly, it didn't find any crashes. So at least I can be sure that fuzzing, much slower, but with coverage, is not any improvement. Still, the mutations it creates are quite useful, as you can see in the following. So you can even see this phone numbers scrolling here and so on. So it generated a very long phone number correctly into some TLV structure here. And that's quite interesting to see. So this is something that you could not reach by just flipping bits and bytes. There is one big shortcoming that all of these fuzzers have, including the initial ToothPicker which is they don't have any kind of memory sanitization. So the framework that you would usually use in user space on iOS is the MallocStackLogging framework. I even got this running for CommCenter, so it's a bit of a command line juggling. But in the end you can enable MallocStackLogging also for CommCenter. The issue here is that it increases the memory usage a lot and even if you configure CommCenter to have a higher memory allowance, it is so high that it's just immediately killed by the out-of-memory killer. So this doesn't work. Then there is also libgmalloc. It doesn't exist for iOS, it's just exists on Xcode. I got one of the Xcode libraries running on one of my iPhones. I have no idea if this is an expected configuration or not. At least I could execute smaller programs. And then when you use this on CommCenter, it just crashes with a libgmalloc error on parsing some of the configuration files very, very early when starting the CommCenter. So all of this didn't work. And this also means that the fuzzer cannot find certain bug types or crashes much later when encountering bugs. So all of the fuzzers that I created are not perfect, but at least they found a lot of different crashes. Let's look into this. I mean, the first obvious number that you see here is the 42. So I stopped fuzzing after 42 crashes - at least crashes that I think are individual crashes and that are not caused by Frida - so I tried to filter out Frida crashes and this corresponds to the total amount of crashes, but only some of them are replayable by either one or multiple packets. And for the replayable crashes I can also check if they were fixed in recent iOS versions or the most recent iOS 14.3 or not. Then I also marked two colors here because there is the Intel libraries, but there's also the Qualcomm libraries. And for the Qualcomm libraries, I didn't spend as much time fuzzing, because I have less Qualcomm phones, but also all the asserts in the code prevent a lot of issues from being reached. So the libraries themselves have less issues and also within CommCenter, less of the code that has improper state handling is reached. The location daemon is marked also with a big grey box here, because the location daemon is similarly to the CommCenter using some of the raw packet inputs and parses them. So it has special parsers for Qualcomm and Intel. And it's also an interesting target because of this. Other than this I got really a lot, a lot, a lot of different daemons crashing. Some of them, even with replayable behaviour. So, for example, there is the wireless radio manager daemon that you can just crash via one Intel packet. But, this has been fixed. And then there is one interesting crash that I actually got via Qualcomm and Intel libraries. So in the mobile Internet sharing daemon, this also has been fixed and some of the crashes only happened via Qualcomm, but I'm not sure if that's like a Qualcomm-specific thing or it's just randomness of the fuzzer. So the mobile Internet sharing demon has an issue where it accesses memory at configuration strings, so there's different strings at this memory address and I found this quite early, but I was not aware of the fact, that so many other daemons are actually crashing when I fuzz CommCenter. So, I didn't look into this in the very beginning. And when I reported it to Apple, they said: "Yeah, yeah, we already know about this and we fixed it and a beta prior to your report." So certainly nothing that I got a CVE for. Another interesting crash in the CellMonitor, but only of the Intel library. The CellMonitor is something that is running passively in the background all the time and it parses, for example, GSM and UMTS cell information. I already found this on the Singaporean SIM without any active data plan in my very first round of fuzzing and reported it back then to Apple. I don't know, if it's triggerable over the air or not. So I guess it's something that you first need to get code execution for. And it has been fixed in iOS 14.2. And I wrote a lot of emails with Apple because I thought, that they didn't fix it. And the reason for this is that both the GSM cell info and the UMTS cell info function, when they parse data, they have two different bugs. So I still got crashes in the same functions and I thought: "OK, same function, still a crash: The bug is not fixed.". But actually, it's very high quality code and it's just multiple bugs per function. And there is even one more issue in the CellMonitor, even though I think the remaining bugs are very simple crashes or nothing that could be exploitable at all, but still hints to the great code quality. And the same story is, that there're even more bugs to be fixed. So most of them are probably just stability improvements, but some of them are still interesting. So, let's see how this goes. So since I told, that it's a very simple fuzzer, some of you might have already started coding those 10 lines of code for fuzzing, while I continued talking and grabbed their old iPhones, that they are willing to lose, if something goes wrong. So, how can we actually build a fuzzer that is performant and replicates some of the bugs that I found just within a day. Let's take a look. When you look, Frida fuzzing, a lot of the stuff that you do, is limited by the processing power of the iPhone. So your iPhone will get very, very, very hot and it might even drain more battery, than it can get via the USB port. So it might even discharge while fuzzing. And performance is really key. So you need to identify bottlenecks. I said ToothPicker or ICEPicker, the initial version is just 20 fuzz cases per second and you can tune this to something like 20.000 fuzz cases per second. So, I already told, that I tuned it to something like 400 or 500 fuzz cases per second, but, why the 20.000? So, initially, a student of mine, did some fuzzing in a very different parser and said: "On my iPhone 6S, it's running with 20.000 fuzz cases per second." I was like: "No way, no way!" But actually, you can do this. So, this depends a lot on the Frida design. The first variant, how most Frida scripts are written is, that you have some Python script that runs on Linux or macOS, and it has a couple of functions that you can see here. So first of all, it has this on_message callback. So, this on_message callback is something that we need later. And we just register it to our Frida script, the Frida script, that I'm going to show you in a second. And you load the script and the script can then even call functions on your iPhone. For this, you load a second script on your iPhone. So this is JavaScript injected into the iOS target process and it can, for example, use to send function to send something back to the on message function. And it can export functions via RPCs. So, you can then call them. All this happens via JSON. And so it needs serialization and deserialization, which means you cannot send hex data or binary data directly. So you have a hex string that you encode into JSON, which is then parsed as binary data and also it's all via USB. So you also have the speed limitation by USB. And, of course, if you use the Frida C-bindings locally on the iOS smartphone, it is a bit faster, but it's still not perfect. So, the more you can prevent from this JSON part and the USB part, the better. The actual fuzzer looks a bit like this. So, you are in the libARIServer, so that's the lowest library from the diagram before. And then you define this inbound message callback function, which has two arguments, which are the payload and the length. So, this looks a bit cryptic, but that's basically it. And then you can, but you don't have to, add this interceptor here because you might want to fix your sequence number or add basic block coverage to your fuzzer, etc. So, this is also done there. And then you can just call this inbound message callback of ARI and send ARI payloads. So, this already can be very different. So, if you now call this via RPC export, via a Python script on your laptop, you can reach something like 500 fuzz cases per second, if you inject SMS, which are quite processing intensive payload. Or, if you just do the same thing and if you just run this inbound message callback in a loop, locally with JavaScript, without any external Python script, then you would get 22.000 fuzz cases per second on the very same device. So this is the speed difference that the JSON serialization, deserialization and the USB in between make. So, I did a few more measurements, and certainly on the iPhone 8, there is a bug that prevents me from collecting coverage. But, what you can see is, so, the first part here is if you have just a bit flipper in a loop that calls the target function, you can get 17.000 fuzz cases per second on an iPhone 7. As soon as you start collecting basic block coverage, not processing it, just collecting, you drop to 250 fuzz cases per second. So, you need to ask yourself, if your fuzzer gets really that much better from collecting coverage. And another thing is - that's this line above - so, if you just print the packet, that you fuzzed or injected and print this via Python to your laptop, you also have a huge slow down, which is not as large as the coverage slowdown. But still, you can see every print and every sending off a message in between the Python script and JavaScript takes a lot of time. Now, if you have this remote SMS injection that I had before, then you drop to 200 fuzz cases per second. So it is a blind injection without any coverage. If you collect coverage but don't process coverage, then you are down to 100 fuzz cases per second. So, for the initial ToothPicker design, this would be the optimum. But, because the Radamsa mutator is very slow and because you also need to process the coverage information, et cetera, that's down to 20 fuzz cases per second. So, this is the comparison here. And now you can imagine why collecting coverage probably isn't always useful and why also having your laptop calculating better mutation because it's easier to write a mutator there, than directly in JavaScript, is not always the best idea. So let's watch one last demo video. What you can see here, is when you try to delete SMS, after all of the fuzzing, it really doesn't work neither via the settings nor via the SMS app. So, you really need to reset your iPhone after fuzzing it for too long. No other chance than this to delete the messages. With this, we are already at the end of this talk, but of course, there will be a Q&A session and if you missed the Q&A session, you can also ask me on Twitter or write me an email. Thanks for watching! rC3 music Subtitles created by c3subtitles.de in the year 2020. Join, and help us!