34c3 intro
Herald: All right, next lecture here is
from Artem. Next to the fact that these,
how would I spell it, earning a nice
amount of money probably at a lab that's
quite renown in the world as Kaspersky and
from that point on he's not just looking
in this lecture to exploit a development
for Cisco stuff we all suffered from this
last year or we all heard about it and
don't know the impact maybe, but he's
going to explain us the work he did on
that field so please can I ask your warm
welcoming applause for Artem Kondratenko,
do you, okay good, please give him a warm
applause and start.
applause
Artem Kondratenko: Hello everyone, so
excited to finally be able to attend Chaos
Communication Congress. Very happy to see
y'all. So without further ado let's jump
into some practical IOS exploitation. So a
few words about myself, my name is Artem,
I do stuff, mostly security related stuff
but mostly my areas of expertise are
penetration tests, both internal external
and also do research in my free time and
get a bug bounty here and there and this
talk is actually kind of a continuation of
my talk this summer at Def Con about Cisco
Catalyst exploitation.
So for those of you who are out of
context, let's recap what happened earlier
this year. So year 2017 was reaching
vulnerabilities for Cisco IOS devices. So
we had at least three major advisories for
Cisco IOS that represented three remote
code execution vulnerabilities.
So the first one is vulnerability in
cluster management protocol which resulted
in unauthenticated remote code
execution via telnet, second one is SNMP
overflow and the DHCP remote code
execution.
In this lecture I'm gonna be talking
about two of those vulnerabilities because
DHCP RCE is yet to be researched. So
hopefully by the end of this talk I'm
going to be able to show you a live demo
of exploiting the SNMP service in Cisco
IOS.
So but first what happened earlier: So on
March 26, 2017 we had a major advisory from
Cisco that announcing that hundreds of
models of different switches are
vulnerable to remote code execution
vulnerability. No public code, no public
exploit was available and no exploitation
in the wild. So it was critical and main
points of the vulnerability were as
follow:
So Cisco switches can be clustered, and
there's a cluster management protocol
built on top of telnet, and this
vulnerability is a result of actually two
errors, a logic error and a binary error.
So the telnet options get parsed
regardless whether the switch is in
cluster mode or not and the incorrect
processing of this cluster management
protocol options result in overflow. So
what is interesting about this
vulnerability that actually the source of
research for Cisco guys was another
internal research. But the "Vault 7" leak
that happened in March this year so many
hacking techniques and tools were released
to public by WikiLeaks, and among many
vendors that were affected was Cisco
Systems.
So basically except for the advisory you
could go to WikiLeaks and read about this
potential exploitation technique for Cisco
Catalyst. So basically these were notes of
an engineer who was testing the actual
exploit, there were no actual exploit in
the leak. So basically this worked as
follows: there was two modes of
interaction, so for example an attacker
could connect to the telnet, overflow the
service and be presented with a privileged
15 shell. The other mode of operation was
the … is to set for all the subsequent
… connections to the telnet, there
will be … without credentials so I
did. We discover this exploit, full
research was presented at DEFCON 25. I was
targeting system catalyst 2960 as a target
switch. I also described a PowerPC
platform exploitation and the way you can
debug it.
You can look at my blog post about
exploiting this service, and also the
proof accounts of exploit on my github
page. But today I want to talk about
something else, about another
vulnerability that was announced this year
about SNMP remote code execution.
So the actual motivation behind this
research was that I was conducting an
external pentest, and it was revealed an
nmap scan revealed that a Cisco router
where the default community string was
available. So the goal was to get access
to the internal network.
So the actual advisory said that the
attacker needs a read-only community
string to gain remote code execution on
the device. The target router was a 2800
integrated services router, which is a
very common device on the networks.
So the technical specs for it is it's a
it has a MIPS big endian architecture, you
don't have any client debugging tools for
it available, and it's interesting in that
sense that the firmware is relatively new
for this router, and it might be
interesting to look at the defensive end
exploit prevention mechanisms employed by
Cisco IOS. When I say relatively new is
the interesting thing is that this device
is actually end of support, it's not
supported. So the last patch for it was
came out at 2016, and to remind you the
advisory for SNMP overflow appeared in
2017, in June 2017, but nonetheless this
is still widely used device.
If you search for SNMP banner on Shodan,
you will find at least 3000 devices with
SNMP service available with default public
string. So this devices are all supposedly
vulnerable to SNMP overflow. And the
question is whether we can build a mount
code execution exploit for it. So since
we're going to be exploiting SME protocol,
let's make a quick quick recap of how it
works, just light touch.
So SNMP comes with several abbreviations
like MIB, which stands for management
information base, and is kind of a
collection of objects that can be
monitored from the SNMP manager. And so a
management information base actually
consists of object identifiers. And as an
example: you all know that printers
usually use SNMP, For example if there is
a certain level of ink in the cartridge,
you can query the SNMP service on this
device for the percentage of ink left. So
that's a kind of example how it works.
Management information base looks like a
tree.
So you have your base element at the top
and your leaf elements, So all these
elements represent an object that could be
queried. We're going to be looking at get
requests. And that is why the advisory
states that for their vulnerability to be
triggered you only have to know the read-
only community string. So it's a
relatively simple protocol, you just
supply the object identifier you're
querying and you'll get back the result.
So here for example we get the router
version, the description field. And you
can also do this with a readily available
Linux tools like snmpget. So before we
will build an exploit, we have a starting
point. So how do we look for for the
crash? So the advisory actually states
that there are nine different vulnerable
management information bases and you only
have to know the read-only community string.
So for the fuzzing to be done I'll be
using Scapy as a tool to as a toolkit to
work with network protocols, and here you
can see that I'm building an object
identifier, a valid object identifier that
references the description field, and then
I'm appending some letters "a" which is 65
in ASCII table. Then I build an IP packet,
I build a UDP packet and an SNMP packet
inside of it with community string public
and object identifier.
So of course this will not trigger the
overflow, because this object identifier
is completely fine. How do we get all the
object identifiers that our router will
respond to? So basically there are two
ways: you can take the firmware and just
extract all the OIDs from it. It's easy to
grab them, they're stored in plain text.
Another way is to actually look at the
vulnerable MIBs and visit the website OID
views and get all object identifiers from
this website. So as a matter of fact the
first crash I had was in ciscoAlpsMIB,
which is kind of related to airplane protocol,
which does not concern us because it's not
a focus of our exploitation.
So the actual overflow was in one of its
object identifiers. So this request this I
actually crashed the router when you
connect to the Cisco router of via a
serial cable you will be and there's a
crash you will be presented with a stack
trace. So we see here that we got a
corrupted program counter, and we also see
the state of registers that we have at the
moment of crash. So here you can see that
we have control of a program counter, it's
called EPC, and also we control the
contents of registers s0, s1, s2, s3, s4,
s5, s6.
Further inspection also provided me with
knowledge that we have 60 spare bytes on
the stack to work with. But before we
build we exploit we have some problems,
issues to be solved.
And that is: yes, we do control the
program counter, but where do we jump to?
Is ASLR on? Can we execute shellcode
directly on the stack? Is stack
executable? If we can place the shellcode
on it, is data caching a problem for us?
And if we launched our shellcode, can we
just bash the code? Is the code section
writable? Is the code integrity check on?
But the most important question is: how
can we return the code flow back to the
SNMP service?
Because IOS is a single binary running in
the memory, and if you have an exception
in any thread of this big binary, the Cisco
device will crash. And if we look at the
advisory, one of the indicators of
compromised Cisco states is device reload,
so exploitation of the vulnerabilities
will cause an affected device to reload.
We will build an exploit we'll try to
build an exploit that will not crash the
SNMP service on it. Before we dive deeper
into the firmware I want to reference
previous researches on this matter. This
is by no means a complete list but these
researchers actually helped me a lot and
seemed interesting and very insightful to
me. You should definitely check them out
so for example "Router Exploitation" by
Felix FX Lindener and "CISCO IOS
SHELLCODE" by George Nosenko is a great
resource for IOS internals and great
reference to how IOS works in terms of
exploitation
and the third resource "How to cook
Cisco" is a great info on exploiting
PowerPC-based Cisco switches and also
great info on bypassing common mechanisms
and exploit prevention stuff in IOS. So
basically if I were to tell you how IOS
works in one slide it's basically a single
binary running in memory. Everything is
statically linked into a single ELF file
which gets loaded on startup, of course
you have no API whatsoever. Everything has
no symbols whatsoever. Yes, there is a
glibc library at the end of the firmware
but it's also kind of hard to use it
because you have so many different
versions of firmware and the offsets jump
and you don't know the exact location of
those functions. So to start with static
analysis you should probably copy the
firmaware from the flash memory of the
router. Use the copy command, it supports
TFTP and FTP protocols so you download
this firmware, the next thing you do is
unpack the firmware. The firmware itself,
when the router starts loading it, has an
initial stop that does the unpacking but
you don't have to reverse engineer that,
you just use binwalk, that will do the
unpacking for you. You load the result of
unpacking with binwalk to IDA Pro, you
have to change the processor type to MIPS
32 big-endian and we know that this is
MIPS, because we saw the registers. These
registers tell us that it was indeed MIPS
architecture. So one thing I want to note,
the actual firmware gets loaded into
address 800F00 but the program counter is
located at address 4 and this is because
IOS when it loaded the firmaware
transfers, I mean maps to memory 2400F00
and this is important because to have
correct cross references in IDA Pro you
have to rebase your program to 4 and after
that you will have all correct string
cross-references. We will have all the
necessary strings and your static analysis
setup will be complete. But in order to
build an exploit it will not suffice to
only have the, you know, IDA Pro loaded
with the firmware with all the cross
references, you probably want to, you
know, set up a debug environment. It is
well known that IOS can be debugged via a
serial port and actually there's a "gdb
kernel" command that is used to start the
internal gdb server, or it was because
functionality was removed in the recent
versions of IOS and you can't really run
the gdb. But nonetheless there's a way to
enable the gdb and this way is to reboot
the device, send an escape sequence to the
serial line, this will bring up the rom
monitor shell so rom monitor is a simple
piece of firmware that gets loaded and run
just before your firmware starts running
and in this ROMMON you can manually boot
your firmware with a flag and which will
launch the whole firmware under gdb. And
after your firmware is loaded, the gdb
will kick in. Now you can't just use your
favorite gdb debugger and Linux and
connect it to IOS via a serial port
because IOS uses a slightly different
subset of commands of gdb protocol. It has
a server-side gdb but the client side
should be accustomed to this gdb server.
Basically there is no publicly and
officially available client-side debugging
tools for IOS and that is because this is
intended for Cisco engineers for to be
done. Although there have been some
efforts from the community to build tools
to debug several versions of routers and
switches with IOS and if you look for ways
to debug Cisco IOS you will find, you most
definitely will find a tutorial that says
that you can actually patch an old version
of gdb that still supports IOS, but it
actually never works because I tried it
and all I could do is read memory, the
stepping, the tracing, it just doesn't
work. So another way is to use a cool tool
by NCC group, it's called IODIDE, it's a
graphical debugger for IOS, it really
works, it's a great tool, but the thing is
it is only, it targets PowerPC
architecture and it has some some problems
you probably have to patch the debugger to
be able to work with it and the third
option, the last resort
is to implement your own debugger for the
router. And to do that you have to know
which commands actually Cisco supports,
and not a lot, so you can basically read
memory and write memory and set and write
registers and the only program counter
control command is a step instruction. So
basically it's kind of easy to implement
such a debugger because all the
information is just sent as a plain text
over a serial cable and appended with a
checksum which is just a CRC. So this way
I was able to, you know, make a quick
Python script using Capstone to be able to
debug IOS, you can inspect registers,
there's a basic breakpoint management, you
just write a special control double word
to be able to break. You can step over a
step over step into and also a good feature is
to be able to dump memory, which we will
use later. So to find the overflow, the
SNMP overflowing the code, how do you do
it? Basically you can follow, since we
have all the string cross-references, you
can follow the strings, that reference
SNMP get requests and just step until the
crash, but a more efficient method is just
to crash the device and start inspecting
the stack after the device is already
crashed. You just have to dump some memory
on the stack and look into the values that
reference the code, some of them will be
return addresses and this will give you a
hint where the crash actually is. So the
actual program counter corruption happens
in the function epilog, I call this
function snmp_stack_overflow, so you can
see here that at the end of a function we
load the values from the stack to
registers $s0 to $s6 and also we load
value into register $ra and this is an
important register, it's called a return
address register and almost every function
in MIPS uses this register to jump back to
its parent function. So basically we have
some space on the stack, but the question
is can we place our shellcode on this on
the stack? And can we execute it? Because,
you know, stack location is
unpredictable, every time you trigger this
vulnerability a separate space on the
stack is allocated and you cannot really
predict it. So, no valid jump to stack
instructions in the firmware like we did
on Intel x86 like jump ESP. No such
instructions in the firmware, but even if
we could find such an instruction, the
address space layout randomization (ALSR)
is on, which means the code section and
data section is based on different offsets
each time we reboot the device, which
means that we can't reliably jump to the
instruction. And also an unfortunate thing
is that data caching is also in place. So,
about ASLR, this is the first first time I
encountered the randomization in IOS.
Previous researchers, that I've been doing
with, they said a lot about diversity of
the firmware. So, basically you had so
many different versions of firmware when
you exploited the Cisco device it couldn't
really reliably jump to any code because
there's so a vast diversity of different
firmware that was built by different
people but here we actually have the stack
address based randomization and the text
section and data section is loaded on
different offsets after each reboot. So,
another thing that really upsets us is
data caching, so when we write our shell
code to stack, we think that it will be on
the stack but what actually happens,
everything gets written into data cache
and when we place our program counter to
the stack we get executing garbage
instructions which results in a crash once
again. So this problem this is basically a
data execution prevention, well it's not
it's a it's a cache but the solution to
this problem is the same as for data
execution prevention and it is return
oriented programming, so but unfortunately
we still have ASLR so we can't really jump
to anything because it's on a random
offset but here the rom monitor, that I
was talking about comes to our rescue. So
this little piece of software that gets
loaded before the actual firmware might
actually help us. So
the first thing we want to find where
this bare-bones firmware is
located and the interesting feature of
this ROMMON shell, it's actually allowing
you to disassemble arbitrary memory parts
and if you target the disassembler at an
invalid address you will get a stack trace
revealing the actual address of the rom
monitor. And what's the most interesting
thing as the rom monitor is located at
bfc0000 and you can dump it using the
debugger or you can just search the
internet for the version and download it.
The most interesting part about this piece
of firmware, is that rom monitor is
located at the same address and it's
persistent across reboots and it's really
great because we can use it for building
ROP chains inside of it. So now we have a
theoretical possibility of circumventing
ASLR, defeating the cache problem. So how
do we build an exploit, so the overview is
as follows: we jump to ROMMON, we initiate
a ROP chain, which makes an arbitrary
write using the code reuse technique and
after that we have to recover the stack
frame to allow the SNMP service to restore
the legitimate code flow. This is really
important because we will be writing only
four bytes and that is not enough for a
full fledged shellcode and if we don't
crash SNMP we can exploit this
vulnerability over and over again, thus
building a shellcode in the memory. So
after we build the shellcode we make a
jump to it. So, here's how it works: we
overflow the stack, we overflow the return
address so it points to rom monitor, we
jump to the rom monitor, then what we do
we actually find a gadget that reuses the
data on our stack to make an arbitrary
four byte write just before the text
section. Then we have to find a gadget
that will recover stack for us so we can
restore the legitimate SNMP execution call
flow. So this is basically an overview of
one cycle of how we write a four byte
double word. Now, a little bit on building
ROP chains, so what is it? what is return
oriented programming? So basically the
idea is to not execute the shellcode
directly but is to use existing
code in the binary to execute your
payload. So you use stack not as a source
of instructions but you use stack as data
for the code that you're reusing. So
basically you change the snippets of code
we call them gadgets and you chain them
together with jump or call instructions
and candidate gadgets has to meet two
requirements: It has to actually execute
our payload and also it also has to
contain instructions that will transfer
execution flow to the next gadget or, if
it's the last gadget it should transfer
execution back to the SNMP service. The
problems with the return oriented approach
is that there is a limited set of gadgets
available, so if you're talking about the
firmware it's around 200 megabytes of code
so there are plenty of different gadgets
there, if we're talking about a rom
monitor it's only 500 kilobytes of code,
so not a lot of code available and the
second major problem is that gadgets,
because most of them are function
epilogues, they modify the stack frame
because they delete the local variables
after they jump back to the parent
function and you have to account for that
because this, my crash, the process you
are exploiting. ROP chains can be
basically forced to do anything but
mostly, most of the times we do arbitrary
memory writes and this actually might lead
to arbitrary code execution. So the idea
for for looking for gadgets is that you
find a gadget that loads data from the
stack into the registers and then you find
a second gadget that works with the data
in the, in those pages for example you
have one register $v0 which contains the
value you want to write and the other
gadget $s0 which has the address you want
to write to. So we actually want to find
gadgets that also load data from stack to
return registers so we can jump to the
next gadget. I don't have to look for
these gadgets manually in IODIDE, in there
are a lot of different tools for building
ROP chains, one of those tools is Ropper
you can find it on GitHub it's a really
handy tool.
You just search for necessary
instructions to build
the necessary ROP chain. So now the last
technical part of how the ROP chains in
this particular exploit work and then
we'll get to the demo. So this is how a
perfectly, you know, healthy stack frame
looks like. So you basically have local
variables on the stack, you have return
adress, you also have a stack frame of
parent functions underneath the stack
frame of our vulnerable function. So when
we overflow the local variables with our
long object identifier here's what
happens: We overflow the local variables
and these variables actually partly get
written to $s0 and $s6 general purpose
registers we also, of course overflow the
return address, which will jump for us to
rom monitor and we also have some 60
bytes, after that we overflow the stack
frame of the next function and we use that
data also for our ROP chain. What we do
here, we take the value of $a0, we control
the value of $a0 as you remember and we
move it to register $v0 and that's for
only solely purpose because there are no
other gadgets in rom monitor that use $s0
as a target register to write data so we
have to use register $v0. After that the
most important part is that we load the
return address from the ROP data too and
also we load the address we will write to
from the ROP data too. So basically right
now after this gadget stops executing we
have $s0 points to a memory we want to
write to and $v0 contains 4 bytes we will
be writing just before the code section.
So the final gadget that is performing the
arbitrary write is the gadget that takes
the value of register $v0 and writes it to
a pointer reference that referenced by
register $s0 and the last thing it does
actually transfers the control back to the
gadget, which will recover the stack for
us. Most important gadgets it allows us to
run the exploit several times, you might
have noticed that the previous gadgets
actually moved the stack pointer 30 bytes
and hacks down them down the
stack and this actually means
that the process that we will return
to will crash if we don't point the stack
pointer just between two stack frames. We
find a gadget that will move the stack
pointer down to 228 bytes in hex, which
will result in a perfectly healthy stack.
Also we load the return address to
register $ra and it points to the parent
function that called our own vulnerable
function so this way we perform an
arbitrary four byte write. We can do this
several times until our shellcode is
actually built, just before the text
section and the final thing we do, we
overflow the stack again and jump to the
shellcode. A few words about the
shellcode: The device I was working with had a
telnet service and it had a password, so I
designed a simple shell code that will
just patch the authentication call flow.
So as you can see here we have a function
"login password check" and a result which
is in $v0 register is checked whether the
authentication was successful or not. We
can build a shell code which which will
just patch this instruction which checks
"login password check" and it will allow
us to make a credential list
authentication against telnet service. So
what it does: basically the shell code
inspects the stack and the return address
in it to calculate the ASLR offset
because, of course the ASLR is on for the
code section and we want to patch
something in it and after that it writes a
0, which is a nop instruction in MIPS, to
a call that checks for password for telnet
and also for enable password and then it
just jumps back to SNMP service. So now
the long-awaited demo. Let's see if I can
make it a live demo. All righty, so here
we have the serial connection to the
device, you can see we have a shell. So
what we do now, we inspect the password on
the telnet service to make sure it's
working as intended. So we see that bad
passwords. We don't know the valid
password for the device, what we do now is
we launch the actual exploit, as
parameters it takes the host, community
and shell code in hex. So this is the shell
code I was talking about that patches the
code flow in authentication. So let's
write sudo. So here you see that we
initiate writing the four byte sequences
into the text section. Basically this
writes the shell code into the memory. So
after the exploit finishes this, we just
have to jump to the shell code. So let's
see. Please do not crash. So, yes. So back
to the slides. And of course you can build
a shell code that will unset this behavior
and patch the process back to enable the
password and on the side notes how
reliably can you exploit this
vulnerability? So, of course the SNMP
public community will leak you the version
of the particular router but it does not
leak you the version of ROMMON and we're
basically constructing ROP chains in the
rom monitor. So actually you have not that
many versions of rom monitor available.
You have only five if we're talking about
2800 router. So the worst-case scenario is
just you crash it four times. It's not
like you have to crash it four thousand
times to you know beat the ASLR but
there's a second option which is
interesting. ROMMON is designed to be
upgraded, so basically a system
administrator can download a new version
and update it but the thing is that read-
only region that contains the stock ROMMON
is always in place and it is always at the
same offset, so even if you updated the
rom monitor, the read-only version of it,
the old version that always been there,
will always be at bfc00000. So basically
the assumption is that all the devices
manufactured at the same time and place,
they will have the same read-only rom
monitor, you can query your serial number
of your router using snmpget. So for
example my lab router is manufactured in
the year of 2008 and Czech Republic. So
and it has the following version of rom
monitor. So guys to, you know, to
summarise about all this, do not leave
default credentials on external networks.
So public communities are not designed to
be placed on external networks
for the Shodan to find it. Take care of
what you expose on the external networks.
Of course patch your devices and watch
for the end-of-life announcement by
Cisco. Sorry? Sure why not? Alright guys
thank you so much for your attention
applause
thanks for having me.
Herald: I suppose there are some questions
in this audience, please take a microphone
if you can. no one on the internet? They
are flabbergasted there it seems.
Microphone number one.
Mic 1: Yeah, I'm a random network admin
and I know that people tend to use the
same SNMP community on many of their
routers. My view is that basically if you
can get access to read only on those
routers you will be able to hijack that or
like use the same principle. So basically
don't use the same SNMP community on all
your devices that would be also something.
Artem Kondratenko: the main thing is to
update your routers because it's a patched
vulnerability, the patch was released in
September of 2017 but if you tend to use
the end-of-life products like router 2800
you probably should use a strong community
strength for it.
Herald: Thank you. Someone else having a
question there? Yes someone on the
internet is alive. It's alive.
Signal Angel: Let's try it. Yeah now I've
actually got a microphone. The Internet is
asking how much time did you put into this
whole project?
Artem Kondratenko: While working on this
exploit consumed around I'd say four
weeks. Four weeks from the discovering the
device on the external network to the
final exploit. Yes. Thank you.
Herald: I have a question maybe for you as
well. Is that you you're as well a lot of
you have lots of volunteers who are
working with you as well in researching
these exploits or?
Artem Kondratenko: Volunteers?
Herald: Yeah I don't know.
Artem Kondratenko: No, actually we don't
have any volunteers, this is all part of
my work.
Herald: Okay. Thank you very much for
thank you very much for this in its really
revealing lecture, if someone wants to...
Artem Kondratenko: Oh I just forgot to
say, is my mic on? okay so the actual
proof of concept and the debugger will be
released in a few days, so the Python
script with the capstone and the actual
proof of concept I'll publish it and in a
week or so.
Herald: okay thank you.
34c3 outro
subtitles created by c3subtitles.de
in the year 2018. Join, and help us!