So, I have already been introduced
My name is Stefan Widmann
and maybe I can have my slides?
Hello, my slides?
laughter
Ich bin am VGA drauf.
[hums]
Yeah, um...
[mumbling]
Okay, so while we're waiting until
my slides appear somehow.
Who has seen the incredible talk about
hacking the VoIP phones from Cisco, last year
either live or on video?
Okay some of you.
When we think about this talk
The Cisco VoIP phones have an embeddd
Linux operating system, but they did not only
have to deal with the linux OS, but also with
the firmware of the DSP.
So I want to tell you there's
not only one system,
but several systems. Several
sub-systems containing firmware
Slides would be nice, we can start
without slides, there's no problem.
So what are we going to talk about today?
First we are going to talk about motivation,
why should we do firmware analysis.
Then we need to be able to do it, so we
have some prerequisites to bring with us.
Then we'll dig deep into the topics, We will
try to look at how we obtain a firmware,
how can we analyze it, and how can we modify it.
Hmm. Angel: We are sorry for the brief hiccup, we're
working on that. It's the second talk. I'm sorry.
[speaks german]
Hmm
Okay, so we can do without slides, it's okay.
Let's start with the motivation, why do we
want to do firmware analysis.
When talking to my lawyer I learned that I
had to clean up 90% of my motivation slide.
And left is, you can do it if you
want to gain interoperability.
laughter
You can do it if you want to get rid
of errors, and the manufacturer
does not want to or is unable.
And one interesting point under
discussion is, what about forensics,
taking a look in those thousands of devices
in everyday life. Do they only do
what they are supposed to do?
Herald: Yes, we are still hunting for a Video
Angel to sort this little problem out. We should
have one here within 1 minute.
Again, very sorry.
We now have Nick Farr on stage,
our certified Powerpoint specialist.
applause
It can only take minutes now.
Maybe we can continue. So I will just tell
you something about prerequisites you
should bring when starting to analyse
You should at least have a good knowledge
of embedded system architecture.
You should have dealt with peripherals,
bus interfaces and so on.
You should be able to read
and write assembler.
Some might say:
I have a very good decompiler,
which is fine.
If it works for you, okay, but don't
rely on the availability of a decompiler
for the architecture you're going to be working on
Especially if you are going to work
low down in the register stuff.
And in my opinion, a decompiler output
will confuse you more than help you.
You will go to disassemble maybe, C runtime
libraries, optimized to be as small as possible
That can be really hard in decompiler output.
If you want to practice how embedded systems
are working, then it might be a good idea to fetch
your arduino or whatever. You write some
little C code, handling some hardware stuff.
Then you just compile it and take a look,
what the disassembly looks like.
Very nice to have is a device reader or programmer,
like galib. The problem is they are expensive.
If you think we are going to do firmware analysis, it
may be a valuable investment for your hackerspace.
And last but not least, what you need most is time.
Time, time, time. It may take hours, days
without any progress, so please be patient.
laughter
Any volunteers to make up some slides here?
I swear, it worked perfectly okay with
my external monitor.
Yes? [illegible]
So I'll have to fetch my USB stick, wait a moment.
No problem, we're flexible.
[whistling]
So is there anyone who knows their
way around a computer around here?
Herald: While we figure that out, it might
be a good possibility to remind you all
that we are still looking for some Angels.
You could do video angels, which are in
high demand right now. Or you could just
do any other work you'd like to.
You can do one or two shifts, that's fine.
It would be greatly appreciated
because we require volunteer work for this event.
Also if you brought any beverages in here,
it'd be awesome if you could take them out
with you. And put them into the little
storage cases located all around the building.
Trust me when we are finished,
you'll be able to do this announcement.
Are we good? No, not really.
laughter
It looks good on the laptop.
You want to give a quick intro
to the new Ubuntu desktop?
Because I did not get that at all.
Yeah, mirror displays man.
4 zur 3 folien.
Now we are making progress
applause
Enjoy the talk!
Okay, perfect, now with slides.
One small announcement, there will be
5 more minutes of extra talk at the end.
We'll do that. So just please ignore the yellow bars.
laughter
Not really no.
High level devices. Big complexity. YES!
Perfect, thank you, without yellow bars, thank you.
Okay. So we already talked about prerequisites,
so now we're going deep into the topics.
First we need to obtain a firmware. We
will go from non-invasive to invasive.
Because first thing we want to try is getting
the firmware without opening the device.
We will first try to download a plain binary
from the manufacturer.
or maybe someone else have extracted
a binary and placed it on the internet.
You can try to download a bootdisk, USB,
CD-ROM, bootimage, whatever the manuf. provides.
and extract it using, for example, WinRAR
on windows or just mount it on linux.
Search for files named .bin, .hex, .s19, .mot.
like motorola, .rom or .raw.
Most times binary, that means .bin, .rom or
.raw files are already real binary files.
Non-binary files should be converted to
bin files, e.g with converters like hex2bin.
If this doesn't work out, maybe we get
an updater from the manufacturer
Normally they're .exe files built for windows.
There are different updater types.
First the self-extracting archives.
It might be an installer too,
like Installshield or whatever.
It might be an updater, simple .exe file
without any installation, just containing the image.
It might be an updater that is downloading
an image, or it might be some of the others,
but packed with an executable packer
like UPX or PECompact.
Let's go a little bit into detail.
So if it's a self-extracting archive search
for signatures like RARSFX or PK.
You can unpack them, e.g. if you rename
a PK containg file to .zip you can unzip it.
If it is an installer, like Installshield, there are
special unpackers, but the problem is they are
very hard to use and extremely version specific.
It might work, it might not.
The best way is to just let it install
and search in the installed files
for a plain image or updater. If it's an
updater containing the firmware image,
we can search for the image in the executable
using your favorite hex-editor.
Maybe the updater is writing the data to a file, a temporary file in most cases, and deleting
it after usage. You can use ProcessMonitor
which is like strace but on Windows
and you can take a look at what files
are written to disk can try to capture it
before it's deleted. Maybe the updater is
just checking your device, so its just
a little downloader. Checking your device
type, take a look on the ftp-site of the
manufacturer and is downloading an
image if there's one available.
So if it's downloading the image to a
file, use ProcessMonitor again.
If it's just downloading to RAM, you have
to go for a debugger, and dump it from memory.
If you have a packed updater, which
of course is only done to save size.
If it's standard UPX, you can download UPX
and use UPX -d to unpack it.
Sometimes the manufacturer violate the
license of UPX and modify UPX by removing
vital file information to make it un-depackable.
So you would need a special unpacker.
Other executable packers are most times
designed not to be uncompressed.
So you would need special unpackers too.
One challenge that awaits us is, maybe
the updaters contain compressed images.
They are normally unpacked before the image
is written to the device, so we can just watch
the process memory with a debugger and dump it.
What's a bit more challenging is when the
firmware is sent compressed to the device.
So we have to use invasive techniques
we will talk about later.
It's a good idea to get a sniffer ready
when you first connect your device to your PC.
Maybe the favourite bloatware coming with
the device wants to update it instantly.
What can you do to sniff the transfers?
On Windows XP and I'm sorry it's only XP,
there's TraceSPTI, a fantastic tool tracing SPTI
SCSI PassThrough Interface.
So you might think SCSI? I do not have any SCSI
devices, but very much communication is done using
this protocol on Windows. to identify S/ATA
USB devices, especially if they are ATAPI.
On the linux side you might use Wireshark
to trace the communication, because Wireshark
on linux can trace and sniff USB. There are
various other tools like Bushound and so on
to watch communication on buses. But the
problem is they are normally very expensive.
A problem you'll have if you're trying to sniff
the update transfer and reconstruct the image is
that it's like a puzzle. You don't know how to
build the image, and if you're doing it right or not.
If we do not have a firmware yet,
it might get invasive now.
We'll search for serial interfaces, sometimes
they are accesible without opening the device,
sometimes not. Do we have an embedded linux
system? Yes, we can search for a serial console.
Maybe we have to use JTAG, there was a very
good talk on 27C3 about JTAG, serial flash and so on,
so I've included a link here.
So, still no firmware? Get your screwdriver,
let's void warranties.
We open the device and we search for
memory devices on the PCB.
If you have a very old device, maybe you'll
encounter EPROMS or even PROMS, 27-somethings
If it's a little bit newer, you might see EEPROMS
and flash. 28, 29, 39, 49 something and
and the big flash devices with 48-pins for
example with various other names.
Very nice to see is that serial flashes,
those 8-pin devices 25..., sometimes 24...
are more and more becoming the standard.
They are easy to de-solder, easy to re-solder
and there are very cheap readers and programmers
available. But please, even if some say we can
do it in system without desoldering the chip,
please don't do it. It can lead to very big problems.
To make it a little bit harder, firmware can be
contained in chip-internal memories.
You can try to use proprietary programming
interfaces to read the firmware, of course JTAG.
Some devices do have bootloaders in a mask ROM.
You can try to use them.
If none of these approaches succeed,
you can try microprobing.
There was a talk on last years congress
about low-cost chip microprobing, I've
included a link here. So just for matter of
completeness I've mentioned CPLDs and FPGAs.
You know CPLDs are built up using internal EEPROMs.
FPGAs, Field Programmable Gate Arrays have internal
SRAM and an external serial configuration flash.
Some years ago they were marketed as being
reverse-engineer proof, okay. Yeah, maybe.
There's a talk tomorrow, same time I think, in
Saal 2, about taking a closer look at FPGAs.
Yeah, congratulations, we've done it,
we have our firmware, perfect.
So what's next, now we have to analyze it.
The problem is what processor is used.
We don't know which disassembler to use.
So we are searching the web for any
datasheets, can we get any information.
Can we find out what processor is in use? The
problem is in many cases you won't get the datasheets.
The manufacturer says, you buy 1 million devices
a year, and you sign an NDA, you get the datasheets.
Now you have to be really patient, now it gets
to trial and error, trying different disassemblers.
You can use specific disassemblers, they
are only built for one architecture.
You can use a very good tool, the Interactive Disassembler, IDA. There's a freeware version.
I've included a link in the link section of this talk,
but the freeware only has a little set of architectures.
If you want the full set, it gets very expensive.
But there is a new tool that I really like.
It's ODA, the Online Disassembler, supporting
thirty something architectures, and it's free.
You can upload binary files, you can
upload code, and try different architectures,
and find out what might be the correct one,
and we'll do that now.
So I've prepared some binary code.
I know which architecture this has been
written for, because I did it.
I put this code to Online Disassembler,
and I chose different architectures,
and now lets look at what
the disassembly looks like.
Let's first start with former Hitachi, now
Renesas, H8S. I hope you can read it.
Take some time and please raise your hand
if you think this is valid disassembly and
we have found our architecture.
I see one hand.
Okay, I have to disappoint you, I'm sorry, it's not valid disassembly, we can see it in the second line.
The disassembler was not able to disassemble
the data and it's just an undefined instruction.
There are several .word's in the code. It's not H8S.
Let's try MIPS. Again take some time and raise
your hand if you think that's valid.
laughter again?
It's invalid too. We can see it in the second line,
because there's a dword that's not disassembled.
What about Panasonic MN103 family?
The same hand again? Oh I see another hand.
Ok, several hands now. Yeah, OK, thank you.
So the problem is, it's not valid.
I have to disappoint you.
The problem is in this case, it looks really good
and you have to dig deeper.
You will have to look at, are all subroutines
correct. Do they make sense?
Are there subroutine calls at all and so on.
And you will see something strange. Ok last try.
What about Texas Instruments MSP430?
And again, please raise your hands.
Okay? Yeah, this time it is MSP430!
We have found our architecture, perfect,
eureka, bingo, we have it.
But what's next? The offset in the file,
of the firmware file we loaded is often
not the offset in address space. This is no
real problem when the architecture is
using relative adressing. Relative adressing
means we have register content and whatever
we want to access is based on some
registers content. Location independent code.
But we have a big problem when absolute
adressing is being used, and even architectures
supporting relative adressing do have some
absolute adressing, somewhere on some accesses.
We would not know, where's the entry point.
Where should we start?
Interrupt vectors might be decoded completely
wrong, subroutine calls do not make any sense.
They go to [addresses] outside of our firmware
for example, or in the middle of instructions.
So the load offset has to be found.
I'll now show a method I call "call distance search".
We will select closely located subroutine adresses
and we'll have to decide either to use
preceding return instructions in front of
the subroutines, or the start of the function
entry sequence. We build a search string
containing wildcards, and then we search.
Now we'll do that together, I've prepared an example.
This is 8051 code. The 8051 core is very old, it's
an 8-bit controller, but it's still widely used in the field
because it's cheap as dirt and you can
implement it wherever you want.
In the left column we see the addresses
of our example, from 0x00 to 0x13 hex.
We see four subroutines, with the first being the root
subroutine, calling the other three subroutines.
We can see the first call to 0x100 is outside our
example, we do not have 0x100 in this example.
So what we do is take the three subroutine
adresses and sort them.
So we're getting 0x100, 0x103, and 0x107. We calculate
the difference to figure out the length of the subroutines.
We get 3 bytes and 4 bytes. Now we look at how
subroutines are built in this specific architecture.
On x86 you will mostly find it, not on the 64-bit
platforms, but on the 32-bit and 16-bit platforms,
You will find a stack-frame entry sequence in
every function, like push bp or push ebp 0x55
So you can trigger on that one.
On 8051 it's not possible. Take a look at address 0x0A It's 0xE0.
Take a look at address 0x0D , it's 44, and 0x11 is 7B.
The are not equal, it does not help us.
So we look at the preceding returns
and yes there are returns in front
of every subroutine.
So we take the 0x22 [ret] as our anchor.
Our search string will look like this;
We start with the 0x22, we have a
subroutine with a length of 3 bytes,
so we have 0x22 [ret], two wildcards and
again a return. The second part of the
search string encodes the second
subroutine with 4 bytes. So we have wildcard,
wildcard, wildcard and again a return [0x22]
In this simple example we get only one hit,
perfect. We get a hit at address 0x09.
But we do not want the address of the
return, we want the address of the subroutine,
so we are not using the 0x09, we are using the 0x0A.
What we do is we take the original destination
address 0x0100, we subtract 0x0A
and we get the base address of our
code example, which is 0xF6
If we apply this newly found out load
offset to the code and we adjust the offset
starting now at 0x00F6 in the left column
we see that all three subroutines now match.
The call to 0x0100, the call to 0x0107
and the call to 0x0103.
Ok, I think this was hard, so let's
repeat what we have already done.
So we have obtained our image, we have
successfully found the processor architecture,
we have found a disassembler
to disassemble the firmware,
and we have hopefully found the
original load offset. So what's next
Maybe the question arises, is there
additional firmware in this device?
I see jumps and calls outside of firmware
we already know, although we have adjusted
the load offset. Is it chip internal?
We can see it on the figure, maybe
we have only firmware part A. And maybe
it's using a library or chip internal part B.
So we will have to see what we can do
using a modification of the firmware.
Now having done that, we can start
with normal reverse engineering of the code.
We search for strings, we search
for references to the strings,
but as we are in a very low end embedded
system, maybe we can search for very specialized,
data references and operands. Search for USB
descriptor fields, you have extracted with /bin/lsusb
Take a look for USB magics like USBC and USBS,
you know these two dwords are used in
usb communications. Take a look for IDE,
SATA and ATAPI ID strings, saying
"I'm a OCZ SSD device" for instance. When
you've sniffed the device communication
you've already found some typical datablocks.
You can try to find them [in the binary]
Last, but not least, maybe the device provides some error codes, and you can search for strings,
or for operands in the opcodes.
It's very interesting to find hidden
firmware update sequences, because
they would allow non-invasive modifications.
For example search for chip erase and
programming commands, you can take the
appropriate commands from the datasheet
if there's any external memory device available.
We've done it, we have analysed it and we've
learnt a lot about the device.
Now we are going to modify it.
First, we have to think about, If we are
going to modify the firmware, we have
to prepare to brick our device.
Manufacturers implement several integrity
checks, and why do they do that?
They do it because firmware is stored to flash,
which is prone to aging, especially if heat is involved.
So they do checksums. There are softwarebased
checksum calculations, CRC for example.
There are even hardwarebased checksums
where some HW peripheral will do the job for us.
So what you see in the code is maybe the start
offset, the end offset, and if you're lucky the polynomial.
It might be hardcoded in the peripheral too,
so you won't see anything.
It can be a combination of both, being done
only on startup or cyclically in the background.
What we have to do to modify the firmware
is either correct those checksums,
or we have to patch those checksum
algorithms not to trigger.
What are the goals of our modification, of
course we heard it in our motivation section.
We are about to correct errors, and maybe the errors
are contained in another part of firmware we are not
having right now. Maybe we have to dump
additional memory regions.
That's what they did in the Cisco VoIP hack.
They tried to find a memcpy routine and use it.
If you don't find a memcpy routine maybe
you can implement your own. Why not?
You could dump code from other memory
regions to output buffers.
If you have space in an external memory
device, why not program it to the device
and read it from the device. It can be very
interesting to gather more device internal information.
For example doing a RAM dump, because
during static analysis, you always wonder
what may be in RAM at this and that address.
Now as we have modified the firmware,
we can inject it back to the device. For
example using the original updater.
It might contain the next checksum check, who knows.
We can try to re-program it to the external
memory device if available, or to the processor.
This might be done using a serial interface,
either JTAG or proprietary.
That's it. Thank you very much.
applause
Angel: If you have any questions, please line up
at the room microphones, there are four here.
Are there any questions? Microphone 1 please;
Question: Not a question, but a tip. If you need some binary dump or some left over files on windows,
you can deny the delete right, so the install
or updater program is unable to delete its tempfiles.
So they are left over after reprogramming the device.
A: Do you have tip what to use in that case?
Q: Sorry?
A: Do you have a tip, is there a special tool?
Q: It's not necessary, windows has
the function already built in.
A: OK.
Q:And I don't know the word.
A: OK
Q: But you are able to revoke rights completely
from a directory, there's a special right for deleting.
A: OK, thank you.
Angel: Are there any more questions?
Angel: Doesn't look like it, please give a warm
round of applause to our speaker Stephan Widmann.
applause
A: On [microphone 2]
Angel: There is one more question... No?
A: OK
Angel: If you're leaving please do take your...
subtitles created by c3subtitles.de