Sylvain Munaut: osmo-gmr: What's up with sat-phones ?
-
0:09 - 0:12This talk will be by Sylvain Munaut
-
0:12 - 0:15who works with the Osmocom project.
-
0:15 - 0:19He will talk today about the
osmo-gmr satellite phones. -
0:19 - 0:23After the talk there will be the
opportunity to ask questions. -
0:23 - 0:25You can approach one of the microphones
-
0:25 - 0:28and then you have the chance
to ask one question. -
0:28 - 0:33Please give a round of applause
to Mr Sylvain Munaut! -
0:33 - 0:36Applause
-
0:36 - 0:39- Hi! Thank you for being here.
-
0:39 - 0:45As he said I'm going to talk about
satellite phones today. -
0:45 - 0:50I'll first talk with a quick recap of the
previous work we've done. -
0:50 - 0:54At 28C3 there was a presentation
on this subject. -
0:54 - 0:56But there were some pieces missing.
-
0:56 - 0:59So, I'm just gonna recap
what we did previously -
0:59 - 1:01and then move on to the new stuff
-
1:01 - 1:03which is the speech codec and cipher
-
1:03 - 1:07reverse engineering and cracking
that's been done. -
1:07 - 1:10First, what is GMR exactly?
-
1:10 - 1:13GMR stands for GEO-Mobile Radio and
-
1:13 - 1:17it's an ETSI standard
for satellite phones. -
1:17 - 1:19It's heavily inspired from GSM and
-
1:19 - 1:21if you read the GMR specs
-
1:21 - 1:24there is plenty of reference
to go see the GSM spec. -
1:24 - 1:27There is actually two standards named GMR:
-
1:27 - 1:29GMR-1 and GMR-2.
-
1:29 - 1:31They are not an evolution of one another.
-
1:31 - 1:33They are competing standards
-
1:33 - 1:35that have been developed
by distinct companies -
1:35 - 1:38and then both have been
standardized to ETSI. -
1:38 - 1:41Today, we're gonna be looking
at GMR-1. -
1:41 - 1:46That comes in 3 evolutions.
It started with plain old GMR-1 -
1:46 - 1:49which is voice calls and SMS.
-
1:49 - 1:53It evolved to include packet data.
That's called GMPRS. -
1:53 - 1:55And then there was a later evolution
-
1:55 - 1:58to interconnect with the 3G core network
-
1:58 - 2:01and improve the air interface
and that's GMR-1 3G. -
2:01 - 2:04But we'll be mostly looking at
the first version -
2:04 - 2:08which is still used.
-
2:09 - 2:13Among the deployments the one
we focused on was Thuraya. -
2:13 - 2:16The reason why is that
it's the most common -
2:16 - 2:19commercial operator
where you can actually buy SIM -
2:19 - 2:21and place phone calls.
-
2:21 - 2:23The satellites are visible from Europe
-
2:23 - 2:27which is kind of a requirement for me
to look at the signal. -
2:27 - 2:31This operator is mostly active in the
middle east, Asia and Africa. -
2:31 - 2:34In Europe you don't see too many people
-
2:34 - 2:37with satellite phones
because we have GSM coverage. -
2:37 - 2:41But there are other operators
and we actually read recently -
2:41 - 2:44that there is a new deployment
planned for Europe -
2:44 - 2:47called Solaris Mobile and they said
-
2:47 - 2:50they're going to use GRM-1 3G
-
2:50 - 2:53but the satellite isn't launched yet.
-
2:53 - 2:56This is the coverage zone for Thuraya.
-
2:56 - 2:58As you can see Europe is very well covered
-
2:58 - 3:01so there is no problem if you want to
receive the signal. -
3:01 - 3:04If you are somewhere in that area
you can see it. -
3:04 - 3:07So, since it's so heavily inspired by GSM
-
3:07 - 3:10it makes sense to compare it to it.
-
3:10 - 3:13The first thing they did is to
rename everything. -
3:13 - 3:16So, instead of a base transceiver station
-
3:16 - 3:18you have a GEO transceiver station.
-
3:18 - 3:20Instead of a base station controller
-
3:20 - 3:22you have a GEO station controller.
-
3:22 - 3:26Instead of a mobile station
you have a mobile earth station. -
3:26 - 3:30Then they did some actual
useful stuff like -
3:30 - 3:32use specialized features for satellites.
-
3:32 - 3:35The first one is
terminal-to-terminal calls. -
3:35 - 3:37When you're usually placing a call
-
3:37 - 3:40your signal goes from your
satellite phone to the satellite -
3:40 - 3:43then it goes back to the
core network on earth -
3:43 - 3:46and then, if you're calling
another satellite phone -
3:46 - 3:49it would have to again go
back up to the satellite -
3:49 - 3:51and back to the other phone
and that means that -
3:51 - 3:54you pay 4 trips from earth to space
-
3:54 - 3:57and since the satellite is in
geosynchronous orbit -
3:57 - 3:59it's pretty far away.
-
3:59 - 4:02So, to reduce the delay in communication
-
4:02 - 4:06they have the so-called
terminal-to-terminal calls where -
4:06 - 4:09the call network will still
handle establishing -
4:09 - 4:12the channel but once
the call is established -
4:12 - 4:15the data is going directly
from one phone to the satellite -
4:15 - 4:18and directly back down to the other phone
-
4:18 - 4:20without any involvement from the network.
-
4:20 - 4:22It never even sees the packets.
-
4:22 - 4:26They also have this call called
“High Penetration Alerting” -
4:26 - 4:29That's because like in the movies
you see people using -
4:29 - 4:32satellite phones in bunkers
and stuff like that -
4:32 - 4:34in practice, as soon as you are inside
-
4:34 - 4:37you can't see the satellite and
you can't place a call at all -
4:37 - 4:40but you still might want to
be able to at least… -
4:40 - 4:42you know, you can't pick up the phone
-
4:42 - 4:45but you want to know that
somebody is calling you -
4:45 - 4:48and so they have this specialized channel
-
4:48 - 4:51which has an incredible amount
of error correction on it -
4:51 - 4:54so that even with very very low signal
-
4:54 - 4:57you can still get an indication
that somebody is trying to call you -
4:57 - 5:01please run outside so
you can take the call. -
5:01 - 5:04There's also a very
tight integration to GPS -
5:04 - 5:07For example, all the Almanac
and Ephemeris data -
5:07 - 5:10are actually broadcasted
by the Thuraya satellite -
5:10 - 5:12so that your phone can get a lock faster
-
5:12 - 5:15and when you place a phone call
the very first thing -
5:15 - 5:19that your phone will do
is send your exact GPS coordinates -
5:19 - 5:22in clear to the operator.
-
5:23 - 5:25They do this for two reasons:
-
5:25 - 5:29The first one is:
If you're not on the right frequency -
5:29 - 5:32for this particular geographic location
they can tell you -
5:32 - 5:34“OK. You shouldn't be connecting
on this frequency -
5:34 - 5:37use this one! It has a much better signal
-
5:37 - 5:39and you get better quality”.
-
5:39 - 5:42That's one reason.
The other reason is purely commercial: -
5:42 - 5:45It's for billing because if you're calling
-
5:45 - 5:47while you are in certain countries
it's gonna cost more -
5:47 - 5:50than if you are on the other side
of the border -
5:50 - 5:52and so they need to know where you are
-
5:52 - 5:53to bill you correctly.
-
5:53 - 5:56And then compared to GSM,
of course, they introduced -
5:56 - 5:58the two things we're gonna look at today:
-
5:58 - 6:01They changed the speech codec
to something called AMBE -
6:01 - 6:04and they changed the cipher.
They are not using A5/1 -
6:04 - 6:06or A5/2 or A5/3 or that kind of stuff.
-
6:06 - 6:10They use something called A5/GMR-1.
-
6:10 - 6:13If you look at the protocol stack
-
6:13 - 6:15anything in the lower layers
-
6:15 - 6:18(radio modulation, TDMA frame structure)
-
6:18 - 6:21it's completely different
because you have to deal with -
6:21 - 6:24the particular of the
satellite communication. -
6:24 - 6:26You find the same kind of concept
-
6:26 - 6:29and you find the control channels
and broadcast channels -
6:29 - 6:32but their implementation is different.
-
6:32 - 6:35If you go up one layer
to the data link layer -
6:35 - 6:39you have something very similar to LAPDm.
-
6:39 - 6:43Again, they needed some minor adaptation
-
6:43 - 6:45because bandwidth costs a lot
on a satellite. -
6:45 - 6:48They reduce the size of the overhead
by reducing the header -
6:48 - 6:50and since you have so much delay
-
6:50 - 6:53going from your phone to the satellite and
-
6:53 - 6:56back down to the earth station
then again to your phone -
6:56 - 7:00you have something like
nearly 500 ms of delay -
7:00 - 7:05because you have basically 4…
no 250 ms, sorry, -
7:05 - 7:10because you have both paths up and down
-
7:10 - 7:14and so, with that delay you need more
time for the acknowledgement -
7:14 - 7:17to come in,
so they increased the window size. -
7:17 - 7:19Anything above that, so, layer 3
-
7:19 - 7:22radio resource, since
it's managed the radio channel -
7:22 - 7:25is still a bit different
but anything above that -
7:25 - 7:28is strictly the same as GSM.
It actually interoperates -
7:28 - 7:31with the GSM core network
which means I can take -
7:31 - 7:34my Belgian operator SIM
put it into a Thuraya phone -
7:34 - 7:38and I can roam onto the satellite
and place calls -
7:38 - 7:40which is really nice
except for the price you pay -
7:40 - 7:43at the end of the month.
-
7:43 - 7:46For packet data it's essentially
the same thing. -
7:46 - 7:47Anything that's close to the lower level
-
7:47 - 7:50like RLC and MAC is gonna be different,
-
7:50 - 7:53anything above interoperates with GSM
-
7:53 - 7:56so you're gonna be speaking to
a GSM core network -
7:56 - 8:00SGSN and GGSN and all that good stuff.
-
8:00 - 8:05So, osmo-gmr, what we presented last time,
-
8:05 - 8:10is essentially everything you needed to go
from RF to wireshark. -
8:10 - 8:13That includes the hardware setup
-
8:13 - 8:20(how you can build an antenna,
what LNA to use, what SDR receiver to use) -
8:20 - 8:23At that time we didn't
actually have RTL-SDR, -
8:23 - 8:26the cheap DVB dongle you can use as SDR.
-
8:26 - 8:29Nowadays we do which means that
for less than 100 bucks -
8:29 - 8:33you can get an antenna,
LNA and SDR receiver and all -
8:33 - 8:36that we need to listen to those signals.
-
8:36 - 8:39Then we did all the
SDR processing which is -
8:39 - 8:41taking that raw data, filtering it,
selecting the channel -
8:41 - 8:45doing the demodulation
getting actual data bits out of it. -
8:45 - 8:48Then, channel coding which is
converting those data bits into -
8:48 - 8:52layer 2 frames. Then,
interpreting those layer 2 frames -
8:52 - 8:56into channel assignment
and follow those assignments -
8:56 - 8:59into a demonstration application.
-
8:59 - 9:02And finally, the wireshark dissector
which takes this -
9:02 - 9:05and presents it nicely in a way
you can read stuff. -
9:05 - 9:09But there is two things we couldn't do:
-
9:09 - 9:11First, we couldn't see past the
ciphering mode command -
9:11 - 9:14which as soon as ciphering was enabled
-
9:14 - 9:16we couldn't see anything because
-
9:16 - 9:18we knew the key because
it was our own calls -
9:18 - 9:20and we can read the key from the SIM
-
9:20 - 9:23but we didn't know the algorithm
that was used to cipher -
9:23 - 9:26so there was no way for us to
decipher that. -
9:26 - 9:29And we also had no way to
get the voice data, -
9:29 - 9:33converting the frame into
actual audio that we can listen to -
9:33 - 9:38We couldn't do that and that's
what we're gonna look into. -
9:38 - 9:44So the first thing is the speech codec.
-
9:46 - 9:51It's called AMBE for
Advanced Multi-Band Excitation. -
9:51 - 9:56It's not a codec in itself.
AMBE is more a family of codecs -
9:56 - 10:00which means you have several codecs
which are named AMBE -
10:00 - 10:02which are different versions
of one another -
10:02 - 10:05slightly different so that
they're not compatible. -
10:05 - 10:08It's not documented in the standard
-
10:08 - 10:12which is really annoying.
When I started working on GMR -
10:12 - 10:15I really thought that
the codec was in there -
10:15 - 10:20and then I discovered it wasn't.
That was a bad day. -
10:20 - 10:23There is a bit of specification in there
-
10:23 - 10:27but it only gives a high level
description like -
10:27 - 10:31“The codec takes audio as input
and produces 80 bits of output -
10:31 - 10:35every 20 ms” but nothing
that can be used to -
10:35 - 10:39realistically implement a decoder.
-
10:41 - 10:44That codec is made by a company DVSI Inc.
-
10:44 - 10:49whose entire business is codecs.
That's all they do -
10:49 - 10:51which is probably why there are
incompatible versions -
10:51 - 10:55of AMBE because they can
sell different versions. -
10:55 - 10:59These guys, they do sell
a small USB stick that -
10:59 - 11:02can decode some of the variants.
For example, you can decode -
11:02 - 11:07DStar audio which is also
another AMBE codec -
11:07 - 11:11or P25 which is used in
law enforcement radio in the US -
11:11 - 11:14(it's also an AMBE variant)
-
11:14 - 11:17and the cheap USB stick can decode that.
-
11:17 - 11:20Unfortunately, it's incompatible
with the GMR-1 variant -
11:20 - 11:22because that's the first thing I did,
I contacted DVSI -
11:22 - 11:25to know what were my options.
-
11:25 - 11:31And short of buying a source code license
which is just too expensive -
11:32 - 11:35is buying the appliance
which is called NET-2000. -
11:35 - 11:39And not only does it cost like 2000 Euros
-
11:39 - 11:43which is way too much for a hobby project
-
11:43 - 11:46but you also have to special-order it
with a GMR-1 firmware -
11:46 - 11:50and you have to sign some
non-reverse engineering agreement -
11:50 - 11:54which… That wasn't an option.
-
11:54 - 11:57We still had some hope.
The first hope we had is that -
11:57 - 12:03as I said P25 uses an AMBE variant as well
-
12:03 - 12:07and this particular variant
is actually documented. -
12:07 - 12:11So you can download the standard
for that particular -
12:11 - 12:15variant of AMBE and you have
all the math and all the specs -
12:15 - 12:18so you can write a decoder
and somebody actually wrote -
12:18 - 12:22a decoder which is open source
which is called ambelib -
12:22 - 12:28and so we can use that code as a base
to modify it. That was a lead. -
12:28 - 12:32The other lead, of course, is that the
codec is somewhere in the phone. -
12:32 - 12:34The phone is obviously capable
of decoding audio -
12:34 - 12:37so maybe we can find it in there.
-
12:37 - 12:41But before we can start
searching for the codec -
12:41 - 12:45it's good to have a basic understanding
of how the codec works -
12:45 - 12:48so that we know what we're looking for.
-
12:48 - 12:50The first thing to understand:
it's a vocoder. -
12:50 - 12:53It's not a general-purpose audio codec
-
12:53 - 12:58which means if you try to feed it music
it's gonna perform horribly -
12:58 - 13:01because a general-purpose
audio codec will try to model -
13:01 - 13:06what your ear can actually hear
and what it can't -
13:06 - 13:10whereas a vocoder is gonna try
to model the speech -
13:10 - 13:13and reconstructs
something that sounds like -
13:13 - 13:16but is not actually the same thing.
-
13:16 - 13:19They will drop a lot of information.
-
13:19 - 13:23To do that the first thing they do
is to split your speech -
13:23 - 13:26into small segments which are
compressed independently. -
13:26 - 13:32In the case of GMR-1
those segments are 10 ms long and -
13:32 - 13:37they are combined by pair
and each pair is compressed -
13:37 - 13:41into a single frame. And then,
for each of those small segments -
13:41 - 13:46they will represent it in 4 parameters.
Those 4 parameters are -
13:46 - 13:52the pitch which is the
fundamental frequency -
13:52 - 13:56of the periodic component of your voice,
-
13:56 - 13:59the gain which is
how loud you are talking, -
13:59 - 14:01something called
voiced/unvoiced decision, -
14:01 - 14:04we're gonna see what this is right after
-
14:04 - 14:07but essentially you have the
fundamental frequency -
14:07 - 14:09and for each harmonic,
two times the frequency, -
14:09 - 14:11three times the frequency,
four times the frequency, -
14:11 - 14:16you have a bit that says
is that component voiced or unvoiced -
14:16 - 14:22and then you have the amplitudes
for the various harmonics. -
14:22 - 14:27If you want to play back the voice
you have to do 3 things: -
14:27 - 14:29The first thing is unpacking.
-
14:29 - 14:33That's taking the 80 bits
of the voice frame -
14:33 - 14:37and reconstruct quantized
parameters from it. -
14:37 - 14:40All the parameters are not just
put in order one after the other -
14:40 - 14:43because some of the bits in a frame
will receive more error correction -
14:43 - 14:47than others and some of the bits
are more sensitive to errors than others -
14:47 - 14:51and so they will be placed at
specific places in the frame -
14:51 - 14:55so that if errors do happen,
hopefully it will affect something -
14:55 - 14:57that is not too important.
-
14:57 - 15:00That's the unpacking step.
-
15:00 - 15:02The second one is dequantization.
-
15:02 - 15:05That's going from the quantized,
compressed parameters -
15:05 - 15:09into floating point value
or integer values -
15:09 - 15:11which will represent the real parameters
-
15:11 - 15:13that are going to be used in the last step
-
15:13 - 15:17which is synthesis. And synthesis
is taking those parameters -
15:17 - 15:23and reconstructing actual audio
that sounds like the original voice. -
15:23 - 15:26To do a quick overview
of the synthesis step -
15:26 - 15:29what you do is you take
for each possible harmonic, -
15:29 - 15:32so, f0, 2 f0, 3 f0,…
-
15:32 - 15:36you have the voiced/unvoiced decision
-
15:36 - 15:40and what voiced means is
that you're gonna reconstruct -
15:40 - 15:43that particular frequency
as a pure sinusoidal tone, -
15:43 - 15:50and if it's unvoiced you're just gonna
fill that small band with noise. -
15:50 - 15:55And with that they can reconstruct
human voice pretty well. -
15:55 - 16:00So, now that we know more or less
how the codec works -
16:00 - 16:03and what we are looking for
-
16:03 - 16:08we can start looking at
where we are gonna find this codec. -
16:08 - 16:13The phone we looked at is
the Thuraya SO-2510. -
16:13 - 16:17We looked at it mostly because
it was the cheapest phone on eBay. -
16:17 - 16:20And that also means that it's the simplest
-
16:20 - 16:25which means you have
less places to look into. -
16:26 - 16:31We knew the codec was in the DSP.
How did we know that is that -
16:31 - 16:33there is simply no other places it could be.
-
16:33 - 16:37There are a bunch of chips on that phone, of course,
-
16:37 - 16:40there is only two where it could possibly be:
-
16:40 - 16:42the TI OMAP main processor,
-
16:42 - 16:44and then they also have like a small ASIC
-
16:44 - 16:46which is named the DALMA ASIC because
-
16:46 - 16:50there is a big DALMA marking on it.
-
16:50 - 16:54At first, we thought that
that chip was the AMBE codec. -
16:54 - 16:58But it turned out that
we found some Korean paper -
16:58 - 17:00from the phone manufacturer
that described -
17:00 - 17:02the phone architecture.
And there is a schematic -
17:02 - 17:05of the internals of that ASIC
and it showed -
17:05 - 17:08it only had satellite radio functions
-
17:08 - 17:10and absolutely nothing
to do with the codec -
17:10 - 17:14which means there was a good chance
it was in the TI OMAP processor. -
17:14 - 17:17Why in the DSP? Well, because
it just makes sense -
17:17 - 17:20to make the codec in the DSP
especially since DVSI -
17:20 - 17:25the company that makes the codec
mostly sells those codecs as a -
17:25 - 17:30DSP precompiled library that you can buy.
-
17:31 - 17:39Pretty standard process: We get the
firmware update from the Internet. -
17:39 - 17:42From there we can extract
the actual DSP image -
17:42 - 17:46which is the code that is
loaded into the DSP. -
17:46 - 17:48Thankfully it's supported by IDA
-
17:48 - 17:51which makes the whole process
easier to reverse engineer. -
17:51 - 17:54But it's still like 250KB of binary data
-
17:54 - 17:57and since it's a DSP it has no I/O
-
17:57 - 18:01which means there is not
a single string in it. -
18:01 - 18:05And also it's a DSP which means
it's written in DSP assembly. -
18:05 - 18:08I don't know if you ever
tried to read that -
18:08 - 18:13but it's not the most
understandable thing in the world -
18:13 - 18:16because it's highly optimized
for speed, of course. -
18:16 - 18:19And I actually looked at that
code and went over it a few times -
18:19 - 18:22and took a few hours and looked at it
-
18:22 - 18:25and I couldn't see anything in it.
-
18:25 - 18:31And one day I get an email
from a colleague in OSMOCOM, -
18:31 - 18:35Dieter Spaar, asking me
if I had ever looked at that codec -
18:35 - 18:37and I told him “yeah I looked
-
18:37 - 18:38but I couldn't really find anything.
-
18:38 - 18:40Do you see anything?”
-
18:40 - 18:42And six hours later he sent me
-
18:42 - 18:44“I think this is the entry point
-
18:44 - 18:46for the encoding and decoding functions”
-
18:46 - 18:49“Oh, OK. Good!”
-
18:49 - 18:53The way he found that
from what I remember is that -
18:53 - 18:57you look at what would a codec use.
-
18:57 - 19:00And of course it's gonna
access the audio path -
19:00 - 19:06so you can look at the audio DMA setup,
and audio DMA interrupts. -
19:06 - 19:09You can also look for constants
that you know are used in the codec, -
19:09 - 19:12for example, I said that
the frames are 80 bits, -
19:12 - 19:15so you can look for the constant 80
somewhere in the code. -
19:15 - 19:17You can look at it.
-
19:17 - 19:20It will produce 160 audio samples
per frame decoded -
19:20 - 19:23so you can look at the constant 160.
-
19:23 - 19:26Once you actually found one function
-
19:26 - 19:29there is something
that kind of stands out, -
19:29 - 19:32it's that since they ship that
as a binary library -
19:32 - 19:37they don't really trust the guy
who implements the rest of the DSP code -
19:37 - 19:39to setup the C runtime properly
-
19:39 - 19:43and so what they do is they switch
everything at each call into the codec. -
19:43 - 19:48They switch the stack, they
re-setup/reconfigure the C runtime, -
19:48 - 19:51so that it's really independent.
-
19:51 - 19:54Once you find one function,
finding the other is pretty easy, -
19:54 - 20:00you can just look for that
particular function prelude -
20:00 - 20:07that happens at each call at
the audio library. -
20:07 - 20:11OK, so, we know where the code is
-
20:11 - 20:13but it's still a massive amount of code.
-
20:13 - 20:18I think it's like one third of
the DSP code is the audio codec. -
20:18 - 20:24So, It's not something you can easily
reverse engineer in a few hours. -
20:24 - 20:29So, before trying reverse
engineering we just tried, -
20:29 - 20:32OK, maybe we can just run it.
-
20:32 - 20:35It turns out that TI has this
really nice program called -
20:35 - 20:39TI Code Composer Studio
which includes a simulator -
20:39 - 20:43and it's a really nice piece
of software because -
20:43 - 20:47not only can you pretty accurately
simulate any of the DSPs -
20:47 - 20:51you can also completely configure
the surrounding environment -
20:51 - 20:56in which it runs so you can setup
arbitrary memory map. -
20:56 - 20:59You can also trace all memory access
to see what part -
20:59 - 21:04of memory is being read or being written
or being executed. -
21:04 - 21:07You can also access a file
on your host system. -
21:07 - 21:09So you can use fread and fwrite
-
21:09 - 21:12and the simulator would
automatically translate those calls -
21:12 - 21:17into actual reading from the audio,
hard drive and write to your hard drive -
21:17 - 21:22so you can read a compressed
file that you saved -
21:22 - 21:28and it will write an audio
wave file at the output. -
21:28 - 21:32And the way to do that is
pretty straightforward. -
21:32 - 21:35You essentially just need to
take the binary dump that -
21:35 - 21:39you have from the firmware update
-
21:39 - 21:45and you need to find a way to
mix in your own custom code. -
21:45 - 21:51The way to do that: You create
a new object file like ELF -
21:51 - 21:55except in this case it's called COFF,
a common object file format. -
21:55 - 21:58But it's the same idea.
-
21:58 - 22:05You can just link to it like you would
link any other piece of binary. -
22:05 - 22:07You write a simple main function
-
22:07 - 22:12that will basically fread your
compressed samples, -
22:12 - 22:15call the decoding function
or at least what you hope -
22:15 - 22:20is the decoding function
and fwrite the audio samples. -
22:20 - 22:25And surprisingly it worked.
It didn't require that much effort. -
22:25 - 22:28In like a couple of days
it was all working. -
22:28 - 22:33It didn't work on the first try.
There was a couple of pitfalls. -
22:33 - 22:37But essentially the IDE is there
and it can probably be applied -
22:37 - 22:40to other things as well.
-
22:40 - 22:42Of course there is some
pretty big downsides. -
22:42 - 22:46At first, it's pretty slow.
Definitely sub-real-time -
22:46 - 22:51even on a modern laptop.
-
22:51 - 22:54It also requires TI Code Composer Studio
-
22:54 - 22:58which is a Windows application
and it's also not free. -
22:58 - 23:02You have a 30 days evaluation period
but that's about it. -
23:02 - 23:05So it's really not that practical.
-
23:05 - 23:08But it was running in the simulator.
-
23:08 - 23:11There is no real reason
it cannot run on real hardware. -
23:11 - 23:14Of course, that's what we try next.
-
23:14 - 23:17Unfortunately, on real hardware
-
23:17 - 23:20you can't just put memory
anywhere you want. -
23:20 - 23:24When the codec runs in the phone,
-
23:24 - 23:29it runs inside a TI-OMAP
which is an ARM plus a DSP -
23:29 - 23:31and that chip has an MMU
which means -
23:31 - 23:35you can create
any virtual memory you want. -
23:35 - 23:39But if you're trying to run
on a cheap DSP board -
23:39 - 23:42that you can get for 50 bucks,
-
23:42 - 23:46those usually don't include an MMU at all
-
23:46 - 23:48which means you need to find a board
-
23:48 - 23:50which has memory
at the right places -
23:50 - 23:55because the binary you have has been
linked to a specific physical addresses -
23:55 - 23:58and it needs to run at
that particular addresses. -
23:58 - 24:06Thankfully, Dieter found one such board
with a compatible memory map -
24:06 - 24:12and it had SDRAM.
Well, we needed it to have some memory. -
24:12 - 24:17It also included Ethernet which means
we can make some networked appliance -
24:17 - 24:20where we submit samples and get them back.
-
24:20 - 24:23That was really nice.
-
24:23 - 24:26The process of migrating the code
from the simulator -
24:26 - 24:29to the board is actually
pretty smooth because -
24:29 - 24:32of the way TI Code Composer Studio works.
-
24:32 - 24:34You just basically select that as a target
-
24:34 - 24:40and since the memory was at the right
place it pretty much worked. -
24:40 - 24:43Now, it wasn't quite as fast as we want it
-
24:43 - 24:48because it was only running
about real-time at the first try -
24:48 - 24:52and that's because SDRAM is slow
especially when you do -
24:52 - 24:57a bunch of random accesses
the SDRAM isn't really fast. -
24:57 - 25:00The codec has unfortunately
a lot of big data tables -
25:00 - 25:05which are accessed constantly
like to compute cosines. -
25:05 - 25:10So, what Dieter did is basically
identify a couple -
25:10 - 25:14of those big tables
that are accessed very often -
25:14 - 25:16and then just migrated them
to internal SRAM. -
25:16 - 25:19You just find the one or two places
-
25:19 - 25:22where the code reads from those tables
-
25:22 - 25:26and just patch the address to another address
-
25:26 - 25:29and redirect it to there.
-
25:29 - 25:33And that got the code running much faster,
about 16x faster than real-time. -
25:33 - 25:35And honestly, that was good.
-
25:35 - 25:39I wasn't really planning on doing
anything else on the hardware -
25:39 - 25:44and so I ordered the same board…
except I didn't. -
25:44 - 25:48I somehow ended up clicking
on the wrong button in the -
25:48 - 25:50TI store or something and
when the board arrived -
25:50 - 25:53I didn't have the right one.
-
25:53 - 25:56At that point I had two choices:
-
25:56 - 25:59I could just order the same board
-
25:59 - 26:01and wait one more week,
-
26:01 - 26:04or I could try to run
the code on it anyway. -
26:04 - 26:07The second board didn't have Ethernet.
It had USB. -
26:07 - 26:10But most importantly it didn't
have any SDRAM at all. -
26:10 - 26:12It only had internal SRAM
-
26:12 - 26:16and unfortunately not at
the right address. -
26:16 - 26:21So I just figured: OK.
I'm just gonna try to relocate the binary -
26:21 - 26:25and make it run on another address.
How hard can it be? -
26:25 - 26:28It turned out to be pretty straightforward
because -
26:28 - 26:31in I think less than a day it was running
-
26:31 - 26:34and that's really thanks to IDAPython
and the simulator. -
26:34 - 26:38Basically what I did is in IDAPython
which is the -
26:38 - 26:42scripting language in
IDA reverse engineering software -
26:42 - 26:46you can iterate through all the opcodes,
then for that -
26:46 - 26:50opcode you can test:
Is there an absolute memory reference -
26:50 - 26:53somewhere in that opcode?
If no, then don't do anything. -
26:53 - 26:56If there is an absolute memory reference,
-
26:56 - 27:00is that reference falling
somewhere in a place that -
27:00 - 27:02I'm trying to relocate or not?
If it is a place -
27:02 - 27:04I'm trying to relocate then
just patch that opcode -
27:04 - 27:07and change the absolute memory reference
to the new place. -
27:07 - 27:10Go through all the code like that
and try to run it. -
27:10 - 27:13Of course, it didn't work
the first time around -
27:13 - 27:20because there was a couple of
pitfalls in that approach -
27:20 - 27:25but using the simulator it was
pretty easy to find out where -
27:25 - 27:30things were going wrong because
you take the original code -
27:30 - 27:33you run everything in the simulator
and you trace every -
27:33 - 27:38memory access, then you take your
potentially relocated code -
27:38 - 27:42you run it again in the simulator
and trace every memory access -
27:42 - 27:45Decoding the same frame
it should be deterministic -
27:45 - 27:50and so you should have the exact same
memory access patterns -
27:50 - 27:53in both cases. And as soon as you see
that they are diverging -
27:53 - 27:56you know that somewhere
a branch wasn't taken or -
27:56 - 27:59something went wrong and
you can go at that place in your code -
27:59 - 28:03look and see why the
relocation didn't work. -
28:03 - 28:07It only took two tries to
have the code running. -
28:07 - 28:11It was running on my board and
I didn't have to order a new one -
28:11 - 28:13which was pretty nice.
-
28:13 - 28:16So we have a hardware USB decoder
-
28:16 - 28:19which is very useful but
you still have to carry it around -
28:19 - 28:23or plug it into a network
server or something -
28:23 - 28:26which isn't necessarily
something you want. -
28:26 - 28:32I wanted something I could
checkout into git, basically. -
28:32 - 28:39So, I started reverse engineering
the entire codec. -
28:39 - 28:43Now, it turned out to be very painful.
-
28:43 - 28:47As I said, to reconstruct audio
you have three main steps. -
28:47 - 28:51The first one is unpacking and
it's just basically bit shuffling. -
28:51 - 28:55It's like the first thing
the codec does which means -
28:55 - 28:58it's really easy to find and
it's really easy to follow. -
28:58 - 29:00So, no problem there.
-
29:00 - 29:04The second step is dequantization
and it took like 95% percent -
29:04 - 29:08of the work because it's lot of math.
-
29:08 - 29:12It's all written in DSP assembly,
in fixed-point -
29:12 - 29:15and sometimes you're looking at
a function to figure out -
29:15 - 29:19what that function does you
have to run it with different -
29:19 - 29:23parameters to see what does
that thing compute. -
29:23 - 29:26Graphing is really useful for that.
You just feed some -
29:26 - 29:29inputs into the simulator see
what output is generated -
29:29 - 29:33and just plot it in PyLab or something.
-
29:33 - 29:37And then you can see, OK,
that actually computes a logarithm. -
29:37 - 29:42There is still a lot of code there
and there was no -
29:42 - 29:46other option for that part
because the quantization step -
29:46 - 29:49is completely specific to GMR.
It's not shared with any -
29:49 - 29:52of the other AMBE variants.
-
29:52 - 29:55The synthesis part is even
more complicated, is even more -
29:55 - 29:58code in the DSP but I really didn't want…
-
29:58 - 30:02after finishing the dequantization
I really didn't want -
30:02 - 30:05to the synthesis, so I just assumed
that the synthesis step -
30:05 - 30:09was gonna be very similar
to the one in P25. -
30:09 - 30:14So I took the mbelib existing
open-source code for P25, -
30:14 - 30:18I ripped out the unpacking
and dequantization out of it. -
30:18 - 30:21That's because it's the one for P25,
of course. -
30:21 - 30:25I just plugged in my own implementation
-
30:25 - 30:28and run it through the synthesis.
-
30:28 - 30:31There's a couple of things
you need to take care of -
30:31 - 30:38mostly because P25 uses 20 ms frames,
GMR uses 10 ms frames. -
30:38 - 30:42So, I basically dropped one frame
every other frame in GMR -
30:42 - 30:46and just used the 10 ms parameter
synthesized 20 ms of speech -
30:46 - 30:49but it produced something
where you could follow -
30:49 - 30:52a conversation and it worked.
-
30:52 - 30:56I was still not really happy about it
because the code quality -
30:56 - 30:59of mbelib was really not that good.
-
30:59 - 31:01Its performance is also not that good.
-
31:01 - 31:05It wasn't actually faster
than the hardware decoder. -
31:05 - 31:10So, in the end I went back to
the actual P25 specifications -
31:10 - 31:14that I had and
rewrote a clean implementation -
31:14 - 31:17of the synthesis step.
I just guessed the difference. -
31:17 - 31:21Every time it says 20 ms
I just put 10 ms. When there is -
31:21 - 31:26a 512-bin FFT I just put 256
-
31:26 - 31:30and adapted stuff as I went.
-
31:30 - 31:34And the resulting code is in the git
-
31:34 - 31:37and you can decode voice.
-
31:37 - 31:40I think it works pretty well.
-
31:40 - 31:43You can hear some subtle differences
-
31:43 - 31:45compared to the original codec
-
31:45 - 31:48because DVSI did do a bunch of stuff to
-
31:48 - 31:51improve the voice quality.
It's their whole business. -
31:51 - 31:56And I didn't go through all of that.
-
31:56 - 32:00But… From my point of view,
I'm done on this and I'm not -
32:00 - 32:03gonna go any further.
We have C code we can run. -
32:03 - 32:06So. That's good.
-
32:06 - 32:13Applause
-
32:14 - 32:17Thank you.
-
32:17 - 32:20Next, we move on to the cipher.
-
32:20 - 32:24We didn't reverse engineer the cipher.
-
32:24 - 32:28That was done by a team
at the Bochum university -
32:28 - 32:30led by Benedikt Driessen.
-
32:30 - 32:34Right after we publ…
After 28C3, basically… -
32:34 - 32:38Actually, at 28C3 we already had
some contacts with them but -
32:38 - 32:41the paper wasn't published yet.
-
32:41 - 32:45And we actually have them validate
their data and their attack -
32:45 - 32:46using osmo-gmr.
-
32:46 - 32:50I definitely encourage you to
read the paper that they published -
32:50 - 32:52because the method they used.
They are actually the ones -
32:52 - 32:57who extracted the DSP image which
we used to reverse engineer the codec. -
32:57 - 33:02They were the first ones
to extract it from the -
33:02 - 33:05file that you can download online.
-
33:05 - 33:09And then, there is some pretty
interesting stuff in there. -
33:09 - 33:11As I said, they also published
an attack on that cipher -
33:11 - 33:14which is a ciphertext-only attack
-
33:14 - 33:18and I'll show some of the results later on.
-
33:18 - 33:23So, what does the cipher look like?
Well, it looks like that. -
33:23 - 33:27If you are familiar with A5/2 or even A5/1
-
33:27 - 33:30that should look very familiar because
-
33:30 - 33:32it's pretty much the same thing.
-
33:32 - 33:36So you have 4 linear feedback registers.
-
33:36 - 33:40The first 3 ones are combined
using a nonlinear output function -
33:40 - 33:44into the actual key stream.
And then you have a 4th register -
33:44 - 33:48which a few of the bits are tapped.
They go to a nonlinear -
33:48 - 33:51clocking function and that function
is gonna determine -
33:51 - 33:55how fast the other registers are clocked.
-
33:55 - 34:00So, not all registers advance
at each output bit. -
34:00 - 34:05And as I said, it is very similar to…
it's actually a copy of A5/2 -
34:05 - 34:08except they changed every number like…
they don't… -
34:08 - 34:11the polynomial for the linear
feedback registers are different, -
34:11 - 34:14which bits are tapped to combine
into the output are different. -
34:14 - 34:18Same thing for R4 but
it's the exact same structure -
34:18 - 34:23which means we can reuse the attack
that works on A5/2. -
34:23 - 34:27This cipher obviously has
an initialization step -
34:27 - 34:30where you basically take the key
and the frame number -
34:30 - 34:32that you want to encrypt
-
34:32 - 34:36and you have some process to initialize
-
34:36 - 34:38the LFSR with those data.
-
34:38 - 34:40But really the only thing to retain
-
34:40 - 34:43is that the initialization
process is entirely linear. -
34:43 - 34:46There is some XOR and LFSR being involved.
-
34:46 - 34:50So that's pretty much…
-
34:50 - 34:54…the thing to remember.
-
34:54 - 35:00The actual irregular clocking
only is applied when you -
35:00 - 35:06generate bitstream
for encrypting the frame. -
35:06 - 35:10Since we are gonna talk about
attacks on ciphers -
35:10 - 35:15in some details there are a few things
that you may not remember -
35:15 - 35:18if you don't do algebra all the time.
-
35:18 - 35:21This is a very very quick reminder.
-
35:21 - 35:26So, you should know is that
when you see GF(2) that's -
35:26 - 35:28just a fancy term for binary.
-
35:28 - 35:32What we call addition in GF(2)
is just the XOR operation. -
35:32 - 35:34Multiplication is just the logic AND
-
35:34 - 35:36and you can do everything with that.
-
35:36 - 35:40If you accept those definitions,
then algebra over binary -
35:40 - 35:44is gonna be pretty much the same as
algebra over real numbers -
35:44 - 35:47and all the nice properties
are gonna be applicable -
35:47 - 35:50to binary math as well.
-
35:50 - 35:54I mentioned previously “linear”
and I will mention it -
35:54 - 35:56several times in the rest of this talk.
-
35:56 - 36:01When I say “linear” it's either a constant,
-
36:01 - 36:04or a variable or unknown
multiplied by a constant -
36:04 - 36:08but you never have two variables
multiplied together -
36:08 - 36:13because then it becomes quadratic
and it's not linear anymore. -
36:14 - 36:18Being linear has some nice properties
like linear systems of equations -
36:18 - 36:23we have very fast algorithms to
solve them -
36:23 - 36:28and every linear operation can be
represented as matrix operation -
36:28 - 36:31which is especially useful…
we're gonna be dealing with -
36:31 - 36:34hundreds of unknowns and
spelling them out is not an option. -
36:34 - 36:38We have to have some way of
representing that and that's -
36:38 - 36:42matrix operations, essentially.
-
36:42 - 36:46One other important thing to know is:
-
36:46 - 36:50When you have an operation
that adds redundancy, for example, -
36:50 - 36:53channel coding where you add the CRC
-
36:53 - 36:55or where you do convolutional coding,
-
36:55 - 36:58adding error correction adds
redundancy to your message -
36:58 - 37:01and when you have redundancy
you can create what's called -
37:01 - 37:05a parity check matrix to check
if that redundancy in that present. -
37:05 - 37:08So, for a CRC you check
the CRC for example. -
37:08 - 37:11But it can be generalized to anything
that adds redundancy. -
37:11 - 37:14It can be checked.
And that becomes very useful to -
37:14 - 37:17make the attack much faster.
-
37:17 - 37:21So, a little more detail about
what the guys at Bochum did. -
37:21 - 37:25They published a ciphertext-only attack
-
37:25 - 37:28that means you only need to
capture data off the air -
37:28 - 37:30and directly with that
you feed it to the attack -
37:30 - 37:32and it can give you the key.
-
37:32 - 37:35Their target was called TCH3 frames
-
37:35 - 37:38and those are traffic frames,
they're what carries the voice -
37:38 - 37:41which means you have a lot of them because
-
37:41 - 37:44as long as the conversation is still going,
-
37:44 - 37:46frames that are being generated constantly,
-
37:46 - 37:49several tens of them per second.
-
37:49 - 37:52They didn't start from nothing. As I said
-
37:52 - 37:55the cipher is heavily based on A5/2, so,
-
37:55 - 37:57it kind of makes sense to read the papers
-
37:57 - 38:01that were published about A5/2 attacks.
-
38:01 - 38:06The most interesting one and one of the
most advanced one is called -
38:06 - 38:12“Instant Ciphertext-Only Cryptanalysis
of GSM encrypted communication” -
38:12 - 38:16It's kind of math-heavy but
-
38:16 - 38:21it provides an attack on A5/2
that recovers the key instantly -
38:21 - 38:24which is really nice.
-
38:24 - 38:26They modified this attack.
They tweaked it for GMR-1 -
38:26 - 38:28because the frames are different.
-
38:28 - 38:30There's different amount of bits.
-
38:30 - 38:33The cipher is slightly different.
-
38:33 - 38:35And they also did something a little weird.
-
38:35 - 38:37They tried to guess more bits
-
38:37 - 38:39some of the bits they tried to brute-force.
-
38:39 - 38:42I'm not exactly sure why.
-
38:42 - 38:44I think it's to reduce
the number of unknowns. -
38:44 - 38:48But I don't know if it actually
had the desired effect. -
38:48 - 38:52And in the RUB paper
they published their result: -
38:52 - 39:00With 350 GB of precomputed data
using 32 TCH3 frames -
39:00 - 39:03they managed to recover the key
in something like 40 minutes -
39:03 - 39:06which is not bad, definitely, and it works.
-
39:06 - 39:09But the original attack is named “instant”.
-
39:09 - 39:1340 minutes isn't instant – to me, at least.
-
39:13 - 39:18So, we started looking for a better attack.
-
39:18 - 39:22We started off the same paper
because it's kind of -
39:22 - 39:25the state of the art of A5/2 attacks
-
39:25 - 39:27but we really didn't do anything fancy.
-
39:27 - 39:33We just used it exactly as they said
in the original paper -
39:33 - 39:37and we just did the necessary tweaks
for A5/GMR-1, -
39:37 - 39:40changing the matrix and
the length of the vector, -
39:40 - 39:42really nothing fancy.
-
39:42 - 39:46We also decided to attack
a different type of channel -
39:46 - 39:49and this is probably what
makes the most difference. -
39:49 - 39:52We decided to target FACCH3 control frames
-
39:52 - 39:55instead of the TCH3 voice frames.
-
39:55 - 39:59There is a very big advantage to
targeting those frames. -
39:59 - 40:03First, they use a different kind of
modulation and training sequence. -
40:03 - 40:06From an RF point of view
they are much easier to receive. -
40:06 - 40:10And that's pretty important if
you're trying to mount an attack because -
40:10 - 40:13as soon as you have a
bit error in the reception -
40:13 - 40:15it will make your equations inconsistent
-
40:15 - 40:19and you won't actually be able
to solve for the key as easily. -
40:19 - 40:22So, this is a very nice property
of those frames. -
40:22 - 40:25The other very nice property is
that we have known -
40:25 - 40:29plaintext on the FACCH3 channel,
the control channel. -
40:29 - 40:32There are some predictable frames
that we know will there -
40:32 - 40:33and we know where they will be.
-
40:33 - 40:36the voice traffic is
completely unpredictable. -
40:36 - 40:38It's someone talking.
-
40:38 - 40:41The control traffic will follow
some very known patterns -
40:41 - 40:46and I'll show an example of
plaintext which is really trivial. -
40:47 - 40:50Some other of the nice properties
of the control channel is: -
40:50 - 40:53There is much more redundancy.
-
40:53 - 40:58In TCH3 not all the bits are
actually error-corrected -
40:58 - 41:01which means you don't have
that much information -
41:01 - 41:05while for the control channel
each layer-2 bit that you transmit -
41:05 - 41:10is almost quadrupled
before you put it on the air -
41:10 - 41:15which means you can build
a lot of equations very fast -
41:15 - 41:20without many bursts.
In the RUB attack they needed 32 frames, -
41:20 - 41:27probably because there is not much
redundancy in the TCH3 frames. -
41:27 - 41:30And also, as a nice bonus that
we discovered later is that -
41:30 - 41:34the FACCH3 control channel
is also used to negotiate -
41:34 - 41:38other types of channels like
the channels that are used -
41:38 - 41:43when you send FAX or
when you establish a data call -
41:43 - 41:45over satellite networks.
-
41:45 - 41:48It's actually negotiated using
that type of control channel -
41:48 - 41:51which means the same kind of attack
can work to crack -
41:51 - 41:55not only voice but also data calls and FAX.
-
41:55 - 41:58This is an example of the plaintext.
-
41:58 - 42:01I don't know if you can really see it.
-
42:01 - 42:03But essentially on the top
you have the cipher mode command -
42:03 - 42:06the first line in blue is
the cipher mode command. -
42:06 - 42:09So everything after that
is in theory ciphered. -
42:09 - 42:11This is, of course, the enciphered view.
-
42:11 - 42:14And the three marked packets
in gray and black -
42:14 - 42:16are the very predictable texts
-
42:16 - 42:19and what they are is acknowledgements.
-
42:19 - 42:21Because of the delay on the GSM channel
-
42:21 - 42:23there is some outstanding
acknowledgement -
42:23 - 42:28that still need to be sent
when the ciphering is turned on. -
42:28 - 42:32You get empty packets with
incrementing sequence number -
42:32 - 42:35and the ACK bit set.
-
42:35 - 42:37Something changes in them,
the sequence number, but -
42:37 - 42:41as soon as you can count,
predicting them is trivial. -
42:41 - 42:45This definitely works in practice
like 100% of the time. -
42:45 - 42:50I never really had any problem with that.
-
42:51 - 42:56So, how do you build a plaintext attack?
-
43:04 - 43:08The general goal is…
what you're trying to achieve -
43:08 - 43:12is build a giant system of equations
that's linear -
43:12 - 43:13and that you can solve.
-
43:13 - 43:17It's gonna have the form
A * x = b -
43:17 - 43:21b is the cipher stream that
is generated by A5/GMR-1 -
43:21 - 43:26x is some representation of
the internal state of the cipher -
43:26 - 43:28that you are trying to find
-
43:28 - 43:31and A is a giant matrix
that you can precompute -
43:31 - 43:35that's only dependent on
the structure of the cipher. -
43:35 - 43:40Each row of A and b are gonna represent
one bit of the output -
43:40 - 43:45and how to combine that internal state
to generate that cipher stream bit. -
43:45 - 43:48As I mentioned,
the initialization process is linear -
43:48 - 43:51and that's very important
for two things. -
43:51 - 43:54First,
from the internal state of the cipher -
43:54 - 43:58you're gonna be able to recover the key
-
43:58 - 44:02which is not entirely true.
There is one bit of the key -
44:02 - 44:05that you can't recover
because that bit of the key -
44:05 - 44:08turns out to have absolutely no effect
on the output whatsoever. -
44:08 - 44:12so, you can't recover it
but doesn't matter either. -
44:12 - 44:15The other thing is that
-
44:15 - 44:18you're gonna need to
build a lot of equations. -
44:18 - 44:21That system is gonna be big.
-
44:21 - 44:24In a single burst of data
you won't have enough equations. -
44:24 - 44:26You're gonna need multiple of them.
-
44:26 - 44:29But, of course, those multiple bursts
are gonna have different -
44:29 - 44:31frame numbers because they're sequential.
-
44:31 - 44:34They're not gonna wrap before a long time.
-
44:34 - 44:37But since the initialization
is entirely linear -
44:37 - 44:41you can actually derive equation
from one frame number -
44:41 - 44:44from an equation built out of
another frame number. -
44:44 - 44:50But, of course, it would be too easy.
The cipher isn't entirely linear. -
44:50 - 44:53so, we have to “linearize” it and
-
44:53 - 44:57when I first started looking at
a crypto attack on A5/2 -
44:57 - 45:00in the paper it was always
one line that said -
45:00 - 45:03oh yeah, we linearize the cipher
-
45:03 - 45:06and they kind of always assumed
the readers know what that means -
45:06 - 45:09which I didn't at the time.
-
45:09 - 45:12So, I decided to be a little more explicit.
-
45:12 - 45:17The two nonlinear elements are
the output function -
45:17 - 45:20which uses a majority function
where you take three bits -
45:20 - 45:24and it outputs whichever
bit is more common -
45:24 - 45:27and you can rewrite that
majority function as being -
45:27 - 45:31the sum of a plus b multiplied by c
-
45:31 - 45:37or if you wish a XOR b AND c.
-
45:37 - 45:41And the problem is the
b multiplied by c because -
45:41 - 45:44that's quadratic, that's not linear.
-
45:44 - 45:46That means that
we can't introduce that. -
45:46 - 45:50And the way to linearize that
is to kind of ignore it. -
45:50 - 45:54For every possible quadratic term
that's gonna be generated -
45:54 - 45:58by that nonlinear output function
-
45:58 - 46:00you generate a new unknown.
-
46:00 - 46:03In reality that unknown
isn't really unknown. -
46:03 - 46:07It's a function of your other variables but
-
46:07 - 46:10you can't represent it in a linear fashion.
-
46:10 - 46:15So, you just ignore it and say
“OK, this is a new unknown, deal with it”. -
46:15 - 46:19Actually, there is a lot of
possible combinations -
46:19 - 46:22which adds 594 new unknowns
to your equation system -
46:22 - 46:25which means that you're gonna need
a bunch more equations -
46:25 - 46:28to be able to actually solve it.
-
46:28 - 46:32But at least it makes it linear.
So, there is hope. -
46:33 - 46:40The other nonlinear thing in that cipher
is the clocking. -
46:40 - 46:44As I said, all the LSFR don't advance
at the same rate. -
46:44 - 46:50Some are clocked once or zero times
at each output bit. -
46:50 - 46:56And this is a function of the value of R4
at that particular time. -
46:56 - 47:00So, how do we linearize that?
-
47:00 - 47:05Well, R4 is a 17-bit
linear feedback register. -
47:05 - 47:09So, we can know
all its future states from any… -
47:09 - 47:13As soon as we have a good state
at one particular time -
47:13 - 47:16you can predict all the future states
-
47:16 - 47:19because it's clocked all the time.
-
47:20 - 47:24And in those 17 bits
the initialization step forces -
47:24 - 47:26one of those bits to one.
-
47:26 - 47:29So, there is only 16 bits
that you don't know -
47:29 - 47:33which means there is only
65536 possible clocking -
47:33 - 47:36patterns for the cipher.
-
47:36 - 47:40So, the way we approach
that is we just brute-force it. -
47:40 - 47:43You just assume that R4 is
gonna be equal to that value -
47:43 - 47:48after the initialization
and you just assume that -
47:48 - 47:53and try to solve and crack the cipher
by using that assumption. -
47:53 - 47:56And if you find the Kc, well,
then maybe it's the right one. -
47:56 - 48:01You just repeat that 65536 times
and most of the time -
48:01 - 48:04if your guess about R4 is incorrect
-
48:04 - 48:07the system will end up being inconsistent
-
48:07 - 48:11and you won't actually have any solution.
-
48:11 - 48:18But this still requires solving
65536 systems -
48:18 - 48:22which is fast but not that fast.
-
48:22 - 48:28There is actually a trick that we can
apply to do better guesses. -
48:28 - 48:32When you build these
giant equation systems -
48:32 - 48:35it turns out that some of the
equations that you construct -
48:35 - 48:38they are not gonna be independent.
Some of them are -
48:38 - 48:40gonna be related to each other.
-
48:40 - 48:45And so some of the rows are redundant.
And what's nice is: -
48:45 - 48:49If they're redundant
we can actually test for that redundancy. -
48:49 - 48:53For the 65536 possible systems
that you need to solve, -
48:53 - 48:55you also have an
An * x = b -
48:55 - 48:59there is a bunch of rows in An
that are gonna be linear dependent. -
48:59 - 49:04So, for each of them you can
build a parity check matrix Hn -
49:04 - 49:08so that you have this nice property
of a parity check matrix -
49:08 - 49:13where you multiply b by Hn
and you get a zero matrix. -
49:13 - 49:16But all those Hn, you can precompute them.
-
49:16 - 49:20That's the offline data of the attack.
-
49:20 - 49:23You precompute those giant matrices,
-
49:23 - 49:29then you take b which is the cipher text
you got from the cipher, -
49:30 - 49:33you multiply it by this matrix
and if it gives you zero, -
49:33 - 49:37you know that this particular R4 value
might be correct. -
49:37 - 49:39You don't know if it's correct or not,
-
49:39 - 49:41but at least it might be.
-
49:41 - 49:44If the result is non-zero
then you know that R4 is definitely -
49:44 - 49:47not the correct R4 and so
instead of solving an entire -
49:47 - 49:50system of equations the only thing
you need to do is -
49:50 - 49:54do one matrix multiplication
and that is pretty fast. -
49:54 - 49:58And in the end you're gonna have
a couple of possible matches -
49:58 - 50:02so you're gonna solve like maybe
3 or 4 systems of equations -
50:02 - 50:05instead of solving 65536 and
that's much much faster. -
50:05 - 50:10Now, how do you go from
a known-plaintext attack -
50:10 - 50:13to a ciphertext-only attack?
-
50:13 - 50:16I'm not gonna detail that much
but essentially -
50:16 - 50:20the channel coding operation
introduces redundancy -
50:20 - 50:24and you can kind of null out that…
-
50:24 - 50:27if you take the last equation you have
-
50:27 - 50:30H multiplied by y plus g.
-
50:30 - 50:34H is a function of the channel coding.
-
50:34 - 50:36You can compute it. It's known.
-
50:36 - 50:40y is the data that
you received off the air. -
50:40 - 50:42So, you know it as well.
-
50:42 - 50:46g is also a function of
the channel coding operation. -
50:46 - 50:50So, you know it.
So everything on the left you know. -
50:50 - 50:55And the last value is H multiplied by A
-
50:55 - 50:58which is what you had in
the known-plaintext attack, -
50:58 - 51:00so you know it as well.
-
51:00 - 51:03And everything you don't know
is x and so, again, -
51:03 - 51:06you have a linear system of equations
that you can solve. -
51:06 - 51:10And the same R4 quick-scan
method can be applied -
51:10 - 51:14to scan very quickly and not
have to solve everything. -
51:14 - 51:17So, this is the result.
-
51:17 - 51:21If you use the fact that
you know some plaintext -
51:21 - 51:24you need about 50 Megabytes
of data precomputed, -
51:24 - 51:27you need to receive 8 bursts
or even 4 if you -
51:27 - 51:31by any chance have the right
frame number alignment, -
51:31 - 51:33and in like 500 ms
you can get the key back. -
51:33 - 51:36And that's on a 6 year old laptop.
-
51:36 - 51:40On any recent machine it's really instant.
-
51:40 - 51:43If you want to use the
ciphertext-only variant, -
51:43 - 51:46and honestly I really don't know
why you would -
51:46 - 51:50because the known-plaintext works,
so you might as well use it. -
51:50 - 51:54But if you want it purely
for academic purposes, -
51:54 - 51:57then you need 8 bursts,
-
51:57 - 52:01about 5 Gigabytes of data,
-
52:01 - 52:04and then it takes a little longer
-
52:04 - 52:09like about 1 second to recover the key.
-
52:09 - 52:13A few things I would like to
look at in the future, -
52:13 - 52:15is C-band.
-
52:15 - 52:20The satellite has to talk back
to the earth station -
52:20 - 52:23and this happens
in a different frequency band -
52:23 - 52:25and this is what we would like to look at,
-
52:25 - 52:27essentially, the feeder link
-
52:27 - 52:30If any of you have a large dish,
satellite dish, -
52:30 - 52:33and want to provide data for us
please contact us, -
52:33 - 52:35we're looking for it.
-
52:35 - 52:38We're also started
looking into packet data -
52:38 - 52:41and some other stuff.
-
52:41 - 52:45A quick note about
other satellite phone systems, -
52:45 - 52:48please don't assume that
just because Thuraya GMR-1 -
52:48 - 52:51is broken that the others aren't
unless you have -
52:51 - 52:53actual proof that they aren't
because the only -
52:53 - 52:56reason we chose Thuraya is
because the phone was -
52:56 - 52:58the cheapest on eBay.
-
52:58 - 53:02So, given that all the satellite
phone systems also… -
53:02 - 53:06you know, you can find commercial
interceptors for them, -
53:06 - 53:11there is a big chance that they're
not much more secure. -
53:11 - 53:14I like to thank a few people:
-
53:14 - 53:19Dimitry Stolnikov,
the main other author in osmo-gmr -
53:19 - 53:23doing a bunch of the RF capture stuff,
-
53:23 - 53:28Dieter Spaar
for reverse engineering of the AMBE codec -
53:28 - 53:30and all the help he provided there
-
53:30 - 53:33and the RUB team, of course,
for their support in -
53:33 - 53:35reverse engineering the cipher,
-
53:35 - 53:40and the organizer for having me
-
53:40 - 53:43and thank you for your attention!
-
53:43 - 53:47Applause
-
53:47 - 53:52- Thank you very much!
We now have time for a few questions. -
53:52 - 53:57First, are there questions
from the Internet? -
53:58 - 54:01Not yet. So,
if you have a question, please -
54:01 - 54:03approach one of the microphones.
-
54:03 - 54:06Yes, please, microphone number 3!
-
54:06 - 54:13- Yes, I have a rather not technical
but political question: -
54:13 - 54:18So, you said in the beginning
there was like a short -
54:18 - 54:21way of communication that
only the satellite transmits -
54:21 - 54:25the voice data, so there is
no ground station involved? -
54:25 - 54:30- Yes. - Do you think that this
might upset some of the people -
54:30 - 54:34who like to intercept calls
because it makes intercepting -
54:34 - 54:36the voice data much more complicated
-
54:36 - 54:39than just going to ground station and say
-
54:39 - 54:41“I want to listen to this”?
-
54:41 - 54:44- First, just because the satellite
will send the data back -
54:44 - 54:46directly to their mobile phone
-
54:46 - 54:49doesn't mean the satellite
can't also provide a copy -
54:49 - 54:53of it to the earth station.
It just doesn't go through. -
54:53 - 54:56But the satellite can obviously
send a copy of it if they want to. -
54:56 - 55:00- And probably there will be a
possibility for the ground station -
55:00 - 55:02to say
“Don't do that direction connection”. -
55:02 - 55:06- On the other hand they can also
just intercept it off the air -
55:06 - 55:08because it's so easy to listen to it.
-
55:08 - 55:11Last year there was a talk with somebody
-
55:11 - 55:14that looked at satellite pictures
-
55:14 - 55:17and they saw that one of the
spy satellites was relocated -
55:17 - 55:22like right next to the
Thuraya-2 satellite. -
55:22 - 55:25You can look that up.
It was a presentation last year. -
55:25 - 55:29It was pretty interesting
to see that… yeah… -
55:29 - 55:31If you want to listen to it you can do it.
-
55:31 - 55:33- Thank you!
-
55:33 - 55:37- Next question from microphone number 2!
-
55:39 - 55:46- I've two questions: Can you hear
both sides of the conversation -
55:46 - 55:50or only one side of the conversation?
The second question is: -
55:50 - 55:53How big of a dish do you need?
-
55:53 - 55:55- If you want to receive L-band which is
-
55:55 - 55:58just listen to the communication
from the satellite to the phone -
55:58 - 56:02you don't need a dish,
you need like a small antenna. -
56:02 - 56:06You can even do it with a modified
GPS antenna and a piece of metal. -
56:06 - 56:10So, it's really easy and
really cheap to do. -
56:10 - 56:13You can only hear the the
part of the conversation -
56:13 - 56:16that comes from the satellite
to the phone. -
56:16 - 56:20So, if by any chance that guy
is speaking to another phone -
56:20 - 56:24you can hear both sides
because you can intercept both -
56:24 - 56:27channels simultaneously.
-
56:27 - 56:30But in general, you either
-
56:30 - 56:34only hear one side or you can
also be close enough to the -
56:34 - 56:39actual satellite phones
to receive the uplink -
56:39 - 56:43that goes from the phone to the satellite.
-
56:43 - 56:46Dimitry showed me like a
commercial intercept system -
56:46 - 56:51where apparently they use like
planes to intercept the uplink. -
56:51 - 56:54They just fly over the area of interest
-
56:54 - 56:59so they can more easily receive
the uplink from the phones. -
56:59 - 57:01- One final question. Please make it short
-
57:01 - 57:03because we are running out of time.
-
57:03 - 57:05- I'll try my best.
-
57:05 - 57:08Launching your own satellite
is certainly not an option -
57:08 - 57:11but do you see any practical applications
-
57:11 - 57:16in terms of what you did with GSM?
-
57:17 - 57:21- Not really. It would be
interesting to try start a net -
57:21 - 57:24because there is nothing that says
that you have to be on a satellite. -
57:24 - 57:28You could technically start a base station on earth
-
57:28 - 57:32using that protocol and it would work.
-
57:32 - 57:35And the phones have much more power
than a GSM phone. -
57:35 - 57:38But other than that,
you could build an IMSI catcher -
57:38 - 57:41for satellite phones if you wanted to
-
57:41 - 57:44and you'd definitely beat the
signal strength of the satellite -
57:44 - 57:47with your own setup.
-
57:47 - 57:48- Thank you.
-
57:48 - 57:54- Please give the speaker
another round of applause! -
57:54 - 58:06subtitles created by c3subtitles.de
Join, and help us!
- Title:
- Sylvain Munaut: osmo-gmr: What's up with sat-phones ?
- Description:
-
http://media.ccc.de/browse/congress/2014/31c3_-_6267_-_en_-_saal_g_-_201412271600_-_osmo-gmr_what_s_up_with_sat-phones_-_sylvain_munaut.html
At 28C3 we introduced the very first steps of the osmo-gmr projects. During this talk, we will present the various advances that have been made in this project on various aspects (voice codec, crypto algorithm, ...)
Sylvain Munaut
- Video Language:
- English
- Duration:
- 58:06
C3Subtitles edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? | ||
s.gesemann edited English subtitles for Sylvain Munaut: osmo-gmr: What's up with sat-phones ? |