1
00:00:00,000 --> 00:00:14,760
34c3 preroll
2
00:00:14,760 --> 00:00:20,360
Herald: The Democratic People's Republic
of Korea—or, as most of you know it,
3
00:00:20,360 --> 00:00:25,269
North Korea, is a topic which is
already following us at congress
4
00:00:25,269 --> 00:00:31,450
for four years. It all started
in 31c3 with Will Scott,
5
00:00:31,450 --> 00:00:37,030
one of our speakers today, giving a
talk about teaching computer science in
6
00:00:37,030 --> 00:00:45,120
North Korea. The topic was then gone on by
Florian Grunow and Niklaus Schiess, who
7
00:00:45,120 --> 00:00:52,210
talked about the Red Star OS and also the
tablet PC called Woolim. Today, we will
8
00:00:52,210 --> 00:00:56,940
hear the next episode—we will hear about
consumer electronics in North Korea. We
9
00:00:56,940 --> 00:01:02,100
will take a peek behind the curtain, learn
about the Internet, and the current market
10
00:01:02,100 --> 00:01:09,280
situation there. Our speakers today
are Will Scott, a security postdoc, as
11
00:01:09,280 --> 00:01:16,030
well as his friend Gabe Edwards, security
consultant, and they will give us a peek
12
00:01:16,030 --> 00:01:22,710
behind the curtain. So, please, welcome
Will and Gabe with a big round of applause,
13
00:01:22,710 --> 00:01:32,429
thank you for being here already.
[Applause]
14
00:01:32,429 --> 00:01:39,890
Will: Thank you, great. So just just to
put this in perspective, right, one of the
15
00:01:39,890 --> 00:01:45,479
disclaimers is that the words that get
used, especially on this topic often have
16
00:01:45,479 --> 00:01:52,460
a lot of meaning. There there is a reason
of that we'll be calling this DPRK or
17
00:01:52,460 --> 00:01:56,170
Korea throughout. That's often the words
you'll hear of people who are dealing with
18
00:01:56,170 --> 00:01:59,979
engagement with the country. North Korea
is a term that the country does not call
19
00:01:59,979 --> 00:02:06,119
itself, but rather is what typically more
adversarial countries use to talk about it
20
00:02:06,119 --> 00:02:12,080
as an occupying presence. So that that
language is is this weird quirk that
21
00:02:12,080 --> 00:02:18,320
exists here. So yeah, we're going to talk
some about what consumer technology looks
22
00:02:18,320 --> 00:02:22,660
like and how it's evolving and what's
going on there. I think we're pretty
23
00:02:22,660 --> 00:02:30,630
excited about this. I want to start by by
setting a little bit of context. This is
24
00:02:30,630 --> 00:02:35,570
the science of technology complex that
opened in 2015. It's in an island in a
25
00:02:35,570 --> 00:02:40,390
river to the south side of Pyongyang, it's
still in the main city. There was a pretty
26
00:02:40,390 --> 00:02:44,490
major construction project; it went on for
about a year before they opened this. In
27
00:02:44,490 --> 00:02:48,110
the lobby they've got this nice
diorama of what the building looks like.
28
00:02:48,110 --> 00:02:52,570
It actually … this is the rest of the
lobby—it looks pretty modern.
29
00:02:52,570 --> 00:02:56,870
They have this sort of plain pastel
scheme that you actually see a lot in in
30
00:02:56,870 --> 00:03:02,870
modern architectural construction there.
So so if you go into the new water park or
31
00:03:02,870 --> 00:03:06,700
the boat restaurant that they've opened in
the last couple of years you see the same
32
00:03:06,700 --> 00:03:14,150
design styling. This building is part
Science Museum—it has a bunch of sort of
33
00:03:14,150 --> 00:03:20,510
interactive exploratory exhibits that you
might have a class of children come
34
00:03:20,510 --> 00:03:26,930
through to learn. It also has lecture
halls, and it also has a library. And and
35
00:03:26,930 --> 00:03:31,010
when you look at parts of it are that are
the library you see a ton of computers.
36
00:03:31,010 --> 00:03:36,790
Right, this this is a … technically … there,
there is technology here. And and the
37
00:03:36,790 --> 00:03:40,720
thing that is really, I think, fascinating
and revealing about where we are in terms
38
00:03:40,720 --> 00:03:44,350
of our understanding of this country is
you look at these computers and yet again
39
00:03:44,350 --> 00:03:49,900
we see this thing that doesn't look
familiar. This isn't Red Star, it's not
40
00:03:49,900 --> 00:03:53,170
quite anything that looks like the tablets
we've seen. That's that's a desktop
41
00:03:53,170 --> 00:04:00,840
monitor. And it's not Windows or Mac. It's
yet again something new. And in fact,
42
00:04:00,840 --> 00:04:06,150
playing with this, you find that it's
Android that's that's been put in this
43
00:04:06,150 --> 00:04:11,500
custom bezel. It has a keyboard and mouse,
but it's got an Android taskbar at the top
44
00:04:11,500 --> 00:04:16,820
to let you know what apps are there and
it's yet another … they have special cased
45
00:04:16,820 --> 00:04:23,140
and customized a distribution that works
for this purpose. And I think we … for
46
00:04:23,140 --> 00:04:28,980
each one of these that maybe we have seen,
there's there's many more that we haven't.
47
00:04:28,980 --> 00:04:37,590
So, I want to just get us up to speed on
what we do know, to start with. We've seen
48
00:04:37,590 --> 00:04:43,090
Red Star—this is version 3, it came out
three years ago that we learned about Red
49
00:04:43,090 --> 00:04:47,001
Star version 3; this this thing that sort
of Mac-like. There's actually been a
50
00:04:47,001 --> 00:04:50,030
couple other versions that have ended up
on the Internet that we know stuff about.
51
00:04:50,030 --> 00:04:54,690
And we we have at some level a better
picture of what the desktop technology
52
00:04:54,690 --> 00:04:59,560
looks like. We've seen version 2.5 which
looks somewhat Windows like. There's been
53
00:04:59,560 --> 00:05:04,250
a release of the server version that runs
some of the web servers from the country.
54
00:05:04,710 --> 00:05:10,180
And then two years ago, Florian and
Niklaus' talk—they actually went in and
55
00:05:10,180 --> 00:05:13,750
did a bunch of analysis of it, along
with on the Internet there's been
56
00:05:13,750 --> 00:05:18,320
blog posts of other people who've posted
CVEs of various bugs that they found in
57
00:05:18,320 --> 00:05:22,540
this, figured out how to make it run on
the external Internet by changing firewall
58
00:05:22,540 --> 00:05:26,540
rules, and really just like learning a lot
about both the environment that this thing
59
00:05:26,540 --> 00:05:32,310
was working in and the properties of it.
We have a bit less on the mobile side - so
60
00:05:32,310 --> 00:05:37,030
this is what a store in in Korea in
Pyongyang sort of looks like: those are
61
00:05:37,030 --> 00:05:43,560
laptops on the left, tablets and phones on
the right for sale. We got a talk last
62
00:05:43,560 --> 00:05:49,090
year, again from Niklaus and Florian, about
the Woolim tablet. I think that's actually
63
00:05:50,440 --> 00:05:56,420
maybe on the second row in this picture.
And and we got a sense of some of the
64
00:05:56,420 --> 00:06:02,460
information controls there in particular,
right. So what they talked about was how
65
00:06:02,460 --> 00:06:07,520
this thing prevents some types of file
copies and transferring, and some of the
66
00:06:07,520 --> 00:06:12,540
sort of surveillance things that are built
into it. But again, we didn't get too much
67
00:06:12,540 --> 00:06:17,810
in terms of hardware to bite our teeth
into. Finally, there's this like next
68
00:06:17,810 --> 00:06:23,930
layer up—the software ecosystem. This is
an app store, again in Korea. You go to a
69
00:06:23,930 --> 00:06:27,790
place and they have nice … this is this is
a nice one where they've got pictures so I
70
00:06:27,790 --> 00:06:33,550
can see which games it is that are for
sale that they'll then plug this in my
71
00:06:33,550 --> 00:06:41,280
device into a computer and transfer apps
onto the device. And so we get all of this
72
00:06:41,280 --> 00:06:46,240
and we have mostly anecdotes that are that
are helping us sort of get small pictures,
73
00:06:46,240 --> 00:06:48,810
and I think the real problem right is
there's all these devices—this is an
74
00:06:48,810 --> 00:06:54,669
example of a few, and and we really I
think are quite far behind and having that
75
00:06:54,669 --> 00:07:02,230
bar lowered for people to play and
understand what these things are. So, what
76
00:07:02,230 --> 00:07:06,800
what I want to do to like try and explain
that situation that we're in is is talk
77
00:07:06,800 --> 00:07:11,770
about why we're there and the different
sort of general groups of where these
78
00:07:11,770 --> 00:07:16,000
devices end up. I realize that
that's talking about motives and that
79
00:07:16,000 --> 00:07:19,610
is often like the way that you get
people mad at you, if you try and
80
00:07:19,610 --> 00:07:22,770
ascribe some motivation to them that
they disagree with. So realize that these
81
00:07:22,770 --> 00:07:26,550
are bread's … broad strokes and not really
indicative of everyone. But this gives you
82
00:07:26,550 --> 00:07:31,590
some sense of why we've still ended up in
this world of not knowing much publicly.
83
00:07:31,590 --> 00:07:36,830
Maybe … there's a quote from … this is
from Kim Jong-il that's that's relevant, and
84
00:07:36,830 --> 00:07:41,980
and says, you know, Koreans are quite an
intelligent people and even in computer
85
00:07:41,980 --> 00:07:45,570
technology we excel. I think this is
something that we maybe don't appreciate
86
00:07:45,570 --> 00:07:50,290
when we're thinking about this. It is
rational for Korea to not want this stuff
87
00:07:50,290 --> 00:07:54,620
to come out, right? They are worried about
adversarial government's trying to
88
00:07:54,620 --> 00:07:58,919
leverage whatever they can. It seems
rational that it's in their best interest
89
00:07:58,919 --> 00:08:03,330
to make it difficult for this stuff to get
out and for people to be able to attack
90
00:08:03,330 --> 00:08:08,900
them with it. That's what we've seen in,
you know, against the threat model well
91
00:08:08,900 --> 00:08:16,710
implemented copy control and and other
sort of limitations on the on the devices.
92
00:08:16,710 --> 00:08:19,630
In terms of foreigners who have access to
these devices, I think there's sort of two
93
00:08:19,630 --> 00:08:24,070
classes. What we saw in the talk last year
was a device that came out through a
94
00:08:24,070 --> 00:08:29,650
defector group. So you've got someone who
left with this device and now he's trying
95
00:08:29,650 --> 00:08:35,360
to figure out what what's on it. And that
is this adversarial relationship where the
96
00:08:35,360 --> 00:08:40,299
goal there is to do damage to the country.
And so there's much more value in having
97
00:08:40,299 --> 00:08:45,501
0-days than there is in releasing this
because then the security gets fixed. And
98
00:08:45,501 --> 00:08:48,880
so you'll see that you know for any device
that comes out there there's really the
99
00:08:48,880 --> 00:08:52,520
sensitivity both in terms of not wanting
to identify people but also in; well if we
100
00:08:52,520 --> 00:08:57,770
find anything that's buggy, we want to be
able to do something with it. I think in
101
00:08:57,770 --> 00:09:03,040
fact there's many more devices that don't
come out that way but that are held by
102
00:09:03,040 --> 00:09:08,119
foreigners who are working constructively
with the country. And for them, the the
103
00:09:08,119 --> 00:09:12,790
reason is somewhat different. And I think
the reason for them is in many cases that
104
00:09:12,790 --> 00:09:17,169
they're worried about sort of the unknown
unknowns of “could someone get in trouble?
105
00:09:17,169 --> 00:09:21,449
Will this result in my connection to the
country getting disrupted? The people
106
00:09:21,449 --> 00:09:25,030
I like and work with getting in trouble
for having given me the device that I've
107
00:09:25,030 --> 00:09:28,640
been done something reckless with.”
Right, so we can see from like
108
00:09:28,640 --> 00:09:31,529
a bunch of individual perspectives why
we don't have more of this technology
109
00:09:31,529 --> 00:09:37,120
out there. We can also understand
that, you know, as the public, this
110
00:09:37,120 --> 00:09:40,050
creates this weird thing where
we're all fascinated but don't
111
00:09:40,050 --> 00:09:43,949
have access. And and that I think
also in the spirit of, you know,
112
00:09:43,949 --> 00:09:49,690
for Korea, this isn't great. Because the
bugs go unpatched and they don't get a
113
00:09:49,690 --> 00:09:56,660
better security. So, this is the
electronic goods store at the airport
114
00:09:56,660 --> 00:10:00,800
which somewhat counter-intuitively doesn't
actually sell the tablets to foreigners
115
00:10:00,800 --> 00:10:07,199
but they do have some. What we're … what
we're going to talk about for the rest of
116
00:10:07,199 --> 00:10:14,309
this talk is an effort that I guess we're
sort of putting out on the web called
117
00:10:14,309 --> 00:10:19,540
computer … KoreaComputerCenter.org. Where
we're going to try and release a bit more
118
00:10:19,540 --> 00:10:23,699
of this technology. And I'm going to talk
through the three initial things that
119
00:10:23,699 --> 00:10:27,929
we're going to put up there that we hope
people play with. And this is in the
120
00:10:27,929 --> 00:10:34,079
spirit that this we think … this makes life
better both for Korea and for the outside
121
00:10:34,079 --> 00:10:40,009
world. For Korea, the same thing I was
just saying—I think you get better
122
00:10:40,009 --> 00:10:44,500
security in the long run. We we I think as
a community understand the value of open-
123
00:10:44,500 --> 00:10:48,620
source software, and in having many eyes
audit and find the bugs. We've already
124
00:10:48,620 --> 00:10:53,180
seen that on the artifacts that have
gotten out. For us, I think it's a great
125
00:10:53,180 --> 00:11:00,820
chance to … to do two things—one one,
it spreads our understanding more
126
00:11:00,820 --> 00:11:03,999
consistently so we actually understand
what is going on in the country and can
127
00:11:03,999 --> 00:11:08,769
make rational policy decisions at some
high level. It's also fascinating and we
128
00:11:08,769 --> 00:11:15,230
get to preserve this anthropological
artifact of this really amazing parallel
129
00:11:15,230 --> 00:11:19,130
development that has created … that
that exists of of what technology is
130
00:11:19,130 --> 00:11:25,519
like in Korea. So, in that spirit,
let's talk about what's coming out.
131
00:11:25,519 --> 00:11:29,790
Some of this I think is showing up on
BitTorrent links that are on this site
132
00:11:29,790 --> 00:11:36,009
koreacomputercenter.org as we speak. The
first is a phone image—there's a system
133
00:11:36,009 --> 00:11:43,869
partition and data partition recovery for
this phon, a Pyongyang 2407. This phone
134
00:11:43,869 --> 00:11:51,050
was chosen because it's made by a Chinese
OEM, Jin Lee, which also creates the same
135
00:11:51,050 --> 00:11:58,059
hardware in an Indian model. So if you've
got a friend in India at least, you can
136
00:11:58,059 --> 00:12:04,249
get the G&E v5—it's exactly the same
hardware and so these images can load onto
137
00:12:04,249 --> 00:12:08,330
one of these phones and then you will also
be able to run this operating system. And
138
00:12:08,330 --> 00:12:12,239
so rather than just doing static analysis
of what's there you can actually see how
139
00:12:12,239 --> 00:12:16,949
that fits together and what actually
happens. How it works, that it does shut
140
00:12:16,949 --> 00:12:20,429
down when a SIM card from a different
operator gets plugged in, these sorts of
141
00:12:20,429 --> 00:12:26,730
things. So this is this is just I guess
I'll say the the basic phone system - it
142
00:12:26,730 --> 00:12:30,660
doesn't include most apps but it's got a
bunch of the sort of operating system-
143
00:12:30,660 --> 00:12:35,190
level copy controls. You can get your
hands on the the Red Star protection
144
00:12:35,190 --> 00:12:42,709
things that we're talked about last year.
The second thing for apps we're going to
145
00:12:42,709 --> 00:12:46,300
turn to something a little bit older this
is the Samjiyon tablet which is one of the
146
00:12:46,300 --> 00:12:54,189
first tablets that came out 2011-2012 era.
This was sort of at the beginning of
147
00:12:54,189 --> 00:12:58,040
Korea's sort of introduction of widespread
consumer electronics, so it got circulated
148
00:12:58,040 --> 00:13:03,480
quite a bit. It was a larger run of
devices than many of them. In fact so
149
00:13:03,480 --> 00:13:07,210
widespread that there's there's one of
these devices in the Stanford library. And
150
00:13:07,210 --> 00:13:10,481
so I guess the other thing I'll stress is
these devices are out there and it's a
151
00:13:10,481 --> 00:13:13,999
matter of making sure that we're releasing
these in a way where it's just like this
152
00:13:13,999 --> 00:13:18,009
is software but we're not necessarily
getting anyone in particular in trouble
153
00:13:18,009 --> 00:13:21,220
because these devices we know are in a
bunch of places and the attribution
154
00:13:21,220 --> 00:13:24,450
becomes hard at that point for
anyone to like, lose
155
00:13:24,450 --> 00:13:27,360
contact or get in trouble. So there's
156
00:13:27,360 --> 00:13:33,809
there's a basic set of apps that come
there. These are some of the icons there -
157
00:13:33,809 --> 00:13:37,999
there's a nice one that has a bunch of
recipes. The the thing I'll say about
158
00:13:37,999 --> 00:13:42,689
these - these were made for this specific
device and this is a thing that you'll see
159
00:13:42,689 --> 00:13:46,819
I think throughout all the software if you
actually take a look at it. And so there's
160
00:13:46,819 --> 00:13:51,929
a lot of hard-coded paths. So as well as
the APKs themselves you'll find that they
161
00:13:51,929 --> 00:13:56,070
reference things that they expect to be in
specific parts of the SD card. Those files
162
00:13:56,070 --> 00:14:00,449
are included, but it's unlikely that if
you just copy the APK onto a Android phone
163
00:14:00,449 --> 00:14:06,369
it will be able to show you much content.
So it would be awesome if someone who
164
00:14:06,369 --> 00:14:09,569
enjoys small.i wants to twiddle some paths
so that those can look for internal
165
00:14:09,569 --> 00:14:13,921
resources instead, and lower that bar
further so that more people can play. I
166
00:14:13,921 --> 00:14:17,139
think the other thing that's interesting
here is pretty much all of these apps use
167
00:14:17,139 --> 00:14:21,670
their own specific binary format that's
like yet again this totally new thing
168
00:14:21,670 --> 00:14:29,209
where it's like someone just coded some
totally one-off thing. And that's weird.
169
00:14:29,209 --> 00:14:33,080
And the final thing is we're gonna release
a bunch of educational materials that seem
170
00:14:33,080 --> 00:14:36,519
to sort of end up on these devices.
Education is one of the big purposes,
171
00:14:36,519 --> 00:14:40,610
right? You're you're giving these to the
the children and teenagers who are
172
00:14:40,610 --> 00:14:45,160
especially excited about technology and
one of the useful things that they can do
173
00:14:45,160 --> 00:14:50,489
is use that for for their course material.
In getting a set of PDFs that are sort of
174
00:14:50,489 --> 00:14:55,189
like usable, we ended up having to do some
work. I'm gonna turn over to Gabe to
175
00:14:55,189 --> 00:14:58,649
explain sort of the process we went
through and getting this this last set of
176
00:14:58,649 --> 00:15:03,280
the the textbooks that are
going to come out.
177
00:15:03,280 --> 00:15:08,029
Gabe: Thanks, Will. So basically when I
got involved with this, the situation as
178
00:15:08,029 --> 00:15:13,860
far as these textbooks was that we had
quite a few of these files. And there are
179
00:15:13,860 --> 00:15:18,629
two things you could tell on the surface -
one is that they claim to be PDF files
180
00:15:18,629 --> 00:15:24,379
based on the filename, and some of them
have titles in English or Korean -
181
00:15:24,379 --> 00:15:25,379
that sort of suggests
182
00:15:25,379 --> 00:15:28,480
what's inside. But what you see on the
screen is not what we saw because none of
183
00:15:28,480 --> 00:15:35,319
these files were plain PDFs. So there's a
bit of sort of custom DRM that's been
184
00:15:35,319 --> 00:15:40,959
applied to these files and it's pretty
rudimentary, but it's actually been kind
185
00:15:40,959 --> 00:15:48,161
of remarkably decent job of what we think
it was designed for. Which is that the the
186
00:15:48,161 --> 00:15:53,350
textbooks that come with or that come with
or that are added to one device are not
187
00:15:53,350 --> 00:15:57,580
supposed to be able to be accessed on a
different device. And as well so if you
188
00:15:57,580 --> 00:16:01,630
pulled the these PDF files out of the
device that you send off outside the
189
00:16:01,630 --> 00:16:07,009
country, they're not readable. Now one
thing I will say is that we know from some
190
00:16:07,009 --> 00:16:13,009
of the previous talks on Red Star that
developers in and for the DPRK have
191
00:16:13,009 --> 00:16:20,259
implemented actual AES-like encryption.
This is not that - it's fairly basic and
192
00:16:20,259 --> 00:16:26,269
we did find some some holes in it. So talk
a little bit about what we did. So when we
193
00:16:26,269 --> 00:16:30,949
look at these files, the first thing we
notice is that they don't have a PDF
194
00:16:30,949 --> 00:16:35,029
header. The first eight bytes have this
reference or this potential reference
195
00:16:35,029 --> 00:16:40,459
anyway to what will might be a date in
little-endian format. So this might be
196
00:16:40,459 --> 00:16:45,910
either December 1st or January 12th in
1978. If you have any idea what that
197
00:16:45,910 --> 00:16:50,920
means, please let us know because we're
kind of curious. The next thing is that
198
00:16:50,920 --> 00:16:56,300
when we started to look at the devices,
because we also had the the applications
199
00:16:56,300 --> 00:17:03,449
that read these files, one of them has a
hard coded reference to those first four
200
00:17:03,449 --> 00:17:08,319
bytes. And so when you look at what that
application was, we find that it's this
201
00:17:08,319 --> 00:17:14,138
app called UDK.Android.Reader, which if
you go to the Google Play Store it's just
202
00:17:14,138 --> 00:17:21,280
a commercially available PDF Reader app
for Android. But it's not really, because
203
00:17:21,280 --> 00:17:27,459
it's been modified to implement the the
DRM that we're looking at here. So
204
00:17:27,459 --> 00:17:32,890
basically, we took the the copy of the
reader that's available online, and one of
205
00:17:32,890 --> 00:17:37,769
the copies on one of the devices, and
we'll compare them we find that the
206
00:17:37,769 --> 00:17:44,070
application calls out to a shared library
when it wants to parse a PDF file. That
207
00:17:44,070 --> 00:17:47,270
library looks kind of like this
- these are the ELF sections in the file
208
00:17:47,270 --> 00:17:53,850
and it's pretty normal. When we look at
the copy that's on the DPRK version of the
209
00:17:53,850 --> 00:17:58,789
app, there's this one section added that
kind of jumps out - like it's literally
210
00:17:58,789 --> 00:18:07,990
called dot-modified. So when you look into
what's in that section, we see something
211
00:18:07,990 --> 00:18:12,230
like this - and this is really not going
to be legible both because of the size of
212
00:18:12,230 --> 00:18:18,370
text and because it's decompiled from ARM.
But we have the original decompiled code
213
00:18:18,370 --> 00:18:23,200
on the left, and the DPRK version on the
right. And the two things I just want to
214
00:18:23,200 --> 00:18:29,380
highlight are - at the top the original
function that would be filling a buffer to
215
00:18:29,380 --> 00:18:34,029
read the file has been replaced by a stub
that calls this sort of custom method in
216
00:18:34,029 --> 00:18:39,620
the modified section. And this the version
that's over in the modified section does
217
00:18:39,620 --> 00:18:44,380
basically the exact same thing, except
that in one case it will call another
218
00:18:44,380 --> 00:18:47,740
function that does some decryption. And
there's some other things as well in the
219
00:18:47,740 --> 00:18:54,200
modified section this is just sort of one
example. Now the reason that this is kind
220
00:18:54,200 --> 00:18:58,639
of interesting to us is that it really
shows us that these modifications were not
221
00:18:58,639 --> 00:19:04,000
made by someone who had source code.
Like this is kind of crazy low-level, not
222
00:19:04,000 --> 00:19:09,639
crazy, but like it's it's really low-level
modification of the binary itself. So when
223
00:19:09,639 --> 00:19:14,360
we look into those functions and what they
do, what we start finding is that the
224
00:19:14,360 --> 00:19:21,880
shared library, the modified version of
the shared library, has this 512 bytes pad
225
00:19:21,880 --> 00:19:25,960
which basically gets used over and over
again as part of the decryption process.
226
00:19:25,960 --> 00:19:29,649
And one of the things about it is that for
different files you will start using it at
227
00:19:29,649 --> 00:19:35,870
a different point. And there's also a four
byte key that's different for every file,
228
00:19:35,870 --> 00:19:41,179
which comes from a combination of a few
bytes in the file header itself, and a
229
00:19:41,179 --> 00:19:50,330
per-device key. So that per-device key is
kind of interesting. So they're taking,
230
00:19:50,330 --> 00:19:54,029
well at the end of the day you want a four
byte key, and they're generating it out of
231
00:19:54,029 --> 00:19:57,690
a six byte MAC address and the code that
they use kind of looks like this.
232
00:19:57,690 --> 00:20:02,669
This is us reimplementing it
in Go. One of
233
00:20:02,669 --> 00:20:06,659
the weird things about it is that some of
these devices may not actually have useful
234
00:20:06,659 --> 00:20:11,419
MAC addresses so in some cases the MAC
address that's using is actually just some
235
00:20:11,419 --> 00:20:17,460
hard-coded value in a file. All the time
when it reads these MAC addresses it's
236
00:20:17,460 --> 00:20:21,940
really just reading some code or some some
text out of that system etc MAC address
237
00:20:21,940 --> 00:20:28,610
file. So if you have that key, the process
to decrypt is really simple. You take that
238
00:20:28,610 --> 00:20:35,080
key, you subtract some of the bytes - the
ones marked with Y, and you get your four
239
00:20:35,080 --> 00:20:41,019
bytes to do a decryption. And the point in
the pad that I mentioned for this (tilaka)
240
00:20:41,019 --> 00:20:47,200
starting offset is just that same value
interpreted as an integer mod 512 because
241
00:20:47,200 --> 00:20:53,720
that's the length of the pad. In all the
examples we looked at, or as far as we
242
00:20:53,720 --> 00:21:00,750
could tell, these headers only had keys
for like one device. But looking at the
243
00:21:00,750 --> 00:21:06,500
the compiled code it looks like it might
be possible to have like one file that can
244
00:21:06,500 --> 00:21:09,820
be decrypted by multiple different
devices. We just haven't actually seen a
245
00:21:09,820 --> 00:21:16,250
file that is like. So the way that
actually does decryption is byte by byte
246
00:21:16,250 --> 00:21:22,940
and this is a simplified view of what's
going on. We're releasing a tool that will
247
00:21:22,940 --> 00:21:26,230
do this correctly and has all the details
in it but in a nutshell what you're doing
248
00:21:26,230 --> 00:21:30,090
is you're doing a little bit of math to
figure out where you are starting from for
249
00:21:30,090 --> 00:21:33,980
all these operations. And then for each
byte that you want to decrypt, you take
250
00:21:33,980 --> 00:21:39,710
your encrypted byte, you subtract one of
the per-file bytes, and then you XOR the
251
00:21:39,710 --> 00:21:46,750
whole thing with one of the bytes from
that 512 byte pad. So, the cool thing
252
00:21:46,750 --> 00:21:52,200
about this from my point of view is that
this process is totally reversible. So if
253
00:21:52,200 --> 00:21:57,220
you don't know your per-file key but you
do know what the plaintext should look
254
00:21:57,220 --> 00:22:05,539
like, you can run this backwards. And it
looks ound like that. So what if you just
255
00:22:05,539 --> 00:22:09,210
get a bunch of these encrypted PDF files
and you have no idea what device they came
256
00:22:09,210 --> 00:22:15,170
from and you just want to look at them?
You can also do it like. It's really
257
00:22:15,170 --> 00:22:19,169
quick to do you basically
brute-force all of the potentialial
258
00:22:19,169 --> 00:22:22,410
positions to be starting from, which
is really not that many many because the
259
00:22:22,410 --> 00:22:28,250
pad is not very big. And it's kind of a
plain text at a known plaintext attack.
260
00:22:28,250 --> 00:22:33,570
The header a PDF file always looks like %
PDF and then there's a version number. So
261
00:22:33,570 --> 00:22:38,830
you take 4 bytes you calculate the per-
file key that you would need to to make
262
00:22:38,830 --> 00:22:44,100
that decrypt to % PDF and then you take
the same per-file key and you see if it
263
00:22:44,100 --> 00:22:49,160
would be able to decrypt the next section
to a version number, and wind up with a
264
00:22:49,160 --> 00:22:58,781
valid header. And so we've done this for
all of the the files that we found, and
265
00:22:58,781 --> 00:23:04,880
basically wound up with plain text for all
these. One of the things that we noticed
266
00:23:04,880 --> 00:23:10,309
after decrypting these files is that many
of them have watermarks at the end - so if
267
00:23:10,309 --> 00:23:17,230
we look back to the talks on Red Star OS
from the past years, Florian and Niklaus
268
00:23:17,230 --> 00:23:21,970
did some work on understanding what the
watermark is. And if you want full details
269
00:23:21,970 --> 00:23:28,860
look at those talks. But to summarize it -
every time that a file passes through a
270
00:23:28,860 --> 00:23:34,500
desktop system or sometimes a file gets
modified the OS adds basically an
271
00:23:34,500 --> 00:23:40,290
encrypted form of the hard drive serial
number. Now when releasing these files we
272
00:23:40,290 --> 00:23:45,460
want to sort of obscure their origins and
not get any particular people into
273
00:23:45,460 --> 00:23:52,200
trouble, so we remove all those watermarks
before releasing these. And that's pretty
274
00:23:52,200 --> 00:23:55,659
simple because the way that this works
with PDF files is just that there's a
275
00:23:55,659 --> 00:23:59,860
known line of text at the end of the file
that represents the end of the PDF, and
276
00:23:59,860 --> 00:24:05,130
the Red Star always puts these watermarks
at the end so we just chop off the end. So
277
00:24:05,130 --> 00:24:10,190
once we have this we have like over 300
files of really different kinds of things,
278
00:24:10,190 --> 00:24:14,039
and we've kind of looked at some of them
but we're going to be releasing a torrent
279
00:24:14,039 --> 00:24:19,590
with all of them and we'd really like to
see what people come up with - just you
280
00:24:19,590 --> 00:24:21,940
know that that's in these files that we
have noticed.
281
00:24:21,940 --> 00:24:25,149
Will: Have we looked at all of them?
Gabe: I mean yeah, we've had like a quick
282
00:24:25,149 --> 00:24:30,330
look at some of them. We don't, I don't
speak Korean, you know some. There's
283
00:24:30,330 --> 00:24:36,460
probably more to be found in that archive.
So quick a look at just a couple of
284
00:24:36,460 --> 00:24:42,019
examples of things we found. There's many
different kinds of books on these devices
285
00:24:42,019 --> 00:24:45,659
many of them are like computer science
books, there's general-purpose knowledge
286
00:24:45,659 --> 00:24:50,679
kids textbooks. But because we want to
understand the state of technology in in
287
00:24:50,679 --> 00:24:55,889
the DPRK, the part that's most interesting
to us right now is computer science
288
00:24:55,889 --> 00:25:01,190
textbooks. So like two of the examples we
have are this Java programming book and
289
00:25:01,190 --> 00:25:06,820
this computer science book. They've got
some awesome covers and really neat art in
290
00:25:06,820 --> 00:25:11,970
some of them. But yeah, I'll hand that
back to to Will to actually talk about the
291
00:25:11,970 --> 00:25:20,790
analysis of what we we found in these
books and sort of where they came from.
292
00:25:20,790 --> 00:25:23,500
Will: Cool. Yeah, so maybe another quote
293
00:25:23,500 --> 00:25:28,331
from from Kim Jong-il is appropriate,
saying that we need to be aware of the
294
00:25:28,331 --> 00:25:31,940
information technology industry and we
need to meet the needs of the information
295
00:25:31,940 --> 00:25:37,559
technology industry. And so I think one of
the things that that comes out of these
296
00:25:37,559 --> 00:25:42,090
text books that that I think is sort of
interesting and this is the first benefit
297
00:25:42,090 --> 00:25:46,260
is that this can help us understand sort
of where Korea is in terms of how much
298
00:25:46,260 --> 00:25:52,680
emphasis its placing on this aspect. For a
lot of the educational materials, they
299
00:25:52,680 --> 00:25:57,019
seem to be organically created, they seem
to be about the specific environment
300
00:25:57,019 --> 00:26:02,990
there's a lot of training kids how to use
Red Star of various versions that you see.
301
00:26:02,990 --> 00:26:10,440
The textbooks, many of them are translated
or follow a curriculum and a layout of
302
00:26:10,440 --> 00:26:13,700
foreign external materials that have been
translated. So for some of the ones where
303
00:26:13,700 --> 00:26:18,350
we could identify what the original source
was, we tried to calculate how long that
304
00:26:18,350 --> 00:26:21,299
had taken, because we were actually
surprised sometimes this was a pretty
305
00:26:21,299 --> 00:26:27,990
quick. So I'll show this waterfall graph -
each of these bars represents one book.
306
00:26:27,990 --> 00:26:32,170
Some of the titles at the bottom they're
quite small and the the y-axis is the
307
00:26:32,170 --> 00:26:36,760
year. The bottom is when the original
English version that was used seemed to
308
00:26:36,760 --> 00:26:41,730
come out and and the top is when the
translation was released. And so what's
309
00:26:41,730 --> 00:26:45,070
interesting here is you
see order of even the
310
00:26:45,070 --> 00:26:50,340
same year sometimes a couple years
throughout this whole period of 2000 to
311
00:26:50,340 --> 00:26:55,789
2010 where they're putting a bunch of
effort into taking four-hundred, five-
312
00:26:55,789 --> 00:27:03,299
hundred page books. The the torrent of
these text books is four-some gigs, and
313
00:27:03,299 --> 00:27:09,019
doing good translations fairly quickly.
These are like solid translations the code
314
00:27:09,019 --> 00:27:14,529
examples have been often changed, there's
comments in Korean in there. Like, this is
315
00:27:14,529 --> 00:27:17,899
this is a solid effort that we should be
understanding and I think maybe partially
316
00:27:17,899 --> 00:27:22,090
sort of fills this gap of like, what is
this disconnect between this very isolated
317
00:27:22,090 --> 00:27:33,509
country and the fact that it has a really
strong computer capability. Cool, to end,
318
00:27:33,509 --> 00:27:38,240
I just want to sort of give an anecdote
that maybe goes to the other side of this
319
00:27:38,240 --> 00:27:42,130
anthropological value that we get out of
this sort of work. So you've heard about
320
00:27:42,130 --> 00:27:48,039
Kwangmyong - this is the internal network
or Internet. And so from these educational
321
00:27:48,039 --> 00:27:51,889
textbooks you start to get I think more
insight into sort of how this thing has
322
00:27:51,889 --> 00:27:57,730
progressed over over time. Here's pictures
from 2001, I apologize for quality, this
323
00:27:57,730 --> 00:28:03,211
was what was there of an early version of
Kwangmyong. This is Kwangmyong 5.1 which
324
00:28:03,211 --> 00:28:09,549
looks sort of like AOL. It was a dial-up
application that would get you documents
325
00:28:09,549 --> 00:28:15,120
and information. You also see at that same
time that there was an email sort of
326
00:28:15,120 --> 00:28:22,179
corresponding app called "hey son" - I
think I got that pronunciation not too bad
327
00:28:22,179 --> 00:28:25,120
that was used for messaging. We've heard
that there was a messaging system, we
328
00:28:25,120 --> 00:28:30,529
didn't really have that connected to sort
of where that fit in to the puzzle. A
329
00:28:30,529 --> 00:28:34,570
picture that seems to be that same sort of
Internal network ended up on the South
330
00:28:34,570 --> 00:28:40,450
Korean internet around 2005. It got reused
by anonymous in 2013 when they claimed to
331
00:28:40,450 --> 00:28:46,340
attack the Korean government servers, but
but then sort of that that turned out to
332
00:28:46,340 --> 00:28:50,781
be false in that it was this original 2005
post that someone made. That seems to be a
333
00:28:50,781 --> 00:28:56,450
similar system. And even in that 2005 post
they they had sort of also their web
334
00:28:56,450 --> 00:29:00,479
component - that's the same logo
in the upper left as they moved
335
00:29:00,479 --> 00:29:02,120
to sort of a web site
that we've now seen
336
00:29:02,120 --> 00:29:07,110
evolved. It's worth noting here right
Kwangmyong is a single site - it's a
337
00:29:07,110 --> 00:29:12,330
service for generally technical document
retrieval. Here's that same site now up to
338
00:29:12,330 --> 00:29:18,740
the 2010-era looking a little bit nicer at
least at higher quality in the picture.
339
00:29:18,740 --> 00:29:21,889
And so I think what we're starting to do
is we're getting these insights through
340
00:29:21,889 --> 00:29:24,760
through seeing some of these more
documents coming out about what this
341
00:29:24,760 --> 00:29:28,840
internal ecosystem actually looks like.
There are these these services that we can
342
00:29:28,840 --> 00:29:33,740
start to link over time, understand what
sorts of files are available and the
343
00:29:33,740 --> 00:29:39,100
specialties of these different groups, and
and preserve some of this internal network
344
00:29:39,100 --> 00:29:44,929
that, you know, in this fairly unstable
environment, we're at in danger of losing.
345
00:29:44,929 --> 00:29:50,100
To bring us up to current time, this is
from 2015 - a sort of blurry picture from
346
00:29:50,100 --> 00:29:55,519
a Koryolink office. Koryolink's the the
mobile telephony provider and to call out
347
00:29:55,519 --> 00:30:00,759
that they now have a same set of services
on a poster advertising mobile service
348
00:30:00,759 --> 00:30:05,830
with internal IPs to them. And so we're
seeing now that this is being introduced
349
00:30:05,830 --> 00:30:09,360
at a wider availability and advertised to
people on their mobile devices. So we're
350
00:30:09,360 --> 00:30:13,700
moving beyond just wire desktop
connections but this is now a thing that
351
00:30:13,700 --> 00:30:18,980
more people are going to have access to on
personal devices. And so I think you know,
352
00:30:18,980 --> 00:30:25,669
internally, we're in this really exciting
transitionary phase. I'm happy that that
353
00:30:25,669 --> 00:30:31,131
more of this ends up in the public. So,
there's this site, koreacomputecenter - it
354
00:30:31,131 --> 00:30:36,320
should already have some links, more will
show up very soon. If you are interested
355
00:30:36,320 --> 00:30:40,860
we encourage you to go grab that stuff try
and make it the bar lower. If you have
356
00:30:40,860 --> 00:30:45,190
DPRK artifacts, info@
koreacomputercenter.org - we'd love to
357
00:30:45,190 --> 00:30:51,081
talk to you, help make stuff safe, and get
more stuff out for public consumption. I
358
00:30:51,081 --> 00:30:57,350
think we are about that time - are you
coming kicking us off; so we will take
359
00:30:57,350 --> 00:31:03,308
questions across the hall in
the tea room. Thank you.
360
00:31:03,308 --> 00:31:07,730
Applause
361
00:31:07,730 --> 00:31:13,095
34c3 postroll
362
00:31:13,095 --> 00:31:27,941
subtitles created by c3subtitles.de
in the year 2018. Join, and help us!