1
00:00:00,000 --> 00:00:11,169
32C3 preroll music
2
00:00:11,169 --> 00:00:15,140
M.C.: Hey! So, can you hear me OK? Yeah.
3
00:00:15,140 --> 00:00:19,779
I am M.C. and I work on Transparency
Toolkit along with Brennan Novak
4
00:00:19,779 --> 00:00:25,799
and Kevin Gallagher. Basically, what
we try to do is “Watch the Watchers”.
5
00:00:25,799 --> 00:00:31,460
Back in May we released a database of
over 27.000 people in the Intelligence
6
00:00:31,460 --> 00:00:37,340
Community called ICWATCH. And this is
people who are talking about their work on
7
00:00:37,340 --> 00:00:41,780
classified programs on the public
internet. So we collected it using
8
00:00:41,780 --> 00:00:46,310
search terms like the code words
mentioned in the Snowden documents.
9
00:00:46,310 --> 00:00:50,710
And today we’re releasing
an update to ICWATCH
10
00:00:50,710 --> 00:00:55,970
doubling the data in the database.
11
00:00:55,970 --> 00:01:00,920
applause
12
00:01:00,920 --> 00:01:07,309
And that’s already vive, if
anyone wants to look at it.
13
00:01:07,309 --> 00:01:12,159
For the people who aren’t familiar with
this project and the sorts of things
14
00:01:12,159 --> 00:01:16,810
available on the research methods I’d like
to go through an interesting example of
15
00:01:16,810 --> 00:01:20,350
research things that can
be found in this database.
16
00:01:20,350 --> 00:01:26,449
So this is Lauren Russell, and she works
at L-3, a major intelligence contractor.
17
00:01:26,449 --> 00:01:30,679
But she started her career as an army
interrogator in Iraq. She says that
18
00:01:30,679 --> 00:01:36,900
the information that she collected was
used to capture dozens of people.
19
00:01:36,900 --> 00:01:40,190
But part of her job was also to assure
safe and humane treatment of hundreds
20
00:01:40,190 --> 00:01:45,379
of detainees. So that’s good at least. But
then, a few years after that, she went and
21
00:01:45,379 --> 00:01:50,389
worked for a different company called
Exelis in Afghanistan. And this job
22
00:01:50,389 --> 00:01:55,580
was quite different. It involved finding
people to kill. So she says as part
23
00:01:55,580 --> 00:01:59,840
of this work that she “utilized F3EA
methodology to conduct analysis on raw and
24
00:01:59,840 --> 00:02:05,320
fused HUMINT, SIGINT, and COMINT helping
to create 125 Targeting Support Packets
25
00:02:05,320 --> 00:02:09,299
then nominated to the Joint Priority
Effects List (JPEL) for kinetic targeting.”
26
00:02:09,299 --> 00:02:14,280
So there’s a lot of not very obvious terms
and gibberish there. And this is a pretty
27
00:02:14,280 --> 00:02:17,750
common problem by going through these
résumés. So I want to break down how you
28
00:02:17,750 --> 00:02:22,849
would interpret that sentence. “Signals
Intelligence” is what the NSA does.
29
00:02:22,849 --> 00:02:28,129
It’s collecting data from intercepted
communications. COMINT – Communications
30
00:02:28,129 --> 00:02:31,449
Intelligence – is specifically Signals
Intelligence from communication data.
31
00:02:31,449 --> 00:02:35,420
So what the NSA does
when they read your email.
32
00:02:35,420 --> 00:02:38,580
HUMINT, Human Intelligence is
Intelligence on human sources.
33
00:02:38,580 --> 00:02:45,650
So things like data gain
from informers or from torture.
34
00:02:45,650 --> 00:02:50,210
The “direct priority of XLES” is a list of
people the US military and its allies are
35
00:02:50,210 --> 00:02:54,720
trying to kill and capture in Afghanistan.
36
00:02:54,720 --> 00:02:58,740
F3EA stands for “Find, Fix, Finish,
Exploit and Analyze”. It’s a rapid
37
00:02:58,740 --> 00:03:02,990
intelligence collection and analysis
methodology used for targeting. And
38
00:03:02,990 --> 00:03:06,670
we recently found out in the Drone
Papers that this is often used for
39
00:03:06,670 --> 00:03:12,869
drone targeting. And “Kinetic Targeting”
simply means attacking a moving target.
40
00:03:12,869 --> 00:03:16,800
So looking at her profile again: she says
that she “F3EA methodology
41
00:03:16,800 --> 00:03:20,819
to conduct analysis on raw and fused
HUMINT, SIGINT and COMINT helping to
42
00:03:20,819 --> 00:03:24,899
create 125 Targeting Support Packets
then nominated to the direct priority
43
00:03:24,899 --> 00:03:28,670
of XLES for conduct targeting.” Basically
what she means is that based on
44
00:03:28,670 --> 00:03:32,759
intercepted communications and information
from human sources, possibly gained under
45
00:03:32,759 --> 00:03:38,560
the rest from torture she is deciding
who should be killed and captured.
46
00:03:42,755 --> 00:03:48,659
The Intelligence Community has long
had an attitude of “Collect It All”.
47
00:03:48,659 --> 00:03:52,670
And General [Keith B.] Alexander
started trying to collect all the data
48
00:03:52,670 --> 00:03:58,400
that they could from every source.
One of the first projects to this end
49
00:03:58,400 --> 00:04:02,700
was something called Real Time Regional
Gateway (RT-RG). It’s a master project to
50
00:04:02,700 --> 00:04:07,949
store, combine, search and analyze data
from many different sources at once.
51
00:04:07,949 --> 00:04:11,530
Everything from intercepted communications
to data from drones to data from
52
00:04:11,530 --> 00:04:17,930
interrogations to even mundane things like
traffic patterns and the prize of potatoes.
53
00:04:17,930 --> 00:04:22,970
They started this program in 2005.
The initial version was built by SAIC
54
00:04:22,970 --> 00:04:27,270
for use in Iraq. And these days it’s
mostly used in Afghanistan.
55
00:04:27,270 --> 00:04:31,520
It searches the US soil because according
to documents published in “Der SPIEGEL”
56
00:04:31,520 --> 00:04:38,479
last year Germany is the 3rd largest
contributor to RT-RG. This source
57
00:04:38,479 --> 00:04:41,400
of collection analysis tools are used
for some programs that you might have
58
00:04:41,400 --> 00:04:47,130
heard of too, like CoTraveller – the
program the NSA has to figure who is
59
00:04:47,130 --> 00:04:52,380
going places with who else. And there is
a specific analytic tool. This part of
60
00:04:52,380 --> 00:04:57,579
RT-RG called SIDEKICK that uses relative
velocities to calculate this from any
61
00:04:57,579 --> 00:05:01,590
different data sources, so that they can
calculate that for people across networks.
62
00:05:01,590 --> 00:05:04,030
Unfortunately, this is really
computationally intensive because they
63
00:05:04,030 --> 00:05:09,459
need to pre-compute all of the travel
behaviour for all the pairs of selectors.
64
00:05:09,459 --> 00:05:12,500
But it’s feasible for them to do
computationally intensive things the time
65
00:05:12,500 --> 00:05:18,199
that it’s built because it’s built on
Hadoop and accumulo for distributed data
66
00:05:18,199 --> 00:05:27,380
processing and storage. So they’re quite
serious about this. The goals for RT-RG
67
00:05:27,380 --> 00:05:33,150
are quite lofty. One of the creators, in
an interview with “Defence News” described
68
00:05:33,150 --> 00:05:37,240
their aim is being able to use intercepted
communications and integrate it with
69
00:05:37,240 --> 00:05:42,000
signals with geolocation. So that they can
instantly find people and target them.
70
00:05:42,000 --> 00:05:47,200
Another counter-terrorism official told
the Wall Street Journal that RT-RG
71
00:05:47,200 --> 00:05:53,079
literally allows them to predict the
future. Decorrelation means it’s the
72
00:05:53,079 --> 00:05:56,890
strongest correlation tool ever. So their
goals of this seem to be two-fold: First
73
00:05:56,890 --> 00:06:02,990
of all to be able to kill or smite any
potential enemies. And 2nd one to be
74
00:06:02,990 --> 00:06:07,970
omniscient. To know everything that’s
happening at once. And to correlate it and
75
00:06:07,970 --> 00:06:13,300
use that to predict what will happen in the
future. And these goals sound a little bit beyond
76
00:06:13,300 --> 00:06:18,560
what you would expect from someone
who is trying to simply protect people or
77
00:06:18,560 --> 00:06:21,569
stop terrorism. It sounds more like
they’re trying to become some sort
78
00:06:21,569 --> 00:06:26,539
of God. Who by collecting and analyzing
everything know everything that’s
79
00:06:26,539 --> 00:06:32,280
happening everywhere and can just smite
any enemies from above. Instantly.
80
00:06:32,280 --> 00:06:37,330
But the thing is they are'nt a God. There are
people working on these and they're
81
00:06:37,330 --> 00:06:40,289
normal people. And they’ve crazy
resources and they intercept
82
00:06:40,289 --> 00:06:44,460
a lot of data. But they also use data
that’s freely available to anyone for
83
00:06:44,460 --> 00:06:49,860
a lot of their work. Open Source
Intelligence. This is a pamphlet from
84
00:06:49,860 --> 00:06:55,270
a startup called ZeroFox that uses data
from Social Media to track ISIS.
85
00:06:55,270 --> 00:07:00,019
And tools like this are quite common.
There’s another tool called “LM Wisdom”
86
00:07:00,019 --> 00:07:03,620
that’s made by Lockheed Martin. And
they have a wonderful promotion video
87
00:07:03,620 --> 00:07:08,699
on their website explaining exactly how it
works – that I’d like to play.
88
00:07:08,699 --> 00:07:11,960
with lowered voice:
Hopefully this’ll work…
89
00:07:11,960 --> 00:07:15,819
audio/video starts Female Narrator:
Social Media content has the power
90
00:07:15,819 --> 00:07:19,300
to incite organized movements
and sway political outcomes.
91
00:07:19,300 --> 00:07:22,879
Person in Video: “It’s an opposition
terrorist organization in Iran.”
92
00:07:22,879 --> 00:07:26,259
Female Narrator: Monitoring and analyzing
the massive and rapidly changing
93
00:07:26,259 --> 00:07:31,210
open source intelligence data, or OSINT,
and turning it into actionable intelligence
94
00:07:31,210 --> 00:07:37,180
for decision-makers is an imperative.
Lockheed Martin’s Wisdom software suite
95
00:07:37,180 --> 00:07:42,199
offers an advanced capability to collect,
manage and analyze vast amounts
96
00:07:42,199 --> 00:07:47,620
of open source data. Enabling analysts
to understand, measure and anticipate
97
00:07:47,620 --> 00:07:52,039
real-world advance through Social Media.
Person in Video: “Think of Wisdom as your
98
00:07:52,039 --> 00:07:58,520
eyes and ears on the web. Wisdom is
that tool that would allow it to do this
99
00:07:58,520 --> 00:08:00,400
at scale!”
Female Narrator: Wisdom’s advanced
100
00:08:00,400 --> 00:08:05,319
Big Data collection capability and data
store automatically identify and harvest
101
00:08:05,319 --> 00:08:09,479
online Social Networking data of
operational interest. As well as
102
00:08:09,479 --> 00:08:14,810
socio-cultural data from standard online
open sources like newspaper feeds and
103
00:08:14,810 --> 00:08:20,110
structured databases. Wisdom’s high-
performance analytic algorithms analyze
104
00:08:20,110 --> 00:08:25,510
the content in near realtime distinguishing
noise from high-value information.
105
00:08:25,510 --> 00:08:30,980
Capturing trends, sentiment and influence;
turning open source data into predictive,
106
00:08:30,980 --> 00:08:36,030
actionable intelligence.
audio/video stops
107
00:08:36,030 --> 00:08:37,210
M.C.: Yeah, so…
applause
108
00:08:37,210 --> 00:08:41,259
…that’s what they’re doing. And they’re
not just using this to target terrorists.
109
00:08:41,259 --> 00:08:46,450
It was recently revealed that they are
helping Walmart use this to find employees
110
00:08:46,450 --> 00:08:50,230
that are organizing for better working
conditions and find the main organizers
111
00:08:50,230 --> 00:08:53,820
and fire them. Using
data from Social Media.
112
00:08:53,820 --> 00:08:59,320
So it’s used for Corporate purposes as
well. And LM Wisdom wasn’t even made
113
00:08:59,320 --> 00:09:02,620
for surveillance in the first place.
I tracked down one of the people
114
00:09:02,620 --> 00:09:09,020
who created it. And at that time he worked
for General Electric and was hoping to
115
00:09:09,020 --> 00:09:14,320
make a… to help NBC make tools so
that they can figure out which sites
116
00:09:14,320 --> 00:09:19,740
to partner with to make their videos go
viral. So it’s not just governments that
117
00:09:19,740 --> 00:09:22,959
are using Open Source Intelligence because
there’s no barriers to access it and
118
00:09:22,959 --> 00:09:27,510
there’s many applications. There’s even
many people search databases that
119
00:09:27,510 --> 00:09:31,120
have information like people’s address,
and phone number, and relatives,
120
00:09:31,120 --> 00:09:35,320
and how old they are. And these include
many, many people. Probably everyone
121
00:09:35,320 --> 00:09:39,230
in the US. And they’re used by many people
for all sorts of purposes from private
122
00:09:39,230 --> 00:09:47,839
detectives to people that are selling
advertisements. If this data is available
123
00:09:47,839 --> 00:09:53,459
already and it’s used for everything from
figuring out who to kill to stopping unions
124
00:09:53,459 --> 00:09:57,440
from organizing to trying to sell things
to people – why can’t we use it to
125
00:09:57,440 --> 00:10:00,529
understand surveillance programs, too?
Why can’t we use it to understand human
126
00:10:00,529 --> 00:10:05,170
rights abuses. Why not use it for
accountability? So we started to build
127
00:10:05,170 --> 00:10:09,940
tools to do this and in the near future
we’d like to make it possible for anyone
128
00:10:09,940 --> 00:10:14,400
to make something like ICWATCH or other
databases in less than a day and without
129
00:10:14,400 --> 00:10:19,560
programming. Long-term goal is to build
software similar to what the Intelligence
130
00:10:19,560 --> 00:10:24,310
Community has. Things similar to LM-Wisdom,
things similar to Real Time Regional Gateway.
131
00:10:24,310 --> 00:10:29,779
So that people can collect all this
information in one place and analyze it.
132
00:10:29,779 --> 00:10:33,389
I’d like to show a demo of some of the
tools that we’ve been working on. It’s
133
00:10:33,389 --> 00:10:41,110
possible to just – this won’t work at all
but we’ll see. So this is Harvester. It’s
134
00:10:41,110 --> 00:10:48,660
a tool for collecting data from online
sources in an automated fashion. You can
135
00:10:48,660 --> 00:10:53,200
choose different data sources, say
“Indeed” – this is a résumé website – and
136
00:10:53,200 --> 00:10:58,240
say you want to find anyone who mentioned
XKeyscore and for sake of timing let’s
137
00:10:58,240 --> 00:11:08,160
just get people in Maryland. And “start
collecting”, and it might take a second
138
00:11:08,160 --> 00:11:12,920
because it’s still a bit rough. But it
opens a browser, goes finds other people
139
00:11:12,920 --> 00:11:19,069
who mention XKeyscore in Maryland and it
goes and downloads all of their résumés
140
00:11:19,069 --> 00:11:24,149
in one place… you can kind of see them
as they download because this is being
141
00:11:24,149 --> 00:11:48,709
slowed a bit down right now. That just
works key services and fairly small.
142
00:11:48,709 --> 00:11:57,699
Something shouted from out of the audience
M.C.: laughs
143
00:11:57,699 --> 00:12:02,060
applause
144
00:12:05,800 --> 00:12:12,350
Takes a second to load,
still kind of rough…
145
00:12:12,350 --> 00:12:18,930
Yeah, so we’re hoping to add many different
data sources, so that people can collect
146
00:12:18,930 --> 00:12:22,690
data from sources online as well as just
take a pile of pdf’s on their computer,
147
00:12:22,690 --> 00:12:26,570
point at the directory and it will load
them and OCR them and people will be able
148
00:12:26,570 --> 00:12:31,470
to search through them
in a searchable database.
149
00:12:31,470 --> 00:12:35,549
So while this is loading why don’t I go
and walk through some of the rest of the
150
00:12:35,549 --> 00:12:40,020
pipeline. So our goal is to have tools
for collecting data, loading it into
151
00:12:40,020 --> 00:12:46,770
a database; and then tools for matching
data across various sources on the same
152
00:12:46,770 --> 00:12:50,220
person or the same company. So it should
take someone’s résumés and Social Media
153
00:12:50,220 --> 00:12:54,130
profiles and everything and link it
together and then also link that to the
154
00:12:54,130 --> 00:12:57,180
companies they work(ed) for, the other
people they know, the locations they’ve
155
00:12:57,180 --> 00:13:01,540
lived. As well as tools for extracting
things from data. So to be able to go
156
00:13:01,540 --> 00:13:04,330
through a résumé, extract all the code
words mentioned, to be able to go through
157
00:13:04,330 --> 00:13:08,019
a document and extract all the
companies mentioned and generating
158
00:13:08,019 --> 00:13:13,190
entities that way. And tools for searching
through data in databases where you can
159
00:13:13,190 --> 00:13:17,699
search for search queries and browse by
categories. And for viewing data and
160
00:13:17,699 --> 00:13:23,649
network graphs and maps. Let’s see if this
is done… Right now it just shows the
161
00:13:23,649 --> 00:13:32,540
raw JSON. The connection between tools
is a bit rough. But we should be able to
162
00:13:32,540 --> 00:13:41,240
index the data and load it into a search
tool. Will take a second. Hopefully this
163
00:13:41,240 --> 00:14:05,760
works. Ouh, it’s going! Yah… So it takes
a little bit. Index… And you can see…
164
00:14:05,760 --> 00:14:13,699
The data will be at… It kind of circle
loaded into a subscriptions list…
165
00:14:13,699 --> 00:14:17,310
So there’s a searchable database on all the
people who are working on XKeyscore
166
00:14:17,310 --> 00:14:27,400
in Maryland!
applause, cheers from audience
167
00:14:27,400 --> 00:14:33,100
So I think that in using this Free
Software and open data really the key is
168
00:14:33,100 --> 00:14:38,070
because we have far, far fewer resources
than the Intelligence Community. And we
169
00:14:38,070 --> 00:14:41,240
don’t even have the resources that a
company like Lockheed Martin has. We can’t
170
00:14:41,240 --> 00:14:45,269
internally build all of this software. I
hope that we will anticipate every future
171
00:14:45,269 --> 00:14:50,609
use to be able to help people adapt to
that. Having people be able to take our
172
00:14:50,609 --> 00:14:54,199
data, take our tools and adapt it to their
own situations is absolutely key to
173
00:14:54,199 --> 00:14:58,380
actually ensuring that they’re useful. And
there are also a lot of open source tools
174
00:14:58,380 --> 00:15:01,269
that the Intelligence Community has,
really. It’s like accumulo, the thing
175
00:15:01,269 --> 00:15:05,399
that’s used in Real Time Regional Gateway.
It was released by the NSA and made open
176
00:15:05,399 --> 00:15:11,029
source. And Gaffer which is a graph
database recently released by GCHQ.
177
00:15:11,029 --> 00:15:15,660
So we can sort of take those and possibly
also build on those in some cases.
178
00:15:15,660 --> 00:15:17,940
As well are using the same tools
chuckles
179
00:15:17,940 --> 00:15:22,050
And it’s appropriate because our goal is
to enable people to collect and use
180
00:15:22,050 --> 00:15:27,529
information in the same way that the
Intelligence Community can.
181
00:15:27,529 --> 00:15:31,880
But, well, I think that we should aim
to collect it all and collect all the
182
00:15:31,880 --> 00:15:35,009
information that we can. I think we also
need to be careful to avoid a lot of the
183
00:15:35,009 --> 00:15:39,740
mistakes that the Intelligence Community
has made. Because some of the effects are
184
00:15:39,740 --> 00:15:45,550
quite bad and lead to people being killed
for no reason at all. And – it’s quite
185
00:15:45,550 --> 00:15:49,729
absurd. And the main one of these,
I think, is de-humanizing people.
186
00:15:49,729 --> 00:15:53,370
Torture techniques are specifically
designed to de-humanize people.
187
00:15:53,370 --> 00:15:56,100
When people are looking at data that
they’ve intercepted, they’re not looking
188
00:15:56,100 --> 00:15:59,569
at a person, they’re looking at meta-data,
they’re looking at numbers on a screen.
189
00:15:59,569 --> 00:16:05,819
It’s not something that’s easy to find a
way around. When I was working on ICWATCH
190
00:16:05,819 --> 00:16:11,410
I was grabbling with this problem quite a
bit. So I decided to try to see who some
191
00:16:11,410 --> 00:16:15,649
of these people are and try to put faces
to these issues. So I started going to
192
00:16:15,649 --> 00:16:19,440
Intelligence conferences. Many of these
conferences are quite open and you can
193
00:16:19,440 --> 00:16:24,490
just go in. And I wasn’t that out of place
either, I just told people that I made
194
00:16:24,490 --> 00:16:27,430
tools to collect and analyze
Open Source Intelligence.
195
00:16:27,430 --> 00:16:29,139
laughter and applause
196
00:16:29,139 --> 00:16:35,590
There're many people doing.
197
00:16:35,590 --> 00:16:38,080
There’re many people doing simmilar
things out there, too. Like I met the
198
00:16:38,080 --> 00:16:42,409
Zerofox people who were one of the examples
I showed earlier at one of these conferences.
199
00:16:42,409 --> 00:16:45,409
They are actually very, very nice. And
200
00:16:45,409 --> 00:16:48,139
there were also some people who were quite
interested in what I was doing. There was
201
00:16:48,139 --> 00:16:50,970
one recruiter from Northrop-Grumman who
seemed somewhat interested in hiring me
202
00:16:50,970 --> 00:16:54,300
and I looked her up later and found
a bunch of job listings where she was
203
00:16:54,300 --> 00:16:59,159
trying to hire people who… to work on
programs related to XKeyscore. It wasn't
204
00:16:59,159 --> 00:17:03,639
all good, I got kicked out of one conference.
I got some strange requests like there was
205
00:17:03,639 --> 00:17:09,690
one guy who was trying to figure how to
use open data to help venture capitalists
206
00:17:09,690 --> 00:17:15,170
figure out what porn the founders of the
startups they funded watched. I’m not sure
207
00:17:15,170 --> 00:17:18,109
that’s even possible. But it was really
weird and he was asking me for help and
208
00:17:18,109 --> 00:17:20,260
I was like “I don’t think I can
help with that, sorry!”
209
00:17:20,260 --> 00:17:27,160
laughter and applause
210
00:17:27,160 --> 00:17:30,940
Of course there were some negative comments
on things like Manning and Snowden
211
00:17:30,940 --> 00:17:33,990
and some confusion like there was someone
who is making insider threat detection
212
00:17:33,990 --> 00:17:39,130
software, who was talking about how it
would stop a situation like when Snowden
213
00:17:39,130 --> 00:17:43,070
leaked documents to Wikileaks and
things like that. So people don’t actually
214
00:17:43,070 --> 00:17:46,280
know what’s going on. But generally most
of them were decent people and some of
215
00:17:46,280 --> 00:17:49,250
them were quite nice, some of them were
quite funny. And some of them really
216
00:17:49,250 --> 00:17:52,570
seemed to think that what they were doing
is saving lives. So they’re not evil people
217
00:17:52,570 --> 00:17:57,540
who want to hurt others but they’re not
infallible either. They’re human beings.
218
00:17:57,540 --> 00:18:02,800
And our strategy – looking at individuals
– scares a lot of people. But what you
219
00:18:02,800 --> 00:18:09,810
have to realize is that institutions are
made up by people. It’s easier to just
220
00:18:09,810 --> 00:18:12,810
look at the institution. It’s easier to
just look at an abstract program. Just
221
00:18:12,810 --> 00:18:15,590
like it’s easier not to think of the
person who you just decided to kill in a
222
00:18:15,590 --> 00:18:21,430
drone strike as a person. That’s why these
things continue to happen. I think that
223
00:18:21,430 --> 00:18:24,520
there’s a lot of benefit to looking at
people as people, both to avoid some of
224
00:18:24,520 --> 00:18:28,970
the problems the Intelligence Community
has as well as because people’s data trails
225
00:18:28,970 --> 00:18:31,780
are part of the data trails of the
institutions. And if we’re only looking at
226
00:18:31,780 --> 00:18:36,490
institutions we’re missing part of the
data trail the people leave.
227
00:18:36,490 --> 00:18:40,690
Though, of course, no one person is
responsible for the wrong-doings of the
228
00:18:40,690 --> 00:18:46,900
Intelligence Community. So we shouldn’t
demonize any one person. But…
229
00:18:46,900 --> 00:18:49,650
these are the people who go to work every
day and perpetuate the actions of the
230
00:18:49,650 --> 00:18:54,810
Intelligence Community. So I think everyone
involved is a little bit at fault.
231
00:18:54,810 --> 00:18:57,950
And the other benefit of looking at people
as people is that we can start to
232
00:18:57,950 --> 00:19:01,220
understand them. Because you have to
understand what their hopes are, what
233
00:19:01,220 --> 00:19:05,330
their fears are. How they see the world.
What upsets them. And what might cause
234
00:19:05,330 --> 00:19:08,920
them to change their behaviour. And from
that we can start to maybe come up with
235
00:19:08,920 --> 00:19:13,150
alternatives. So let’s look at some of
these people and look at some of their
236
00:19:13,150 --> 00:19:21,960
stories. This is Jason Epperson. He works
on Intelligence collection for Special
237
00:19:21,960 --> 00:19:27,420
Operations. In his spare time he enjoys
coaching children sports. He currently
238
00:19:27,420 --> 00:19:32,050
works at the US Special Ops Command
(USSOCOM) helping different agencies
239
00:19:32,050 --> 00:19:35,190
collect data, share it, say and figure out
what data they need, just generally
240
00:19:35,190 --> 00:19:39,340
helping them integrate it. But when he
started his career back in 1998 also
241
00:19:39,340 --> 00:19:43,950
working on collecting data for Special
Operations. Then later, in 2004, he went
242
00:19:43,950 --> 00:19:49,650
to work at the US Central Command in the
NSA cryptologic services group and he was
243
00:19:49,650 --> 00:19:53,330
focused on tracking down high-value
targets and individuals. And he claimed
244
00:19:53,330 --> 00:19:56,710
that as a result of his work, numerous
high-value individuals were captured
245
00:19:56,710 --> 00:20:03,990
or killed. It is especially interesting
because he was working on this in 2007
246
00:20:03,990 --> 00:20:09,330
when PRISM was launched and at the top
of his résumé he lists in his specialties
247
00:20:09,330 --> 00:20:14,620
PRISM as “possible”, so that’s kind of a
dinagra but based on his background it
248
00:20:14,620 --> 00:20:20,640
might not be. So I think it probably is
actually PRISM.
249
00:20:20,640 --> 00:20:27,530
Then after he was working there he went
and started working counter-radicalization
250
00:20:27,530 --> 00:20:31,030
efforts – things like boosting the
capacity of Muslim Faith Leaders to win
251
00:20:31,030 --> 00:20:33,910
hearts and minds and establishing
competing social networks to counter
252
00:20:33,910 --> 00:20:37,150
Al Qaeda ideology and he’s very clear in
his job description that he’s not killing
253
00:20:37,150 --> 00:20:43,480
people, he’s just helping allies of the US
figure out who is who, set Interpol notices for.
254
00:20:43,480 --> 00:20:46,790
But the most interesting thing about him
isn’t any of his jobs. It’s this
255
00:20:46,790 --> 00:20:50,940
publication that he has at the bottom of
his résumé called “An Examination of the
256
00:20:50,940 --> 00:20:55,980
Effect of Government Data Mining on US
Citizens”. And this clearly an area where
257
00:20:55,980 --> 00:21:00,470
he has a lot of expertise. And he
presented this at a conference back in
258
00:21:00,470 --> 00:21:04,810
2010. I still don’t have a copy yet. It’s
not easily available. I think it might be
259
00:21:04,810 --> 00:21:09,630
possible to get either by buying it from
the company directly or by going to the
260
00:21:09,630 --> 00:21:14,820
Library of Congress that seems to have
some copies of the conference proceedings.
261
00:21:14,820 --> 00:21:19,670
That could be quite interesting. Both
because he was relatively high up, he was
262
00:21:19,670 --> 00:21:23,700
in command of nearly 400 people back when
PRISM started and he was working with the
263
00:21:23,700 --> 00:21:27,840
NSA. It’s possible that he had some role
early on in the program and this might
264
00:21:27,840 --> 00:21:33,790
provide some clues. And then also the
little “data mining on US Citizens” a bit
265
00:21:33,790 --> 00:21:36,910
in the title is kind of interesting
because that’s supposed to be the last
266
00:21:36,910 --> 00:21:40,500
protection – I think that’s kind of a super
protection because most US citizens
267
00:21:40,500 --> 00:21:43,200
wouldn’t find it very comforting if the
Chinese Government said: “Oh yeah, we have
268
00:21:43,200 --> 00:21:47,420
a mass surveillance program but we only
spy on people who aren’t Chinese citizens.”
269
00:21:47,420 --> 00:21:50,680
That’s not really comforting to them, so I
don’t see why it would be. But it’s been
270
00:21:50,680 --> 00:21:54,800
the one thing that people were impeding.
“We don’t collect it on US citizens”. And
271
00:21:54,800 --> 00:21:59,960
just seeing that on the title of a paper
is like a tiny admission that maybe they
272
00:21:59,960 --> 00:22:08,240
do. So some of these (?) files tell other
interesting stories about people’s lives.
273
00:22:08,240 --> 00:22:11,760
If you’ve seen any of my other talks, this
is someone you’ve heard me talk about
274
00:22:11,760 --> 00:22:15,920
a lot. Solomon Varnado. He spent most of
his life in the military intelligence
275
00:22:15,920 --> 00:22:20,190
community, focused on Signals Intelligence
and Geolocation. He took down his résumé
276
00:22:20,190 --> 00:22:25,960
after ICWATCH launched. But I actually
recently found another résumé of his on
277
00:22:25,960 --> 00:22:31,070
another website that has additional
information like on the side in the
278
00:22:31,070 --> 00:22:35,580
military he ran diversity programs and a
sexual assault prevention program and
279
00:22:35,580 --> 00:22:39,070
things like that. I first came across this
profile because he mentions a lot of
280
00:22:39,070 --> 00:22:45,010
interesting code words. This is probably
the first known mention of XKeyscore back
281
00:22:45,010 --> 00:22:54,610
in 2004/2005. But these aren’t the most
interesting part of his résumé. Later on
282
00:22:54,610 --> 00:22:58,230
he… after he works on Intelligence
Collection Management – just Standard
283
00:22:58,230 --> 00:23:05,170
Signals Intelligence Collection – he goes
and he works for L-3 Stratis. And there he
284
00:23:05,170 --> 00:23:08,550
says that he identified, collected, and
performed direction finding
285
00:23:08,550 --> 00:23:13,000
of specified target signals using
PENNANTRACE, DISPLAYVIEW and CEGS.
286
00:23:13,000 --> 00:23:14,450
But I wasn't sure what “PENNANTRACE” was
287
00:23:14,450 --> 00:23:17,200
so I found it a definition
very conveniently located in
288
00:23:17,200 --> 00:23:21,800
another résumé. That said it was an
airborne collection platform for PENNANTRACE.
289
00:23:21,800 --> 00:23:27,500
That sounds like some sort of
Signals Intelligence collection platform.
290
00:23:27,500 --> 00:23:31,760
And the other interesting thing about this
job is that he said that he called for
291
00:23:31,760 --> 00:23:35,720
external review of intelligence management
processes which is not something I see
292
00:23:35,720 --> 00:23:39,130
normally. And he was there for a fairly
short time, only a couple of months.
293
00:23:39,130 --> 00:23:43,170
After staying at most of his other jobs
for over a year. And then at his next job
294
00:23:43,170 --> 00:23:44,900
he was also there for
only a couple of months.
295
00:23:44,900 --> 00:23:47,540
He was working at Pluribus International,
also on Drone Intelligence,
296
00:23:47,540 --> 00:23:50,470
this time definitely Drone Intelligence,
on Predator drones because he
297
00:23:50,470 --> 00:23:54,370
mentions Airhandler which we now know
more about thanks to the catalogue
298
00:23:54,370 --> 00:23:58,320
released by The Intercept. It’s a
299
00:23:58,320 --> 00:24:02,290
geo-processing system for geolocation
data from Predator drones.
300
00:24:02,290 --> 00:24:06,330
And the update to ICWATCH
includes all the data on all of the words
301
00:24:06,330 --> 00:24:13,610
mentioned in that catalogue. And then
he leaves the Intelligence Community
302
00:24:13,610 --> 00:24:19,090
entirely after that job. And he goes and
works as a used car salesman at this used
303
00:24:19,090 --> 00:24:23,160
car dealership. And it turns out he is
actually – found him on this other résumé
304
00:24:23,160 --> 00:24:25,580
that I just found – He’s actually quite
a successful used cars salesman.
305
00:24:25,580 --> 00:24:27,760
He’s won a bunch of awards.
He’s one of the best
306
00:24:27,760 --> 00:24:30,740
salesmen in the region. So he’s doing quite
well. And he won a bunch of awards
307
00:24:30,740 --> 00:24:32,420
and he's in the military too,
so it seems like
308
00:24:32,420 --> 00:24:35,730
he’s very committed to what he
does. But still that’s quite a huge career
309
00:24:35,730 --> 00:24:39,880
change and it sounds like maybe he was
starting to get upset with some of how
310
00:24:39,880 --> 00:24:42,840
things are really being done and he
couldn’t figure out a way to fix it after
311
00:24:42,840 --> 00:24:46,840
calling for external review
so he just left.
312
00:24:49,010 --> 00:24:54,190
applause
313
00:24:54,190 --> 00:25:02,360
And then, this is Michael Dial. Michael
Dial is a pipe fitter and a plumber. And
314
00:25:02,360 --> 00:25:08,400
this is him with his family. He’s actually
a pipe fitter and a plumber. But he’s not
315
00:25:08,400 --> 00:25:13,780
just any pipe fitter. He has security
clearance. And he goes and he fits pipes
316
00:25:13,780 --> 00:25:17,990
in secure facilities. As you might expect
he does a lot of pipe fitting for naval
317
00:25:17,990 --> 00:25:27,080
ships. He also does things like he goes to
embassies and other secret locations in
318
00:25:27,080 --> 00:25:38,170
Afghanistan and Iraq, Ecuador, Serbia
and sets up their pipes. He also did some
319
00:25:38,170 --> 00:25:43,620
pipe fitting in Djibouti at some sort of
Homeland Security facility which
320
00:25:43,620 --> 00:25:50,170
coincidently is also where many of the
drone programs are run out of. So there’s
321
00:25:50,170 --> 00:25:54,640
some interesting cases like that’s where
there are people like Michael Dial who
322
00:25:54,640 --> 00:25:59,020
aren’t involved in Intelligence at all,
directly. But the information in the
323
00:25:59,020 --> 00:26:04,960
résumés still provides very interesting
useful details about where secret
324
00:26:04,960 --> 00:26:07,880
facilities are located and other aspects
of the Intelligence Community. Because
325
00:26:07,880 --> 00:26:11,090
secret facilities don’t just materialize
out of thin air. They need people to build
326
00:26:11,090 --> 00:26:15,750
them, they need people to operate them.
So from tracking down these people we can
327
00:26:15,750 --> 00:26:18,740
start to map them. And then there’re other
useful things like we can figure out which
328
00:26:18,740 --> 00:26:25,740
companies clean the NSA. I’m sure that
has all sorts of useful applications.
329
00:26:25,740 --> 00:26:33,850
This is Eleana Costa. He lives in D.C. and
he works for the DOD. And this is him at his
330
00:26:33,850 --> 00:26:38,340
High School Graduation back in 1988. He
has been working in Military and
331
00:26:38,340 --> 00:26:45,240
Intelligence for nearly 20 years. And back
in 2003, he worked on Psi Ops programs.
332
00:26:45,240 --> 00:26:50,880
Specifically he worked on Psi Ops programs
in Paraguay, Columbia and Bolivia. And
333
00:26:50,880 --> 00:26:55,970
these were in support of DEED, the drug
enforcement agency and the CIA.
334
00:26:55,970 --> 00:26:59,260
And there are a few other reasons ICWATCH
you mention involvement in Psi Ops in
335
00:26:59,260 --> 00:27:04,480
Latin America for the DEA. It seems me
quite an extensive thing especially since
336
00:27:04,480 --> 00:27:08,900
I didn’t collect any data on this
specifically, and I had just suddenly a bunch
337
00:27:08,900 --> 00:27:13,950
of people on the database on this, so:
maybe worth looking into a bit. And then
338
00:27:13,950 --> 00:27:17,320
after that he went and he worked on Psi
Ops programs in Iraq. So it’s kind of
339
00:27:17,320 --> 00:27:22,120
interesting. Then he went and worked
at the DOD on Human Intelligence.
340
00:27:22,120 --> 00:27:27,240
The other interesting thing about Kiliana
Costa is that he’s one of the people who
341
00:27:27,240 --> 00:27:34,010
deleted his résumé after ICWATCH
launched and that was how I found him.
342
00:27:34,010 --> 00:27:41,090
laughter and applause
343
00:27:41,090 --> 00:27:46,050
So after ICWATCH launched a lot of people
were positively interested in it, but we
344
00:27:46,050 --> 00:27:49,180
also got a lot of threats because… it’s
really absurd, because all we’re doing is
345
00:27:49,180 --> 00:27:52,670
collecting information that people
explicitly, independently, willingly
346
00:27:52,670 --> 00:27:56,720
posted online about the profession;
as we’re not posting addresses or
347
00:27:56,720 --> 00:28:02,930
anything like that. And making it more
searchable. Just like google does.
348
00:28:02,930 --> 00:28:07,200
But a lot of people in the Intelligence
Community contacted us and for the first
349
00:28:07,200 --> 00:28:11,730
few weeks, we saw a new response
every day. Some of these were kind of
350
00:28:11,730 --> 00:28:17,580
interesting and reveals some sort of non-
sensical mind sets of people in the
351
00:28:17,580 --> 00:28:25,330
Intelligence Community. Like this guy.
This is Alexander Irinovitch. He sent me
352
00:28:25,330 --> 00:28:29,380
a…, actually a nice email, a very nice
email. It was really nice. Saying that he
353
00:28:29,380 --> 00:28:32,740
couldn’t understand why he was in ICWATCH
because he wasn’t involved in surveillance.
354
00:28:32,740 --> 00:28:36,610
He was working at a private company that
had nothing to do with surveillance.
355
00:28:36,610 --> 00:28:42,750
So I looked at his profile and I saw that
he was working at unit 8200, the Israeli
356
00:28:42,750 --> 00:28:46,930
Intelligence unit which, okay, there are
mandatory military services not that
357
00:28:46,930 --> 00:28:50,810
weird, though he was there for several
years, not just the mandatory portion,
358
00:28:50,810 --> 00:28:57,800
and this is the Intelligence unit that
spies on Palestinians. And then I looked
359
00:28:57,800 --> 00:29:02,700
at where he works now. And he works for a
company called Verint. According to their
360
00:29:02,700 --> 00:29:09,160
website they make software for analyzing
data from wiretaps. So I think that has to
361
00:29:09,160 --> 00:29:13,220
do with surveillance. I’m not sure why he
interpreted that as “nothing to do with
362
00:29:13,220 --> 00:29:16,940
surveillance”. But it’s kind of interesting
interpretation, I think it makes sense for him
363
00:29:16,940 --> 00:29:20,220
to be in the database, but of course,
for any particular profile, there is
364
00:29:20,220 --> 00:29:23,140
some noise. So it’s up to whoever
is looking at it to make the call
365
00:29:23,140 --> 00:29:26,050
and do the research.
366
00:29:26,050 --> 00:29:30,040
And sometimes other people who complained
also helped us find interesting details.
367
00:29:30,040 --> 00:29:34,420
Like this guy, Joshua Lively. He’s one of
the people who reported us to the FBI for
368
00:29:34,420 --> 00:29:43,120
domestic terrorism. He worked as a
linguist at this company. I looked at
369
00:29:43,120 --> 00:29:48,490
his profile and he mentions a lot
of interesting code words in it.
370
00:29:48,490 --> 00:29:51,750
Some of them didn’t make so much sense
for the time. This thing called ZB.
371
00:29:51,750 --> 00:29:55,740
And then a few weeks later the Intercept
released this article on a thing called
372
00:29:55,740 --> 00:30:03,830
Skynet. It’s used to use machine learning
to analyze travel data, the telecom
373
00:30:03,830 --> 00:30:08,130
providers. And ZB is one of the databases
they use and he, coincidently, has a lot
374
00:30:08,130 --> 00:30:12,130
of the databases that are used in this
listed in his skills. And as a linguist
375
00:30:12,130 --> 00:30:14,860
professioned with the language that’s used
in the region that’s mainly targeted
376
00:30:14,860 --> 00:30:18,510
in this… So I’m not sure if he’s involved
in this particular program. But it seems
377
00:30:18,510 --> 00:30:22,860
like he’s involved in something similar.
378
00:30:22,860 --> 00:30:28,160
So it’s quite interesting. Generally there
are a lot of angry people in the
379
00:30:28,160 --> 00:30:31,750
Intelligence Community. Some are nicer
than others and were just asking questions
380
00:30:31,750 --> 00:30:35,910
being like “Can you please take my profile
down!”, some other more afraid, some other
381
00:30:35,910 --> 00:30:40,640
were more violent and sending things like
death threats. Our server started getting
382
00:30:40,640 --> 00:30:44,440
hit pretty hard and ICWATCH kept going
down. We wanted to be sure that we weren’t
383
00:30:44,440 --> 00:30:48,090
going to be compelled to take the data
down some way. And the easiest way not
384
00:30:48,090 --> 00:30:52,130
to be compelled to take the data down is
to make it so you can’t really take the
385
00:30:52,130 --> 00:30:55,700
data down yourself. And the people had
much less incentive to go after you.
386
00:30:55,700 --> 00:31:00,970
So we moved ICWATCH to Wikileaks which has
been great, and they’ve been wonderful
387
00:31:00,970 --> 00:31:03,940
helping with all this. So thank you,
Wikileaks!
388
00:31:03,940 --> 00:31:09,720
applause
389
00:31:09,720 --> 00:31:11,610
from the audience: Your welcome!
390
00:31:11,610 --> 00:31:13,760
M.C.: chuckles
laughter
391
00:31:13,760 --> 00:31:17,500
As I mentioned earlier a lot of people are
taking down their résumés in response to
392
00:31:17,500 --> 00:31:24,700
ICWATCH. Specifically 1.030 people have,
out of the original 27.000. And others have
393
00:31:24,700 --> 00:31:29,120
edited them and made them private. So as
part of the update in addition to doubling
394
00:31:29,120 --> 00:31:35,050
the number of résumés available we also
recollected all of the initial résumés
395
00:31:35,050 --> 00:31:39,750
and you can go on the site and see which
ones are removed, which ones are made
396
00:31:39,750 --> 00:31:43,590
private, which ones have been modified and
all of that is fug so you can easily see
397
00:31:43,590 --> 00:31:50,540
how that’s changed.
applause
398
00:31:50,540 --> 00:31:55,330
And some of these revealed details that
people hadn’t posted… that many wish that
399
00:31:55,330 --> 00:32:00,760
they hadn’t posted in the first place. But
they also provide useful updates on where
400
00:32:00,760 --> 00:32:05,480
people are working. Because they’re to
track people as they move from job to job.
401
00:32:05,480 --> 00:32:10,840
E.g. there’s this guy, Michael Acosta,
from the original ICWATCH. From 2011
402
00:32:10,840 --> 00:32:15,750
to 2012 he worked at Guantanamo. He
was primarily trying to find out about
403
00:32:15,750 --> 00:32:21,690
potential attacks on Guantanamo itself.
He monitored various detainees and
404
00:32:21,690 --> 00:32:27,660
collaborated with the Behavioural Science
Team and was trying to figure out if
405
00:32:27,660 --> 00:32:32,790
detainees were planning some sort of coup,
I guess. And then he started working for
406
00:32:32,790 --> 00:32:41,030
the Airforce. And here he was working on
Drone Intelligence and targeting and such
407
00:32:41,030 --> 00:32:44,230
things like how he was responsible for
“the production made instant upgrade of
408
00:32:44,230 --> 00:32:47,960
DGS2 mission critical Intelligence
databases which include high value target
409
00:32:47,960 --> 00:32:52,550
development folders” like the things used
for JPAL targeting, regional fairbriefs,
410
00:32:52,550 --> 00:32:57,980
mission storyboards and mission target
logs with document FMV mission rollups.
411
00:32:57,980 --> 00:33:00,520
But the most interesting thing on this
résumé isn’t any of those things.
412
00:33:00,520 --> 00:33:05,510
It’s the thing that changed between the
original launch of ICWATCH and now.
413
00:33:05,510 --> 00:33:08,980
And that’s that he moved and started
working for a different company.
414
00:33:08,980 --> 00:33:14,160
He started working for this company
called… he called SOS International
415
00:33:14,160 --> 00:33:20,780
as All Source Analyst. He unfortunately
had to leave the position that he had
416
00:33:20,780 --> 00:33:24,880
on the site coaching High School Baseball
which he seemed to really like.
417
00:33:24,880 --> 00:33:27,630
And he kind of liked it because right now
he’s looking for Baseball opportunities
418
00:33:27,630 --> 00:33:31,610
in Germany. So he seems to be in Germany
working for this company called SOS
419
00:33:31,610 --> 00:33:34,730
International that I never heard of
before. So I went on the website and they
420
00:33:34,730 --> 00:33:38,040
have a list of the cities that they
operate in Germany. These 6 cities,
421
00:33:38,040 --> 00:33:43,870
along with Guantanamo and a number of
other sketchy locations. And based on
422
00:33:43,870 --> 00:33:47,610
Michael Acosta’s past record of working at
Guantanamo and on Drone targeting and
423
00:33:47,610 --> 00:33:50,130
things like that it sounds like this
company is probably doing something quite
424
00:33:50,130 --> 00:33:56,450
sketchy. By tracking changes to where
people work we can start to find things
425
00:33:56,450 --> 00:34:00,360
like this we might not otherwise think to
look at. That we might not otherwise about
426
00:34:00,360 --> 00:34:03,070
as interesting.
427
00:34:03,070 --> 00:34:10,219
But it’s not just open data that we
collect. Because the same tools for
428
00:34:10,219 --> 00:34:13,549
collecting and analyzing open data
are also useful for other data sets,
429
00:34:13,549 --> 00:34:18,510
they’re useful. Like we made a search tool
in collaboration with Church Foundation
430
00:34:18,510 --> 00:34:22,149
for all of the published Snowden documents
that allows you to search the full text of
431
00:34:22,149 --> 00:34:26,280
the documents, browse which code words
are in these documents, see documents that
432
00:34:26,280 --> 00:34:33,139
mention particular countries, see the full
PDFs and articles. And we also made a…
433
00:34:33,139 --> 00:34:37,230
when the Hacking Team data came out this
summer we mirrored the data and became one
434
00:34:37,230 --> 00:34:41,659
of the primary mirrors of the data. We had a
torrent that was almost downing the server
435
00:34:41,659 --> 00:34:44,350
with a lot of space and figured that none
of the other people had that, so we put it
436
00:34:44,350 --> 00:34:51,510
up. And that got a lot of traffic, it got
about 57 M hits in the first 2 days.
437
00:34:51,510 --> 00:34:54,300
And soon we realized there was a problem
where our server charged a lot for
438
00:34:54,300 --> 00:34:59,370
bandwidth and did cost us 48$ everytime
someone decided to download the 400GB
439
00:34:59,370 --> 00:35:07,480
with WGET. So that was interesting but
it’s been resolved now. It hopefully made
440
00:35:07,480 --> 00:35:11,030
the data more accessible to people who
don’t have 400GB of harddrive space
441
00:35:11,030 --> 00:35:15,990
available or enough internet connectivity
to download that. So then we’ve also made
442
00:35:15,990 --> 00:35:21,240
a search tool for all of the Hacking Team
emails; that has a search interface that
443
00:35:21,240 --> 00:35:25,400
lets you browse them like you would in a
normal email client with threading, and a
444
00:35:25,400 --> 00:35:28,870
network graph so that you can see the
connections between senders and
445
00:35:28,870 --> 00:35:39,860
recipients. The Intelligence Community
has a variety of collection disciplines:
446
00:35:39,860 --> 00:35:45,350
SIGINT, OSINT, HUMINT, measurements
of Signals Intelligence, Symmetry
447
00:35:45,350 --> 00:35:49,080
Intelligence. They have all these
different sources that they’re gathering
448
00:35:49,080 --> 00:35:55,780
data from. I think that we should try to
duplicate this. Because there are a lot
449
00:35:55,780 --> 00:35:58,230
of different sources that we can gather
data from as well, and we need to find
450
00:35:58,230 --> 00:36:01,600
base to better collect data from all these
sources and to fuse them together.
451
00:36:01,600 --> 00:36:06,300
These are some other ones that I’ve
been spending all the time looking at.
452
00:36:06,300 --> 00:36:10,170
And there’s open source Intelligence
things like ICWATCH where you’re
453
00:36:10,170 --> 00:36:13,060
collecting data from purely public
sources. But this is just part of the vare
454
00:36:13,060 --> 00:36:17,950
ecosystem that we can draw on. This is
mostly information that people and
455
00:36:17,950 --> 00:36:21,230
institutions make about themselves
publicly, either intentionally or
456
00:36:21,230 --> 00:36:25,840
unintentionally. And it’s really difficult
to use because there’s a lot of it and it
457
00:36:25,840 --> 00:36:29,940
needs to be collected and matched up and
pulled together in a browsable way for
458
00:36:29,940 --> 00:36:33,390
people to be able to use it. So you can’t
really just mainly go and use it at scale.
459
00:36:33,390 --> 00:36:39,900
You can do it a little bit but not nearly
enough. And so we’re working on making
460
00:36:39,900 --> 00:36:44,540
this easier to use. The other sort of data,
it’s anonymously leaked documents,
461
00:36:44,540 --> 00:36:47,370
documents that were (?) sent
journalists, that they think should be
462
00:36:47,370 --> 00:36:51,700
public and these often pretty explicitly
reveal corruption, human rights abuses
463
00:36:51,700 --> 00:36:56,480
or other issues. But this can also be used
to collect more data. Like we used the
464
00:36:56,480 --> 00:37:00,800
published Snowden documents very heavily
to find code words that we could use to
465
00:37:00,800 --> 00:37:05,240
collect the data in ICWATCH. And once we
start to collect data on secret things
466
00:37:05,240 --> 00:37:10,800
that were recently not known at all, but
now are, and we can find data on that, we
467
00:37:10,800 --> 00:37:14,140
can start to find data on unknown code
words and unknown things that we might not
468
00:37:14,140 --> 00:37:20,560
otherwise recognize. And then there’s data
released by governments, from FOIA
469
00:37:20,560 --> 00:37:25,400
requests through open data initiatives.
This, of course, can be spun or things can
470
00:37:25,400 --> 00:37:31,370
be held back. So it’s not ideal to use on
its own. But it can be used like the other
471
00:37:31,370 --> 00:37:34,740
2 types with in combination with each other.
You can use that to provide context, you
472
00:37:34,740 --> 00:37:42,540
can use open source data to frame FOIA
requests and things like that. So the goal
473
00:37:42,540 --> 00:37:46,730
of Transparency Toolkit is to make it
easier to collect all these types of data
474
00:37:46,730 --> 00:37:50,950
in one place and to start to use this data
in the same ways that the Intelligence
475
00:37:50,950 --> 00:37:55,330
Community uses the data collected from
all the various collection disciplines.
476
00:37:55,330 --> 00:38:00,400
Except their goal isn’t to kill people or be
some sort of omniscient to God-like being
477
00:38:00,400 --> 00:38:04,370
but we just want to build some sort of
external structure of accountability.
478
00:38:04,370 --> 00:38:09,690
To make it easier to uncover and understand
things like surveillance programs or human
479
00:38:09,690 --> 00:38:14,520
rights abuses or corruption. And when we
can find the people and companies that are
480
00:38:14,520 --> 00:38:18,290
involved in things like surveillance we
can start to map who’s doing what.
481
00:38:18,290 --> 00:38:21,870
And we can start to request information
about specific contracts. And we know who
482
00:38:21,870 --> 00:38:24,580
we can ask questions about particular
programs. And then we can start to use the
483
00:38:24,580 --> 00:38:30,020
data to start legal cases against specific
companies. And we can start to take more
484
00:38:30,020 --> 00:38:34,850
concrete actions than we would be able to,
otherwise, if we were dealing simply in
485
00:38:34,850 --> 00:38:38,820
theory or in guesses as to
what’s going on.
486
00:38:38,820 --> 00:38:42,310
So – open source intelligence – let’s just
be more pro-active and more direct with
487
00:38:42,310 --> 00:38:49,280
our techniques. And it also lets us find
some of this information earlier, because
488
00:38:49,280 --> 00:38:52,490
many of the programs mentioned in the
Snowden documents were mentioned first
489
00:38:52,490 --> 00:38:58,890
in other and open data sources. And if we
can start to figure out where these are
490
00:38:58,890 --> 00:39:02,390
and start to figure out what they are,
then we know what data we’re missing and
491
00:39:02,390 --> 00:39:05,410
we can start to go after it with FOIA
requests or trying to find it by other
492
00:39:05,410 --> 00:39:14,060
means. But all of this a really, really
big project and we can’t… this is not
493
00:39:14,060 --> 00:39:17,220
going to work if it’s just us working on
it. We need to work with other people.
494
00:39:17,220 --> 00:39:20,650
We need to work with activists who have
ideas of how they want to use the data.
495
00:39:20,650 --> 00:39:23,640
We need to work with journalists that
collect the data and write stories about
496
00:39:23,640 --> 00:39:27,130
it. We need to work with human rights
lawyers to help them with their research
497
00:39:27,130 --> 00:39:30,430
help them build legal cases based on the
findings. We need to work with NGOs and
498
00:39:30,430 --> 00:39:34,800
human rights researchers who want to
collect and use open data in their work.
499
00:39:34,800 --> 00:39:38,330
And we need more people going through
databases like ICWATCH. This doesn’t
500
00:39:38,330 --> 00:39:42,340
require any special expertise. You gain
the knowledge that you need as you’re
501
00:39:42,340 --> 00:39:46,490
going through them looking up terms. It’s
not easy but it can be quite interesting
502
00:39:46,490 --> 00:39:52,040
once you combine all of these obscure
terms and it’s like “Oh, that’s what
503
00:39:52,040 --> 00:39:56,840
they’re doing!” and oftentimes what
they’re doing is something entirely absurd
504
00:39:56,840 --> 00:40:01,300
like reading all your email
or killing people.
505
00:40:01,300 --> 00:40:05,870
And we also need software developers to
help develop software and help us figure
506
00:40:05,870 --> 00:40:11,130
out how all of these tools should fit
together. So if anyone’s interested in
507
00:40:11,130 --> 00:40:14,770
working with us to take on the
Intelligence Agencies of the world and
508
00:40:14,770 --> 00:40:18,430
figure out what they’re doing please let
us know. I think it sounds a bit insane
509
00:40:18,430 --> 00:40:23,130
and I know that, but (they) have far more
resources and far more experience but if
510
00:40:23,130 --> 00:40:27,720
we keep ignoring the situation and we
continue as we are now making scattered
511
00:40:27,720 --> 00:40:30,640
attempts to change things that aren’t
coordinated, that are based on limited
512
00:40:30,640 --> 00:40:36,290
information, nothing is going to change
longterm. So I think we need to collect
513
00:40:36,290 --> 00:40:40,800
all the information we can and figure out
how to effectively combine it and use it
514
00:40:40,800 --> 00:40:45,510
for concrete goals. And I think we need
to do this with free software and open
515
00:40:45,510 --> 00:40:49,100
data, because against such powerful
adversaries they’re probably the best
516
00:40:49,100 --> 00:40:51,490
hopes we have.
517
00:40:51,490 --> 00:41:01,940
applause
518
00:41:01,940 --> 00:41:05,960
Herald: Thank you, thank you so much!
Now we have the round of Q&A,
519
00:41:05,960 --> 00:41:11,630
for anyone who liked to ask a question,
please forward to the mikes on both sides
520
00:41:11,630 --> 00:41:17,070
of this Saal (Hall). Start
taking the question from…
521
00:41:17,070 --> 00:41:18,440
is nodding towards first person asking
…yeah.
522
00:41:18,440 --> 00:41:24,610
Q: So I’d like to ask about documents
which are scans. Which are sometimes
523
00:41:24,610 --> 00:41:30,010
released as official open source
information. What kind of workflow do you
524
00:41:30,010 --> 00:41:35,950
have or even if you have any kind of
workflow for some OCR on these…!?
525
00:41:35,950 --> 00:41:40,870
M.C.: A serious (?) that depends on the
document. There’s some open source
526
00:41:40,870 --> 00:41:46,960
software called Tesseract that’s quite
good, but it doesn’t always work in cases
527
00:41:46,960 --> 00:41:51,260
where there needs to be more specialized
parsing. I like to use something that’s
528
00:41:51,260 --> 00:41:54,830
called Abbyy (FineReader) which is,
unfortunately, not open source and we are
529
00:41:54,830 --> 00:41:59,220
looking for an alternative. For the
published Snowden documents, because we
530
00:41:59,220 --> 00:42:03,560
needed to extract the classification
headers and that wasn’t so working with
531
00:42:03,560 --> 00:42:07,150
Tesseract. But Tesseract
works for most things.
532
00:42:07,150 --> 00:42:10,030
listens to unrecorded comment
from the audience
533
00:42:10,030 --> 00:42:15,190
Yeah.
534
00:42:15,190 --> 00:42:19,720
Herald: Thank you. Do we have question
from… [the internet]? Yeah, oui!
535
00:42:19,720 --> 00:42:24,310
Signal Angel: Yes, rooty is asking on IRC:
What would you recommend the NSA to
536
00:42:24,310 --> 00:42:27,540
develop towards a future
of Social Usefulness!??
537
00:42:27,540 --> 00:42:35,780
E.g. what value have databases from
2015, people cell phone sensors in 2115!??
538
00:42:35,780 --> 00:42:40,550
Could you give the NSA, maybe
CEO there, useful work!??
539
00:42:40,550 --> 00:42:42,760
M.C.: Can you rephr..-, sorry !??
540
00:42:42,760 --> 00:42:50,010
Signal Angel: naively repeats first
of the apparent Troll questions
541
00:42:50,010 --> 00:42:52,290
M.C.: laughs
Social Usefulness…
542
00:42:52,290 --> 00:42:56,070
Probably the most useful thing they could
do is stop collecting the data in the
543
00:42:56,070 --> 00:43:01,760
first place, especially the data that’s
being intercepted or illegally collected.
544
00:43:01,760 --> 00:43:07,250
There’s probably some amounts of useful
tracking they could do, but I’m not sure
545
00:43:07,250 --> 00:43:10,300
that’s the best approach using the tactice
that they were to collect the data at that
546
00:43:10,300 --> 00:43:12,670
time.
547
00:43:12,670 --> 00:43:16,070
Herald: Thank you. So, next
question from you, please!
548
00:43:16,070 --> 00:43:20,490
Question: Hello, thanks for the talk, that
was one of the best ones I’ve seen at this
549
00:43:20,490 --> 00:43:26,740
congress. I was wondering what you think
about the question you’re raising about
550
00:43:26,740 --> 00:43:30,840
“we shouldn’t make the same mistakes”.
Because I’m not totally sure that’s
551
00:43:30,840 --> 00:43:34,780
possible because of things I’ve seen in
other communities. All communities have
552
00:43:34,780 --> 00:43:41,100
their extremists and they will abuse this
data. And then that allows a political
553
00:43:41,100 --> 00:43:46,610
attack on you, because they say you made
that happen, it’s not true. But it will celd
554
00:43:46,610 --> 00:43:50,230
people. So how do you protect
against that?
555
00:43:50,230 --> 00:43:53,660
M.C.: I think it’s hard to entirely
protect against it because we can’t
556
00:43:53,660 --> 00:43:57,330
control the actions of other people. But
people could also go off and use this data
557
00:43:57,330 --> 00:44:01,530
negatively by collecting it on their own,
independently of us. I was actually quite
558
00:44:01,530 --> 00:44:05,280
impressed, after we launched ICWATCH, I
haven’t heard of anyone complaining of
559
00:44:05,280 --> 00:44:07,380
threats that they’ve gotten from people…
560
00:44:07,380 --> 00:44:10,040
People in the Intelligence Community:
I haven’t heard of anyone in the
561
00:44:10,040 --> 00:44:11,980
Intelligence Community complaining about
threats that they’ve gotten as the results
562
00:44:11,980 --> 00:44:16,450
of ICWATCH being launched. All of the
complaints have been theoretical. The only
563
00:44:16,450 --> 00:44:19,340
threats I’ve heard of resulting from
ICWATCH are that from the Intelligence
564
00:44:19,340 --> 00:44:21,940
Community to us. I haven’t heard of
anything, so I’ve been very impressed with
565
00:44:21,940 --> 00:44:27,190
the civility of the internet in that case.
And I think that maybe, by framing it, and
566
00:44:27,190 --> 00:44:30,400
actually bringing it down to the
individual level, and making it clear that
567
00:44:30,400 --> 00:44:35,460
these are people, that makes it a little
bit less likely that people will go after
568
00:44:35,460 --> 00:44:37,610
them in a vicious way.
569
00:44:37,610 --> 00:44:43,260
Q: Have you thought of creating a kind of usage
guidelines? I mean that's not gonna change what
570
00:44:43,260 --> 00:44:48,270
anyone does. But if someone does something
you can then say “That’s against our usage
571
00:44:48,270 --> 00:44:52,170
guidelines” and it’s a political defence
against someone accusing it…
572
00:44:52,170 --> 00:44:56,040
M.C.: Yeah, I don’t think there’s any way
that we can enforce something like that.
573
00:44:56,040 --> 00:44:59,830
But we do try to be very careful with how
we’re framing it in saying – like I -
574
00:44:59,830 --> 00:45:02,920
since a long time, all this talk saying these are
people that are not evil people. They’re
575
00:45:02,920 --> 00:45:06,570
normal people that you should look at as
such. So I think being very careful of
576
00:45:06,570 --> 00:45:09,140
framing it and we’ll be developing some
sort of guidelines. That’s definitely a
577
00:45:09,140 --> 00:45:11,230
good idea.
578
00:45:11,230 --> 00:45:13,740
Herald: Thank you. Your question, please!
579
00:45:13,740 --> 00:45:19,590
Troll: Hi! First, thank you very much for
this tool that makes it possible to fight
580
00:45:19,590 --> 00:45:27,750
back against, legally. For people who try
to punish or yeah…
581
00:45:27,750 --> 00:45:34,020
What I have to say, or my question is: I
worked in the last 3 1/2 years, let’s say,
582
00:45:34,020 --> 00:45:39,530
in the field of IT Forensics. And I worked
with Maltego and stuff, and so I know what
583
00:45:39,530 --> 00:45:45,210
a lot of work it is to collect data and
bring it into good conditions, so others
584
00:45:45,210 --> 00:45:57,480
could read it or you can get a goal, or
see a goal. And what I personally think
585
00:45:57,480 --> 00:46:04,700
is very important: this could be very
sensible data to people and my question
586
00:46:04,700 --> 00:46:12,620
is: How do you care that this data
which you will offer to download will keep
587
00:46:12,620 --> 00:46:20,470
safe? That’s the first question, and
the second is: Did you think about
588
00:46:20,470 --> 00:46:27,830
verifications? So you are collecting a lot
of data, and in a few years another person
589
00:46:27,830 --> 00:46:34,650
wants to see if this data was correct. So
do you verify the sources like MD5 sum
590
00:46:34,650 --> 00:46:44,230
or so you can say “This fingerprint taken
at this-day and this-time is correct?”
591
00:46:44,230 --> 00:46:51,220
M.C.: For the first question: I don’t
think there’s really… I’m not sure (?)
592
00:46:51,220 --> 00:46:56,220
protected because this is a version that
people posted publicly themselves. So they
593
00:46:56,220 --> 00:47:00,720
sort of said that they don’t want it to be
protected or secured because they’re
594
00:47:00,720 --> 00:47:07,250
posting it on the public internet. So I’m
not sure there’s really any reason to try
595
00:47:07,250 --> 00:47:11,510
to protect it when it’s something that
they’ve published very publicly.
596
00:47:11,510 --> 00:47:16,050
And on the second one, for verification,
that’s quite tricky with some of the data
597
00:47:16,050 --> 00:47:18,990
especially around the Intelligence
Community because all of these things
598
00:47:18,990 --> 00:47:22,320
are secretive and it’s hard to confirm
them. We can confirm them against each
599
00:47:22,320 --> 00:47:26,760
other like now we have multiple résumé
sites on ICWATCH, so sometimes we can find
600
00:47:26,760 --> 00:47:31,020
the same person’s résumé on another site
and compare over time and we can go
601
00:47:31,020 --> 00:47:34,410
finding their profiles they have and try
to combine as much data on the same
602
00:47:34,410 --> 00:47:36,310
as is possible and have it over time.
603
00:47:36,310 --> 00:47:41,790
Q: What I did: I made a fingerprint
when I downloaded a website, I made a
604
00:47:41,790 --> 00:47:45,790
fingerprint and then I can say OK, this
is… yeah.
605
00:47:45,790 --> 00:47:48,730
M.C.: Of truth verifying various actions
collected, then. Yeah, I mean that's a bit harder to
606
00:47:48,730 --> 00:47:54,980
absolutely do that on the behalf all of
the full text of the web page save, then
607
00:47:54,980 --> 00:48:01,350
we have it all published on Github so you
can verify those collected then but, yeah.
608
00:48:01,350 --> 00:48:03,980
Herald: We’ll take the questions
from up there.
609
00:48:03,980 --> 00:48:10,390
Jake Appelbaum: Hi, community extremist
here… So I wanted to say something which
610
00:48:10,390 --> 00:48:13,380
is that I think what Julian did for
leaking documents you’re doing for
611
00:48:13,380 --> 00:48:17,800
analysis. Which is really great! Because
transparency is enough – you need action!
612
00:48:17,800 --> 00:48:21,310
And so I just wanted to say that I hope
that everyone can give and see in
613
00:48:21,310 --> 00:48:28,000
Transparency Toolkit a lot of material
support. And maybe a round of applause!
614
00:48:28,000 --> 00:48:33,750
applause
615
00:48:33,750 --> 00:48:37,940
Definitely the best talk at the congress
and I had a couple of suggestions. But
616
00:48:37,940 --> 00:48:41,640
one of them is: I think it would be great
if you could focus on American Domestic
617
00:48:41,640 --> 00:48:43,060
Police Agencies.
M.C.: Hmm-mhm…
618
00:48:43,060 --> 00:48:48,110
Jake: In particular collecting the images
of Police Academy Graduation photographs.
619
00:48:48,110 --> 00:48:53,340
And to be able to move in the direction of
facial recognition, so that we can find
620
00:48:53,340 --> 00:48:56,440
Undercover Police Officers
that are in our midst…
621
00:48:56,440 --> 00:49:01,740
applause
622
00:49:01,740 --> 00:49:06,640
And I think it would be great if you could
create a FOIA wizard, essentially, ’cause
623
00:49:06,640 --> 00:49:10,720
everybody likes wizards, and who doesn’t
like UNIX… So it’d be great if you could
624
00:49:10,720 --> 00:49:14,290
create a FOIA wizard where you could say:
“I wanna know about these terms” and it
625
00:49:14,290 --> 00:49:19,310
would just generate automatically – maybe
by partnering with Macroc e.g. –
626
00:49:19,310 --> 00:49:22,890
interesting things, where there’s a kind
of “Wait!”. Where you realize there’s a lot
627
00:49:22,890 --> 00:49:26,630
of people working on this classified
program and it’s at this agency and they
628
00:49:26,630 --> 00:49:29,350
have a contract with this company and
these are the people involved and just
629
00:49:29,350 --> 00:49:34,020
automatically generate those FOIAs and
then get people to sort of sign up to put
630
00:49:34,020 --> 00:49:38,440
their name down and sort of sponsor a
little transparency and to say “Oh, that’s
631
00:49:38,440 --> 00:49:41,610
the FOIA I wanna get behind, I’m in a
check on it, you know, once a week, I’m
632
00:49:41,610 --> 00:49:45,170
gonna do this thing. Through Macroc.”
I think that would be a way to take this
633
00:49:45,170 --> 00:49:49,410
information in a legal manner and to make
it actionable. And I think there’s lots of
634
00:49:49,410 --> 00:49:53,869
other interesting things you could do that
are not about the law. But I leave that to
635
00:49:53,869 --> 00:49:57,270
the imagination of other people. It should
be legal but it doesn’t need to be through
636
00:49:57,270 --> 00:50:02,090
legal channels like, say, FOIA. So thanks
for the work that you’re doing, M.C. and
637
00:50:02,090 --> 00:50:06,170
I hope that you will expand it to,
basically, all of the pigs of the whole
638
00:50:06,170 --> 00:50:10,190
world. And I would really encourage you
to read Hannah Ahrend’s “Eichmann in
639
00:50:10,190 --> 00:50:15,760
Jerusalem”, because you described a
fundamental thing: these people aren’t
640
00:50:15,760 --> 00:50:21,280
evil. But actually, Evil itself doesn’t
exist. These people are the Banality of
641
00:50:21,280 --> 00:50:26,040
Evil. They’re people who have soccer
practice, and they have a dog, and they
642
00:50:26,040 --> 00:50:29,540
like to go home and fuck their wife, and
they’re regular people who do drone
643
00:50:29,540 --> 00:50:31,520
strikes.
644
00:50:31,520 --> 00:50:36,340
applause
645
00:50:36,340 --> 00:50:40,150
Herald: Thank you. We
have a question on mike 1.
646
00:50:40,150 --> 00:50:46,540
Q: How easy is it to add support for new
databases or new sources of information?
647
00:50:46,540 --> 00:50:51,050
M.C.: It depends on the source and how
that site is structured. But generally
648
00:50:51,050 --> 00:50:55,110
it’s not too difficult. The adding to
proper new sources does require
649
00:50:55,110 --> 00:51:00,060
programming at this point. But it’s not
particularly complex programming and we
650
00:51:00,060 --> 00:51:03,350
have some libraries that make some
parts of it easier, as well. And if you’re
651
00:51:03,350 --> 00:51:05,700
interested in adding a data source we’re
more than happy to help with that.
652
00:51:05,700 --> 00:51:10,980
Q: Awesome! My favourite is the list of…
the report of when people were denied
653
00:51:10,980 --> 00:51:16,440
security clearance and why and if their
appeal was then, like, removed.
654
00:51:16,440 --> 00:51:18,280
M.C.: Yeah, that would
be quite interesting!
655
00:51:18,280 --> 00:51:24,490
Q: Okay!
656
00:51:24,490 --> 00:51:29,050
Herald: If there’s no further
questions… moment…
657
00:51:29,050 --> 00:51:34,140
yeah, okay! Please!
658
00:51:34,140 --> 00:51:44,010
Q: Yesterday it was said that we have to
make sure that they know that we watch
659
00:51:44,010 --> 00:51:50,900
them and make sure that they know that we
watch them. Because some day they will get
660
00:51:50,900 --> 00:51:57,680
prosecuted. So, in some way. I think
you are exactly doing this. So this is
661
00:51:57,680 --> 00:52:12,350
brilliant. Are you already in the stage
where you’re thinking you can start
662
00:52:12,350 --> 00:52:18,390
concrete legal actions against some
individuals that you are getting
663
00:52:18,390 --> 00:52:24,590
information with your tools. We’ve been
working with some lawyers towards that.
664
00:52:24,590 --> 00:52:29,230
We are looking to do more in this, so if
you know… if you have any ideas for
665
00:52:29,230 --> 00:52:32,080
particular situations where this may be
applicable, our lawyers, that we should
666
00:52:32,080 --> 00:52:37,150
work with, let us know! But we’re working
towards that and making some progress.
667
00:52:37,150 --> 00:52:41,730
Q: Thanks!
668
00:52:41,730 --> 00:52:44,690
Herald: Getting a question
from up there, please!
669
00:52:44,690 --> 00:52:49,840
Q: I just wanna say that you are a
visionary who is more passionate than
670
00:52:49,840 --> 00:52:53,420
anybody I have ever collaborated with
and it’s a total honor.
671
00:52:53,420 --> 00:52:54,369
applause
672
00:52:54,369 --> 00:52:57,220
Herald: Thank you.
673
00:52:57,220 --> 00:53:02,780
M.C.: Yeah, and just to everyone, that’s
Brennan who also works on Transparency
674
00:53:02,780 --> 00:53:06,710
Toolkit. He made the awesome UI for
Harvester and Lookingglass that you saw
675
00:53:06,710 --> 00:53:09,470
in the Tabs of all this.
676
00:53:09,470 --> 00:53:14,780
applause
677
00:53:14,780 --> 00:53:17,900
Jake: If no one else is gonna ask a
question, I’d like to ask a question which
678
00:53:17,900 --> 00:53:21,260
I know the answer to but no one else
in the room does. And I think it’s very
679
00:53:21,260 --> 00:53:25,210
fascinating. I wonder if you could talk
about lessons that you’ve learned from
680
00:53:25,210 --> 00:53:28,490
studying about the South African
Resistance to Apartheid.
681
00:53:28,490 --> 00:53:30,020
M.C. is laughing
Jake: And maybe you could talk about the
682
00:53:30,020 --> 00:53:34,880
things that drive you to work on these
things. E.g. what inspires you to justice?
683
00:53:34,880 --> 00:53:39,310
E.g. experiences at MIT and maybe – I mean
if you don’t want to talk about it, I’m
684
00:53:39,310 --> 00:53:42,940
sorry for asking it. But if you do wanna
talk about it I think you can inspire
685
00:53:42,940 --> 00:53:48,930
everyone else here to raise their fist
with you! In solidarity.
686
00:53:48,930 --> 00:53:57,150
M.C.: Yeah… Okay… I guess it’s been
nearly 3 years now, so maybe that’s okay
687
00:53:57,150 --> 00:54:06,480
to talk about. 3 years ago there was this
case at MIT… everyone has probably heard
688
00:54:06,480 --> 00:54:13,930
of Aaron Swartz and he was being
prosecuted for downloading documents from
689
00:54:13,930 --> 00:54:22,480
JSTOR. And I was brought in trying to figure out
MIT’s role in this situation, and if you
690
00:54:22,480 --> 00:54:26,400
might be able to sway a public opinion,
a few people in Boston. I think some of
691
00:54:26,400 --> 00:54:31,110
them are in this room. And we were trying
to help him. And eventually, part way into
692
00:54:31,110 --> 00:54:35,770
the process, he became afraid and decided
that it would be more risky for us to help
693
00:54:35,770 --> 00:54:38,890
him, with the prosecutor who might lash
back, so we stopped. But one of the things
694
00:54:38,890 --> 00:54:45,650
that I did in this process was, I sent out
a survey to all of the professors at MIT
695
00:54:45,650 --> 00:54:54,450
asking their opinion on his case. And
whether they identified with his actions.
696
00:54:54,450 --> 00:54:59,280
And I got a lot of response to this
survey. Some were quite nice and were
697
00:54:59,280 --> 00:55:03,560
quite supportive. Some were very vicious,
saying that he should go to jail and that
698
00:55:03,560 --> 00:55:09,040
he is a waste of humanity and he works at
this Harvard Center for Ethics, so how is
699
00:55:09,040 --> 00:55:13,390
this ethical. And things like that. They
were quite horrible. And initially he had
700
00:55:13,390 --> 00:55:17,540
access to this database and somehow over
the next year, when we weren’t doing much,
701
00:55:17,540 --> 00:55:21,970
he lost access to this database. And he
emailed me asking for access again. And
702
00:55:21,970 --> 00:55:26,800
back then I was on some stupid kick about
research ethics and redaction and thought
703
00:55:26,800 --> 00:55:30,570
that there’s no reason to… It really seems
that’s like “I cannot give you the answers
704
00:55:30,570 --> 00:55:34,770
about the names”. I was just stupid because
the names are the most useful part of that
705
00:55:34,770 --> 00:55:42,470
data. And I kind of abandoned him, along
with a lot of other people in that. And I
706
00:55:42,470 --> 00:55:50,119
feel like if I had given him the names
that might have been something that could
707
00:55:50,119 --> 00:55:53,490
be used to find supporters within MIT or
people who were rallying against him. And
708
00:55:53,490 --> 00:55:56,050
I don’t think it would have made a huge
difference but it might have made just a
709
00:55:56,050 --> 00:56:02,140
little bit. And that was one of the things
that really showed me the power of data on
710
00:56:02,140 --> 00:56:06,190
individuals and the role of individuals
within institutions. And I feel like I
711
00:56:06,190 --> 00:56:10,780
really failed there. So
I don’t want to do that again.
712
00:56:10,780 --> 00:56:16,270
applause
713
00:56:16,270 --> 00:56:20,540
Herald: Thank you. Unfortunately, we need
to wrap up because we are out of time.
714
00:56:20,540 --> 00:56:26,900
Thank you for attending this very
interesting lecture and, quite touching
715
00:56:26,900 --> 00:56:28,230
in the end.
716
00:56:28,230 --> 00:56:33,780
postroll music
717
00:56:33,780 --> 00:56:38,350
Subtitles created by c3subtitles.de
in 2016. Join and help us do more!