Herald: Vincezo Izzo is entrepreneur and
investor with a focus on cybersecurity. He
has started up, gotten bought, and
repeated this a few times, and now he is
an advisor who advises people on starting
up companies, getting bought, and
repeating that. He is also director at
CrowdStrike and an associate at MIT Media
Lab.
Just checking the time to make sure that
we start on time, and this is, start
talking now. On the scale of cyber
security. Please give a warm welcome to
Vincenzo.
Applause
Vincenzo Izzo: So hi, everyone, thanks for
being here. As Karen said, I have made a
few changes to my career, but my
background is originally technical, and
what I wanted to do today is to talk about
a trend that I think we sort of take for
granted and it's to some extent obvious
but also underappreciated. And that is the
cloud scale in security. Specifically,
when I say cloud scale, what I mean is the
ability to process very large amounts of
data as well as spawn computing power with
ease, and how that has played a role in
our industry in the past decade or so. But
before I talk about that, I think some
context is important. So I joined the
industry about 15 years ago and back in
the days, even even a place like the
Congress was a much smaller place. It was
to some extent cozier and the community
was tiny. The industry was fairly niche.
And then something happened around 2010.
People realized that there were more and
more state sponsored attacks being carried
out. From Operation Aurora against Google,
to the Mandiant report APT1, that was the
first reported document how the Chinese
PLA was hacking west- let's call it the
western world infrastructure for IP theft.
And that changed a lot for the
industry. There have been two significant
changes because of all of this attention.
The first one is notoriety. We went from
being, as I said, a relatively unknown
industry to something that everyone talk
about. If you open any kind of a
newspaper, there's almost always an
article on cybersecurity, boardrooms talk
about cybersecurity... and in a sense,
again, back when I joined, cybersecurity
wasn't a thing. It used to be called
infosec. And now very few people know what
infosec even means. So notoriety is one
thing, but notoriety is not the only thing
that changed. The other thing that changed
is the amount of money deployed in the
sector. So, back in 2004, depending on the
estimate you trust there, the total
spending for cybersecurity was between
three point five to ten billion dollars.
Today's over 120 billion dollars. And so
it kind of looks exponential. But the
spending came with a almost... Like, a
very significant change in the type of
players there are in the industry today.
So a lot of the traditional vendors that
used to sell security software have kind
of disappeared. And what you have today
are two kinds of player largely. You have
the big tech vendors. So you have
companies like Google, Amazon, Apple and
so on, and so forth, that have sort of
decided to take security more seriously.
Some of them are trying to monetize
security. Others are trying to use it as a
sort of like slogan to sell more phones.
The other group of people or entities are
large cloud-based security vendors. And
what both groups have in common is that
they're using more and more sort of like
cloud-scale and cloud resources to try to
tackle security problems. And so what I
want to discuss today is from a somewhat
technical perspective, our scale has made
a significant impact in the way we
approach problems, but also in the kind of
people that we have in the industry today.
So what I'm gonna do is to give you a few
examples of the change that we've gone
through. And one of the, I think one of
the important things to keep in mind is
that what scale has done, at least in the
past decade, is it has given defense a
significant edge over offense. It's not
necessarily here to stay, but I think it's
an important trend that it's somewhat
overlooked. So let me start with endpoint
security. So back in the 80s, a few people
started to toy with this idea of IDS
systems. And the idea behind an IDS system
is pretty straightforward. You want to
create a baseline benign behavior for a
machine, and then if that machine starts
to exhibit anomalous behavior, you would
flag that as potentially malicious. This
was the first paper published on a host
based IDS systems. Now, the problem with
host based IDS systems is that they never
actually quite made it as a commercial
product. And the reason for this... There
were largely two reasons for this: The
first one is that it was really hard to
interpret results. So it was really hard
to figure out: "Hey, here's an anomaly and
this is why this anomaly might actually be
a security incident." The second problem
was, you had a lot of false positives and
it was kind of hard to establish a benign
baseline on a single machine, because you
had a lot of variance on how an individual
machine would behave. So what happened is
that commercially we kind of got stuck
with antivirus, antivirus vendors, and
signatures for a very long time. Now, fast
forward to 2013. As I mentioned, the APT1
report came out and AV companies actually
admitted that they weren't that useful at
detecting stuff like Stuxnet or Flame. And
so there was kind of like a new kid on the
block, and the buzzword name for it was
EDR. So, endpoint detection and response.
But when you strip EDR from like the
marketing fluff, what EDR really is, is
effectively host-based intrusion detection
system at scale. So in other words, scale
and ability to have cloud-scale has made
IDS systems possible in two ways. The
first one is that because you actually now
have this sort of like data lake with a
number of machines, you have much larger
datasets to train and test detections on.
What that means is, it's much easier to
establish the benign baseline, and it's
much easier to create proper detection, so
they don't detect just malware, but also
sort of like malware-less attacks. The
other thing is that EDR vendors and also
companies that have internal EDR systems
have -to a large extent- economy of scale.
And what that means is you can actually
have a team of analysts that can create
explanation and sort of an ontology to
explain why a given detection may actually
represent a security incident. On top of
it, because you have those data lake, you
are now able to mine that for a data to
figure out new attack patterns that you
weren't aware of in the past. So this in
itself is a pretty significant
achievement, because we finally managed to
move away from signatures to something
that works much better and is able to
detect a broader range of attacks. But the
other thing that EDR system solved, sort
of like as a side effect, is the data
sharing problem. So, if you've been around
industry for a long time, there have been
many attempts at sharing threat data
across different entities and they all
kind of failed because it was really hard
to establish sort of like a protocol to
share those data. But implicitly, what EDR
has done, is to force people to share and
collect threat intelligence data and just
in general data from endpoints. And so now
you have the vendors being the sort of
implicitly trusted third party that can
use that data to write detections that can
be applied to all the systems, not just an
individual company or any individual
machine. And the result of that, the
implication of that is that the meme that
the attacker only needs to get it right
once and the defender needs to get it
right all the time is actually not that
true anymore, because in the past you were
in a situation where if you had an
offensive infrastructure, whether it was
servers, whether it was exploit chains,
you could more often than not reuse them
over and over again. Even if you had
malware, all you had to do was to slightly
mutate the sample and you would pass any
kind of detection. But today that is not
true anymore in most cases. If you get
detected on one machine, all of the
sudden, all of your offensive
infrastructure has to be scrapped and you
need to start from scratch. So this is the
first example and I think in itself is
quite significant. The second example that
I want to talk about is fuzzing. And
fuzzing is interesting also for another
reason, which is it gives us a glimpse
into what I think the future might look
like. So as you're probably familiar, if
you've done any apps like work in the
past, Fuzzing has been sort of like a
staple in the apps like arsenal for a very
long time. But in the past, probably five
years or so, fuzzing has gone through some
kind of renaissance in the sense that two
things have changed. Two things have
improved massively. The first one is that
we finally managed to find a better way to
assess the fitness function that we use to
guide fuzzing. So a few years ago,
somebody called Michal Zalewski release a
fuzzer called AFL, and one of the primary
intuitions behind AFL was that you could
actually instead of using code coverage to
drive the fuzzer, you could use path
coverage to drive the fuzzer and that
turned fuzzing in a way more, you know,
much more effective instrument to find
bugs. But the second intuition that I
think is even more important and that
changed fuzzing significantly is the fact
that as far as fuzzing is concerned, speed
is more important than smarts. You know,
in a way. And what I mean by this is that
when you look at AFL, AFL as an example,
is an extremely dumb fuzzer. It does stuff
like byte flipping, bit flipping. It has
very, very simple strategies for mutation.
But what AFL does very well is, it's an
extremely optimized piece of C code and it
scales very well. And so you are in a
situation where if you have a reasonably
good server, where you can run AFL, you
can synthesize a very complex file formats
in very few iterations. And what I find
that amazing is that this intuition
doesn't apply just to file formats. This
intuition applies to much more complicated
state machines. So the other example that
I want to talk about as far as fuzzing
goes, is ClusterFuzz. ClusterFuzz is a
fuzzing harness used by the Chrome team to
find bugs in Chrome and ClusterFuzz has
been around for about six years. In the
span of six years ClusterFuzz found
sixteen thousand bugs in Chrome alone,
plus another eleven thousand bugs in a
bunch of open source projects. If you
compare ClusterFuzz with the second most
successful fuzzer are out there for
JavaScript engines, you'll find that the
second fuzzer called jsfunfuzz found about
six thousand bugs in the span of eight to
nine years. And if you look at the code,
the main difference between the two is not
the mutation engine. The mutation engine
is actually pretty similar. They don't...
ClusterFuzz doesn't do anything
particularly fancy, but what ClusterFuzz
does very well is it scales massively. So
ClusterFuzz today runs on about twenty
five thousand cores. And so with fuzzing
we're now at a stage where the bug churn
is so high that defense again has an
advantage compared to offense because it
becomes much quicker to fix bugs than it
becomes to fix exploit chains, which would
have been unthinkable just a few years
ago. The last example that I want to bring
up is a slightly different one. So, a few
months ago, the TAG team at Google found
in the wild a server that was used for a
watering hole attack, and it was thought
that the server was used against Chinese
Muslim dissidents. But what's interesting
is that the way you would detect this kind
of attack in the past was that you would
have a compromised device and you would
sort of like work backwards from there.
You would try to figure out how the device
got compromised. What's interesting is
that the way they found the server was
effectively to mine their local copy of
the Internet. And so, again, this is
another example of scale that gives them a
significant advantage to defense versus
offense. So, in all of these examples
that I brought up, I think when you look
deeper into them, you realise that it's
not that the state of security has
improved because we've necessarily got
better at security. It's that it has
improved because we got better at handling
large amounts of data, storing large
amounts of data and spawning computing
power and resources quickly when needed.
So, if that is true, then one of... the
other thing to realise is that in many of
these cases, when you look back at the
examples that I brought up, it actually is
the case that the problem at scale looks
very different from the problem at a much
smaller scale, and the solution as a
result is very different. So I'm going to
use a silly example to try to drive the
point home. Let's say that your job is to
audit this function. And so you need to
find bugs and this function. In case
you're not familiar with C code, the
problem here is that you can overflow or
underflow that buffer at your pleasure
just by passing a random value for "pos".
Now, if you were to manually audit this
thing, or if your job was to audit
this function, well, you could use... You
would have many tools you could use. You
could do manual code auditing. You could
use a symbolic execution engine. You could
use a fuzzer. You could use static
analysis. And a lot of the solutions that
are optimal for this case end up being
completely useless, if now your task
becomes to audit this function and this is
because the state machine that this
function implements is so complex that a
lot of those tools don't scale to get
here. Now, for a lot of the problems I've
talked about it, we kind of face the same
situation where the solution at scale and
a problem of scale looks very different.
And so one thing, one realization is that
engineering skills today are actually more
important than security skills in many
ways. So when you look... when you think
back at fuzzers like ClusterFuzz, or AFL,
or again EDR tools, what matters there is
not really any kind of security expertise.
What matters there is the ability to
design systems that scale arbitrarily
well, in sort of like their backend, to
design, to write code that is very
performant and none of this has really
much to do with traditional security
skills. The other thing you realize is
when you combine these two things is that
a lot of what we consider research is
happening in a different world to some
extent. So, six years ago, about six years
ago, I gave a talk at a conference called
CCS and it's an academic conference. And
basically what I... my message there was
that if academia wanted to do research
that was relevant to the industry, they
had to talk to the industry more. And I
think we are now reached the point where
this is true for industry in the sense
that if we want to still produce
significant research at places like CCC,
we are kind of in a bad spot because a lot
of the innovation that is practical in the
real world is happening very large... in
very large environments that few of us
have access to. And I'm going to talk a
bit more about this in a second. But
before I do, there is a question that I
think is important to digress on a bit.
And this is the question of:
Have we changed
significantly as an industry, are we are
in sort of like a new age of the industry?
And I think that if you were to split the
industry in phases, we left the kind of
like artisanal phase, the phase where what
mattered the most was security knowledge.
And we're now in a phase where we have
this large scale expert systems that
require significant more
engineering skills, that they require
security skills, but they still take input
from kind of like security practitioners.
And I think there is a question of: Is
this it? Or is this the kind of like where
the industry is going to stay, or is there
more to come? I know better than to make
predictions in security, 'cause most of
the times they tend to be wrong, but I
want to draw a parallel. And that parallel
is with another industry, and it's Machine
Learning. So, somebody called Rich Sutton
who is one of the godfather of machine
learning, wrote an essay called "The
Bitter Truth". And in that essay, he
reflects on many decades of machine
learning work and what he says in the
essay is that people tried for a very long
time to embed knowledge in machine
learning systems. The rationale was that
if you could embed knowledge, you would
have a smart... you could build smarter
systems. But it turns out that what
actually worked were things that scale
arbitrarily well with more computational
power, more storage capabilities. And so,
what he realized was that what actually
worked for machine learning was search and
learning. And when you look at stuff like
AlphaGo today, AlphaGo works not really
because it has a lot of goal knowledge. It
works because it has a lot of computing
power. It has the ability to train itself
faster and faster. And so there is a
question of how much of this can
potentially port to security. Obviously,
security is a bit different, it's more
adversarial in nature, so it's not quite
the same thing. But I think we are... we
have only scratched the surface of what
can be done as far as reaching a newer
level of automation where security
knowledge will matter less and less. So I
want to go back to the AFL example that I
brought up earlier, because one way to
think about AFL is to think about it as a
reinforcement learning fuzzer. And what I
mean by this... is in this slide, what AFL
capable to do, was to take one single JPEG
file and in the span of about twelve
hundred days iteration, they were
completely random dumb mutation. Go to
another well-formed JPEG file. And when
you think about it, this is an amazing
achievement because there was no knowledge
of the file format in AFL. And so we are
in... we are now more and more building
systems that do not require any kind of
expert knowledge as far as security is
concerned. The other example that I want
to talk about is the Cyber Grand
Challenge. So DARPA ... a few years ago
started this competition called Cyber
Grand Challenge,
and the Idea behind cyber grand challenge
was to try to answer the question of can
you automagically do exploit generation
and can you automatically do patch
generation. And obviously they did it on
some well toy environments. But if you
talk today to anybody who does automatic
export generation research, they'll tell
you that we are probably five years away
from being able to automatically
synthesize non trivial exploits, which is
an amazing achievement because if you
asked anybody five years ago, most people,
myself included, would tell you that
time would not come anytime soon. The
third example that I want to bring up is
something called Amazon Macie, which is a
new sort of service released by Amazon.
And what it does is basically uses machine
learning to try to automatically identify
PII information and intellectual property
in the data. You started with a AWS and
then tried to give you a better sense of
what happens to that data. So in all of
these cases, when you think about them,
again, it's a scenario where there is very
little security expertise needed. What
matters more is engineering skills. So
everything I've said so far is reasonably
positive for scale. Is a positive scale,
it is a positive, sort of like case for
scale. But I think that there is another
side of scale that is worth touching on.
And I think especially to this audience is
important to think about. And the other
side of scale is that scale breeds
centralization. And so to the point I was
making earlier about where, where is
research happening, where is real word
applicable research happening, and that
happens increasingly in places like Amazon
or Google or large security vendors or
some intelligence agencies. And so what
that means is the field, the barriers to
entry to the field are are significantly
higher. So I said earlier that I tried to
join the industry about 15 years ago. Back
then, I was still in high school. And one
of the things that was cool about the
industry for me was that as long as you
had a reasonably decent internet
connection and a laptop, you could
contribute to the top of the industry. You
could see what everyone was up to. You
could do research that was relevant to
what the what the industry was working on.
But today, the same sort of like 15, 16
year old kid in high school would have a
much harder time contributing to the
industry. And so we are in a situation
where... but because scale breeds
centralization. We are in a situation
where we will likely increase the barrier
of entry to a point where if you want to
contribute meaningfully to security, you
will have to go through a very
standardized path where you probably do
computer science and then you go work for
a big tech company. And that's not
necessarily a positive. So I think the
same Kranzberg principle applies to scale
in a sense, where it has done a lot of
positive things for the sector, but it
also comes with some consequences. And if
there is one takeaway from this talk
that I would like you to have is to think
about how much something that is pretty
mundane that we take for granted in our
day to day has changed the industry and
how much that will probably contribute to
the next phase of the industry. And not
just from a technical standpoint. It's not
that the solutions we use today are
much different from what we used to use,
but also from the kind of people that are
part of the industry and the community.
And that's all I had. Thank you for
listening.
Applause
Herald: Thank you very much. We have time
for questions. So if you have any
questions for Vincenzo, please line up
behind the microphones that are marked
with numbers and I will give you a signal
if you can ask a question. We also have
our wonderful signal angels that have been
keeping an eye on the Internet to see if
there are any questions from either
Twitter, Mastodon or IRC. Are there any
questions from the Internet? We'll just
have to mic fourth... microphone number
nine to be turned on and then we'll have a
question from the Internet for Vincenzo.
And please don't be shy. Line up behind
the microphone. Ask any questions.
Signal Angel: Now it's on. But actually
there are no questions from the Internet
right now.
Herald: There must be people in the room
that have some questions. I cannot see
anybody lining up. Do you have any advice
for people that want to work on some
security on scale?
Vincenzo: I mean, I just had to think a
lot of the interesting research is
happening more and more like tech
companies and similar. And so as much as
it pains me. It's probably the advice to
think either whether you can find other
ways to get access to large amounts of
data or and computational power or maybe
consideresting into one of those places.
Herald: And we now actually have questions
at microphone number one.
Microphone 1: Can you hear me? Yeah. Thank
you for the great talk. You're making a
very strong case that information at scale
has benefited security, but is that also
statistical evidence for that?
Vincenzo: So I think, well, it's a bit
hard to answer the question because a lot
of the people that have an incentive to
answer that question are also kind of
biased, but I think when you look at
metrics like well, time in terms of how
much time people spend on attackers
machine, that has decreased significantly
like it, it has statistically decreased
significantly. As far as the other
examples I brought up, like fuzzing and
similar. I don't think I as far as I'm
aware, there hasn't been any sort of
rigorous study around where now we are.
We've reached the place where defense has
kind of like an edge against offense. But
I think if I talk to anybody who has kind
of like some offensive security knowledge
or they did work in offense, the overall
feedback that I hear is that it's becoming
much harder to keep bug chains alive for a
long time. And this is in large part not
really for for countermeasures. It's in
large part because bugs keep churning.
So there isn't a lot of
statistical evidence, but from what I can
gather, it seems to be the case.
Herald: We have one more question from
microphone number one.
Microphone 1: So thank you for the
interesting talk. My question goes in the
direction of the centralization that you
mentioned, that the large like the
hyperscalers are converging to be the
hotspots for security research. So is
there any guidance you can give for us as
a community how to to retain access to the
field and contribute?
Vincenzo: Yes. So. So I think
it's an interesting situation
because more and more there
are open source tools that
allow you to gather the data. But the
problem with these data gathering
exercises is not too much how to gather
the data. The problem is what to gather
and how to keep it. Because when you look
at the cloud bill, for most
players, it's extraordinarily high.
And I don't unfortunately, I don't have an
easy solution to that. I mean, you can use
pretty cheap cloud providers, but
it's still like, the expenditure is still
an order of magnitude higher than it used
to be. And I don't know, maybe academia
can step up. I'm not sure.
Herald: We have one last question from the
Internet. And you can stay at the
microphone if you have another question
for Vincenzo.
Signal: Yes. The Internet asked that. You
ask a lot about fuzzing at scale about
besides OSS-Fuzz, are you aware of any
other scaled large fuzzing infrastructure?
Vincenzo: That is publicly available? No.
But when you look at, I mean when you
look, for instance, of the participants
for Cyber Grand Challenge, a lot of them
were effectively using a significant
amount of CPU power for fuzzing. So I'm
not aware of any kind of like plug and
play fuzzing infrastructure you can use
aside from OSS-Fuzz. But there is a law,
like as far as I'm aware, everyone there
that does fuzzing for a living has now
access to significant resources and tries
to scale fuzzing infrastructure.
Herald: If we don't have any more
questions, this is your last chance to run
to a microphone or write a question on the
Internet. Then I think we should give a
big round of applause to Vincenzo.
Vincenzo: Thank you.
Applause
subtitles created by c3subtitles.de
in the year 2019. Join, and help us!