Hacking Containers, Kubernetes and Clouds

Edit subtitles

0:04 - 0:09

Thomas Fricke: Thank you very much for the
invitation. So second talk tomorrow –
0:09 - 0:14

Thank you – ähm today. So this is my
background. More or less I do, Kubernetes
0:14 - 0:19

security and critical infrastructure,
founded several companies and are now my
0:19 - 0:25

main focus is on Kubernetes security. This
this rabbit hole of Kubernetes, if you
0:25 - 0:31

look deeper into it, then you should be a
little bit scared, and I want to explain
0:31 - 0:37

why. The first approach is the
application, and then the application
0:37 - 0:43

normally is run in containers. And the
containers, what is not really well known,
0:43 - 0:48

have access to service accounts in
Kubernetes, which is one of the major
0:48 - 0:54

flaws in Kubernetes at the moment. If you
take over the service account, it might be
0:54 - 1:01

that you can take over a cluster. and if
you can take over a cluster, you might
1:01 - 1:08

take over a node and then your entire
cloud service account, Which is the work
1:08 - 1:15

of somebody else, I will mention later on
these slides. So let's look what happens:
1:15 - 1:22

So, the target is I have an application
exposed to the internet, and I want to own
1:22 - 1:29

the entire cluster from outside.
Application might be vulnerable. Examples?
1:29 - 1:37

Yeah, lots of them. One example I want to
present is imagetragick, who normally
1:37 - 1:44

should not do eval or exec statements in
any framework – should be PHP, NodeJS, or
1:44 - 1:51

any other framework – and execute commands
in the context of your application,
1:51 - 1:56

because something can go wrong and
developers are responsible for this. Let's
1:56 - 2:05

see how it looks like: This is the attack
model based on an attack. I thought it was
2:05 - 2:13

old and has been fixed in 2016, but now
there was a new overview by Emil Lerner,
2:13 - 2:22

who again showed, yes, you can, in current
versions of ImageMagick, exploit this
2:22 - 2:33

attack. So, it works. ImageMagick is for
uploading images, so you convert the image
2:33 - 2:41

in a different format, scale the size, and
then if you do something wrong in this
2:41 - 2:47

image, you can own the entire container.
This also works for non containerized
2:47 - 2:53

applications if you have a server running
something with ImageMagick on it. Please
2:53 - 3:02

be careful. OK. If we have mastered this
step, the next step is, yes, we want
3:02 - 3:08

access to the service account. And this is
by default enabled in Kubernetes: So, you
3:08 - 3:14

have a Kubernetes design flaw because your
service account is exposed to the
3:14 - 3:20

container where the application runs it.
The next step of an attacker is
3:20 - 3:27

installation of additional software. So,
you want to take over. You need a curl or
3:27 - 3:35

kubectl or chmod, and then you are owner
of the service account and can actually do
3:35 - 3:43

commands by uploading pictures in
ImageTragick. So responsible for this flaw
3:43 - 3:51

is the image creator. Let's see what else
can happen. To get total control, you also
3:51 - 3:58

need role-binding to a cluster-admin role.
This is not enabled by default, but the
3:58 - 4:04

internet is always good for bad advice. So
if you copy the installation requirements
4:04 - 4:11

or recommendations from the internet,
somebody else might take over the entire
4:11 - 4:23

cluster. Let's look deeper into it: Worst
practice here is what you can see in the
4:23 - 4:34

elastic installation recommendation: They
just mentioned they have a newer version,
4:34 - 4:43

but they use the cluster admin permissions
here to install ElasticSearch in your
4:43 - 4:54

Kubernetes cluster. So they recommend it
and a lot of other applications also have
4:54 - 5:00

this – which is a little bit outdated, but
it's quite common – in the installation
5:00 - 5:08

requirements. Never, ever do this, please.
It also can come with Helm Charts, so you
5:08 - 5:17

have Helm Charts where the cluster-admin
role is included. Here you see it, it was
5:17 - 5:23

in Apache Heron, which is an Apache
project, and it uses the cluster-admin
5:23 - 5:40

role, so by a helm install you might be
affected by this flaw to. So with these
5:40 - 5:47

four steps, which effectively are three
steps, you have a cluster application
5:47 - 5:54

exposed, and through that path, you can
take over the entire cluster from the
5:54 - 6:02

outside, and do anything what the cluster-
admin world can do. Effectively, is this
6:02 - 6:10

cluster-admin role-binding is like a
doormat attack, so you have the best
6:10 - 6:18

cryptography, the most expensive locks on
one side and then you put the lock under
6:18 - 6:24

the doormat or under the flower at the
door or something like that. This is
6:24 - 6:34

something which is, not really, what you
want. I can do an example walkthrough
6:34 - 6:40

which shows how it goes. So, I've
published all my trainings notebooks on
6:40 - 6:50

GitHub. Here's the way you can build this
out-dated ImageTragick version in
6:50 - 6:58

OpenShift. So, I use CRC, which is the
code-ready container version. It's based
6:58 - 7:05

on the ImageTragick proof of concept by
Mike Williams. And here you run and create
7:05 - 7:14

a vulnerable image. A little bit lengthy.
It's compiled inside and so on. So, don't
7:14 - 7:20

get a full Version. Which is the reason
why I don't show it here, but effectively
7:20 - 7:30

at the end, you have a vulnerable
application in a container internal and in
7:30 - 7:38

OpenShift. And that's exactly what we need
to run the application. Here is the
7:38 - 7:49

exploit. And the exploit starts with the
deployment of this container, which is
7:49 - 7:54

standard Kubernetes. Here "oc" is like
kubesctl. So, you get an overview.
7:55 - 8:00

Additionally, in OpenShift, you have a
very simple version of creating a root,
8:00 - 8:07

which is connected to a hostname, and then
you can upload it by using that hostname.
8:09 - 8:16

You expose the deployment, you expose the
service which is created, you expose the
8:16 - 8:23

route finally, and then you have access.
The next step is you get this root and
8:23 - 8:31

then here you have a URL, which you can
use. And in a full demo, I would just
8:31 - 8:38

simply call this URL and then I can upload
images here. I've created these files,
8:38 - 8:45

which are valid postscript files, but you
see at the end there is a full command.
8:45 - 8:51

And here, because there's a curl in the
container, I can download a version of
8:51 - 8:58

kubectl. Effectively, the containers,
specially the RedHat containers are not so
8:58 - 9:08

vulnerable as others, but you have always
writable temp, which is enough to deploy
9:08 - 9:17

some software. So, we curl kubectl from
the internet, put it into temp, and then
9:17 - 9:26

we use a simple chmod command to activate
kubectl. So now we can call kubectl
9:26 - 9:40

commands from inside an image. It's a
death bells, more or less so. Exactly at
9:40 - 9:46

the right place. We have a working exploit
now and warning, it might also already
9:46 - 9:53

work in older versions of Kubernetes.
Because in newer versions will need some.
9:54 - 10:02

Pill of poison, additionally, and this is
exactly this cluster all binding to the
10:02 - 10:07

cluster admin, which needs to be done,
that we have full access from the outside
10:07 - 10:16

and if we do this, and expose our cluster
admin account to the same account, which
10:16 - 10:22

is already exposed inside the container,
we can execute commands with this kubectl
10:22 - 10:29

so we can create deployments by uploading
pictures. Which is exactly what you never
10:29 - 10:35

want, but an attacker now has full access
to your cluster by simply uploading
10:36 - 10:46

prepared malicious pictures. Can do this.
So this is an example here, just. Create
10:46 - 10:51

and delete. Containers and deployments
this way, you can effectively do
10:51 - 11:11

everything. And again, this is the problem
here from the application side. If you
11:11 - 11:21

have a vulnerable version of ImageMagick,
you can include commands, and you can
11:21 - 11:27

definitely install software on the
Kubernetes server side. There are several
11:27 - 11:35

trys to fix this. For example, you can use
better images like Red Hat does, so this
11:35 - 11:40

is a Red Hat health index, which is quite
good, but effectively these images have
11:40 - 11:48

the advantage only that you not run
anything as root. But you run the same as
11:49 - 11:55

another user I.D. and it's the same user
is allowed to write to the temp directory,
11:55 - 12:04

effectively, yeah, you don't need root for
installing software. So, the container
12:04 - 12:12

also was good practice, no root inside, it
has an immutable root file system, but the
12:12 - 12:17

curl which is completely unnecessary, was
also deployed, we had write access to
12:17 - 12:23

temp. We had a chmod. And the first thing
you would prevent. All the stuff I'm doing
12:23 - 12:29

here is and if you're going to and don't
learn anything from this talk, please go.
12:30 - 12:36

Look into your service account and try to
disable the automountServiceAccountToken
12:37 - 12:42

features, so all of the service accounts
which are not running operators don't need
12:42 - 12:49

this service account open. If you have an
operator, it might be broken now and it
12:49 - 12:58

can be, um, overwritten by the Pod
definition, but effectively this. entire
12:58 - 13:05

example would not work without this
service account token. So, we have fixed
13:05 - 13:10

that. We cannot fix the application
because this is something, uh, somebody
13:10 - 13:15

else is creating for us, and we might even
have a floor which is not affected, so
13:15 - 13:20

there might be a zero-day. The next thing
we must prevent is the installation of
13:20 - 13:30

software. Fix the images, so use really
immutable images. Temp only if you need
13:30 - 13:40

it. PID is 1, anyway. Uh, OK, you might
have some variable data, but you should
13:40 - 13:48

use containers from scratch, no curl, no
wget and this also affects Red Hat UBIs
13:48 - 13:53

And most of the standard images have this
flaw, so you have a full operating system
13:53 - 14:02

inside with all the tools you like. But
this is not your territory. It's just,
14:02 - 14:08

yeah, it's a tool for the attacker. So
please run only trusted images, build your
14:08 - 14:17

own images and build them from scratch.
This is my example I also have uploaded to
14:17 - 14:23

GitHub, how to harden the container, which
is based on nginx alpine. nginx alpine
14:23 - 14:28

normally is a very small container, but
you can do more. You can use the script,
14:28 - 14:34

which is in this repository, just to get
only the tools you need. So this is not
14:34 - 14:40

statically linked because the original
engine is not statically linked. But it's
14:40 - 14:56

very close. This means you only positively
install the software you need. This is
14:56 - 15:02

dynamically linked, therefor the -d, so we
use LVD. Extract all the dynamic link
15:02 - 15:10

libraries and then all the configuration
files which are necessary. It is the
15:10 - 15:18

password registry group. OK. Some licenses
and share. Need some directories for
15:18 - 15:23

logging and then you can install it from
scratch because this script installs it in
15:23 - 15:32

a directory \temp\harden and you can with
this. Multi-stage build you can install
15:32 - 15:42

all what you need from \temp\harden. And
then the next container is based on
15:42 - 15:49

scratch and you can use nginx the same way
you would use it more or less. An
15:49 - 15:57

application which is statically linked. So
now we have created a hardened image
15:57 - 16:04

without kubectl, curl inside. So, we are
much closer to a secure application. The
16:04 - 16:11

next thing is, yeah, role binding to
cluster admin role. Don't do this. If
16:11 - 16:18

something in your application goes wrong,
you have additional measures, which you
16:18 - 16:25

can take just to prevent the application
from break-out of the container. So, you
16:25 - 16:30

can separate the internet exposure of
services or ingresses in Kubernetes from
16:30 - 16:37

privilege operations. So you have node
settings. ElasticSearch is doing a lot of
16:37 - 16:44

these things, so a lot is really not true
so, doing a sysctl. Some applications have
16:44 - 16:52

hostPaths on or have connection to the
host inter-process communication, which is
16:52 - 16:58

not necessary if you have exposed it and
then separate the applications who need
16:58 - 17:04

this from the applications which don't
need it. So, cluster admin should be more
17:04 - 17:09

or less restricted to very privileged
operators. And by the way, Argo is also a
17:09 - 17:15

very privileged operator. Don't run an
Argo on a Kubernetes cluster in a security
17:15 - 17:21

critical environment because I've seen
Argo also is binding to cluster admin. It
17:21 - 17:27

doesn't mean that Argo by default is
unsafe, but it's a very complex
17:27 - 17:34

application and I would definitely run it
in a separate cluster, not in the critical
17:34 - 17:41

cluster. And what does an architecture fix
look like, here you have the lifecycle of
17:41 - 17:48

a Pod, so the time is going to from left
to right. Here you see if the container is
17:48 - 17:56

ready, it can be accessed from the
internet. And if you do something from the
17:56 - 18:04

init system, like a sysctl, please do it
inside a container which is not connected
18:04 - 18:10

to the internet, just to use the pause
container, as a pause container to limit
18:10 - 18:16

it and restrict it and that is not really
connected to the network. So, this is
18:16 - 18:23

something which covers the architecture.
Additionally, I already mentioned here the
18:23 - 18:28

network policy which will come later, so
this is our threat matrix. We have exposed
18:28 - 18:33

and not exposed services. You have
unprivileged and privileged things. The
18:33 - 18:39

dangerous ones are the privileged ones
which are exposed, but normally you only
18:39 - 18:46

have an exposed privileged application if
you have an IDE running in Kubernetes,
18:46 - 18:51

which is not what I would like to see in
critical infrastructure, something like
18:51 - 18:58

rstudio or have a web ui to a gitops
framework. And normally you only have a
18:58 - 19:04

web application. And what should not be
exposed under normal conditions is an
19:04 - 19:12

operator's sysctl, build systems, host
operators and so on. If you do this, it's
19:12 - 19:21

virtually not possible to own the cluster,
you should do all the three because if you
19:21 - 19:26

have security in depth, you can make a
mistake on one of these levels and the
19:26 - 19:32

other means other levels keep you from
being exploited. You can even do more
19:32 - 19:39

isolation on the network side, you have
network policies for egress on the node
19:39 - 19:44

side, you can activate seccomp, gvisor,
and the common Frameworks, SELinux,
19:44 - 19:50

Apparmor. You can use PodSecurity
policies, or in the future, the open
19:50 - 19:56

policy agent to prevent the node from
being hacked. For the identity and access
19:56 - 20:03

management, you should use individual
service accounts for all your tasks. So
20:03 - 20:09

you have enough of a lot of roles. You
should use role based access control to
20:09 - 20:18

check this. OK, but I promise, yes, we can
go even deeper, and this needs a little
20:18 - 20:28

help from your cloud administrator and
here, the example from Nico Meisenzahl,
20:28 - 20:35

who does a very similar example on hi-
jacking Kubernetes, and he's doing it,
20:35 - 20:43

obviously in one of the clouds. And what
he has found out is you can get access to
20:43 - 20:49

the azure.json file, which has user
assigned identities. This is not the
20:49 - 20:55

Kubernetes identities. This is the Azure
identity. You can get a token, you can get
20:55 - 21:02

a subscription, you can get a resource
group and then you can use a curl command,
21:02 - 21:07

with this token, to change things on the
API version of this resource group with
21:07 - 21:12

this subscription. So, you might be able
to hack your node with the privilege
21:12 - 21:17

container and then take over your cloud
account. And he told me that this is also
21:17 - 21:25

the truth for the other cloud, so it might
even work something similar in AWS and
21:25 - 21:33

GCP. So please, also protect your cloud
account. Understand your identity and
21:33 - 21:38

access management in the cloud. So, at
least, someone in the team should
21:38 - 21:44

understand it. And limit also the
underlying account to the bare minimum. It
21:44 - 21:51

might even be a good idea to block access
addresses like 169.254.something. And the
21:51 - 21:57

other clouds, as I already mentioned, also
might be affected. And my call to the
21:57 - 22:03

cloud providers, is don't deliver account
data in containers or nodes. This is not
22:03 - 22:08

necessary. It's yes, it's very
comfortable, as the service account
22:08 - 22:13

talking is very comfortable for running
operators, but it's a major security flaw
22:13 - 22:23

and it might be that you lose all your
accounts and data. Conclusion: We have a
22:23 - 22:29

full attach chain from the application to
the cloud account. And it's your task to
22:29 - 22:36

prevent it and fix it. This is called
shared responsibility, so the cloud
22:36 - 22:41

providers effectively only care for the
infrastructure, but not really for the
22:41 - 22:46

security in your clou d. This is your
task. OK. Thank you for your attention, I
22:46 - 22:52

hope it was interesting. Please ask your
questions. And now I'm open for he Q&A.
22:52 - 22:58

Applaus
22:58 - 23:05

Herald: Thank you for the talk. This is
working? Yeah. So do we have any questions
23:05 - 23:12

from the internet? I don't see any coming
in so far, but we, I think, usually a bit
23:12 - 23:17

ahead so I'll ask one:
Q: What do you think? So who's in in the
23:17 - 23:21

responsibility mainly to fix these
insecurities? Do you think this can be
23:21 - 23:27

fixed by better default in these
infrastructures and configuration files?
23:27 - 23:32

Is this to be fixed for better tutorials
and better education for the devop
23:32 - 23:35

engineers? What was the main point of
responsibility?
23:35 - 23:45

A: I definitely would prefer to have
secure default installations. But then you
23:45 - 23:51

have this shared responsibility in the
contracts: From a certain point, you are
23:51 - 23:57

responsible for the security of the
account, and we have seen this complexity
23:58 - 24:09

because this might be 20 steps. Every step
is very simple and every step is looking
24:09 - 24:16

very harmless, but all the steps together
might create a full exploit of a cloud. So
24:16 - 24:24

this must be overseen, and it's very hard
for developers who are cloud native and
24:24 - 24:30

are focusing on the application to have an
overview of the security. Developers now
24:30 - 24:37

have 10 or 100 times more code on the hard
disk than ten years before. And this means
24:37 - 24:43

developers are not able really to have a
full judgment about what is going on in
24:43 - 24:49

terms of security. This is something
developers talk about security, either
24:49 - 24:56

they are specialized on it or they have
not seen things like this. What I normally
24:56 - 25:00

notice: The developers are not aware of
these problems.
25:00 - 25:07

Q: OK. And what do you think, what can we
do about the complexity? So do you think
25:07 - 25:10

we need better education for people to
actually understand the systems? Or is
25:10 - 25:14

there a way in cloud infrastructures to
reduce the complexity?
25:14 - 25:24

A: Better education? And do all the simple
fixes. These are five steps, and the fixes
25:24 - 25:30

are also very simple. And you have to
check them and then you need a tool
25:30 - 25:36

because you might have 20 clusters like
this. Every cluster has 20 applications,
25:36 - 25:41

so this might be quite complicated. So you
need tools for an overview and in the
25:41 - 25:47

trainings material, you see examples how
you can check your Kubernetes clusters for
25:47 - 25:52

exploits like this.
Herald: OK, thank you very much. Thanks
25:52 - 25:58

for being here. We will continue in about
half an hour with the next talk, then
25:58 - 26:04

again in German. Thanks.
Thomas: Thank you very much. Applaus
26:04 - 26:13

Outro: Everything is licensed under CC BY
4.0. And it is all for the community, to
26:13 - 26:14

the unknown and for everyone.
26:14 - 26:15

Subtitles created by c3subtitles.de
in the year 2022. Join, and help us!

Title:: Hacking Containers, Kubernetes and Clouds
Description:: more » « less
Video Language:: English
Duration:: 26:15

	alcuna edited English subtitles for Hacking Containers, Kubernetes and Clouds
	alcuna edited English subtitles for Hacking Containers, Kubernetes and Clouds
	alcuna edited English subtitles for Hacking Containers, Kubernetes and Clouds
	alcuna edited English subtitles for Hacking Containers, Kubernetes and Clouds
	alcuna edited English subtitles for Hacking Containers, Kubernetes and Clouds
	C3Subtitles edited English subtitles for Hacking Containers, Kubernetes and Clouds
	C3Subtitles edited English subtitles for Hacking Containers, Kubernetes and Clouds
	C3Subtitles edited English subtitles for Hacking Containers, Kubernetes and Clouds

English subtitles

Revisions

Revision 8 Edited

alcuna

Hacking Containers, Kubernetes and Clouds

Revisions

Our website uses cookies

Operating cookies (Required)