CompTIA Security+ Full Course: Security Network Monitoring & SIEMs

Edit subtitles

0:00 - 0:17

[Music]
0:17 - 0:20

Now, network monitoring has been around for a lot of
0:20 - 0:22

time, probably ever since the first
0:22 - 0:24

networks were invented. Just like with
0:24 - 0:27

any system, just like with any electronic
0:27 - 0:31

device, we tend to want to be able to
0:31 - 0:33

monitor if everything is going okay. We
0:33 - 0:35

want to receive warnings, we want to be
0:35 - 0:37

alerted when something goes wrong, when
0:37 - 0:39

when something fails. And this type of
0:39 - 0:41

monitoring is tremendously useful
0:41 - 0:44

especially in larger networks. Over time,
0:44 - 0:47

this monitoring has extended to security
0:47 - 0:48

monitoring as well. So we're not just
0:48 - 0:51

concerned about how is the network doing,
0:51 - 0:53

if it's working well, if you don't have
0:53 - 0:55

any failed devices, but we're also
0:55 - 0:57

starting to look at the network traffic.
0:57 - 0:59

How is the network utilized, who uses it,
0:59 - 1:02

who attempts to access it, what type of
1:02 - 1:04

traffic are they generating? And if we
1:04 - 1:06

try to gather all this type of
1:06 - 1:08

information, we try to make sense of it,
1:08 - 1:10

we try to correlate it, with a smart
1:10 - 1:13

enough device, we might be able to detect
1:13 - 1:17

attempts at intrusion or attacks that
1:17 - 1:19

are about to happen or that have
1:19 - 1:21

happened in the past or proves that
1:21 - 1:23

we've been compromised or somebody in
1:23 - 1:25

the network has been infected. And all
1:25 - 1:27

that information is in there if you know
1:27 - 1:29

where to look for and also if you have
1:29 - 1:31

the right tools to look for it.
1:31 - 1:34

In general, the term intrusion detection
1:34 - 1:36

refers to a system that is able to
1:36 - 1:39

monitor whatever can be observed in a
1:39 - 1:42

network, and in most cases, we're talking
1:42 - 1:44

about two things that can be observed.
1:44 - 1:46

First of all, we have Network traffic,
1:46 - 1:49

and then we have application events or
1:49 - 1:51

logs generated by the operating systems
1:51 - 1:53

by the applications running on those
1:53 - 1:56

OS's and so on. So coming back here to
1:56 - 1:58

our network focus, we've talked about
1:58 - 2:00

intrusion detection at your network level.
2:00 - 2:02

We're going to call this one a network
2:02 - 2:06

based intrusion detection system or NIDS.
2:06 - 2:08

And we have many commercial solutions as
2:08 - 2:10

well as open source ones that are able
2:10 - 2:12

to perform this type of network-based
2:12 - 2:13

intrusion detection. Of course, all the
2:13 - 2:15

major security vendors are doing it. In
2:15 - 2:18

many examples, you're going to see the
2:18 - 2:22

IDS functionality built into the
2:22 - 2:24

functionality of a larger firewall or a
2:24 - 2:26

larger UTM device, especially for major
2:26 - 2:28

vendors out there. But you also have
2:28 - 2:31

solutions in the open source area
2:31 - 2:34

such as Snort, Suricata, or Zeek or Bro.
2:34 - 2:37

They're all available. And some of them also
2:37 - 2:38

have commercial versions as well, but
2:38 - 2:39

they also provide you with three
2:39 - 2:42

versions that you can freely install and
2:42 - 2:44

try and run in your own environment. Now
2:44 - 2:47

the way these intrusion detection
2:47 - 2:51

systems work by definition is that they
2:51 - 2:53

rely on a database of signatures. And
2:53 - 2:55

those signatures are basically just a
2:55 - 2:57

way to describe how a specific traffic
2:57 - 2:59

pattern is supposed to look like in
2:59 - 3:01

order to detect a specific type of
3:01 - 3:04

attack or attempt at an intrusion. So we
3:04 - 3:05

might be looking at a sequence of
3:05 - 3:07

packets that looks in a certain way.
3:07 - 3:10

We might be looking at a specific type
3:10 - 3:12

of packet that doesn't play by the
3:12 - 3:14

normal protocol rules that it belongs to.
3:14 - 3:17

Or a specific type of payload or simply
3:17 - 3:20

just a signature, a byte sequence that
3:20 - 3:22

can be found in the packet
3:22 - 3:24

payload that indicates the fact that the
3:24 - 3:27

payload is malicious. And this behavior
3:27 - 3:29

is very similar to what you're seeing in
3:29 - 3:31

antivirus scanning or anti-malware
3:31 - 3:33

scanning. We're simply looking for a
3:33 - 3:36

sequence of bytes that indicates that-
3:36 - 3:38

well, if we find the sequence of bytes in
3:38 - 3:40

a specific executable file, it means that
3:40 - 3:43

the file is infected with that specific
3:43 - 3:45

virus that the sequence belongs to.
3:45 - 3:48

Now, in intrusion detection, again, we
3:48 - 3:50

we're kind of doing the same thing, right?
3:50 - 3:52

We're looking for patterns, but we're not
3:52 - 3:53

just scanning individual packets.
3:53 - 3:56

Sometimes we need to collect more
3:56 - 3:58

packets in a sequence in order to
3:58 - 4:01

determine if the behavior of the client
4:01 - 4:04

that is generating those packets is
4:04 - 4:06

abnormal, and if it's abnormal, does
4:06 - 4:09

it indicate an attack pattern or not? So
4:09 - 4:11

long story short, intrusion detection
4:11 - 4:13

systems are strongly dependent on a
4:13 - 4:16

database of signatures. Now, more advanced
4:16 - 4:17

instrusion detection systems could also
4:17 - 4:20

correlate this network information with
4:20 - 4:21

log information. So we're seeing
4:21 - 4:23

something fishy in the network by
4:23 - 4:25

looking at the network traffic, let's
4:25 - 4:27

check the application logs that the
4:27 - 4:30

traffic is going towards, for example.
4:30 - 4:31

Let's see how that application reacts
4:31 - 4:34

and if we can see some abnormal
4:34 - 4:37

logs being generated by the app as well.
4:37 - 4:39

Now, correlating that information, the
4:39 - 4:41

traffic and the logs, might tell us more about
4:41 - 4:44

the actual attack or might increase the
4:44 - 4:46

confidence of the fact that we really
4:46 - 4:49

have identified a valid attack signature.
4:49 - 4:51

Not all solutions are able to do this of
4:51 - 4:54

course. Also, a very important distinction
4:54 - 4:57

for intrusion detection system with an
4:57 - 4:59

emphasis on detection is the fact that
4:59 - 5:02

these systems are never able to block
5:02 - 5:04

the malicious traffic once they identify
5:04 - 5:06

it. It's just like the name says, it's just
5:06 - 5:09

detection, it's not prevention, all right?
5:09 - 5:11

So we're not stopping the traffic. We
5:11 - 5:13

might be able to see an attack signature.
5:13 - 5:15

We might be able to raise some alerts,
5:15 - 5:17

generate some syslogs, but we're not
5:17 - 5:19

going to be able to block that specific
5:19 - 5:21

type of traffic. One positive side for
5:21 - 5:23

this is that well, if the device is not
5:23 - 5:26

inside of the traffic path, then the
5:26 - 5:28

attacker might not even be able to
5:28 - 5:29

detect it.
5:29 - 5:31

So most likely, the IDS is going to
5:31 - 5:33

work with a copy of the traffic just to
5:33 - 5:35

analyze it, but it's not going to be able
5:35 - 5:37

to stop the malicious traffic. And the
5:37 - 5:39

attacker is not going to be able to
5:39 - 5:41

detect the IDS device and might not even
5:41 - 5:44

be able to compromise it if they
5:44 - 5:47

intend to. In most situations, the IDS
5:47 - 5:48

device doesn't even have a valid IP
5:48 - 5:50

address within the network that they're
5:50 - 5:52

monitoring, so it cannot be addressed, it
5:52 - 5:55

cannot be compromised by communicating
5:55 - 5:57

with it directly. Alright, so since we
5:57 - 5:58

mentioned the fact that an IDS works
5:58 - 6:00

with just a copy of the traffic, let's
6:00 - 6:02

see how can we generate that copy of
6:02 - 6:04

traffic, right? They're not within the
6:04 - 6:06

traffic path, so we need to make a copy
6:06 - 6:08

of the traffic and just send it in a
6:08 - 6:09

separate channel, on a separate channel
6:09 - 6:12

to the IDS device for analysis. Now, my
6:12 - 6:14

way of doing this is by enabling Port
6:14 - 6:18

mirroring or SPAN. In Cisco speak, this
6:18 - 6:19

is switchboard analyzer. Just a
6:19 - 6:21

functionality on layer 2 or layer 3
6:21 - 6:24

switches that allow us to configure the
6:24 - 6:26

switch, and we're basically telling it
6:26 - 6:28

well, whatever traffic you're seeing on
6:28 - 6:32

ports let's say one, two, and three make a
6:32 - 6:35

copy of that traffic and forward it out
6:35 - 6:37

of port number eight. And of course, we're
6:37 - 6:38

assuming that on port number eight,
6:38 - 6:40

there's an IDS device connected right
6:40 - 6:42

there. So we're basically telling the
6:42 - 6:43

switch to make a copy of all the
6:43 - 6:45

interesting traffic and send it towards
6:45 - 6:47

the IDS. And of course, you might be
6:47 - 6:50

thinking here well, what if the switch is
6:50 - 6:52

overloaded, what if there's more traffic
6:52 - 6:54

generated on those ports than the
6:54 - 6:57

mirror port can actually support. Well
6:57 - 6:59

that's true, it might happen. So in
6:59 - 7:01

cases when the switch is overloaded and
7:01 - 7:02

there's too much traffic in the network,
7:02 - 7:05

packets might be dropped, and also frames
7:05 - 7:08

with errors might not be forwarded to the
7:08 - 7:10

to the mirrored port either. So we
7:10 - 7:12

might not be able to see 100% of all the
7:12 - 7:15

traffic, but in most cases, it's going to
7:15 - 7:16

be enough. And it's also one of the
7:16 - 7:18

features that basically doesn't require
7:18 - 7:19

you to install anything else in the
7:19 - 7:21

network, it's just a functionality, just a
7:21 - 7:23

configuration, effort- just a couple of
7:23 - 7:25

commands on a switch. Another method for
7:25 - 7:28

duplicating traffic is by using a
7:28 - 7:30

passive or an active. It's basically
7:30 - 7:32

a layer 1 device called a TAP, a test
7:32 - 7:35

access port. It's nothing else than a
7:35 - 7:37

kind of like a T-connector where the
7:37 - 7:39

main cable goes from one end to the next,
7:39 - 7:41

and there's a third cable that actually
7:41 - 7:43

receives a copy of the entire traffic
7:43 - 7:46

going through that segment of cable.
7:46 - 7:49

The device is not a smart one, so it's
7:49 - 7:51

it's not like a switch. It's not going to
7:51 - 7:54

look at the destination frames and
7:54 - 7:56

forward entire packets. It's simply
7:56 - 7:58

going to duplicate the electrical or the
7:58 - 8:00

optical signals that it sees on the wire,
8:00 - 8:02

and it's going to make a complete and
8:02 - 8:04

identical copy of those signals onto the
8:04 - 8:08

third connection which, of course, is
8:08 - 8:10

is ideally connected to the IDS device.
8:10 - 8:12

Now, this type of approach is, again,
8:12 - 8:13

completely undetectable.
8:13 - 8:16

Span is not detectable either, right? And
8:16 - 8:21

it also copies entire frames regardless if
8:21 - 8:24

those frames contain errors or not. As we
8:24 - 8:25

said with port mirroring, while the
8:25 - 8:27

frames need to be correct in order to be
8:27 - 8:30

copied, well, with a TAP, the TAP doesn't
8:30 - 8:32

care. It's basically just a signal
8:32 - 8:34

repeater, and we can do this for both
8:34 - 8:37

copper cables and so electrical signals
8:37 - 8:39

as well as fiber optic so optical
8:39 - 8:42

signals. The TAP will not care, it will just
8:42 - 8:44

blindly copy all the signals that it
8:44 - 8:46

receives. And finally the third method
8:46 - 8:49

for monitoring traffic is by having the
8:49 - 8:52

IDS device in the traffic path
8:52 - 8:55

but acting as a transparent device. Again,
8:55 - 8:57

without an IP address, we're basically
8:57 - 9:00

becoming a layer 2 device that is part
9:00 - 9:02

of the same VLAN that they're
9:02 - 9:05

bridging, but they cannot be addressed on
9:05 - 9:07

the network, they cannot be detected on
9:07 - 9:09

the network, and they- if it's a true IDS
9:09 - 9:10

device, then it's not going to be able to
9:10 - 9:13

block the actual traffic that goes
9:13 - 9:15

through it. Now, having the device placed
9:15 - 9:18

inside of the traffic path opens us to
9:18 - 9:20

the possibility of actually blocking the
9:20 - 9:22

traffic, and that's going to be a
9:22 - 9:23

different type of solution called
9:23 - 9:24

intrusion prevention system. And we'll
9:24 - 9:26

get there in just a moment. There's one
9:26 - 9:28

more type of intrusion detection device
9:28 - 9:30

or solution and that is a software
9:30 - 9:32

solution that can be installed directly
9:32 - 9:34

on the workstations. So I'm not talking
9:34 - 9:35

about a box that listens to network
9:35 - 9:37

traffic on an entire segment, but we're
9:37 - 9:39

talking here about a software solution,
9:39 - 9:41

basically a program that runs on your
9:41 - 9:43

endpoint machine, on your host machine, be
9:43 - 9:46

it a laptop or a desktop. Now, this one is
9:46 - 9:48

called host-based instrusion detection
9:48 - 9:51

because it runs on the host, and it does
9:51 - 9:52

have pretty much the same benefit
9:52 - 9:55

or the same abilities as a network-based
9:55 - 9:57

instrusion detection, so it's able to look
9:57 - 9:58

at the network traffic going in and out
9:58 - 10:00

of your network interface. It's able to
10:00 - 10:03

look at the logs generated by the
10:03 - 10:05

applications on your system, but since
10:05 - 10:07

they are running as an application on
10:07 - 10:09

your system, they can become even smarter
10:09 - 10:12

because they might have access now to
10:12 - 10:14

the actual process table. They might be
10:14 - 10:16

looking at the kernel, you might be able
10:16 - 10:18

to look at the memory to see what
10:18 - 10:20

processes are running, when did they
10:20 - 10:23

execute, who executed them, with what
10:23 - 10:26

privileges, and they can also openly look
10:26 - 10:30

at encrypted traffic. So if you are
10:30 - 10:33

communicating over SSL with a website,
10:33 - 10:35

well a network-based instrusion detection
10:35 - 10:37

might not be able to understand anything
10:37 - 10:39

that's going back and forth because it's
10:39 - 10:41

encrypted, but your host-based intrusion
10:41 - 10:43

detection
10:43 - 10:45

is located at the end of that encrypted
10:45 - 10:48

tunnel, so it is able to see that
10:48 - 10:50

unencrypted traffic before it even
10:50 - 10:52

enters the encrypted tunnel and right
10:52 - 10:54

after it leaves the encrypted tunnel. So
10:54 - 10:57

it's able to actually watch the entire
10:57 - 11:00

traffic flow in an unencrypted form. And
11:00 - 11:03

again, since we have pretty much full
11:03 - 11:05

permissions on the monitored host in
11:05 - 11:08

order to be able to properly monitor the,
11:08 - 11:09

you know, the process table and the
11:09 - 11:11

network connections and the network
11:11 - 11:14

traffic, we could also have a look at the
11:14 - 11:16

files on the disk.
11:16 - 11:18

Why would you do that? Well that's
11:18 - 11:20

because monitoring the integrity of the
11:20 - 11:22

files on the disk, especially the
11:22 - 11:25

integrity of the operating system files,
11:25 - 11:27

and being able to detect when that
11:27 - 11:31

integrity fails, when a system file is
11:31 - 11:33

being replaced with a malicious one, when
11:33 - 11:36

a system file is becoming encrypted
11:36 - 11:38

or it is replaced with a completely
11:38 - 11:39

different version, that might be an
11:39 - 11:42

indication of compromise, that might be
11:42 - 11:43

an indication of the fact that you have
11:43 - 11:47

been infected with malware. So solutions
11:47 - 11:49

or functionality additional to host
11:49 - 11:51

based instrusion detection that monitor
11:51 - 11:53

files on your system, especially
11:53 - 11:55

operating system files, these are called
11:55 - 11:58

file integrity monitoring solutions.And
11:58 - 11:59

remember that we said that when we place
11:59 - 12:01

the intrusion detection device in the
12:01 - 12:03

traffic path, that
12:03 - 12:05

device actually becomes able to also
12:05 - 12:08

block the traffic that goes through it
12:08 - 12:09

which can make it an intrusion
12:09 - 12:11

prevention system, right? So detection
12:11 - 12:13

just alerts, just generate alerts or
12:13 - 12:16

events. Intrusion prevention is about
12:16 - 12:19

actually taking action or
12:19 - 12:23

acting upon the detected intrusion. So
12:23 - 12:26

what can such a device actually do
12:26 - 12:27

whenever they're seeing
12:27 - 12:29

something fishy going on inside of a
12:29 - 12:31

network? Well, they could do something as
12:31 - 12:33

simple as simply sending a TCP reset
12:33 - 12:36

packet to the originator of the
12:36 - 12:38

malicious connection. They could also
12:38 - 12:40

have some more advanced functionality
12:40 - 12:42

especially if it's the same
12:42 - 12:43

device that acts as a firewall. They
12:43 - 12:46

might be dynamically able to generate a
12:46 - 12:49

firewall rule to block similar traffic
12:49 - 12:51

like the one that was just detected as
12:51 - 12:54

being part of an attempt for an
12:54 - 12:57

attack or for a compromise. We could be
12:57 - 12:59

choosing if we're detecting something
12:59 - 13:01

that looks like a denial of service
13:01 - 13:03

attack, we could be choosing to limit the
13:03 - 13:06

amount of bandwidth that is allocated to
13:06 - 13:08

that specific type of traffic. Kind of
13:08 - 13:10

like policing that we're doing in
13:10 - 13:12

well, quality of service. In any case,
13:12 - 13:15

any type of action that the IPS device
13:15 - 13:17

can take against the malicious traffic,
13:17 - 13:19

we're going to call it active response.
13:19 - 13:21

And depending on how complex the device
13:21 - 13:23

is and how powerful the device is, you
13:23 - 13:25

might actually choose to look not just
13:25 - 13:29

at simple IPS or IDS signatures, but also
13:29 - 13:31

look for malware signatures. Yeah, that's
13:31 - 13:33

that's going to require you to, you know,
13:33 - 13:35

to decode encrypted traffic. It's going
13:35 - 13:38

to require you to identify potential
13:38 - 13:40

protocols that might be carrying files,
13:40 - 13:42

gather all those related packets that
13:42 - 13:44

belong to the same TCP stream to the
13:44 - 13:46

same flow, assemble them into an
13:46 - 13:48

executable file, store that in memory,
13:48 - 13:51

attempt to scan it with an antivirus
13:51 - 13:53

engine, and then determine if that flow
13:53 - 13:55

was actually malicious or not. Now, this
13:55 - 13:57

requires a lot of processing power. This
13:57 - 13:59

is going to create some sort of delay in
13:59 - 14:01

the networks of the users. They're going
14:01 - 14:04

to see their download unable to
14:04 - 14:06

finish or the application responding
14:06 - 14:10

slowly until the firewall, the UTM device,
14:10 - 14:12

or the intrusion prevention system is
14:12 - 14:14

actually able to scan those files
14:14 - 14:16

against malware signatures. On a lighter
14:16 - 14:18

approach, we could also just be looking
14:18 - 14:21

at URLs, looking for malicious domains or
14:21 - 14:22

domains that associated with
14:22 - 14:24

malware or with the command and control
14:24 - 14:27

servers. We might be looking at URLs in
14:27 - 14:30

order to categorize those URLs and
14:30 - 14:32

figure out the reputation of that URL
14:32 - 14:34

and decide whether we want the
14:34 - 14:35

communication to that specific website
14:35 - 14:37

to proceed or not. So regardless if the
14:37 - 14:40

device is an IPS or an IDS, the detection
14:40 - 14:42

methods are pretty much the same. Now, the
14:42 - 14:45

difference is just in what the device is
14:45 - 14:47

actually doing. Is it only alerting or is
14:47 - 14:50

it actually taking an active response
14:50 - 14:51

approach to the traffic? But the
14:51 - 14:53

detection part is pretty much the same,
14:53 - 14:55

right? And when talking about detection,
14:55 - 14:58

we are going to start with the basic
14:58 - 15:00

type of detection that is where we're
15:00 - 15:01

just looking for signatures in the
15:01 - 15:04

database, which, of course, means that we
15:04 - 15:08

need to have an up-to-date database for
15:08 - 15:09

the device to be able to detect the
15:09 - 15:11

latest and the greatest attack. Now, this
15:11 - 15:13

is basically one of the reasons why
15:13 - 15:14

people choose to pay for commercial
15:14 - 15:17

solutions because databases maintained
15:17 - 15:20

by a dedicated software or security
15:20 - 15:22

vendor that deals with intrusion
15:22 - 15:25

prevention, those databases are going to
15:25 - 15:29

be much more often updated and kept up
15:29 - 15:31

to date in order to mirror as best as
15:31 - 15:33

possible the database of all the known
15:33 - 15:35

attack patterns ever detected in the
15:35 - 15:37

world. Now with open source solutions,
15:37 - 15:39

you're still going to have
15:39 - 15:42

a pretty good level of protection, but
15:42 - 15:44

you might not be able to detect an
15:44 - 15:46

attack that was just identified
15:46 - 15:48

six hours ago. Nevertheless and
15:48 - 15:51

regardless how up-to-date your database
15:51 - 15:54

is, you're still limited by the attack
15:54 - 15:56

patterns listed in that database. If an
15:56 - 15:59

attack emerges and doesn't match
15:59 - 16:00

anything in your database, it's still
16:00 - 16:02

going to go through,
16:02 - 16:04

which leads us to a different approach,
16:04 - 16:06

and that is a behavioral approach. So
16:06 - 16:09

instead of looking at specific streams
16:09 - 16:11

of bytes, specific headers, specific
16:11 - 16:14

sequences of packets, let's look at the
16:14 - 16:17

overall behavior of an application or of
16:17 - 16:19

a protocol.
16:19 - 16:21

Does it look like it's doing what's
16:21 - 16:24

supposed to do? Is it generating more
16:24 - 16:27

packets than we're used to seeing? Is it
16:27 - 16:29

generating more traffic? Is it
16:29 - 16:32

generating an abnormal amount of control
16:32 - 16:35

information as opposed to a real
16:35 - 16:38

transfer data? And we call this
16:38 - 16:40

behavioral monitoring. Now, in order for
16:40 - 16:41

behavioral monitoring to work, we need to
16:41 - 16:44

have something to compare that behavior
16:44 - 16:47

to, and say well, if it goes outside of
16:47 - 16:49

the known ranges,
16:49 - 16:51

then it looks like something's fishy.
16:51 - 16:55

Well that known range is supposed to be
16:55 - 16:58

your baseline. So such a device or such a
16:58 - 17:00

system is supposed to be trained first.
17:00 - 17:02

You're supposed to just leave it inside
17:02 - 17:04

of the network for let's say a week or
17:04 - 17:08

two. Just let it figure out how
17:08 - 17:10

does a normal Monday morning look like
17:10 - 17:12

in your network when everybody comes
17:12 - 17:15

into work and they start logging in and
17:15 - 17:17

start updating their
17:17 - 17:18

machines and perhaps even their mobile
17:18 - 17:21

phones on the company Wi-Fi. But
17:21 - 17:23

nevertheless, you have to leave that
17:23 - 17:26

instrusion prevention solution learn what
17:26 - 17:29

does your normal traffic look like when
17:29 - 17:30

people start accessing internal
17:30 - 17:32

applications, when people start
17:32 - 17:34

accessing internet destinations, when
17:34 - 17:37

people start communicating, sharing files
17:37 - 17:40

between each other, when backups
17:40 - 17:43

start to happen at midnight perhaps,
17:43 - 17:46

right? You have to let it learn so that
17:46 - 17:49

in a couple of weeks when something goes
17:49 - 17:51

outside of the known range where an
17:51 - 17:54

application behaves the way it did not
17:54 - 17:57

behave in the first training weeks, then
17:57 - 17:59

it's going to be able to raise an alarm
17:59 - 18:02

and perhaps indicate the fact that the
18:02 - 18:05

application has been compromised or that
18:05 - 18:08

somebody is using it in order to elevate
18:08 - 18:10

their privileges or just compromise your
18:10 - 18:12

network. And as you can probably guess,
18:12 - 18:15

this is one area where machine learning
18:15 - 18:17

is going to provide you a lot of benefit
18:17 - 18:21

given that you take the time and efforts
18:21 - 18:24

to educate, to teach the machine learning
18:24 - 18:26

system. What does your normal baseline
18:26 - 18:29

look like? Now, of course, regardless how
18:29 - 18:32

complex or how well-tuned your solution
18:32 - 18:34

is going to be, there will be false
18:34 - 18:36

positives and there will be false
18:36 - 18:40

negatives, which is why I always tell
18:40 - 18:42

tell students there's a old saying that
18:42 - 18:45

I heard from someone in Cisco a long,
18:45 - 18:48

long time ago, and they said that IPS
18:48 - 18:49

without eyes
18:49 - 18:53

is useless. So IPS without human eyes is
18:53 - 18:55

useless. There's always going to be
18:55 - 18:58

the need to have a human being right
18:58 - 19:01

there evaluating and analyzing whether
19:01 - 19:03

the alerts generated by the intrusion
19:03 - 19:05

prevention or detection system are valid
19:05 - 19:09

or not. Does it need more fine-tuning or
19:09 - 19:12

do we need to raise an alarm? So what
19:12 - 19:14

devices can we actually find that
19:14 - 19:16

implement this type of advanced
19:16 - 19:18

functionality, be it detection or
19:18 - 19:20

prevention. Well unfortunately, this is
19:20 - 19:23

the place where we're slowly
19:23 - 19:25

stepping into the marketing area. That's
19:25 - 19:27

because the devices that we're going to
19:27 - 19:29

be listing here are not completely
19:29 - 19:33

different devices, but over time,
19:33 - 19:35

different naming conventions have
19:35 - 19:37

emerged, different marketing names have
19:37 - 19:40

been invented to make them sound cool, to
19:40 - 19:42

make them sound different from what the
19:42 - 19:44

other vendors were doing. So we're going
19:44 - 19:46

to start with the next generation
19:46 - 19:48

firewall, and we would had this type of
19:48 - 19:50

next generation
19:50 - 19:53

for about 12 or 15 years already. I've
19:53 - 19:56

been hearing the next generation term in
19:56 - 19:59

in IT security for so long that
19:59 - 20:01

I'm starting to wonder
20:01 - 20:05

are we still next generation, are we- have
20:05 - 20:07

we skipped the generation? Are we now in
20:07 - 20:10

the next next generation or where does
20:10 - 20:12

it stop, where does it end, where
20:12 - 20:14

does the next generation begin, right? Now,
20:14 - 20:15

unfortunately marketing people don't
20:15 - 20:17

really ask themselves these questions. So
20:17 - 20:19

we're kind of stuck with this
20:19 - 20:21

terminology for now, and we're gonna keep
20:21 - 20:23

calling you next generation until I
20:23 - 20:27

don't know when, but regardless, a next
20:27 - 20:29

generation firewall is basically just a
20:29 - 20:31

layer 7 firewall. That's an application
20:31 - 20:33

layer firewall which is able to look at
20:33 - 20:36

the application layer payload, so we're
20:36 - 20:38

actually seeing the data being sent,
20:38 - 20:40

we're not just looking at the packet
20:40 - 20:44

headers. And it also has some sort of
20:44 - 20:46

detection or prevention system built in,
20:46 - 20:49

okay? So we have an IPS or an IDS built
20:49 - 20:51

in, which leads us back to the discussion
20:51 - 20:53

that we had before. So we have an
20:53 - 20:55

application layer firewall which can be
20:55 - 20:57

enriched with additional functionality.
20:57 - 20:59

Now that we have access to the actual
20:59 - 21:01

application payload, well, why not
21:01 - 21:04

look for intrusion signatures, why not
21:04 - 21:06

look for malware signatures, why not look
21:06 - 21:09

for spam signatures, right? So depending
21:09 - 21:12

on how complex the device is, if it at
21:12 - 21:14

least has IPS functionality built in,
21:14 - 21:15

we're going to call it a next
21:15 - 21:17

generation firewall. And here's the funny
21:17 - 21:20

part, if the next generation firewall has
21:20 - 21:23

a bunch of other additional features on
21:23 - 21:26

top of the IPS functionality, such as
21:26 - 21:29

malware scanning, antivirus scanning,
21:29 - 21:30

perhaps looking at the files and being
21:30 - 21:32

able to implement some data loss
21:32 - 21:36

prevention policies, it's able to look
21:36 - 21:39

at the URLs and categorize them and
21:39 - 21:41

analyze the reputation of the web pages,
21:41 - 21:43

and pretty much everything that we could
21:43 - 21:45

possibly think of that we could be doing
21:45 - 21:47

just by looking at the application data,
21:47 - 21:49

then we're going to call this a unified
21:49 - 21:51

threat management device, a UTM device.
21:51 - 21:54

Again, I don't think I need to repeat
21:54 - 21:57

this, but the more complex the device
21:57 - 22:00

becomes, the more stuff it needs to do in
22:00 - 22:02

order to decide weather to allow a
22:02 - 22:04

packet or not, the more resources, the
22:04 - 22:07

more CPU intensive it's going to be, the
22:07 - 22:10

more memory it's going to require, and the
22:10 - 22:11

more delay that is going to be introduced in the
22:11 - 22:14

network. So keep this in mind. Even though
22:14 - 22:16

it kind of sounds cool, right, to have all
22:16 - 22:17

that security functionality in a single
22:17 - 22:19

box,
22:19 - 22:21

which by the way, try to make sure
22:21 - 22:23

it's not a single box of failure, single
22:23 - 22:25

point of failure, all right? [Laughs]
22:25 - 22:27

Even though it sounds cool to have all
22:27 - 22:29

this functionality in one place,
22:29 - 22:33

it's going to hit your performance
22:33 - 22:35

pretty badly, right? So keep this in mind.
22:35 - 22:38

Don't just enable everything blindly
22:38 - 22:41

because the end users, the applications,
22:41 - 22:43

and well, God forbid your
22:43 - 22:44

customers, you're paying customers,
22:44 - 22:47

they're going to feel the effects of
22:47 - 22:50

your awesome UTM device, and
22:50 - 22:52

their application experience is going to
22:52 - 22:54

suffer. Now, a special type of network
22:54 - 22:56

monitoring device can also be considered,
22:56 - 22:58

a web application firewall. We've briefly
22:58 - 23:00

mentioned about web application
23:00 - 23:03

firewalls in a previous video, and we
23:03 - 23:06

said that a WAF, a web application firewall,
23:06 - 23:09

is just a dedicated firewall that is
23:09 - 23:12

specifically trained and educated to
23:12 - 23:16

look at attack signatures aimed at web
23:16 - 23:18

applications. So we're looking for things
23:18 - 23:20

such as cross-site scripting, we're
23:20 - 23:21

looking for,
23:21 - 23:23

you know, directory traversals, we're
23:23 - 23:26

looking at SQL injection attacks. We're
23:26 - 23:27

looking at pretty much anything that
23:27 - 23:31

could be performed by malicious user
23:31 - 23:35

that is trying to exploit a input
23:35 - 23:39

validation flaw in a web application. So
23:39 - 23:41

it's still an application layer firewall.
23:41 - 23:43

It still looks at the application
23:43 - 23:46

layer payload. It's just that it's a bit
23:46 - 23:49

more let's say, picky about what type of
23:49 - 23:51

traffic is it going to analyze. It's
23:51 - 23:53

only going to look at web traffic, and
23:53 - 23:56

it's only going to look for web
23:56 - 23:58

attacks, web application attacks. It's
23:58 - 24:00

mostly going to rely on signatures.
24:00 - 24:03

That's because we cannot really do much
24:03 - 24:07

when it comes to requests coming in from
24:07 - 24:09

our clients. Behavioral analysis
24:09 - 24:11

doesn't really play well here because
24:11 - 24:13

most attacks, especially web application
24:13 - 24:17

attacks, are just one single request, one
24:17 - 24:21

single query with a malicious payload.
24:21 - 24:23

So in many situations, it's going to be
24:23 - 24:25

either black or white, right? We're
24:25 - 24:28

detecting an attempt at an intrusion,
24:28 - 24:29

we're detecting an attack in that
24:29 - 24:32

request or not. It's pretty much not
24:32 - 24:34

going to be much of a gray area with
24:34 - 24:36

web application firewalls. And you could
24:36 - 24:39

deploy a WAF as a separate device. It
24:39 - 24:40

could be a physical box, it could be a
24:40 - 24:42

virtual machine, it could be a
24:42 - 24:45

functionality within a UTM device, again,
24:45 - 24:49

all in one wonders. But it can also be a
24:49 - 24:52

part of the web server itself. So we have
24:52 - 24:55

plugins that install alongside the
24:55 - 24:57

actual web server that is hosting the
24:57 - 24:59

web application, such as plugins for the
24:59 - 25:02

Apache web server, for the IIS web server,
25:02 - 25:05

on Windows server, or for Nginx. So
25:05 - 25:07

we're installing these plugins right
25:07 - 25:10

there, and their purpose is to scan the
25:10 - 25:11

traffic that's coming in from the
25:11 - 25:14

clients before allowing that request to
25:14 - 25:16

be processed by the web server. Having
25:16 - 25:18

something such as a plugin that runs
25:18 - 25:20

alongside the web server on the same
25:20 - 25:23

machine, on the same box, opens us to the
25:23 - 25:26

risk of either having that machine
25:26 - 25:29

compromised by an attacker who, this time
25:29 - 25:31

doesn't target the web application, but
25:31 - 25:35

targets the scanning engine and can
25:35 - 25:36

intentionally cause, for example, a denial
25:36 - 25:39

of service, give it so much traffic to
25:39 - 25:41

analyze that the web server running on
25:41 - 25:43

the same machine is unable to actually
25:43 - 25:45

respond to valid requests. So there you
25:45 - 25:47

have it that's the denial of service attack.
25:47 - 25:49

Now, when it comes to actually monitoring
25:49 - 25:51

the network traffic, we said that a
25:51 - 25:53

solution would be to just simply mirror
25:53 - 25:55

all the traffic, and then look for
25:55 - 25:56

specific attack patterns inside of that
25:56 - 25:59

traffic. Now, that might not be always
25:59 - 26:01

feasible because the amount of traffic
26:01 - 26:03

entering a data center or the server
26:03 - 26:06

front that hosts an application might be
26:06 - 26:09

huge, right? So in some situations, we
26:09 - 26:11

might not be able to analyze the exact
26:11 - 26:14

amount of traffic that goes in, but we
26:14 - 26:17

might be able to generate a summary of
26:17 - 26:18

that traffic and then analyze that
26:18 - 26:21

summary for intrusion attempts. Now, this
26:21 - 26:24

traffic summary is sometimes found under
26:24 - 26:28

the terminology of NetFlow or sFlow or
26:28 - 26:30

jFlow, which is basically just a
26:30 - 26:32

technology implemented by various
26:32 - 26:34

vendors out there in which instead of
26:34 - 26:37

creating an exact copy of the traffic,
26:37 - 26:39

we're simply summarizing that traffic,
26:39 - 26:42

and then reporting that summary back to
26:42 - 26:44

some analysis software. So we're only
26:44 - 26:47

telling it what type of sources, what
26:47 - 26:49

type of destinations have communicated,
26:49 - 26:51

how many bytes were used, what type of
26:51 - 26:53

protocols have been used,
26:53 - 26:56

what type of flags have been set in that
26:56 - 26:58

specific type of traffic. But we don't
26:58 - 27:01

put the burden of sending the entire
27:01 - 27:03

actual traffic in the entire payload to
27:03 - 27:05

that analysis software. Now, this also
27:05 - 27:07

means that we're losing application
27:07 - 27:09

layer visibility, all right? Since we're
27:09 - 27:11

just summarizing the type of traffic,
27:11 - 27:13

we're only describing the metadata about
27:13 - 27:16

that traffic, we're losing everything that
27:16 - 27:18

pertains to the application layer, but
27:18 - 27:20

we're gaining a lot of performance, and
27:20 - 27:22

we can also store this summary
27:22 - 27:24

information long term for further
27:24 - 27:27

analysis somewhere along the line in the
27:27 - 27:30

future. Sometimes, looking at traffic, it's
27:30 - 27:33

simply not feasible. Maybe we cannot grab
27:33 - 27:35

all the traffic that's running through
27:35 - 27:37

the network. Maybe we don't have network
27:37 - 27:39

devices smart enough to generate those
27:39 - 27:41

summaries, those flow
27:41 - 27:44

reports for us. So another solution would
27:44 - 27:47

be to simply have a software monitoring
27:47 - 27:49

solution or so-called a network
27:49 - 27:51

performance monitor that queries
27:51 - 27:54

periodically your networking devices,
27:54 - 27:57

queries your routers, your switches,
27:57 - 27:59

your wireless LAN controllers, your
27:59 - 28:02

firewalls perhaps about the status of
28:02 - 28:05

their physical resources, status of their
28:05 - 28:07

interfaces, how much traffic is going
28:07 - 28:10

through their interfaces, what's the CPU
28:10 - 28:12

load, what's the memory usage, what's the
28:12 - 28:15

structure of the routing table, how does
28:15 - 28:18

the r table look like, how is the DHTP
28:18 - 28:20

traffic looking like, right? So any type
28:20 - 28:22

of monitoring information that can be
28:22 - 28:23

extracted out of these networking
28:23 - 28:26

devices, which, in turn, can be correlated
28:26 - 28:28

in order to figure out if we can see
28:28 - 28:30

some anomalies in there. One such
28:30 - 28:33

solution is, for example, SolarWinds NPM,
28:33 - 28:35

network performance monitor, which is a
28:35 - 28:37

dedicated solution for monitoring not
28:37 - 28:39

just networking devices, but also servers
28:39 - 28:42

and virtual machines about their
28:42 - 28:45

their health, right? How are their network
28:45 - 28:47

interfaces looking like, how much load is
28:47 - 28:49

there on their hardware resource or
28:49 - 28:51

their hardware components, are they
28:51 - 28:53

generating any alerts, do we have failed
28:53 - 28:55

interfaces, do we have failed processes, do
28:55 - 28:58

we have something that's- failed links,
28:58 - 29:00

are we detecting errors or overloaded
29:00 - 29:03

devices? Stuff like that.
29:03 - 29:05

Now, this type of performance monitoring
29:05 - 29:08

can be done over a variety of protocols.
29:08 - 29:11

In most cases, the SNMP protocol is going
29:11 - 29:12

to be used because it allows us to
29:12 - 29:15

report a lot of the hardware counters
29:15 - 29:17

and a lot of the interesting information
29:17 - 29:19

that we want to gather and store long
29:19 - 29:22

term. Also, we might be using WMI such as
29:22 - 29:24

Windows management instrumentation and a
29:24 - 29:26

couple other protocols as well. And of
29:26 - 29:28

course, we could enrich this collection
29:28 - 29:31

by collecting logs from the monitored
29:31 - 29:33

devices and appliances as well. And we
29:33 - 29:35

could be collecting those logs over
29:35 - 29:37

syslog, so we need to configure the
29:37 - 29:38

device to actually send those syslog
29:38 - 29:41

messages or at least a copy of them to
29:41 - 29:44

the monitoring device. Or we could rely
29:44 - 29:45

on an agent, an additional piece of
29:45 - 29:47

software installed on the server on the
29:47 - 29:49

virtual machine that periodically
29:49 - 29:51

reports back to us everything of
29:51 - 29:54

interest regarding that specific
29:54 - 29:56

host. When talking about dedicated
29:56 - 29:59

software design specifically designed to
29:59 - 30:02

analyze a lot of information coming from
30:02 - 30:04

the network be it network traffic, network
30:04 - 30:07

summaries such as NetFlow, logs, and any
30:07 - 30:10

kind of application data, that solution
30:10 - 30:12

is most likely going to be called a SIEM,
30:12 - 30:14

a security information and event
30:14 - 30:16

management. Now the keyword and the
30:16 - 30:19

definition of SIEM is correlation. That
30:19 - 30:22

is it's not just a place where you just
30:22 - 30:24

dump all that information in a huge
30:24 - 30:27

database, it's a place that as you dump
30:27 - 30:29

that information, it's going to look for
30:29 - 30:31

patterns inside of it. It's going to try
30:31 - 30:34

to correlate network traffic with logs
30:34 - 30:37

or application data with NetFlow
30:37 - 30:39

data in order to figure out if some
30:39 - 30:42

anomalous behavior is detected in your
30:42 - 30:45

network. So SIEM solution, and by the way,
30:45 - 30:47

these are pretty expensive solutions out
30:47 - 30:50

there, are never designed to be just log
30:50 - 30:52

storage, right? They're engines, smart
30:52 - 30:55

engines based on machine learning that
30:55 - 30:58

aim to detect patterns of intrusion by
30:58 - 31:00

analyzing and correlating information
31:00 - 31:03

found in multiple log files, and what's
31:03 - 31:05

interesting about the implementation of
31:05 - 31:06

SIEMs is that they're supposed to
31:06 - 31:10

collect logs from your network devices,
31:10 - 31:12

from your security devices, even from
31:12 - 31:14

your workstations, and your mobile
31:14 - 31:16

devices perhaps. And they're able to
31:16 - 31:18

understand and correlate all that
31:18 - 31:19

information and normalize all that
31:19 - 31:22

information even if it comes from tens
31:22 - 31:24

or hundreds of vendors or thousands of
31:24 - 31:25

devices,
31:25 - 31:27

and they're able to normalize that
31:27 - 31:30

information and make it look the same so
31:30 - 31:31

that in the end,
31:31 - 31:34

it can look for patterns inside of it,
31:34 - 31:36

and it also allows you to perform
31:36 - 31:39

queries in a language quite similar to a
31:39 - 31:42

regular SQL language and query all that
31:42 - 31:45

information regardless of the fact that
31:45 - 31:47

it actually came from tens of
31:47 - 31:50

hundreds of different vendors. And since
31:50 - 31:51

a SIEM without machine learning
31:51 - 31:54

functionality is not a very useful SIEM,
31:54 - 31:56

we could use that machine learning
31:56 - 31:59

features to look at user behavior as
31:59 - 32:02

well because in the end, we're trying not
32:02 - 32:04

to detect just, you know, attack patterns,
32:04 - 32:06

we're also trying to identify who is
32:06 - 32:09

conducting them. And a great risk comes
32:09 - 32:11

from insider threats, so if we are able
32:11 - 32:13

to monitor what our users are doing,
32:13 - 32:15

we're not talking here about just
32:15 - 32:17

watching what websites they're
32:17 - 32:20

visiting or taking frequent screenshots
32:20 - 32:22

of their workstations, no we're
32:22 - 32:24

not doing that, but we're looking at the
32:24 - 32:26

behavior that they're exhibiting
32:26 - 32:28

whenever they are interacting with
32:28 - 32:31

specific applications. And if the SIEM
32:31 - 32:33

has such an ability, we call that ability
32:33 - 32:37

user and entity behavior analysis. Don't
32:37 - 32:40

think that we're only performing here
32:40 - 32:43

a witch hunt against insider threats.
32:43 - 32:45

Think about the fact that we might be
32:45 - 32:48

able to detect abnormal behavior because
32:48 - 32:50

a user account has been compromised by a
32:50 - 32:52

hacker, and that hacker is now acting on
32:52 - 32:55

behalf of that user. The user might have
32:55 - 32:57

nothing to do with that abnormal
32:57 - 32:59

behavior, might not even know about it,
32:59 - 33:01

might not even be logged in at that
33:01 - 33:03

specific point in time. But the attacker
33:03 - 33:05

might be acting on behalf of that user.
33:05 - 33:07

If we're able to detect that abnormal
33:07 - 33:10

behavior, we might be able to detect the
33:10 - 33:14

attack going on right then. And stepping
33:14 - 33:16

just a bit into the realm of science
33:16 - 33:19

fiction here, I know that some vendors
33:19 - 33:20

will say no, this is not science fiction,
33:20 - 33:23

we're selling this, we've had huge
33:23 - 33:25

success with this. Well, yes and no. I'm
33:25 - 33:28

going to keep being a bit skeptical as to
33:28 - 33:31

how efficient this approach is. What I'm
33:31 - 33:33

talking here about is sentiment analysis
33:33 - 33:36

or emotion AI. tTat is analyzing user
33:36 - 33:39

behavior in what content the user is
33:39 - 33:42

actually creating as in blog posts,
33:42 - 33:44

social media postings.
33:44 - 33:46

We're not talking here about actual, you
33:46 - 33:49

know, analyzing the contents of emails
33:49 - 33:51

and chats because that might, you
33:51 - 33:54

know, step into the privacy area which we
33:54 - 33:56

might not want to do that. But by
33:56 - 33:58

analyzing publicly available information
33:58 - 34:00

generated by those users, we might be
34:00 - 34:03

able to detect disgruntled employees. We
34:03 - 34:06

might be able to detect unsatisfied
34:06 - 34:08

clients that might create some bad
34:08 - 34:11

reputation for the company, perhaps even
34:11 - 34:14

before they become so upset as to take
34:14 - 34:17

action or malicious action against our
34:17 - 34:19

company. Again, take this with a grain of
34:19 - 34:23

salt, and don't just think that if it
34:23 - 34:27

sounds awesome on paper, it has to be
34:27 - 34:29

awesome in real life. If it sounds too
34:29 - 34:31

good to be true, then it probably is too
34:31 - 34:33

good to be true.
34:33 - 34:35

And finally, the last term here that I
34:35 - 34:37

wanted you to know about is SOAR,
34:37 - 34:39

security orchestration, automation and
34:39 - 34:41

response. That's a mouthful, I know. It's
34:41 - 34:42

usually a functionality built into SIEM
34:42 - 34:44

solutions or it can be just a standalone
34:44 - 34:47

solution. What it basically tries to
34:47 - 34:49

address is the problem of too much
34:49 - 34:52

information that is being overwhelmed by
34:52 - 34:55

too many alerts, too many security events,
34:55 - 34:57

too many security incidents, too many
34:57 - 34:58

incidents that we need to determine if
34:58 - 35:00

they're security related or not. [Laughs]
35:00 - 35:04

Basically the hell of any IT Department
35:04 - 35:07

that deals solely with monitoring the
35:07 - 35:09

network and the applications. And the
35:09 - 35:11

idea behind this is that a SOAR
35:11 - 35:13

solution is supposed to use some machine
35:13 - 35:16

learning techniques in order to not just
35:16 - 35:19

to figure out which anomalous events are
35:19 - 35:22

occurring in the network, but by
35:22 - 35:24

analyzing those anomalous events, it is
35:24 - 35:28

able to take some action against them. So
35:28 - 35:31

it could, at some point, determine if an
35:31 - 35:33

attack is going on, even if it happens
35:33 - 35:34

in the middle of the night, and take
35:34 - 35:36

action immediately by blocking some
35:36 - 35:39

ports, by creating an access list, by
35:39 - 35:40

disabling- temporarily disabling some user
35:40 - 35:41

accounts that might have been
35:41 - 35:43

compromised. So that's security
35:43 - 35:47

orchestration, automation and response.
35:47 - 35:49

Just be sure everybody is clear on this,
35:49 - 35:51

especially for the exam, where does the
35:51 - 35:53

SIEM get its information from. Where
35:53 - 35:55

first of all, it's going to get it from
35:55 - 35:57

logs, right? Syslogs. That's going to be
35:57 - 35:58

the main source of information. How do
35:58 - 36:00

you collect logs? Well you don't really
36:00 - 36:03

collect them. You expect those devices to
36:03 - 36:05

send those to you, so those devices need
36:05 - 36:07

to be configured be it networking
36:07 - 36:09

devices. They might be servers, they
36:09 - 36:10

might be virtual machines, whatever type
36:10 - 36:12

of device you have, just configure them
36:12 - 36:15

to send your logs to a secondary
36:15 - 36:17

destination if the SIEM is not the
36:17 - 36:19

primary one. Just make sure they send a
36:19 - 36:22

copy of those syslogs to the same device
36:22 - 36:24

as well. Next, the SIEM can also collect
36:24 - 36:27

data by installing agents on specific
36:27 - 36:29

systems. Now of course, we might not be
36:29 - 36:31

able to install agents on let's say
36:31 - 36:34

routers or switches apart from some
36:34 - 36:37

recent devices that are running Docker
36:37 - 36:40

containers perhaps. But in most cases, SIEM
36:40 - 36:42

agents are designed to be installed on
36:42 - 36:45

Windows and Linux systems. Then they're
36:45 - 36:47

running as background processes that
36:47 - 36:49

periodically scan the system and
36:49 - 36:51

report back to the SIEM. The logs
36:51 - 36:53

generated by the operating system, the
36:53 - 36:55

running applications, the logs generated
36:55 - 36:57

by the applications, actually, running on
36:57 - 36:59

that host, depending on how the agent is
36:59 - 37:02

configured. The built-in listeners or
37:02 - 37:03

collectors that you're seeing here on
37:03 - 37:05

the slide refers to the fact that the
37:05 - 37:09

SIEM is pre-configured or has plugins
37:09 - 37:12

that allow it to understand what
37:12 - 37:14

different vendors are reporting back to
37:14 - 37:15

it. So it's going to have different
37:15 - 37:18

plugins to understand logs coming in
37:18 - 37:20

from, you know, Cisco devices, HP devices,
37:20 - 37:25

Dell, VMware, whatever vendor it is, it
37:25 - 37:26

needs some sort of a plugin to
37:26 - 37:29

understand that specific log format and
37:29 - 37:31

more than that, it needs to
37:31 - 37:34

understand the contents of the payload of
37:34 - 37:38

what the log is saying. SNMP traps, again,
37:38 - 37:40

most monitoring information is going to
37:40 - 37:43

come in through an SNMP query or as an
37:43 - 37:46

SNMP trap generated by the device back
37:46 - 37:50

to the SIEM. And also NetFlow. NetFlow or
37:50 - 37:52

different variants implemented by
37:52 - 37:54

different vendors are basically just
37:54 - 37:58

summaries of the traffic flows detected
37:58 - 38:01

over a certain period of time, collected,
38:01 - 38:03

and then sent over to the SIEM device
38:03 - 38:05

in order for that traffic summary to be
38:05 - 38:09

analyzed. Finally, the SIEM can also
38:09 - 38:12

capture raw packet data if it has
38:12 - 38:15

dedicated sensors that are able to
38:15 - 38:17

generate a copy of the traffic and send
38:17 - 38:19

it back to the SIEM, or we can even have
38:19 - 38:21

sensors installed inside our network that
38:21 - 38:25

are monitoring real traffic, and they're
38:25 - 38:26

only telling back to the SIEM or they're
38:26 - 38:29

reporting back to the SIEM a summary of
38:29 - 38:31

that traffic. This is very useful when
38:31 - 38:35

your devices don't have enough reporting
38:35 - 38:38

or monitoring capabilities to report
38:38 - 38:40

back to the SIEM device, and instead, you need
38:40 - 38:42

to install some specific sensors that
38:42 - 38:45

look at the traffic, and then tell the
38:45 - 38:46

SIEM the necessary information that it
38:46 - 38:48

needs to perform those correlations.
38:48 - 38:51

Sometimes a sensor such as this one might be
38:51 - 38:54

an IPS or an IDS device even. Log
38:54 - 38:57

normalization is a feature built into
38:57 - 38:58

most SIEM Solutions out there. And
38:58 - 39:00

normalization is required, and it's a
39:00 - 39:03

very important feature because the SIEM
39:03 - 39:04

is designed to collect information from
39:04 - 39:07

hundreds of vendors and thousands of
39:07 - 39:09

different appliances, each of them
39:09 - 39:11

running different operating systems on
39:11 - 39:13

different versions, and they're all
39:13 - 39:16

building syslogs and SNP traps in
39:16 - 39:18

different formats. Some are
39:18 - 39:21

reporting them as a text, some are
39:21 - 39:23

generating logs in binary format, some
39:23 - 39:26

logs are in JSON format, some are in XML
39:26 - 39:30

format or CSV format, depending on how
39:30 - 39:32

the vendor actually designed its logging
39:32 - 39:35

and monitoring abilities. We might even
39:35 - 39:38

find differences as to how the logs are
39:38 - 39:39

actually encoded. Some of them are might
39:39 - 39:41

be using UTF, some of them might be using
39:41 - 39:44

some regional encoding. We might even run
39:44 - 39:46

into some issues due to the fact that
39:46 - 39:48

the new line character is represented
39:48 - 39:50

differently between Windows and Linux
39:50 - 39:51

systems, and that also might be reflected
39:51 - 39:55

in the payload included in the logs that
39:55 - 39:57

we're receiving as part of the
39:57 - 39:59

monitoring process. Not to mention the
39:59 - 40:02

fact that the SNMP mips, basically the
40:02 - 40:05

the database schemas that each vendor is
40:05 - 40:08

using for their own software solutions
40:08 - 40:09

or hardware appliances, these are
40:09 - 40:11

completely different not just among
40:11 - 40:13

vendors, but also among different
40:13 - 40:16

products from the same vendor. So in
40:16 - 40:17

order to have all this bunch of
40:17 - 40:19

information collected in some
40:19 - 40:21

centralized location and to be able to
40:21 - 40:23

query all this information and to be
40:23 - 40:26

able to approach it in a consistent
40:26 - 40:28

manner, we need normalization. That is
40:28 - 40:31

taking all this information coming from
40:31 - 40:34

so many vendors in so many formats and
40:34 - 40:37

making that information look exactly the
40:37 - 40:40

same so that it can be stored in a
40:40 - 40:42

single database that can be queried at
40:42 - 40:44

once regardless of the source of that
40:44 - 40:47

information. So what are we using to
40:47 - 40:49

normalize all this information coming
40:49 - 40:50

from all these vendors? Well, you guessed
40:50 - 40:53

it? We're gonna need some plugins. Some of
40:53 - 40:55

these plugins come from this SIEM vendor
40:55 - 40:57

itself. So they're going to be
40:57 - 40:59

pre-packaged with vendor plugins
40:59 - 41:02

from major vendors out there. Some of
41:02 - 41:03

these plugins are going to come from the
41:03 - 41:06

actual vendors. So if a smaller vendor
41:06 - 41:08

creates them, let's say smaller
41:08 - 41:11

firewalls at some point, and they want to
41:11 - 41:12

be able to integrate with the
41:12 - 41:14

large-scale SIEM Solutions, they're going
41:14 - 41:16

to provide you with a plugin for their
41:16 - 41:18

own environment as well. And another type
41:18 - 41:20

of normalization that is really, really
41:20 - 41:22

important is timestamp normalization.
41:22 - 41:23

Don't forget that we're looking for
41:23 - 41:26

anomalies in network traffic and in
41:26 - 41:29

network events. And if we don't have
41:29 - 41:32

timestamp normalization, if we don't make
41:32 - 41:35

sure that all the events that we're
41:35 - 41:38

looking at are actually stored with
41:38 - 41:40

their right timestamp, at their right
41:40 - 41:42

moment in time when they actually
41:42 - 41:45

happened, we have no chance of detecting
41:45 - 41:47

anomalies in the network. So we might
41:47 - 41:50

have devices that have a badly
41:50 - 41:52

configured clock. We might have devices
41:52 - 41:54

that have been configured for different
41:54 - 41:57

time zones. We might have devices that
41:57 - 42:00

display time or timestamp, and those time
42:00 - 42:03

values in their logs in one
42:03 - 42:05

format versus another format. Some of
42:05 - 42:07

them might be using 24 hour, some of them
42:07 - 42:09

might be using 12 hours. Some of them
42:09 - 42:11

might include the daylight savings time.
42:11 - 42:14

Some of them might be using a UTC or
42:14 - 42:18

Unix epoch time. It's up to the vendor, so
42:18 - 42:21

normalizing these timestamps is also a
42:21 - 42:24

very, very, important topic here that
42:24 - 42:26

needs to be taken care of by the SIEM
42:26 - 42:30

solution before that event indicated by
42:30 - 42:32

that specific timestamp is stored in the
42:32 - 42:35

database alongside with the others. Now,
42:35 - 42:37

the way a SIEM solution can look for
42:37 - 42:40

anomalies in that huge database that we
42:40 - 42:42

just talked about. Well, it could be done
42:42 - 42:44

in a number of ways. We could just rely
42:44 - 42:48

on simple if then else matches, so we're
42:48 - 42:50

looking for, you know, specific events,
42:50 - 42:53

specific types of logs being generated
42:53 - 42:56

in a specific time range perhaps. This
42:56 - 42:58

type of approach is the fastest one
42:58 - 43:00

because it basically boils down to
43:00 - 43:02

a simple query in that huge database
43:02 - 43:04

stored by the SIEM and appliance.
43:04 - 43:07

Unfortunately, if there are unknown
43:07 - 43:08

threats, if there are attacks that we
43:08 - 43:11

know nothing about, that we don't have a
43:11 - 43:12

signature for them, we don't know what to
43:12 - 43:14

look for, we're not going to be able to
43:14 - 43:17

detect them. Kind of makes sense, right? So
43:17 - 43:18

another approach would be heuristic rule
43:18 - 43:21

matching. This is a type of rule matching
43:21 - 43:23

where we're not exactly looking for an
43:23 - 43:24

exact match
43:24 - 43:27

for the specific type of event, but we're
43:27 - 43:30

looking for something that it's pretty
43:30 - 43:33

close to it, all right? So this type of
43:33 - 43:34

approach
43:34 - 43:38

relies on a more permissive set of rules.
43:38 - 43:41

So if it doesn't 100% match or rule, let's
43:41 - 43:44

say if we have some events that are
43:44 - 43:46

pretty close to it and match it like
43:46 - 43:48

let's say 80% or 90%.
43:48 - 43:51

Now this also requires you to fine-tune
43:51 - 43:54

your rule set, so if at some point by
43:54 - 43:57

doing heuristic rule matching, you're
43:57 - 43:59

detecting some anomalies, but you don't
43:59 - 44:03

have a rule that matches that anomaly
44:03 - 44:06

100%, well, you better create it, right?
44:06 - 44:08

You better fine tune your rule set and
44:08 - 44:10

add some more rules or tweak the
44:10 - 44:13

existing ones to match that newly
44:13 - 44:16

detected anomaly, and just to recap this
44:16 - 44:19

here, an behavioral analysis implemented
44:19 - 44:21

in a SIEM relies on the fact that you need
44:21 - 44:23

to build a baseline. You need to tell the
44:23 - 44:26

SIEM how does your normal look like, how
44:26 - 44:28

does your normal traffic look like, how
44:28 - 44:31

does your normal logs generated by all
44:31 - 44:32

the devices and all the applications in
44:32 - 44:35

your network looks like. So that, in turn,
44:35 - 44:37

can be used as a starting point in order
44:37 - 44:40

to detect potential, well, mismatches that
44:40 - 44:44

might indicate attacks or attempts at
44:44 - 44:46

compromising your network. Now of course,
44:46 - 44:47

this is going to create a lot of false
44:47 - 44:49

positives. So you might run into a situation
44:49 - 44:52

where an alert is being raised because
44:52 - 44:56

an application starts generating some
44:56 - 44:58

huge backups because some admin has
44:58 - 45:01

modified the backup policy. Now the SIEM
45:01 - 45:03

device sees a lot of traffic in there,
45:03 - 45:06

raises an alert, raises everyone from their
45:06 - 45:09

sleep at 3am in the morning, and saying
45:09 - 45:11

that, oh my god, this looks like a data
45:11 - 45:13

exfiltration attempt. Somebody is
45:13 - 45:16

dumping all the data from our database,
45:16 - 45:17

and then an admin has to come in and
45:17 - 45:20

intervene and say, my dear, SIEM, what's
45:20 - 45:22

happening in there, what you're seeing is
45:22 - 45:24

just a full backup happening at 3am in
45:24 - 45:28

the morning. It's okay, right? Don't freak
45:28 - 45:31

out about it, okay? So it does require
45:31 - 45:33

human intervention for fine tuning these
45:33 - 45:35

rules.
45:35 - 45:36

On the other hand, we have anomaly
45:36 - 45:39

analysis. And this is, by definition, a
45:39 - 45:40

type of analysis that is performed
45:40 - 45:43

whenever we're comparing observed
45:43 - 45:47

behavior with known standard behavior,
45:47 - 45:49

especially when we're comparing what
45:49 - 45:51

we're seeing as part of a protocol's
45:51 - 45:54

behavior with what this SIEM device
45:54 - 45:56

knows that the protocol is supposed to
45:56 - 45:59

behave according to its RFC, according to
45:59 - 46:01

its definition. Finally with trend
46:01 - 46:03

analysis, we're going to be looking at
46:03 - 46:06

historic data and try to extrapolate it.
46:06 - 46:08

For example, if we see that the backups
46:08 - 46:10

are increasing every single week because
46:10 - 46:13

more data and more data is generated, the
46:13 - 46:16

SIEM device might be able to generate a
46:16 - 46:19

pattern so that if we see five gigabytes
46:19 - 46:22

in a backup this week, and eight gigabytes
46:22 - 46:25

of backups next week, when it is going to
46:25 - 46:28

see 12 gigabytes two weeks from now, it's
46:28 - 46:30

not going to raise an alert because it
46:30 - 46:32

expected the backup volume to increase
46:32 - 46:34

by that amount. But I don't need to tell
46:34 - 46:36

you that not everything can be safely
46:36 - 46:40

predicted this way. Finally, after all
46:40 - 46:42

that advanced correlation and machine
46:42 - 46:45

learning and AI features, the SIEMs
46:45 - 46:49

actually can be used as a database for
46:49 - 46:53

event storage, and they can be queried by
46:53 - 46:55

human users, by admins if you know what
46:55 - 46:57

to look for. Perhaps you just need to
46:57 - 46:59

investigate some event. Perhaps you need
46:59 - 47:02

to perform some forensic analysis.
47:02 - 47:04

So those databases become available to
47:04 - 47:08

you, to any admin basically, simply by
47:08 - 47:10

creating specific rules in order to
47:10 - 47:12

match specific types of events stored in
47:12 - 47:14

there. So you could create simple rules
47:14 - 47:16

that are they're matching based on
47:16 - 47:18

specific conditions. Look for one
47:18 - 47:20

specific IP address or look for a
47:20 - 47:22

specific time range, look for one
47:22 - 47:26

specific string that might occur in all
47:26 - 47:28

those log payloads. Maybe look for a user
47:28 - 47:32

and see what are the events that are
47:32 - 47:34

that are generated by the user or that
47:34 - 47:36

implicate that user and so on and so
47:36 - 47:37

forth. So the SIEM appliances are going
47:37 - 47:40

to allow you to create some queries very
47:40 - 47:41

similar to what you might be already
47:41 - 47:44

used to if you ever used SQL in the past
47:44 - 47:46

because all that data is basically
47:46 - 47:48

stored in a relational database which
47:48 - 47:51

can be queried with an SQL like
47:51 - 47:54

language. And finally, don't forget that
47:54 - 47:56

at the end of the day, not everybody has
47:56 - 47:59

money to invest in a SIEM solution, so you
47:59 - 48:01

might end up having to analyze your logs
48:01 - 48:04

by yourself, just navigating a bunch of
48:04 - 48:07

logs. And this is where a bunch of text
48:07 - 48:10

matching utilities, especially some
48:10 - 48:11

utilities that are built into most Linux
48:11 - 48:14

distributions are going to come in and
48:14 - 48:16

help you tremendously. Now, this is not a
48:16 - 48:18

Linux course, and the exam is not going
48:18 - 48:20

to expect you to know everything about
48:20 - 48:24

all these command line commands. But I
48:24 - 48:26

would say that knowing at least the
48:26 - 48:28

commands right here on the slide is
48:28 - 48:30

going to help you figure out a couple of
48:30 - 48:33

the outputs on the exam. Alright, so without
48:33 - 48:35

going into too much detail here, let's
48:35 - 48:38

have a look in one of my folders here
48:38 - 48:40

that stores log files and a Ubuntu
48:40 - 48:43

distribution, this is running on WSL,
48:43 - 48:45

right, Windows subsystem for Linux. We
48:45 - 48:48

have a log file right here, dpkg log
48:48 - 48:50

which is the log that's generated by the
48:50 - 48:52

package managers. So this log is going to
48:52 - 48:54

tell me which package-based operations
48:54 - 48:56

have been conducted on this machine
48:56 - 48:58

from it's beginning, from its
48:58 - 49:00

installation, right? What did I install,
49:00 - 49:02

what did I uninstall, what did I upgrade?
49:02 - 49:04

So it might be some useful information
49:04 - 49:07

in here. So let's just see a couple of
49:07 - 49:09

these commands. 'cat' is the concatenate
49:09 - 49:11

command in Linux and can also be used to
49:11 - 49:14

list the contents of
49:14 - 49:18

text files. So cat dpkg log is going to
49:18 - 49:19

provide you a bunch of listing right
49:19 - 49:22

here, trying to display all the contents
49:22 - 49:24

of the text file right at the console.
49:24 - 49:26

Now, this file right here, we can also
49:26 - 49:30

pipe it. So resend the result of this cat
49:30 - 49:32

command to another command, which could
49:32 - 49:35

be word count, word count minus l. This is
49:35 - 49:37

going to count the lines in this log file.
49:37 - 49:40

So you can see it's over 9000 lines
49:40 - 49:42

long. Pretty tough to search for some
49:42 - 49:46

information in a 9000 line log file. So
49:46 - 49:49

what we can do right here is, for example,
49:49 - 49:51

limit the amount of information that
49:51 - 49:52

we're displaying on the screen. This is
49:52 - 49:54

where the head or tail commands come in.
49:54 - 49:55

The head command, as you can probably
49:55 - 49:57

guess, is going to provide you with a
49:57 - 50:00

listing of the first 10 lines in this
50:00 - 50:04

log file. Similarly the tail command is
50:04 - 50:06

going to provide you a listing of the
50:06 - 50:08

last 10 lines in a log file. The tail
50:08 - 50:10

command is very useful for log files
50:10 - 50:12

that get appended frequently. So if you just
50:12 - 50:15

want to see the last modifications
50:15 - 50:18

made in this file, use the tail
50:18 - 50:19

command. Of course, the number of lines is
50:19 - 50:21

configurable. We're not going to go into
50:21 - 50:24

all these parameters right now. If you're
50:24 - 50:26

interested in finding out more about any
50:26 - 50:28

Linux command, any Linux utility, just use
50:28 - 50:31

the man pages, man tail,
50:31 - 50:33

and it's going to provide you with the
50:33 - 50:35

manual pages that are going to tell you
50:35 - 50:38

what are all the possible configuration
50:38 - 50:40

flags or settings that can be added to
50:40 - 50:42

this command. Here's the dash n, for
50:42 - 50:44

example, number of lines. Output the
50:44 - 50:46

last number of lines, you can add it as a
50:46 - 50:49

minus n parameter or dash dash line
50:49 - 50:51

equals how many lines you want to
50:51 - 50:52

display on the screen. Quit with the
50:52 - 50:56

letter Q. Now, the grep utility is a
50:56 - 50:58

regular expression evaluator, which can
50:58 - 51:01

be, of course, used to run some complex
51:01 - 51:03

regular expressions, which are going to
51:03 - 51:05

help you tremendously dig through a lot
51:05 - 51:07

of information aand extract what is actually
51:07 - 51:10

useful to you. But you can also do some
51:10 - 51:12

very simple string matching using grep.
51:12 - 51:15

For example, if we are displaying the
51:15 - 51:19

dpkg log here and piping this to the
51:19 - 51:21

to the grep command and search for, let's
51:21 - 51:23

say, installation of a specific package,
51:23 - 51:27

such as, let me see, ansible, right? I did
51:27 - 51:28

use this machine for ansible in the past.
51:28 - 51:31

So there you go. These are all the log
51:31 - 51:33

entries in here generated by the ansible
51:33 - 51:36

package. Notice that we've been through a
51:36 - 51:37

number of ansible versions in here.
51:37 - 51:41

Starting from version 2.8.1, we went
51:41 - 51:45

through 2.9.19, 2.9.27, and so on. We can
51:45 - 51:47

even see the evolution of this package
51:47 - 51:49

on this machine. Now, this is just a very,
51:49 - 51:52

very simple example here. I just wanted
51:52 - 51:54

to let you know that you do have a lot
51:54 - 51:56

of utilities available at your disposal
51:56 - 51:59

for manual log searching if you don't
51:59 - 52:02

have a SIEM solution available, all right?
52:02 - 52:03

Now, there's a lot more to talk about
52:03 - 52:06

this, but since this is not a Linux
52:06 - 52:07

training, we're gonna stop right here.
52:07 - 52:09

Alright everyone, thanks so much for
52:09 - 52:10

watching. I know there's been a lot of
52:10 - 52:12

information in this video, but I hope you
52:12 - 52:15

found this useful and informative, and I
52:15 - 52:17

hope to see you on the next video as
52:17 - 52:18

well. Don't forget to leave a comment if
52:18 - 52:20

you like this. Support the channel if you
52:20 - 52:22

can, if you wish, if you find this useful
52:22 - 52:24

in your studies, and see you in the next
52:24 - 52:26

video. Bye, bye.
52:26 - 52:30

[Music]
52:30 - 52:40

[Music]

Title:: CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
Description:: more » « less
Video Language:: English
Duration:: 52:39

	OEVIDEOS edited English subtitles for CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
	OEVIDEOS edited English subtitles for CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
	OEVIDEOS edited English subtitles for CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
	OEVIDEOS edited English subtitles for CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
	OEVIDEOS edited English subtitles for CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
	OEVIDEOS edited English subtitles for CompTIA Security+ Full Course: Security Network Monitoring & SIEMs

English subtitles

Revisions Compare revisions

Revision 6 Edited

OEVIDEOS
Revision 5 Edited

OEVIDEOS
Revision 4 Edited

OEVIDEOS
Revision 3 Edited

OEVIDEOS
Revision 2 Edited

OEVIDEOS
Revision 1 Uploaded

OEVIDEOS

	Revision Number	Author	Created
	6	OEVIDEOS
	5	OEVIDEOS
	4	OEVIDEOS
	3	OEVIDEOS
	2	OEVIDEOS
	1	OEVIDEOS

CompTIA Security+ Full Course: Security Network Monitoring & SIEMs

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)