< Return to Video

CompTIA Security+ Full Course: Security Network Monitoring & SIEMs

  • 0:00 - 0:17
    [Music]
  • 0:17 - 0:20
    Now, network monitoring has been around for a lot of
  • 0:20 - 0:22
    time, probably ever since the first
  • 0:22 - 0:24
    networks were invented. Just like with
  • 0:24 - 0:27
    any system, just like with any electronic
  • 0:27 - 0:31
    device, we tend to want to be able to
  • 0:31 - 0:33
    monitor if everything is going okay. We
  • 0:33 - 0:35
    want to receive warnings, we want to be
  • 0:35 - 0:37
    alerted when something goes wrong, when
  • 0:37 - 0:39
    when something fails. And this type of
  • 0:39 - 0:41
    monitoring is tremendously useful
  • 0:41 - 0:44
    especially in larger networks. Over time,
  • 0:44 - 0:47
    this monitoring has extended to security
  • 0:47 - 0:48
    monitoring as well. So we're not just
  • 0:48 - 0:51
    concerned about how is the network doing,
  • 0:51 - 0:53
    if it's working well, if you don't have
  • 0:53 - 0:55
    any failed devices, but we're also
  • 0:55 - 0:57
    starting to look at the network traffic.
  • 0:57 - 0:59
    How is the network utilized, who uses it,
  • 0:59 - 1:02
    who attempts to access it, what type of
  • 1:02 - 1:04
    traffic are they generating? And if we
  • 1:04 - 1:06
    try to gather all this type of
  • 1:06 - 1:08
    information, we try to make sense of it,
  • 1:08 - 1:10
    we try to correlate it, with a smart
  • 1:10 - 1:13
    enough device, we might be able to detect
  • 1:13 - 1:17
    attempts at intrusion or attacks that
  • 1:17 - 1:19
    are about to happen or that have
  • 1:19 - 1:21
    happened in the past or proves that
  • 1:21 - 1:23
    we've been compromised or somebody in
  • 1:23 - 1:25
    the network has been infected. And all
  • 1:25 - 1:27
    that information is in there if you know
  • 1:27 - 1:29
    where to look for and also if you have
  • 1:29 - 1:31
    the right tools to look for it.
  • 1:31 - 1:34
    In general, the term intrusion detection
  • 1:34 - 1:36
    refers to a system that is able to
  • 1:36 - 1:39
    monitor whatever can be observed in a
  • 1:39 - 1:42
    network, and in most cases, we're talking
  • 1:42 - 1:44
    about two things that can be observed.
  • 1:44 - 1:46
    First of all, we have Network traffic,
  • 1:46 - 1:49
    and then we have application events or
  • 1:49 - 1:51
    logs generated by the operating systems
  • 1:51 - 1:53
    by the applications running on those
  • 1:53 - 1:56
    OS's and so on. So coming back here to
  • 1:56 - 1:58
    our network focus, we've talked about
  • 1:58 - 2:00
    intrusion detection at your network level.
  • 2:00 - 2:02
    We're going to call this one a network
  • 2:02 - 2:06
    based intrusion detection system or NIDS.
  • 2:06 - 2:08
    And we have many commercial solutions as
  • 2:08 - 2:10
    well as open source ones that are able
  • 2:10 - 2:12
    to perform this type of network-based
  • 2:12 - 2:13
    intrusion detection. Of course, all the
  • 2:13 - 2:15
    major security vendors are doing it. In
  • 2:15 - 2:18
    many examples, you're going to see the
  • 2:18 - 2:22
    IDS functionality built into the
  • 2:22 - 2:24
    functionality of a larger firewall or a
  • 2:24 - 2:26
    larger UTM device, especially for major
  • 2:26 - 2:28
    vendors out there. But you also have
  • 2:28 - 2:31
    solutions in the open source area
  • 2:31 - 2:34
    such as Snort, Suricata, or Zeek or Bro.
  • 2:34 - 2:37
    They're all available. And some of them also
  • 2:37 - 2:38
    have commercial versions as well, but
  • 2:38 - 2:39
    they also provide you with three
  • 2:39 - 2:42
    versions that you can freely install and
  • 2:42 - 2:44
    try and run in your own environment. Now
  • 2:44 - 2:47
    the way these intrusion detection
  • 2:47 - 2:51
    systems work by definition is that they
  • 2:51 - 2:53
    rely on a database of signatures. And
  • 2:53 - 2:55
    those signatures are basically just a
  • 2:55 - 2:57
    way to describe how a specific traffic
  • 2:57 - 2:59
    pattern is supposed to look like in
  • 2:59 - 3:01
    order to detect a specific type of
  • 3:01 - 3:04
    attack or attempt at an intrusion. So we
  • 3:04 - 3:05
    might be looking at a sequence of
  • 3:05 - 3:07
    packets that looks in a certain way.
  • 3:07 - 3:10
    We might be looking at a specific type
  • 3:10 - 3:12
    of packet that doesn't play by the
  • 3:12 - 3:14
    normal protocol rules that it belongs to.
  • 3:14 - 3:17
    Or a specific type of payload or simply
  • 3:17 - 3:20
    just a signature, a byte sequence that
  • 3:20 - 3:22
    can be found in the packet
  • 3:22 - 3:24
    payload that indicates the fact that the
  • 3:24 - 3:27
    payload is malicious. And this behavior
  • 3:27 - 3:29
    is very similar to what you're seeing in
  • 3:29 - 3:31
    antivirus scanning or anti-malware
  • 3:31 - 3:33
    scanning. We're simply looking for a
  • 3:33 - 3:36
    sequence of bytes that indicates that-
  • 3:36 - 3:38
    well, if we find the sequence of bytes in
  • 3:38 - 3:40
    a specific executable file, it means that
  • 3:40 - 3:43
    the file is infected with that specific
  • 3:43 - 3:45
    virus that the sequence belongs to.
  • 3:45 - 3:48
    Now, in intrusion detection, again, we
  • 3:48 - 3:50
    we're kind of doing the same thing, right?
  • 3:50 - 3:52
    We're looking for patterns, but we're not
  • 3:52 - 3:53
    just scanning individual packets.
  • 3:53 - 3:56
    Sometimes we need to collect more
  • 3:56 - 3:58
    packets in a sequence in order to
  • 3:58 - 4:01
    determine if the behavior of the client
  • 4:01 - 4:04
    that is generating those packets is
  • 4:04 - 4:06
    abnormal, and if it's abnormal, does
  • 4:06 - 4:09
    it indicate an attack pattern or not? So
  • 4:09 - 4:11
    long story short, intrusion detection
  • 4:11 - 4:13
    systems are strongly dependent on a
  • 4:13 - 4:16
    database of signatures. Now, more advanced
  • 4:16 - 4:17
    instrusion detection systems could also
  • 4:17 - 4:20
    correlate this network information with
  • 4:20 - 4:21
    log information. So we're seeing
  • 4:21 - 4:23
    something fishy in the network by
  • 4:23 - 4:25
    looking at the network traffic, let's
  • 4:25 - 4:27
    check the application logs that the
  • 4:27 - 4:30
    traffic is going towards, for example.
  • 4:30 - 4:31
    Let's see how that application reacts
  • 4:31 - 4:34
    and if we can see some abnormal
  • 4:34 - 4:37
    logs being generated by the app as well.
  • 4:37 - 4:39
    Now, correlating that information, the
  • 4:39 - 4:41
    traffic and the logs, might tell us more about
  • 4:41 - 4:44
    the actual attack or might increase the
  • 4:44 - 4:46
    confidence of the fact that we really
  • 4:46 - 4:49
    have identified a valid attack signature.
  • 4:49 - 4:51
    Not all solutions are able to do this of
  • 4:51 - 4:54
    course. Also, a very important distinction
  • 4:54 - 4:57
    for intrusion detection system with an
  • 4:57 - 4:59
    emphasis on detection is the fact that
  • 4:59 - 5:02
    these systems are never able to block
  • 5:02 - 5:04
    the malicious traffic once they identify
  • 5:04 - 5:06
    it. It's just like the name says, it's just
  • 5:06 - 5:09
    detection, it's not prevention, all right?
  • 5:09 - 5:11
    So we're not stopping the traffic. We
  • 5:11 - 5:13
    might be able to see an attack signature.
  • 5:13 - 5:15
    We might be able to raise some alerts,
  • 5:15 - 5:17
    generate some syslogs, but we're not
  • 5:17 - 5:19
    going to be able to block that specific
  • 5:19 - 5:21
    type of traffic. One positive side for
  • 5:21 - 5:23
    this is that well, if the device is not
  • 5:23 - 5:26
    inside of the traffic path, then the
  • 5:26 - 5:28
    attacker might not even be able to
  • 5:28 - 5:29
    detect it.
  • 5:29 - 5:31
    So most likely, the IDS is going to
  • 5:31 - 5:33
    work with a copy of the traffic just to
  • 5:33 - 5:35
    analyze it, but it's not going to be able
  • 5:35 - 5:37
    to stop the malicious traffic. And the
  • 5:37 - 5:39
    attacker is not going to be able to
  • 5:39 - 5:41
    detect the IDS device and might not even
  • 5:41 - 5:44
    be able to compromise it if they
  • 5:44 - 5:47
    intend to. In most situations, the IDS
  • 5:47 - 5:48
    device doesn't even have a valid IP
  • 5:48 - 5:50
    address within the network that they're
  • 5:50 - 5:52
    monitoring, so it cannot be addressed, it
  • 5:52 - 5:55
    cannot be compromised by communicating
  • 5:55 - 5:57
    with it directly. Alright, so since we
  • 5:57 - 5:58
    mentioned the fact that an IDS works
  • 5:58 - 6:00
    with just a copy of the traffic, let's
  • 6:00 - 6:02
    see how can we generate that copy of
  • 6:02 - 6:04
    traffic, right? They're not within the
  • 6:04 - 6:06
    traffic path, so we need to make a copy
  • 6:06 - 6:08
    of the traffic and just send it in a
  • 6:08 - 6:09
    separate channel, on a separate channel
  • 6:09 - 6:12
    to the IDS device for analysis. Now, my
  • 6:12 - 6:14
    way of doing this is by enabling Port
  • 6:14 - 6:18
    mirroring or SPAN. In Cisco speak, this
  • 6:18 - 6:19
    is switchboard analyzer. Just a
  • 6:19 - 6:21
    functionality on layer 2 or layer 3
  • 6:21 - 6:24
    switches that allow us to configure the
  • 6:24 - 6:26
    switch, and we're basically telling it
  • 6:26 - 6:28
    well, whatever traffic you're seeing on
  • 6:28 - 6:32
    ports let's say one, two, and three make a
  • 6:32 - 6:35
    copy of that traffic and forward it out
  • 6:35 - 6:37
    of port number eight. And of course, we're
  • 6:37 - 6:38
    assuming that on port number eight,
  • 6:38 - 6:40
    there's an IDS device connected right
  • 6:40 - 6:42
    there. So we're basically telling the
  • 6:42 - 6:43
    switch to make a copy of all the
  • 6:43 - 6:45
    interesting traffic and send it towards
  • 6:45 - 6:47
    the IDS. And of course, you might be
  • 6:47 - 6:50
    thinking here well, what if the switch is
  • 6:50 - 6:52
    overloaded, what if there's more traffic
  • 6:52 - 6:54
    generated on those ports than the
  • 6:54 - 6:57
    mirror port can actually support. Well
  • 6:57 - 6:59
    that's true, it might happen. So in
  • 6:59 - 7:01
    cases when the switch is overloaded and
  • 7:01 - 7:02
    there's too much traffic in the network,
  • 7:02 - 7:05
    packets might be dropped, and also frames
  • 7:05 - 7:08
    with errors might not be forwarded to the
  • 7:08 - 7:10
    to the mirrored port either. So we
  • 7:10 - 7:12
    might not be able to see 100% of all the
  • 7:12 - 7:15
    traffic, but in most cases, it's going to
  • 7:15 - 7:16
    be enough. And it's also one of the
  • 7:16 - 7:18
    features that basically doesn't require
  • 7:18 - 7:19
    you to install anything else in the
  • 7:19 - 7:21
    network, it's just a functionality, just a
  • 7:21 - 7:23
    configuration, effort- just a couple of
  • 7:23 - 7:25
    commands on a switch. Another method for
  • 7:25 - 7:28
    duplicating traffic is by using a
  • 7:28 - 7:30
    passive or an active. It's basically
  • 7:30 - 7:32
    a layer 1 device called a TAP, a test
  • 7:32 - 7:35
    access port. It's nothing else than a
  • 7:35 - 7:37
    kind of like a T-connector where the
  • 7:37 - 7:39
    main cable goes from one end to the next,
  • 7:39 - 7:41
    and there's a third cable that actually
  • 7:41 - 7:43
    receives a copy of the entire traffic
  • 7:43 - 7:46
    going through that segment of cable.
  • 7:46 - 7:49
    The device is not a smart one, so it's
  • 7:49 - 7:51
    it's not like a switch. It's not going to
  • 7:51 - 7:54
    look at the destination frames and
  • 7:54 - 7:56
    forward entire packets. It's simply
  • 7:56 - 7:58
    going to duplicate the electrical or the
  • 7:58 - 8:00
    optical signals that it sees on the wire,
  • 8:00 - 8:02
    and it's going to make a complete and
  • 8:02 - 8:04
    identical copy of those signals onto the
  • 8:04 - 8:08
    third connection which, of course, is
  • 8:08 - 8:10
    is ideally connected to the IDS device.
  • 8:10 - 8:12
    Now, this type of approach is, again,
  • 8:12 - 8:13
    completely undetectable.
  • 8:13 - 8:16
    Span is not detectable either, right? And
  • 8:16 - 8:21
    it also copies entire frames regardless if
  • 8:21 - 8:24
    those frames contain errors or not. As we
  • 8:24 - 8:25
    said with port mirroring, while the
  • 8:25 - 8:27
    frames need to be correct in order to be
  • 8:27 - 8:30
    copied, well, with a TAP, the TAP doesn't
  • 8:30 - 8:32
    care. It's basically just a signal
  • 8:32 - 8:34
    repeater, and we can do this for both
  • 8:34 - 8:37
    copper cables and so electrical signals
  • 8:37 - 8:39
    as well as fiber optic so optical
  • 8:39 - 8:42
    signals. The TAP will not care, it will just
  • 8:42 - 8:44
    blindly copy all the signals that it
  • 8:44 - 8:46
    receives. And finally the third method
  • 8:46 - 8:49
    for monitoring traffic is by having the
  • 8:49 - 8:52
    IDS device in the traffic path
  • 8:52 - 8:55
    but acting as a transparent device. Again,
  • 8:55 - 8:57
    without an IP address, we're basically
  • 8:57 - 9:00
    becoming a layer 2 device that is part
  • 9:00 - 9:02
    of the same VLAN that they're
  • 9:02 - 9:05
    bridging, but they cannot be addressed on
  • 9:05 - 9:07
    the network, they cannot be detected on
  • 9:07 - 9:09
    the network, and they- if it's a true IDS
  • 9:09 - 9:10
    device, then it's not going to be able to
  • 9:10 - 9:13
    block the actual traffic that goes
  • 9:13 - 9:15
    through it. Now, having the device placed
  • 9:15 - 9:18
    inside of the traffic path opens us to
  • 9:18 - 9:20
    the possibility of actually blocking the
  • 9:20 - 9:22
    traffic, and that's going to be a
  • 9:22 - 9:23
    different type of solution called
  • 9:23 - 9:24
    intrusion prevention system. And we'll
  • 9:24 - 9:26
    get there in just a moment. There's one
  • 9:26 - 9:28
    more type of intrusion detection device
  • 9:28 - 9:30
    or solution and that is a software
  • 9:30 - 9:32
    solution that can be installed directly
  • 9:32 - 9:34
    on the workstations. So I'm not talking
  • 9:34 - 9:35
    about a box that listens to network
  • 9:35 - 9:37
    traffic on an entire segment, but we're
  • 9:37 - 9:39
    talking here about a software solution,
  • 9:39 - 9:41
    basically a program that runs on your
  • 9:41 - 9:43
    endpoint machine, on your host machine, be
  • 9:43 - 9:46
    it a laptop or a desktop. Now, this one is
  • 9:46 - 9:48
    called host-based instrusion detection
  • 9:48 - 9:51
    because it runs on the host, and it does
  • 9:51 - 9:52
    have pretty much the same benefit
  • 9:52 - 9:55
    or the same abilities as a network-based
  • 9:55 - 9:57
    instrusion detection, so it's able to look
  • 9:57 - 9:58
    at the network traffic going in and out
  • 9:58 - 10:00
    of your network interface. It's able to
  • 10:00 - 10:03
    look at the logs generated by the
  • 10:03 - 10:05
    applications on your system, but since
  • 10:05 - 10:07
    they are running as an application on
  • 10:07 - 10:09
    your system, they can become even smarter
  • 10:09 - 10:12
    because they might have access now to
  • 10:12 - 10:14
    the actual process table. They might be
  • 10:14 - 10:16
    looking at the kernel, you might be able
  • 10:16 - 10:18
    to look at the memory to see what
  • 10:18 - 10:20
    processes are running, when did they
  • 10:20 - 10:23
    execute, who executed them, with what
  • 10:23 - 10:26
    privileges, and they can also openly look
  • 10:26 - 10:30
    at encrypted traffic. So if you are
  • 10:30 - 10:33
    communicating over SSL with a website,
  • 10:33 - 10:35
    well a network-based instrusion detection
  • 10:35 - 10:37
    might not be able to understand anything
  • 10:37 - 10:39
    that's going back and forth because it's
  • 10:39 - 10:41
    encrypted, but your host-based intrusion
  • 10:41 - 10:43
    detection
  • 10:43 - 10:45
    is located at the end of that encrypted
  • 10:45 - 10:48
    tunnel, so it is able to see that
  • 10:48 - 10:50
    unencrypted traffic before it even
  • 10:50 - 10:52
    enters the encrypted tunnel and right
  • 10:52 - 10:54
    after it leaves the encrypted tunnel. So
  • 10:54 - 10:57
    it's able to actually watch the entire
  • 10:57 - 11:00
    traffic flow in an unencrypted form. And
  • 11:00 - 11:03
    again, since we have pretty much full
  • 11:03 - 11:05
    permissions on the monitored host in
  • 11:05 - 11:08
    order to be able to properly monitor the,
  • 11:08 - 11:09
    you know, the process table and the
  • 11:09 - 11:11
    network connections and the network
  • 11:11 - 11:14
    traffic, we could also have a look at the
  • 11:14 - 11:16
    files on the disk.
  • 11:16 - 11:18
    Why would you do that? Well that's
  • 11:18 - 11:20
    because monitoring the integrity of the
  • 11:20 - 11:22
    files on the disk, especially the
  • 11:22 - 11:25
    integrity of the operating system files,
  • 11:25 - 11:27
    and being able to detect when that
  • 11:27 - 11:31
    integrity fails, when a system file is
  • 11:31 - 11:33
    being replaced with a malicious one, when
  • 11:33 - 11:36
    a system file is becoming encrypted
  • 11:36 - 11:38
    or it is replaced with a completely
  • 11:38 - 11:39
    different version, that might be an
  • 11:39 - 11:42
    indication of compromise, that might be
  • 11:42 - 11:43
    an indication of the fact that you have
  • 11:43 - 11:47
    been infected with malware. So solutions
  • 11:47 - 11:49
    or functionality additional to host
  • 11:49 - 11:51
    based instrusion detection that monitor
  • 11:51 - 11:53
    files on your system, especially
  • 11:53 - 11:55
    operating system files, these are called
  • 11:55 - 11:58
    file integrity monitoring solutions.And
  • 11:58 - 11:59
    remember that we said that when we place
  • 11:59 - 12:01
    the intrusion detection device in the
  • 12:01 - 12:03
    traffic path, that
  • 12:03 - 12:05
    device actually becomes able to also
  • 12:05 - 12:08
    block the traffic that goes through it
  • 12:08 - 12:09
    which can make it an intrusion
  • 12:09 - 12:11
    prevention system, right? So detection
  • 12:11 - 12:13
    just alerts, just generate alerts or
  • 12:13 - 12:16
    events. Intrusion prevention is about
  • 12:16 - 12:19
    actually taking action or
  • 12:19 - 12:23
    acting upon the detected intrusion. So
  • 12:23 - 12:26
    what can such a device actually do
  • 12:26 - 12:27
    whenever they're seeing
  • 12:27 - 12:29
    something fishy going on inside of a
  • 12:29 - 12:31
    network? Well, they could do something as
  • 12:31 - 12:33
    simple as simply sending a TCP reset
  • 12:33 - 12:36
    packet to the originator of the
  • 12:36 - 12:38
    malicious connection. They could also
  • 12:38 - 12:40
    have some more advanced functionality
  • 12:40 - 12:42
    especially if it's the same
  • 12:42 - 12:43
    device that acts as a firewall. They
  • 12:43 - 12:46
    might be dynamically able to generate a
  • 12:46 - 12:49
    firewall rule to block similar traffic
  • 12:49 - 12:51
    like the one that was just detected as
  • 12:51 - 12:54
    being part of an attempt for an
  • 12:54 - 12:57
    attack or for a compromise. We could be
  • 12:57 - 12:59
    choosing if we're detecting something
  • 12:59 - 13:01
    that looks like a denial of service
  • 13:01 - 13:03
    attack, we could be choosing to limit the
  • 13:03 - 13:06
    amount of bandwidth that is allocated to
  • 13:06 - 13:08
    that specific type of traffic. Kind of
  • 13:08 - 13:10
    like policing that we're doing in
  • 13:10 - 13:12
    well, quality of service. In any case,
  • 13:12 - 13:15
    any type of action that the IPS device
  • 13:15 - 13:17
    can take against the malicious traffic,
  • 13:17 - 13:19
    we're going to call it active response.
  • 13:19 - 13:21
    And depending on how complex the device
  • 13:21 - 13:23
    is and how powerful the device is, you
  • 13:23 - 13:25
    might actually choose to look not just
  • 13:25 - 13:29
    at simple IPS or IDS signatures, but also
  • 13:29 - 13:31
    look for malware signatures. Yeah, that's
  • 13:31 - 13:33
    that's going to require you to, you know,
  • 13:33 - 13:35
    to decode encrypted traffic. It's going
  • 13:35 - 13:38
    to require you to identify potential
  • 13:38 - 13:40
    protocols that might be carrying files,
  • 13:40 - 13:42
    gather all those related packets that
  • 13:42 - 13:44
    belong to the same TCP stream to the
  • 13:44 - 13:46
    same flow, assemble them into an
  • 13:46 - 13:48
    executable file, store that in memory,
  • 13:48 - 13:51
    attempt to scan it with an antivirus
  • 13:51 - 13:53
    engine, and then determine if that flow
  • 13:53 - 13:55
    was actually malicious or not. Now, this
  • 13:55 - 13:57
    requires a lot of processing power. This
  • 13:57 - 13:59
    is going to create some sort of delay in
  • 13:59 - 14:01
    the networks of the users. They're going
  • 14:01 - 14:04
    to see their download unable to
  • 14:04 - 14:06
    finish or the application responding
  • 14:06 - 14:10
    slowly until the firewall, the UTM device,
  • 14:10 - 14:12
    or the intrusion prevention system is
  • 14:12 - 14:14
    actually able to scan those files
  • 14:14 - 14:16
    against malware signatures. On a lighter
  • 14:16 - 14:18
    approach, we could also just be looking
  • 14:18 - 14:21
    at URLs, looking for malicious domains or
  • 14:21 - 14:22
    domains that associated with
  • 14:22 - 14:24
    malware or with the command and control
  • 14:24 - 14:27
    servers. We might be looking at URLs in
  • 14:27 - 14:30
    order to categorize those URLs and
  • 14:30 - 14:32
    figure out the reputation of that URL
  • 14:32 - 14:34
    and decide whether we want the
  • 14:34 - 14:35
    communication to that specific website
  • 14:35 - 14:37
    to proceed or not. So regardless if the
  • 14:37 - 14:40
    device is an IPS or an IDS, the detection
  • 14:40 - 14:42
    methods are pretty much the same. Now, the
  • 14:42 - 14:45
    difference is just in what the device is
  • 14:45 - 14:47
    actually doing. Is it only alerting or is
  • 14:47 - 14:50
    it actually taking an active response
  • 14:50 - 14:51
    approach to the traffic? But the
  • 14:51 - 14:53
    detection part is pretty much the same,
  • 14:53 - 14:55
    right? And when talking about detection,
  • 14:55 - 14:58
    we are going to start with the basic
  • 14:58 - 15:00
    type of detection that is where we're
  • 15:00 - 15:01
    just looking for signatures in the
  • 15:01 - 15:04
    database, which, of course, means that we
  • 15:04 - 15:08
    need to have an up-to-date database for
  • 15:08 - 15:09
    the device to be able to detect the
  • 15:09 - 15:11
    latest and the greatest attack. Now, this
  • 15:11 - 15:13
    is basically one of the reasons why
  • 15:13 - 15:14
    people choose to pay for commercial
  • 15:14 - 15:17
    solutions because databases maintained
  • 15:17 - 15:20
    by a dedicated software or security
  • 15:20 - 15:22
    vendor that deals with intrusion
  • 15:22 - 15:25
    prevention, those databases are going to
  • 15:25 - 15:29
    be much more often updated and kept up
  • 15:29 - 15:31
    to date in order to mirror as best as
  • 15:31 - 15:33
    possible the database of all the known
  • 15:33 - 15:35
    attack patterns ever detected in the
  • 15:35 - 15:37
    world. Now with open source solutions,
  • 15:37 - 15:39
    you're still going to have
  • 15:39 - 15:42
    a pretty good level of protection, but
  • 15:42 - 15:44
    you might not be able to detect an
  • 15:44 - 15:46
    attack that was just identified
  • 15:46 - 15:48
    six hours ago. Nevertheless and
  • 15:48 - 15:51
    regardless how up-to-date your database
  • 15:51 - 15:54
    is, you're still limited by the attack
  • 15:54 - 15:56
    patterns listed in that database. If an
  • 15:56 - 15:59
    attack emerges and doesn't match
  • 15:59 - 16:00
    anything in your database, it's still
  • 16:00 - 16:02
    going to go through,
  • 16:02 - 16:04
    which leads us to a different approach,
  • 16:04 - 16:06
    and that is a behavioral approach. So
  • 16:06 - 16:09
    instead of looking at specific streams
  • 16:09 - 16:11
    of bytes, specific headers, specific
  • 16:11 - 16:14
    sequences of packets, let's look at the
  • 16:14 - 16:17
    overall behavior of an application or of
  • 16:17 - 16:19
    a protocol.
  • 16:19 - 16:21
    Does it look like it's doing what's
  • 16:21 - 16:24
    supposed to do? Is it generating more
  • 16:24 - 16:27
    packets than we're used to seeing? Is it
  • 16:27 - 16:29
    generating more traffic? Is it
  • 16:29 - 16:32
    generating an abnormal amount of control
  • 16:32 - 16:35
    information as opposed to a real
  • 16:35 - 16:38
    transfer data? And we call this
  • 16:38 - 16:40
    behavioral monitoring. Now, in order for
  • 16:40 - 16:41
    behavioral monitoring to work, we need to
  • 16:41 - 16:44
    have something to compare that behavior
  • 16:44 - 16:47
    to, and say well, if it goes outside of
  • 16:47 - 16:49
    the known ranges,
  • 16:49 - 16:51
    then it looks like something's fishy.
  • 16:51 - 16:55
    Well that known range is supposed to be
  • 16:55 - 16:58
    your baseline. So such a device or such a
  • 16:58 - 17:00
    system is supposed to be trained first.
  • 17:00 - 17:02
    You're supposed to just leave it inside
  • 17:02 - 17:04
    of the network for let's say a week or
  • 17:04 - 17:08
    two. Just let it figure out how
  • 17:08 - 17:10
    does a normal Monday morning look like
  • 17:10 - 17:12
    in your network when everybody comes
  • 17:12 - 17:15
    into work and they start logging in and
  • 17:15 - 17:17
    start updating their
  • 17:17 - 17:18
    machines and perhaps even their mobile
  • 17:18 - 17:21
    phones on the company Wi-Fi. But
  • 17:21 - 17:23
    nevertheless, you have to leave that
  • 17:23 - 17:26
    instrusion prevention solution learn what
  • 17:26 - 17:29
    does your normal traffic look like when
  • 17:29 - 17:30
    people start accessing internal
  • 17:30 - 17:32
    applications, when people start
  • 17:32 - 17:34
    accessing internet destinations, when
  • 17:34 - 17:37
    people start communicating, sharing files
  • 17:37 - 17:40
    between each other, when backups
  • 17:40 - 17:43
    start to happen at midnight perhaps,
  • 17:43 - 17:46
    right? You have to let it learn so that
  • 17:46 - 17:49
    in a couple of weeks when something goes
  • 17:49 - 17:51
    outside of the known range where an
  • 17:51 - 17:54
    application behaves the way it did not
  • 17:54 - 17:57
    behave in the first training weeks, then
  • 17:57 - 17:59
    it's going to be able to raise an alarm
  • 17:59 - 18:02
    and perhaps indicate the fact that the
  • 18:02 - 18:05
    application has been compromised or that
  • 18:05 - 18:08
    somebody is using it in order to elevate
  • 18:08 - 18:10
    their privileges or just compromise your
  • 18:10 - 18:12
    network. And as you can probably guess,
  • 18:12 - 18:15
    this is one area where machine learning
  • 18:15 - 18:17
    is going to provide you a lot of benefit
  • 18:17 - 18:21
    given that you take the time and efforts
  • 18:21 - 18:24
    to educate, to teach the machine learning
  • 18:24 - 18:26
    system. What does your normal baseline
  • 18:26 - 18:29
    look like? Now, of course, regardless how
  • 18:29 - 18:32
    complex or how well-tuned your solution
  • 18:32 - 18:34
    is going to be, there will be false
  • 18:34 - 18:36
    positives and there will be false
  • 18:36 - 18:40
    negatives, which is why I always tell
  • 18:40 - 18:42
    tell students there's a old saying that
  • 18:42 - 18:45
    I heard from someone in Cisco a long,
  • 18:45 - 18:48
    long time ago, and they said that IPS
  • 18:48 - 18:49
    without eyes
  • 18:49 - 18:53
    is useless. So IPS without human eyes is
  • 18:53 - 18:55
    useless. There's always going to be
  • 18:55 - 18:58
    the need to have a human being right
  • 18:58 - 19:01
    there evaluating and analyzing whether
  • 19:01 - 19:03
    the alerts generated by the intrusion
  • 19:03 - 19:05
    prevention or detection system are valid
  • 19:05 - 19:09
    or not. Does it need more fine-tuning or
  • 19:09 - 19:12
    do we need to raise an alarm? So what
  • 19:12 - 19:14
    devices can we actually find that
  • 19:14 - 19:16
    implement this type of advanced
  • 19:16 - 19:18
    functionality, be it detection or
  • 19:18 - 19:20
    prevention. Well unfortunately, this is
  • 19:20 - 19:23
    the place where we're slowly
  • 19:23 - 19:25
    stepping into the marketing area. That's
  • 19:25 - 19:27
    because the devices that we're going to
  • 19:27 - 19:29
    be listing here are not completely
  • 19:29 - 19:33
    different devices, but over time,
  • 19:33 - 19:35
    different naming conventions have
  • 19:35 - 19:37
    emerged, different marketing names have
  • 19:37 - 19:40
    been invented to make them sound cool, to
  • 19:40 - 19:42
    make them sound different from what the
  • 19:42 - 19:44
    other vendors were doing. So we're going
  • 19:44 - 19:46
    to start with the next generation
  • 19:46 - 19:48
    firewall, and we would had this type of
  • 19:48 - 19:50
    next generation
  • 19:50 - 19:53
    for about 12 or 15 years already. I've
  • 19:53 - 19:56
    been hearing the next generation term in
  • 19:56 - 19:59
    in IT security for so long that
  • 19:59 - 20:01
    I'm starting to wonder
  • 20:01 - 20:05
    are we still next generation, are we- have
  • 20:05 - 20:07
    we skipped the generation? Are we now in
  • 20:07 - 20:10
    the next next generation or where does
  • 20:10 - 20:12
    it stop, where does it end, where
  • 20:12 - 20:14
    does the next generation begin, right? Now,
  • 20:14 - 20:15
    unfortunately marketing people don't
  • 20:15 - 20:17
    really ask themselves these questions. So
  • 20:17 - 20:19
    we're kind of stuck with this
  • 20:19 - 20:21
    terminology for now, and we're gonna keep
  • 20:21 - 20:23
    calling you next generation until I
  • 20:23 - 20:27
    don't know when, but regardless, a next
  • 20:27 - 20:29
    generation firewall is basically just a
  • 20:29 - 20:31
    layer 7 firewall. That's an application
  • 20:31 - 20:33
    layer firewall which is able to look at
  • 20:33 - 20:36
    the application layer payload, so we're
  • 20:36 - 20:38
    actually seeing the data being sent,
  • 20:38 - 20:40
    we're not just looking at the packet
  • 20:40 - 20:44
    headers. And it also has some sort of
  • 20:44 - 20:46
    detection or prevention system built in,
  • 20:46 - 20:49
    okay? So we have an IPS or an IDS built
  • 20:49 - 20:51
    in, which leads us back to the discussion
  • 20:51 - 20:53
    that we had before. So we have an
  • 20:53 - 20:55
    application layer firewall which can be
  • 20:55 - 20:57
    enriched with additional functionality.
  • 20:57 - 20:59
    Now that we have access to the actual
  • 20:59 - 21:01
    application payload, well, why not
  • 21:01 - 21:04
    look for intrusion signatures, why not
  • 21:04 - 21:06
    look for malware signatures, why not look
  • 21:06 - 21:09
    for spam signatures, right? So depending
  • 21:09 - 21:12
    on how complex the device is, if it at
  • 21:12 - 21:14
    least has IPS functionality built in,
  • 21:14 - 21:15
    we're going to call it a next
  • 21:15 - 21:17
    generation firewall. And here's the funny
  • 21:17 - 21:20
    part, if the next generation firewall has
  • 21:20 - 21:23
    a bunch of other additional features on
  • 21:23 - 21:26
    top of the IPS functionality, such as
  • 21:26 - 21:29
    malware scanning, antivirus scanning,
  • 21:29 - 21:30
    perhaps looking at the files and being
  • 21:30 - 21:32
    able to implement some data loss
  • 21:32 - 21:36
    prevention policies, it's able to look
  • 21:36 - 21:39
    at the URLs and categorize them and
  • 21:39 - 21:41
    analyze the reputation of the web pages,
  • 21:41 - 21:43
    and pretty much everything that we could
  • 21:43 - 21:45
    possibly think of that we could be doing
  • 21:45 - 21:47
    just by looking at the application data,
  • 21:47 - 21:49
    then we're going to call this a unified
  • 21:49 - 21:51
    threat management device, a UTM device.
  • 21:51 - 21:54
    Again, I don't think I need to repeat
  • 21:54 - 21:57
    this, but the more complex the device
  • 21:57 - 22:00
    becomes, the more stuff it needs to do in
  • 22:00 - 22:02
    order to decide weather to allow a
  • 22:02 - 22:04
    packet or not, the more resources, the
  • 22:04 - 22:07
    more CPU intensive it's going to be, the
  • 22:07 - 22:10
    more memory it's going to require, and the
  • 22:10 - 22:11
    more delay that is going to be introduced in the
  • 22:11 - 22:14
    network. So keep this in mind. Even though
  • 22:14 - 22:16
    it kind of sounds cool, right, to have all
  • 22:16 - 22:17
    that security functionality in a single
  • 22:17 - 22:19
    box,
  • 22:19 - 22:21
    which by the way, try to make sure
  • 22:21 - 22:23
    it's not a single box of failure, single
  • 22:23 - 22:25
    point of failure, all right? [Laughs]
  • 22:25 - 22:27
    Even though it sounds cool to have all
  • 22:27 - 22:29
    this functionality in one place,
  • 22:29 - 22:33
    it's going to hit your performance
  • 22:33 - 22:35
    pretty badly, right? So keep this in mind.
  • 22:35 - 22:38
    Don't just enable everything blindly
  • 22:38 - 22:41
    because the end users, the applications,
  • 22:41 - 22:43
    and well, God forbid your
  • 22:43 - 22:44
    customers, you're paying customers,
  • 22:44 - 22:47
    they're going to feel the effects of
  • 22:47 - 22:50
    your awesome UTM device, and
  • 22:50 - 22:52
    their application experience is going to
  • 22:52 - 22:54
    suffer. Now, a special type of network
  • 22:54 - 22:56
    monitoring device can also be considered,
  • 22:56 - 22:58
    a web application firewall. We've briefly
  • 22:58 - 23:00
    mentioned about web application
  • 23:00 - 23:03
    firewalls in a previous video, and we
  • 23:03 - 23:06
    said that a WAF, a web application firewall,
  • 23:06 - 23:09
    is just a dedicated firewall that is
  • 23:09 - 23:12
    specifically trained and educated to
  • 23:12 - 23:16
    look at attack signatures aimed at web
  • 23:16 - 23:18
    applications. So we're looking for things
  • 23:18 - 23:20
    such as cross-site scripting, we're
  • 23:20 - 23:21
    looking for,
  • 23:21 - 23:23
    you know, directory traversals, we're
  • 23:23 - 23:26
    looking at SQL injection attacks. We're
  • 23:26 - 23:27
    looking at pretty much anything that
  • 23:27 - 23:31
    could be performed by malicious user
  • 23:31 - 23:35
    that is trying to exploit a input
  • 23:35 - 23:39
    validation flaw in a web application. So
  • 23:39 - 23:41
    it's still an application layer firewall.
  • 23:41 - 23:43
    It still looks at the application
  • 23:43 - 23:46
    layer payload. It's just that it's a bit
  • 23:46 - 23:49
    more let's say, picky about what type of
  • 23:49 - 23:51
    traffic is it going to analyze. It's
  • 23:51 - 23:53
    only going to look at web traffic, and
  • 23:53 - 23:56
    it's only going to look for web
  • 23:56 - 23:58
    attacks, web application attacks. It's
  • 23:58 - 24:00
    mostly going to rely on signatures.
  • 24:00 - 24:03
    That's because we cannot really do much
  • 24:03 - 24:07
    when it comes to requests coming in from
  • 24:07 - 24:09
    our clients. Behavioral analysis
  • 24:09 - 24:11
    doesn't really play well here because
  • 24:11 - 24:13
    most attacks, especially web application
  • 24:13 - 24:17
    attacks, are just one single request, one
  • 24:17 - 24:21
    single query with a malicious payload.
  • 24:21 - 24:23
    So in many situations, it's going to be
  • 24:23 - 24:25
    either black or white, right? We're
  • 24:25 - 24:28
    detecting an attempt at an intrusion,
  • 24:28 - 24:29
    we're detecting an attack in that
  • 24:29 - 24:32
    request or not. It's pretty much not
  • 24:32 - 24:34
    going to be much of a gray area with
  • 24:34 - 24:36
    web application firewalls. And you could
  • 24:36 - 24:39
    deploy a WAF as a separate device. It
  • 24:39 - 24:40
    could be a physical box, it could be a
  • 24:40 - 24:42
    virtual machine, it could be a
  • 24:42 - 24:45
    functionality within a UTM device, again,
  • 24:45 - 24:49
    all in one wonders. But it can also be a
  • 24:49 - 24:52
    part of the web server itself. So we have
  • 24:52 - 24:55
    plugins that install alongside the
  • 24:55 - 24:57
    actual web server that is hosting the
  • 24:57 - 24:59
    web application, such as plugins for the
  • 24:59 - 25:02
    Apache web server, for the IIS web server,
  • 25:02 - 25:05
    on Windows server, or for Nginx. So
  • 25:05 - 25:07
    we're installing these plugins right
  • 25:07 - 25:10
    there, and their purpose is to scan the
  • 25:10 - 25:11
    traffic that's coming in from the
  • 25:11 - 25:14
    clients before allowing that request to
  • 25:14 - 25:16
    be processed by the web server. Having
  • 25:16 - 25:18
    something such as a plugin that runs
  • 25:18 - 25:20
    alongside the web server on the same
  • 25:20 - 25:23
    machine, on the same box, opens us to the
  • 25:23 - 25:26
    risk of either having that machine
  • 25:26 - 25:29
    compromised by an attacker who, this time
  • 25:29 - 25:31
    doesn't target the web application, but
  • 25:31 - 25:35
    targets the scanning engine and can
  • 25:35 - 25:36
    intentionally cause, for example, a denial
  • 25:36 - 25:39
    of service, give it so much traffic to
  • 25:39 - 25:41
    analyze that the web server running on
  • 25:41 - 25:43
    the same machine is unable to actually
  • 25:43 - 25:45
    respond to valid requests. So there you
  • 25:45 - 25:47
    have it that's the denial of service attack.
  • 25:47 - 25:49
    Now, when it comes to actually monitoring
  • 25:49 - 25:51
    the network traffic, we said that a
  • 25:51 - 25:53
    solution would be to just simply mirror
  • 25:53 - 25:55
    all the traffic, and then look for
  • 25:55 - 25:56
    specific attack patterns inside of that
  • 25:56 - 25:59
    traffic. Now, that might not be always
  • 25:59 - 26:01
    feasible because the amount of traffic
  • 26:01 - 26:03
    entering a data center or the server
  • 26:03 - 26:06
    front that hosts an application might be
  • 26:06 - 26:09
    huge, right? So in some situations, we
  • 26:09 - 26:11
    might not be able to analyze the exact
  • 26:11 - 26:14
    amount of traffic that goes in, but we
  • 26:14 - 26:17
    might be able to generate a summary of
  • 26:17 - 26:18
    that traffic and then analyze that
  • 26:18 - 26:21
    summary for intrusion attempts. Now, this
  • 26:21 - 26:24
    traffic summary is sometimes found under
  • 26:24 - 26:28
    the terminology of NetFlow or sFlow or
  • 26:28 - 26:30
    jFlow, which is basically just a
  • 26:30 - 26:32
    technology implemented by various
  • 26:32 - 26:34
    vendors out there in which instead of
  • 26:34 - 26:37
    creating an exact copy of the traffic,
  • 26:37 - 26:39
    we're simply summarizing that traffic,
  • 26:39 - 26:42
    and then reporting that summary back to
  • 26:42 - 26:44
    some analysis software. So we're only
  • 26:44 - 26:47
    telling it what type of sources, what
  • 26:47 - 26:49
    type of destinations have communicated,
  • 26:49 - 26:51
    how many bytes were used, what type of
  • 26:51 - 26:53
    protocols have been used,
  • 26:53 - 26:56
    what type of flags have been set in that
  • 26:56 - 26:58
    specific type of traffic. But we don't
  • 26:58 - 27:01
    put the burden of sending the entire
  • 27:01 - 27:03
    actual traffic in the entire payload to
  • 27:03 - 27:05
    that analysis software. Now, this also
  • 27:05 - 27:07
    means that we're losing application
  • 27:07 - 27:09
    layer visibility, all right? Since we're
  • 27:09 - 27:11
    just summarizing the type of traffic,
  • 27:11 - 27:13
    we're only describing the metadata about
  • 27:13 - 27:16
    that traffic, we're losing everything that
  • 27:16 - 27:18
    pertains to the application layer, but
  • 27:18 - 27:20
    we're gaining a lot of performance, and
  • 27:20 - 27:22
    we can also store this summary
  • 27:22 - 27:24
    information long term for further
  • 27:24 - 27:27
    analysis somewhere along the line in the
  • 27:27 - 27:30
    future. Sometimes, looking at traffic, it's
  • 27:30 - 27:33
    simply not feasible. Maybe we cannot grab
  • 27:33 - 27:35
    all the traffic that's running through
  • 27:35 - 27:37
    the network. Maybe we don't have network
  • 27:37 - 27:39
    devices smart enough to generate those
  • 27:39 - 27:41
    summaries, those flow
  • 27:41 - 27:44
    reports for us. So another solution would
  • 27:44 - 27:47
    be to simply have a software monitoring
  • 27:47 - 27:49
    solution or so-called a network
  • 27:49 - 27:51
    performance monitor that queries
  • 27:51 - 27:54
    periodically your networking devices,
  • 27:54 - 27:57
    queries your routers, your switches,
  • 27:57 - 27:59
    your wireless LAN controllers, your
  • 27:59 - 28:02
    firewalls perhaps about the status of
  • 28:02 - 28:05
    their physical resources, status of their
  • 28:05 - 28:07
    interfaces, how much traffic is going
  • 28:07 - 28:10
    through their interfaces, what's the CPU
  • 28:10 - 28:12
    load, what's the memory usage, what's the
  • 28:12 - 28:15
    structure of the routing table, how does
  • 28:15 - 28:18
    the r table look like, how is the DHTP
  • 28:18 - 28:20
    traffic looking like, right? So any type
  • 28:20 - 28:22
    of monitoring information that can be
  • 28:22 - 28:23
    extracted out of these networking
  • 28:23 - 28:26
    devices, which, in turn, can be correlated
  • 28:26 - 28:28
    in order to figure out if we can see
  • 28:28 - 28:30
    some anomalies in there. One such
  • 28:30 - 28:33
    solution is, for example, SolarWinds NPM,
  • 28:33 - 28:35
    network performance monitor, which is a
  • 28:35 - 28:37
    dedicated solution for monitoring not
  • 28:37 - 28:39
    just networking devices, but also servers
  • 28:39 - 28:42
    and virtual machines about their
  • 28:42 - 28:45
    their health, right? How are their network
  • 28:45 - 28:47
    interfaces looking like, how much load is
  • 28:47 - 28:49
    there on their hardware resource or
  • 28:49 - 28:51
    their hardware components, are they
  • 28:51 - 28:53
    generating any alerts, do we have failed
  • 28:53 - 28:55
    interfaces, do we have failed processes, do
  • 28:55 - 28:58
    we have something that's- failed links,
  • 28:58 - 29:00
    are we detecting errors or overloaded
  • 29:00 - 29:03
    devices? Stuff like that.
  • 29:03 - 29:05
    Now, this type of performance monitoring
  • 29:05 - 29:08
    can be done over a variety of protocols.
  • 29:08 - 29:11
    In most cases, the SNMP protocol is going
  • 29:11 - 29:12
    to be used because it allows us to
  • 29:12 - 29:15
    report a lot of the hardware counters
  • 29:15 - 29:17
    and a lot of the interesting information
  • 29:17 - 29:19
    that we want to gather and store long
  • 29:19 - 29:22
    term. Also, we might be using WMI such as
  • 29:22 - 29:24
    Windows management instrumentation and a
  • 29:24 - 29:26
    couple other protocols as well. And of
  • 29:26 - 29:28
    course, we could enrich this collection
  • 29:28 - 29:31
    by collecting logs from the monitored
  • 29:31 - 29:33
    devices and appliances as well. And we
  • 29:33 - 29:35
    could be collecting those logs over
  • 29:35 - 29:37
    syslog, so we need to configure the
  • 29:37 - 29:38
    device to actually send those syslog
  • 29:38 - 29:41
    messages or at least a copy of them to
  • 29:41 - 29:44
    the monitoring device. Or we could rely
  • 29:44 - 29:45
    on an agent, an additional piece of
  • 29:45 - 29:47
    software installed on the server on the
  • 29:47 - 29:49
    virtual machine that periodically
  • 29:49 - 29:51
    reports back to us everything of
  • 29:51 - 29:54
    interest regarding that specific
  • 29:54 - 29:56
    host. When talking about dedicated
  • 29:56 - 29:59
    software design specifically designed to
  • 29:59 - 30:02
    analyze a lot of information coming from
  • 30:02 - 30:04
    the network be it network traffic, network
  • 30:04 - 30:07
    summaries such as NetFlow, logs, and any
  • 30:07 - 30:10
    kind of application data, that solution
  • 30:10 - 30:12
    is most likely going to be called a SIEM,
  • 30:12 - 30:14
    a security information and event
  • 30:14 - 30:16
    management. Now the keyword and the
  • 30:16 - 30:19
    definition of SIEM is correlation. That
  • 30:19 - 30:22
    is it's not just a place where you just
  • 30:22 - 30:24
    dump all that information in a huge
  • 30:24 - 30:27
    database, it's a place that as you dump
  • 30:27 - 30:29
    that information, it's going to look for
  • 30:29 - 30:31
    patterns inside of it. It's going to try
  • 30:31 - 30:34
    to correlate network traffic with logs
  • 30:34 - 30:37
    or application data with NetFlow
  • 30:37 - 30:39
    data in order to figure out if some
  • 30:39 - 30:42
    anomalous behavior is detected in your
  • 30:42 - 30:45
    network. So SIEM solution, and by the way,
  • 30:45 - 30:47
    these are pretty expensive solutions out
  • 30:47 - 30:50
    there, are never designed to be just log
  • 30:50 - 30:52
    storage, right? They're engines, smart
  • 30:52 - 30:55
    engines based on machine learning that
  • 30:55 - 30:58
    aim to detect patterns of intrusion by
  • 30:58 - 31:00
    analyzing and correlating information
  • 31:00 - 31:03
    found in multiple log files, and what's
  • 31:03 - 31:05
    interesting about the implementation of
  • 31:05 - 31:06
    SIEMs is that they're supposed to
  • 31:06 - 31:10
    collect logs from your network devices,
  • 31:10 - 31:12
    from your security devices, even from
  • 31:12 - 31:14
    your workstations, and your mobile
  • 31:14 - 31:16
    devices perhaps. And they're able to
  • 31:16 - 31:18
    understand and correlate all that
  • 31:18 - 31:19
    information and normalize all that
  • 31:19 - 31:22
    information even if it comes from tens
  • 31:22 - 31:24
    or hundreds of vendors or thousands of
  • 31:24 - 31:25
    devices,
  • 31:25 - 31:27
    and they're able to normalize that
  • 31:27 - 31:30
    information and make it look the same so
  • 31:30 - 31:31
    that in the end,
  • 31:31 - 31:34
    it can look for patterns inside of it,
  • 31:34 - 31:36
    and it also allows you to perform
  • 31:36 - 31:39
    queries in a language quite similar to a
  • 31:39 - 31:42
    regular SQL language and query all that
  • 31:42 - 31:45
    information regardless of the fact that
  • 31:45 - 31:47
    it actually came from tens of
  • 31:47 - 31:50
    hundreds of different vendors. And since
  • 31:50 - 31:51
    a SIEM without machine learning
  • 31:51 - 31:54
    functionality is not a very useful SIEM,
  • 31:54 - 31:56
    we could use that machine learning
  • 31:56 - 31:59
    features to look at user behavior as
  • 31:59 - 32:02
    well because in the end, we're trying not
  • 32:02 - 32:04
    to detect just, you know, attack patterns,
  • 32:04 - 32:06
    we're also trying to identify who is
  • 32:06 - 32:09
    conducting them. And a great risk comes
  • 32:09 - 32:11
    from insider threats, so if we are able
  • 32:11 - 32:13
    to monitor what our users are doing,
  • 32:13 - 32:15
    we're not talking here about just
  • 32:15 - 32:17
    watching what websites they're
  • 32:17 - 32:20
    visiting or taking frequent screenshots
  • 32:20 - 32:22
    of their workstations, no we're
  • 32:22 - 32:24
    not doing that, but we're looking at the
  • 32:24 - 32:26
    behavior that they're exhibiting
  • 32:26 - 32:28
    whenever they are interacting with
  • 32:28 - 32:31
    specific applications. And if the SIEM
  • 32:31 - 32:33
    has such an ability, we call that ability
  • 32:33 - 32:37
    user and entity behavior analysis. Don't
  • 32:37 - 32:40
    think that we're only performing here
  • 32:40 - 32:43
    a witch hunt against insider threats.
  • 32:43 - 32:45
    Think about the fact that we might be
  • 32:45 - 32:48
    able to detect abnormal behavior because
  • 32:48 - 32:50
    a user account has been compromised by a
  • 32:50 - 32:52
    hacker, and that hacker is now acting on
  • 32:52 - 32:55
    behalf of that user. The user might have
  • 32:55 - 32:57
    nothing to do with that abnormal
  • 32:57 - 32:59
    behavior, might not even know about it,
  • 32:59 - 33:01
    might not even be logged in at that
  • 33:01 - 33:03
    specific point in time. But the attacker
  • 33:03 - 33:05
    might be acting on behalf of that user.
  • 33:05 - 33:07
    If we're able to detect that abnormal
  • 33:07 - 33:10
    behavior, we might be able to detect the
  • 33:10 - 33:14
    attack going on right then. And stepping
  • 33:14 - 33:16
    just a bit into the realm of science
  • 33:16 - 33:19
    fiction here, I know that some vendors
  • 33:19 - 33:20
    will say no, this is not science fiction,
  • 33:20 - 33:23
    we're selling this, we've had huge
  • 33:23 - 33:25
    success with this. Well, yes and no. I'm
  • 33:25 - 33:28
    going to keep being a bit skeptical as to
  • 33:28 - 33:31
    how efficient this approach is. What I'm
  • 33:31 - 33:33
    talking here about is sentiment analysis
  • 33:33 - 33:36
    or emotion AI. tTat is analyzing user
  • 33:36 - 33:39
    behavior in what content the user is
  • 33:39 - 33:42
    actually creating as in blog posts,
  • 33:42 - 33:44
    social media postings.
  • 33:44 - 33:46
    We're not talking here about actual, you
  • 33:46 - 33:49
    know, analyzing the contents of emails
  • 33:49 - 33:51
    and chats because that might, you
  • 33:51 - 33:54
    know, step into the privacy area which we
  • 33:54 - 33:56
    might not want to do that. But by
  • 33:56 - 33:58
    analyzing publicly available information
  • 33:58 - 34:00
    generated by those users, we might be
  • 34:00 - 34:03
    able to detect disgruntled employees. We
  • 34:03 - 34:06
    might be able to detect unsatisfied
  • 34:06 - 34:08
    clients that might create some bad
  • 34:08 - 34:11
    reputation for the company, perhaps even
  • 34:11 - 34:14
    before they become so upset as to take
  • 34:14 - 34:17
    action or malicious action against our
  • 34:17 - 34:19
    company. Again, take this with a grain of
  • 34:19 - 34:23
    salt, and don't just think that if it
  • 34:23 - 34:27
    sounds awesome on paper, it has to be
  • 34:27 - 34:29
    awesome in real life. If it sounds too
  • 34:29 - 34:31
    good to be true, then it probably is too
  • 34:31 - 34:33
    good to be true.
  • 34:33 - 34:35
    And finally, the last term here that I
  • 34:35 - 34:37
    wanted you to know about is SOAR,
  • 34:37 - 34:39
    security orchestration, automation and
  • 34:39 - 34:41
    response. That's a mouthful, I know. It's
  • 34:41 - 34:42
    usually a functionality built into SIEM
  • 34:42 - 34:44
    solutions or it can be just a standalone
  • 34:44 - 34:47
    solution. What it basically tries to
  • 34:47 - 34:49
    address is the problem of too much
  • 34:49 - 34:52
    information that is being overwhelmed by
  • 34:52 - 34:55
    too many alerts, too many security events,
  • 34:55 - 34:57
    too many security incidents, too many
  • 34:57 - 34:58
    incidents that we need to determine if
  • 34:58 - 35:00
    they're security related or not. [Laughs]
  • 35:00 - 35:04
    Basically the hell of any IT Department
  • 35:04 - 35:07
    that deals solely with monitoring the
  • 35:07 - 35:09
    network and the applications. And the
  • 35:09 - 35:11
    idea behind this is that a SOAR
  • 35:11 - 35:13
    solution is supposed to use some machine
  • 35:13 - 35:16
    learning techniques in order to not just
  • 35:16 - 35:19
    to figure out which anomalous events are
  • 35:19 - 35:22
    occurring in the network, but by
  • 35:22 - 35:24
    analyzing those anomalous events, it is
  • 35:24 - 35:28
    able to take some action against them. So
  • 35:28 - 35:31
    it could, at some point, determine if an
  • 35:31 - 35:33
    attack is going on, even if it happens
  • 35:33 - 35:34
    in the middle of the night, and take
  • 35:34 - 35:36
    action immediately by blocking some
  • 35:36 - 35:39
    ports, by creating an access list, by
  • 35:39 - 35:40
    disabling- temporarily disabling some user
  • 35:40 - 35:41
    accounts that might have been
  • 35:41 - 35:43
    compromised. So that's security
  • 35:43 - 35:47
    orchestration, automation and response.
  • 35:47 - 35:49
    Just be sure everybody is clear on this,
  • 35:49 - 35:51
    especially for the exam, where does the
  • 35:51 - 35:53
    SIEM get its information from. Where
  • 35:53 - 35:55
    first of all, it's going to get it from
  • 35:55 - 35:57
    logs, right? Syslogs. That's going to be
  • 35:57 - 35:58
    the main source of information. How do
  • 35:58 - 36:00
    you collect logs? Well you don't really
  • 36:00 - 36:03
    collect them. You expect those devices to
  • 36:03 - 36:05
    send those to you, so those devices need
  • 36:05 - 36:07
    to be configured be it networking
  • 36:07 - 36:09
    devices. They might be servers, they
  • 36:09 - 36:10
    might be virtual machines, whatever type
  • 36:10 - 36:12
    of device you have, just configure them
  • 36:12 - 36:15
    to send your logs to a secondary
  • 36:15 - 36:17
    destination if the SIEM is not the
  • 36:17 - 36:19
    primary one. Just make sure they send a
  • 36:19 - 36:22
    copy of those syslogs to the same device
  • 36:22 - 36:24
    as well. Next, the SIEM can also collect
  • 36:24 - 36:27
    data by installing agents on specific
  • 36:27 - 36:29
    systems. Now of course, we might not be
  • 36:29 - 36:31
    able to install agents on let's say
  • 36:31 - 36:34
    routers or switches apart from some
  • 36:34 - 36:37
    recent devices that are running Docker
  • 36:37 - 36:40
    containers perhaps. But in most cases, SIEM
  • 36:40 - 36:42
    agents are designed to be installed on
  • 36:42 - 36:45
    Windows and Linux systems. Then they're
  • 36:45 - 36:47
    running as background processes that
  • 36:47 - 36:49
    periodically scan the system and
  • 36:49 - 36:51
    report back to the SIEM. The logs
  • 36:51 - 36:53
    generated by the operating system, the
  • 36:53 - 36:55
    running applications, the logs generated
  • 36:55 - 36:57
    by the applications, actually, running on
  • 36:57 - 36:59
    that host, depending on how the agent is
  • 36:59 - 37:02
    configured. The built-in listeners or
  • 37:02 - 37:03
    collectors that you're seeing here on
  • 37:03 - 37:05
    the slide refers to the fact that the
  • 37:05 - 37:09
    SIEM is pre-configured or has plugins
  • 37:09 - 37:12
    that allow it to understand what
  • 37:12 - 37:14
    different vendors are reporting back to
  • 37:14 - 37:15
    it. So it's going to have different
  • 37:15 - 37:18
    plugins to understand logs coming in
  • 37:18 - 37:20
    from, you know, Cisco devices, HP devices,
  • 37:20 - 37:25
    Dell, VMware, whatever vendor it is, it
  • 37:25 - 37:26
    needs some sort of a plugin to
  • 37:26 - 37:29
    understand that specific log format and
  • 37:29 - 37:31
    more than that, it needs to
  • 37:31 - 37:34
    understand the contents of the payload of
  • 37:34 - 37:38
    what the log is saying. SNMP traps, again,
  • 37:38 - 37:40
    most monitoring information is going to
  • 37:40 - 37:43
    come in through an SNMP query or as an
  • 37:43 - 37:46
    SNMP trap generated by the device back
  • 37:46 - 37:50
    to the SIEM. And also NetFlow. NetFlow or
  • 37:50 - 37:52
    different variants implemented by
  • 37:52 - 37:54
    different vendors are basically just
  • 37:54 - 37:58
    summaries of the traffic flows detected
  • 37:58 - 38:01
    over a certain period of time, collected,
  • 38:01 - 38:03
    and then sent over to the SIEM device
  • 38:03 - 38:05
    in order for that traffic summary to be
  • 38:05 - 38:09
    analyzed. Finally, the SIEM can also
  • 38:09 - 38:12
    capture raw packet data if it has
  • 38:12 - 38:15
    dedicated sensors that are able to
  • 38:15 - 38:17
    generate a copy of the traffic and send
  • 38:17 - 38:19
    it back to the SIEM, or we can even have
  • 38:19 - 38:21
    sensors installed inside our network that
  • 38:21 - 38:25
    are monitoring real traffic, and they're
  • 38:25 - 38:26
    only telling back to the SIEM or they're
  • 38:26 - 38:29
    reporting back to the SIEM a summary of
  • 38:29 - 38:31
    that traffic. This is very useful when
  • 38:31 - 38:35
    your devices don't have enough reporting
  • 38:35 - 38:38
    or monitoring capabilities to report
  • 38:38 - 38:40
    back to the SIEM device, and instead, you need
  • 38:40 - 38:42
    to install some specific sensors that
  • 38:42 - 38:45
    look at the traffic, and then tell the
  • 38:45 - 38:46
    SIEM the necessary information that it
  • 38:46 - 38:48
    needs to perform those correlations.
  • 38:48 - 38:51
    Sometimes a sensor such as this one might be
  • 38:51 - 38:54
    an IPS or an IDS device even. Log
  • 38:54 - 38:57
    normalization is a feature built into
  • 38:57 - 38:58
    most SIEM Solutions out there. And
  • 38:58 - 39:00
    normalization is required, and it's a
  • 39:00 - 39:03
    very important feature because the SIEM
  • 39:03 - 39:04
    is designed to collect information from
  • 39:04 - 39:07
    hundreds of vendors and thousands of
  • 39:07 - 39:09
    different appliances, each of them
  • 39:09 - 39:11
    running different operating systems on
  • 39:11 - 39:13
    different versions, and they're all
  • 39:13 - 39:16
    building syslogs and SNP traps in
  • 39:16 - 39:18
    different formats. Some are
  • 39:18 - 39:21
    reporting them as a text, some are
  • 39:21 - 39:23
    generating logs in binary format, some
  • 39:23 - 39:26
    logs are in JSON format, some are in XML
  • 39:26 - 39:30
    format or CSV format, depending on how
  • 39:30 - 39:32
    the vendor actually designed its logging
  • 39:32 - 39:35
    and monitoring abilities. We might even
  • 39:35 - 39:38
    find differences as to how the logs are
  • 39:38 - 39:39
    actually encoded. Some of them are might
  • 39:39 - 39:41
    be using UTF, some of them might be using
  • 39:41 - 39:44
    some regional encoding. We might even run
  • 39:44 - 39:46
    into some issues due to the fact that
  • 39:46 - 39:48
    the new line character is represented
  • 39:48 - 39:50
    differently between Windows and Linux
  • 39:50 - 39:51
    systems, and that also might be reflected
  • 39:51 - 39:55
    in the payload included in the logs that
  • 39:55 - 39:57
    we're receiving as part of the
  • 39:57 - 39:59
    monitoring process. Not to mention the
  • 39:59 - 40:02
    fact that the SNMP mips, basically the
  • 40:02 - 40:05
    the database schemas that each vendor is
  • 40:05 - 40:08
    using for their own software solutions
  • 40:08 - 40:09
    or hardware appliances, these are
  • 40:09 - 40:11
    completely different not just among
  • 40:11 - 40:13
    vendors, but also among different
  • 40:13 - 40:16
    products from the same vendor. So in
  • 40:16 - 40:17
    order to have all this bunch of
  • 40:17 - 40:19
    information collected in some
  • 40:19 - 40:21
    centralized location and to be able to
  • 40:21 - 40:23
    query all this information and to be
  • 40:23 - 40:26
    able to approach it in a consistent
  • 40:26 - 40:28
    manner, we need normalization. That is
  • 40:28 - 40:31
    taking all this information coming from
  • 40:31 - 40:34
    so many vendors in so many formats and
  • 40:34 - 40:37
    making that information look exactly the
  • 40:37 - 40:40
    same so that it can be stored in a
  • 40:40 - 40:42
    single database that can be queried at
  • 40:42 - 40:44
    once regardless of the source of that
  • 40:44 - 40:47
    information. So what are we using to
  • 40:47 - 40:49
    normalize all this information coming
  • 40:49 - 40:50
    from all these vendors? Well, you guessed
  • 40:50 - 40:53
    it? We're gonna need some plugins. Some of
  • 40:53 - 40:55
    these plugins come from this SIEM vendor
  • 40:55 - 40:57
    itself. So they're going to be
  • 40:57 - 40:59
    pre-packaged with vendor plugins
  • 40:59 - 41:02
    from major vendors out there. Some of
  • 41:02 - 41:03
    these plugins are going to come from the
  • 41:03 - 41:06
    actual vendors. So if a smaller vendor
  • 41:06 - 41:08
    creates them, let's say smaller
  • 41:08 - 41:11
    firewalls at some point, and they want to
  • 41:11 - 41:12
    be able to integrate with the
  • 41:12 - 41:14
    large-scale SIEM Solutions, they're going
  • 41:14 - 41:16
    to provide you with a plugin for their
  • 41:16 - 41:18
    own environment as well. And another type
  • 41:18 - 41:20
    of normalization that is really, really
  • 41:20 - 41:22
    important is timestamp normalization.
  • 41:22 - 41:23
    Don't forget that we're looking for
  • 41:23 - 41:26
    anomalies in network traffic and in
  • 41:26 - 41:29
    network events. And if we don't have
  • 41:29 - 41:32
    timestamp normalization, if we don't make
  • 41:32 - 41:35
    sure that all the events that we're
  • 41:35 - 41:38
    looking at are actually stored with
  • 41:38 - 41:40
    their right timestamp, at their right
  • 41:40 - 41:42
    moment in time when they actually
  • 41:42 - 41:45
    happened, we have no chance of detecting
  • 41:45 - 41:47
    anomalies in the network. So we might
  • 41:47 - 41:50
    have devices that have a badly
  • 41:50 - 41:52
    configured clock. We might have devices
  • 41:52 - 41:54
    that have been configured for different
  • 41:54 - 41:57
    time zones. We might have devices that
  • 41:57 - 42:00
    display time or timestamp, and those time
  • 42:00 - 42:03
    values in their logs in one
  • 42:03 - 42:05
    format versus another format. Some of
  • 42:05 - 42:07
    them might be using 24 hour, some of them
  • 42:07 - 42:09
    might be using 12 hours. Some of them
  • 42:09 - 42:11
    might include the daylight savings time.
  • 42:11 - 42:14
    Some of them might be using a UTC or
  • 42:14 - 42:18
    Unix epoch time. It's up to the vendor, so
  • 42:18 - 42:21
    normalizing these timestamps is also a
  • 42:21 - 42:24
    very, very, important topic here that
  • 42:24 - 42:26
    needs to be taken care of by the SIEM
  • 42:26 - 42:30
    solution before that event indicated by
  • 42:30 - 42:32
    that specific timestamp is stored in the
  • 42:32 - 42:35
    database alongside with the others. Now,
  • 42:35 - 42:37
    the way a SIEM solution can look for
  • 42:37 - 42:40
    anomalies in that huge database that we
  • 42:40 - 42:42
    just talked about. Well, it could be done
  • 42:42 - 42:44
    in a number of ways. We could just rely
  • 42:44 - 42:48
    on simple if then else matches, so we're
  • 42:48 - 42:50
    looking for, you know, specific events,
  • 42:50 - 42:53
    specific types of logs being generated
  • 42:53 - 42:56
    in a specific time range perhaps. This
  • 42:56 - 42:58
    type of approach is the fastest one
  • 42:58 - 43:00
    because it basically boils down to
  • 43:00 - 43:02
    a simple query in that huge database
  • 43:02 - 43:04
    stored by the SIEM and appliance.
  • 43:04 - 43:07
    Unfortunately, if there are unknown
  • 43:07 - 43:08
    threats, if there are attacks that we
  • 43:08 - 43:11
    know nothing about, that we don't have a
  • 43:11 - 43:12
    signature for them, we don't know what to
  • 43:12 - 43:14
    look for, we're not going to be able to
  • 43:14 - 43:17
    detect them. Kind of makes sense, right? So
  • 43:17 - 43:18
    another approach would be heuristic rule
  • 43:18 - 43:21
    matching. This is a type of rule matching
  • 43:21 - 43:23
    where we're not exactly looking for an
  • 43:23 - 43:24
    exact match
  • 43:24 - 43:27
    for the specific type of event, but we're
  • 43:27 - 43:30
    looking for something that it's pretty
  • 43:30 - 43:33
    close to it, all right? So this type of
  • 43:33 - 43:34
    approach
  • 43:34 - 43:38
    relies on a more permissive set of rules.
  • 43:38 - 43:41
    So if it doesn't 100% match or rule, let's
  • 43:41 - 43:44
    say if we have some events that are
  • 43:44 - 43:46
    pretty close to it and match it like
  • 43:46 - 43:48
    let's say 80% or 90%.
  • 43:48 - 43:51
    Now this also requires you to fine-tune
  • 43:51 - 43:54
    your rule set, so if at some point by
  • 43:54 - 43:57
    doing heuristic rule matching, you're
  • 43:57 - 43:59
    detecting some anomalies, but you don't
  • 43:59 - 44:03
    have a rule that matches that anomaly
  • 44:03 - 44:06
    100%, well, you better create it, right?
  • 44:06 - 44:08
    You better fine tune your rule set and
  • 44:08 - 44:10
    add some more rules or tweak the
  • 44:10 - 44:13
    existing ones to match that newly
  • 44:13 - 44:16
    detected anomaly, and just to recap this
  • 44:16 - 44:19
    here, an behavioral analysis implemented
  • 44:19 - 44:21
    in a SIEM relies on the fact that you need
  • 44:21 - 44:23
    to build a baseline. You need to tell the
  • 44:23 - 44:26
    SIEM how does your normal look like, how
  • 44:26 - 44:28
    does your normal traffic look like, how
  • 44:28 - 44:31
    does your normal logs generated by all
  • 44:31 - 44:32
    the devices and all the applications in
  • 44:32 - 44:35
    your network looks like. So that, in turn,
  • 44:35 - 44:37
    can be used as a starting point in order
  • 44:37 - 44:40
    to detect potential, well, mismatches that
  • 44:40 - 44:44
    might indicate attacks or attempts at
  • 44:44 - 44:46
    compromising your network. Now of course,
  • 44:46 - 44:47
    this is going to create a lot of false
  • 44:47 - 44:49
    positives. So you might run into a situation
  • 44:49 - 44:52
    where an alert is being raised because
  • 44:52 - 44:56
    an application starts generating some
  • 44:56 - 44:58
    huge backups because some admin has
  • 44:58 - 45:01
    modified the backup policy. Now the SIEM
  • 45:01 - 45:03
    device sees a lot of traffic in there,
  • 45:03 - 45:06
    raises an alert, raises everyone from their
  • 45:06 - 45:09
    sleep at 3am in the morning, and saying
  • 45:09 - 45:11
    that, oh my god, this looks like a data
  • 45:11 - 45:13
    exfiltration attempt. Somebody is
  • 45:13 - 45:16
    dumping all the data from our database,
  • 45:16 - 45:17
    and then an admin has to come in and
  • 45:17 - 45:20
    intervene and say, my dear, SIEM, what's
  • 45:20 - 45:22
    happening in there, what you're seeing is
  • 45:22 - 45:24
    just a full backup happening at 3am in
  • 45:24 - 45:28
    the morning. It's okay, right? Don't freak
  • 45:28 - 45:31
    out about it, okay? So it does require
  • 45:31 - 45:33
    human intervention for fine tuning these
  • 45:33 - 45:35
    rules.
  • 45:35 - 45:36
    On the other hand, we have anomaly
  • 45:36 - 45:39
    analysis. And this is, by definition, a
  • 45:39 - 45:40
    type of analysis that is performed
  • 45:40 - 45:43
    whenever we're comparing observed
  • 45:43 - 45:47
    behavior with known standard behavior,
  • 45:47 - 45:49
    especially when we're comparing what
  • 45:49 - 45:51
    we're seeing as part of a protocol's
  • 45:51 - 45:54
    behavior with what this SIEM device
  • 45:54 - 45:56
    knows that the protocol is supposed to
  • 45:56 - 45:59
    behave according to its RFC, according to
  • 45:59 - 46:01
    its definition. Finally with trend
  • 46:01 - 46:03
    analysis, we're going to be looking at
  • 46:03 - 46:06
    historic data and try to extrapolate it.
  • 46:06 - 46:08
    For example, if we see that the backups
  • 46:08 - 46:10
    are increasing every single week because
  • 46:10 - 46:13
    more data and more data is generated, the
  • 46:13 - 46:16
    SIEM device might be able to generate a
  • 46:16 - 46:19
    pattern so that if we see five gigabytes
  • 46:19 - 46:22
    in a backup this week, and eight gigabytes
  • 46:22 - 46:25
    of backups next week, when it is going to
  • 46:25 - 46:28
    see 12 gigabytes two weeks from now, it's
  • 46:28 - 46:30
    not going to raise an alert because it
  • 46:30 - 46:32
    expected the backup volume to increase
  • 46:32 - 46:34
    by that amount. But I don't need to tell
  • 46:34 - 46:36
    you that not everything can be safely
  • 46:36 - 46:40
    predicted this way. Finally, after all
  • 46:40 - 46:42
    that advanced correlation and machine
  • 46:42 - 46:45
    learning and AI features, the SIEMs
  • 46:45 - 46:49
    actually can be used as a database for
  • 46:49 - 46:53
    event storage, and they can be queried by
  • 46:53 - 46:55
    human users, by admins if you know what
  • 46:55 - 46:57
    to look for. Perhaps you just need to
  • 46:57 - 46:59
    investigate some event. Perhaps you need
  • 46:59 - 47:02
    to perform some forensic analysis.
  • 47:02 - 47:04
    So those databases become available to
  • 47:04 - 47:08
    you, to any admin basically, simply by
  • 47:08 - 47:10
    creating specific rules in order to
  • 47:10 - 47:12
    match specific types of events stored in
  • 47:12 - 47:14
    there. So you could create simple rules
  • 47:14 - 47:16
    that are they're matching based on
  • 47:16 - 47:18
    specific conditions. Look for one
  • 47:18 - 47:20
    specific IP address or look for a
  • 47:20 - 47:22
    specific time range, look for one
  • 47:22 - 47:26
    specific string that might occur in all
  • 47:26 - 47:28
    those log payloads. Maybe look for a user
  • 47:28 - 47:32
    and see what are the events that are
  • 47:32 - 47:34
    that are generated by the user or that
  • 47:34 - 47:36
    implicate that user and so on and so
  • 47:36 - 47:37
    forth. So the SIEM appliances are going
  • 47:37 - 47:40
    to allow you to create some queries very
  • 47:40 - 47:41
    similar to what you might be already
  • 47:41 - 47:44
    used to if you ever used SQL in the past
  • 47:44 - 47:46
    because all that data is basically
  • 47:46 - 47:48
    stored in a relational database which
  • 47:48 - 47:51
    can be queried with an SQL like
  • 47:51 - 47:54
    language. And finally, don't forget that
  • 47:54 - 47:56
    at the end of the day, not everybody has
  • 47:56 - 47:59
    money to invest in a SIEM solution, so you
  • 47:59 - 48:01
    might end up having to analyze your logs
  • 48:01 - 48:04
    by yourself, just navigating a bunch of
  • 48:04 - 48:07
    logs. And this is where a bunch of text
  • 48:07 - 48:10
    matching utilities, especially some
  • 48:10 - 48:11
    utilities that are built into most Linux
  • 48:11 - 48:14
    distributions are going to come in and
  • 48:14 - 48:16
    help you tremendously. Now, this is not a
  • 48:16 - 48:18
    Linux course, and the exam is not going
  • 48:18 - 48:20
    to expect you to know everything about
  • 48:20 - 48:24
    all these command line commands. But I
  • 48:24 - 48:26
    would say that knowing at least the
  • 48:26 - 48:28
    commands right here on the slide is
  • 48:28 - 48:30
    going to help you figure out a couple of
  • 48:30 - 48:33
    the outputs on the exam. Alright, so without
  • 48:33 - 48:35
    going into too much detail here, let's
  • 48:35 - 48:38
    have a look in one of my folders here
  • 48:38 - 48:40
    that stores log files and a Ubuntu
  • 48:40 - 48:43
    distribution, this is running on WSL,
  • 48:43 - 48:45
    right, Windows subsystem for Linux. We
  • 48:45 - 48:48
    have a log file right here, dpkg log
  • 48:48 - 48:50
    which is the log that's generated by the
  • 48:50 - 48:52
    package managers. So this log is going to
  • 48:52 - 48:54
    tell me which package-based operations
  • 48:54 - 48:56
    have been conducted on this machine
  • 48:56 - 48:58
    from it's beginning, from its
  • 48:58 - 49:00
    installation, right? What did I install,
  • 49:00 - 49:02
    what did I uninstall, what did I upgrade?
  • 49:02 - 49:04
    So it might be some useful information
  • 49:04 - 49:07
    in here. So let's just see a couple of
  • 49:07 - 49:09
    these commands. 'cat' is the concatenate
  • 49:09 - 49:11
    command in Linux and can also be used to
  • 49:11 - 49:14
    list the contents of
  • 49:14 - 49:18
    text files. So cat dpkg log is going to
  • 49:18 - 49:19
    provide you a bunch of listing right
  • 49:19 - 49:22
    here, trying to display all the contents
  • 49:22 - 49:24
    of the text file right at the console.
  • 49:24 - 49:26
    Now, this file right here, we can also
  • 49:26 - 49:30
    pipe it. So resend the result of this cat
  • 49:30 - 49:32
    command to another command, which could
  • 49:32 - 49:35
    be word count, word count minus l. This is
  • 49:35 - 49:37
    going to count the lines in this log file.
  • 49:37 - 49:40
    So you can see it's over 9000 lines
  • 49:40 - 49:42
    long. Pretty tough to search for some
  • 49:42 - 49:46
    information in a 9000 line log file. So
  • 49:46 - 49:49
    what we can do right here is, for example,
  • 49:49 - 49:51
    limit the amount of information that
  • 49:51 - 49:52
    we're displaying on the screen. This is
  • 49:52 - 49:54
    where the head or tail commands come in.
  • 49:54 - 49:55
    The head command, as you can probably
  • 49:55 - 49:57
    guess, is going to provide you with a
  • 49:57 - 50:00
    listing of the first 10 lines in this
  • 50:00 - 50:04
    log file. Similarly the tail command is
  • 50:04 - 50:06
    going to provide you a listing of the
  • 50:06 - 50:08
    last 10 lines in a log file. The tail
  • 50:08 - 50:10
    command is very useful for log files
  • 50:10 - 50:12
    that get appended frequently. So if you just
  • 50:12 - 50:15
    want to see the last modifications
  • 50:15 - 50:18
    made in this file, use the tail
  • 50:18 - 50:19
    command. Of course, the number of lines is
  • 50:19 - 50:21
    configurable. We're not going to go into
  • 50:21 - 50:24
    all these parameters right now. If you're
  • 50:24 - 50:26
    interested in finding out more about any
  • 50:26 - 50:28
    Linux command, any Linux utility, just use
  • 50:28 - 50:31
    the man pages, man tail,
  • 50:31 - 50:33
    and it's going to provide you with the
  • 50:33 - 50:35
    manual pages that are going to tell you
  • 50:35 - 50:38
    what are all the possible configuration
  • 50:38 - 50:40
    flags or settings that can be added to
  • 50:40 - 50:42
    this command. Here's the dash n, for
  • 50:42 - 50:44
    example, number of lines. Output the
  • 50:44 - 50:46
    last number of lines, you can add it as a
  • 50:46 - 50:49
    minus n parameter or dash dash line
  • 50:49 - 50:51
    equals how many lines you want to
  • 50:51 - 50:52
    display on the screen. Quit with the
  • 50:52 - 50:56
    letter Q. Now, the grep utility is a
  • 50:56 - 50:58
    regular expression evaluator, which can
  • 50:58 - 51:01
    be, of course, used to run some complex
  • 51:01 - 51:03
    regular expressions, which are going to
  • 51:03 - 51:05
    help you tremendously dig through a lot
  • 51:05 - 51:07
    of information aand extract what is actually
  • 51:07 - 51:10
    useful to you. But you can also do some
  • 51:10 - 51:12
    very simple string matching using grep.
  • 51:12 - 51:15
    For example, if we are displaying the
  • 51:15 - 51:19
    dpkg log here and piping this to the
  • 51:19 - 51:21
    to the grep command and search for, let's
  • 51:21 - 51:23
    say, installation of a specific package,
  • 51:23 - 51:27
    such as, let me see, ansible, right? I did
  • 51:27 - 51:28
    use this machine for ansible in the past.
  • 51:28 - 51:31
    So there you go. These are all the log
  • 51:31 - 51:33
    entries in here generated by the ansible
  • 51:33 - 51:36
    package. Notice that we've been through a
  • 51:36 - 51:37
    number of ansible versions in here.
  • 51:37 - 51:41
    Starting from version 2.8.1, we went
  • 51:41 - 51:45
    through 2.9.19, 2.9.27, and so on. We can
  • 51:45 - 51:47
    even see the evolution of this package
  • 51:47 - 51:49
    on this machine. Now, this is just a very,
  • 51:49 - 51:52
    very simple example here. I just wanted
  • 51:52 - 51:54
    to let you know that you do have a lot
  • 51:54 - 51:56
    of utilities available at your disposal
  • 51:56 - 51:59
    for manual log searching if you don't
  • 51:59 - 52:02
    have a SIEM solution available, all right?
  • 52:02 - 52:03
    Now, there's a lot more to talk about
  • 52:03 - 52:06
    this, but since this is not a Linux
  • 52:06 - 52:07
    training, we're gonna stop right here.
  • 52:07 - 52:09
    Alright everyone, thanks so much for
  • 52:09 - 52:10
    watching. I know there's been a lot of
  • 52:10 - 52:12
    information in this video, but I hope you
  • 52:12 - 52:15
    found this useful and informative, and I
  • 52:15 - 52:17
    hope to see you on the next video as
  • 52:17 - 52:18
    well. Don't forget to leave a comment if
  • 52:18 - 52:20
    you like this. Support the channel if you
  • 52:20 - 52:22
    can, if you wish, if you find this useful
  • 52:22 - 52:24
    in your studies, and see you in the next
  • 52:24 - 52:26
    video. Bye, bye.
  • 52:26 - 52:30
    [Music]
  • 52:30 - 52:40
    [Music]
Title:
CompTIA Security+ Full Course: Security Network Monitoring & SIEMs
Description:

more » « less
Video Language:
English
Duration:
52:39

English subtitles

Revisions Compare revisions