< Return to Video

Intermediate Searching in Splunk

  • 0:02 - 0:03
    Hello everyone.
  • 0:04 - 0:07
    We are getting started here on
  • 0:07 - 0:10
    our August lunch and learn session
  • 0:10 - 0:13
    presented by Kinney Group's Atlas Customer
  • 0:13 - 0:16
    Experience team. My name is Alice Devaney. I
  • 0:16 - 0:19
    am the engineering manager for the Atlas
  • 0:19 - 0:22
    Customer Experience team, and I'm excited
  • 0:22 - 0:25
    to be presenting this month's session on
  • 0:25 - 0:28
    intermediate-level Splunk searching. So
  • 0:28 - 0:30
    thank you all for attending. I hope you
  • 0:30 - 0:33
    get some good ideas out of this.
  • 0:33 - 0:35
    I certainly encourage engagement through
  • 0:35 - 0:37
    the chat, and I'll have some
  • 0:37 - 0:40
    information at the end on following up
  • 0:40 - 0:42
    and speaking with my team directly on
  • 0:42 - 0:46
    any issues or interests that you have
  • 0:46 - 0:48
    around these types of concepts that
  • 0:48 - 0:52
    we're going to cover today. So jumping
  • 0:52 - 0:55
    into an intermediate-level session.
  • 0:55 - 0:58
    I do want to say that we have previously
  • 0:58 - 1:02
    done a basic level searching
  • 1:02 - 1:05
    session so that we are really
  • 1:05 - 1:07
    progressing from that, picking up right
  • 1:07 - 1:09
    where we left off. We've done that
  • 1:09 - 1:11
    session with quite a few of our
  • 1:11 - 1:13
    customers individually and highly
  • 1:13 - 1:15
    recommend if you're interested in doing
  • 1:15 - 1:18
    that or this session with a larger team,
  • 1:18 - 1:20
    we're happy to discuss and
  • 1:20 - 1:23
    coordinate that. So getting started,
  • 1:23 - 1:26
    we're going to take a look at the final
  • 1:26 - 1:29
    search from our basic search session.
  • 1:29 - 1:31
    And we're going to walk through that,
  • 1:31 - 1:34
    understand some of the concepts, and
  • 1:34 - 1:36
    then we're going to take a step back,
  • 1:36 - 1:39
    look a little more generally at SPL
  • 1:39 - 1:42
    operations and understanding how
  • 1:42 - 1:46
    different commands apply to data, and
  • 1:46 - 1:49
    really that next level of understanding
  • 1:49 - 1:52
    for how you can write more complex
  • 1:52 - 1:54
    searches and understand really when
  • 1:54 - 1:57
    to use certain types of commands. And
  • 1:57 - 2:00
    of course, in the session we're going
  • 2:00 - 2:04
    to have a series of demos using
  • 2:04 - 2:07
    a few specific commands, highlighting the
  • 2:07 - 2:10
    different SPL command types that we
  • 2:10 - 2:13
    discuss in the second portion and get
  • 2:13 - 2:16
    to see that on the tutorial data that
  • 2:16 - 2:18
    you can also use in your environment,
  • 2:18 - 2:21
    in a test environment very
  • 2:21 - 2:24
    simply. So I will always encourage
  • 2:24 - 2:28
    especially with search content that you
  • 2:28 - 2:30
    look into the additional resource that I
  • 2:30 - 2:34
    have listed here. The search reference
  • 2:34 - 2:36
    documentation is one of my favorite
  • 2:36 - 2:39
    bookmarks that I use frequently in my
  • 2:39 - 2:41
    own environments and working in customer
  • 2:41 - 2:44
    environments. It is really the
  • 2:44 - 2:46
    best quick resource to get information
  • 2:46 - 2:50
    on syntax and examples of any search
  • 2:50 - 2:52
    command and is always a great
  • 2:52 - 2:55
    resource to have. The search manual is a
  • 2:55 - 2:57
    little bit more conceptual, but as you're
  • 2:57 - 2:59
    learning more about different types of
  • 2:59 - 3:00
    search operations,
  • 3:00 - 3:02
    it's very helpful to be able to review
  • 3:02 - 3:04
    this documentation
  • 3:04 - 3:06
    and have reference
  • 3:06 - 3:09
    material that you can come back to as
  • 3:09 - 3:11
    you are studying and trying to get
  • 3:11 - 3:13
    better and writing more complex
  • 3:13 - 3:17
    search content. I have also linked here
  • 3:17 - 3:19
    the documentation on how to use the
  • 3:19 - 3:22
    Splunk tutorial data, so if you've not
  • 3:22 - 3:23
    done that before, it's a very simple
  • 3:23 - 3:26
    process, and there are consistently
  • 3:26 - 3:28
    updated download files that Splunk
  • 3:28 - 3:31
    provides that you're able to directly
  • 3:31 - 3:33
    upload into any Splunk environment. So
  • 3:33 - 3:36
    that's what I'm going to be using today,
  • 3:36 - 3:39
    and given that you are searching over
  • 3:39 - 3:41
    appropriate time windows for when you
  • 3:41 - 3:44
    download the tutorial dataset, these
  • 3:44 - 3:47
    searches will work on the tutorial
  • 3:47 - 3:49
    data as well. So highly encourage, after
  • 3:49 - 3:51
    the fact, if you want to go through
  • 3:51 - 3:54
    and test out some of the content,
  • 3:54 - 3:57
    you'll be able to access a recording as
  • 3:57 - 3:59
    well as if you'd like the slides that
  • 3:59 - 4:01
    I'm presenting off of today, which I
  • 4:01 - 4:02
    highly encourage because there are a lot
  • 4:02 - 4:05
    of useful links in here, reach out to
  • 4:05 - 4:07
    my team. Again, right at the end of the
  • 4:07 - 4:09
    slides we'll have that info.
  • 4:09 - 4:13
    So looking at our overview of basic
  • 4:13 - 4:16
    search, I just want to cover
  • 4:16 - 4:18
    conceptually the two categories that
  • 4:18 - 4:22
    we discuss in that session. And so those
  • 4:22 - 4:24
    two are the statistical and charting
  • 4:24 - 4:28
    functions which consist of in those
  • 4:28 - 4:31
    demos aggregate and time functions. So
  • 4:31 - 4:34
    aggregate functions are going to be your
  • 4:34 - 4:37
    commonly used statistical functions
  • 4:37 - 4:40
    meant for summarization, and then time
  • 4:40 - 4:43
    functions actually using the
  • 4:43 - 4:47
    timestamp field underscore time or any
  • 4:47 - 4:49
    other time that you've extracted from
  • 4:49 - 4:52
    data and looking at earliest, latest
  • 4:52 - 4:55
    relative time values in a
  • 4:55 - 4:58
    summative fashion. And then evaluation
  • 4:58 - 5:02
    functions are the separate type where
  • 5:02 - 5:04
    we discuss comparison and conditional
  • 5:04 - 5:08
    statements so using your if and your
  • 5:08 - 5:10
    case commands in
  • 5:10 - 5:14
    evals. Also datetime functions that
  • 5:14 - 5:17
    apply operations to events uniquely
  • 5:17 - 5:20
    so not necessarily summarization, but
  • 5:20 - 5:22
    interacting with the time values
  • 5:22 - 5:24
    themselves, maybe changing the time
  • 5:24 - 5:27
    format, and then multivalue evalq
  • 5:27 - 5:29
    functions, we touch on that very lightly,
  • 5:29 - 5:32
    and it is more conceptual in basic
  • 5:32 - 5:34
    search. So today we're going to dive in
  • 5:34 - 5:36
    as part of our demo and look at
  • 5:36 - 5:39
    multivalue eval functions later in
  • 5:39 - 5:41
    the presentation.
  • 5:41 - 5:45
    So on this slide here I
  • 5:45 - 5:49
    have highlighted in gray the search
  • 5:49 - 5:52
    that we end basic search with. And so
  • 5:52 - 5:55
    that is broken up into three segments
  • 5:55 - 5:57
    where we have the first line being a
  • 5:57 - 6:00
    filter to a dataset. This is very
  • 6:00 - 6:03
    simply how you are sourcing most of your
  • 6:03 - 6:06
    data in most of your searches in Splunk.
  • 6:06 - 6:08
    And we always want to be a specific
  • 6:08 - 6:11
    as possible. You'll most often see the
  • 6:11 - 6:13
    logical way to do that is by
  • 6:13 - 6:16
    identifying an index and a source type,
  • 6:16 - 6:18
    possibly some specific values of given
  • 6:18 - 6:20
    fields in that data before you start
  • 6:20 - 6:23
    applying other operations. In our case, we
  • 6:23 - 6:25
    want to work with a whole dataset,
  • 6:25 - 6:29
    and then we move into applying our eval
  • 6:29 - 6:30
    statements.
  • 6:30 - 6:33
    So in the evals, the purpose of these is
  • 6:33 - 6:37
    to create some new fields to work with,
  • 6:37 - 6:40
    and so we have two operations here.
  • 6:40 - 6:42
    And you can see that on the first line,
  • 6:42 - 6:46
    we're starting with an error check field.
  • 6:46 - 6:49
    These are web access logs, so we're
  • 6:49 - 6:53
    looking at the HTTP status codes as the
  • 6:53 - 6:56
    status field, and we have a logical
  • 6:56 - 6:58
    condition here for greater than or equal
  • 6:58 - 7:01
    to 400, we want to return errors. And so
  • 7:01 - 7:04
    very simple example, making it as easy
  • 7:04 - 7:06
    as possible. If you want to get specifics
  • 7:06 - 7:09
    on your 200s and your 300s, it's the
  • 7:09 - 7:12
    exact same type of logic to go and apply
  • 7:12 - 7:14
    likely a case statement to get some
  • 7:14 - 7:17
    additional conditions and more unique
  • 7:17 - 7:21
    output in an error check or some sort of
  • 7:21 - 7:24
    field indicating what you want to
  • 7:24 - 7:26
    see out of your status code so this case,
  • 7:26 - 7:30
    simple errors. Or the value of non error
  • 7:30 - 7:32
    if we have say a 200.
  • 7:32 - 7:35
    We're also using a time function to
  • 7:35 - 7:39
    create a second field called day. You
  • 7:39 - 7:42
    may be familiar with some of the
  • 7:42 - 7:46
    fields that you get out of by default
  • 7:46 - 7:50
    for most any events in Splunk and
  • 7:50 - 7:52
    that they're related to breakdowns of
  • 7:52 - 7:56
    the time stamps. You have day, month,
  • 7:56 - 7:58
    and many others. In this case, I want to
  • 7:58 - 8:01
    get a specific format for day so we use
  • 8:01 - 8:03
    a strftime function, and we have a
  • 8:03 - 8:07
    time format variable here on the actual
  • 8:07 - 8:10
    extracted time stamp for Splunk. So
  • 8:10 - 8:12
    coming out of the second line, we've
  • 8:12 - 8:14
    accessed our data, we have created two
  • 8:14 - 8:17
    new fields to use, and then we are
  • 8:17 - 8:21
    actually performing charting with a
  • 8:21 - 8:24
    statistical function, and so that is
  • 8:24 - 8:26
    using timechart. And we can see here
  • 8:26 - 8:29
    that we are counting our events that
  • 8:29 - 8:33
    actually have the error value for our
  • 8:33 - 8:36
    created error check field. And so I'm
  • 8:36 - 8:39
    going to pivot over to Splunk here,
  • 8:39 - 8:41
    and we're going to look at this search,
  • 8:41 - 8:43
    and I have commented out most of the
  • 8:43 - 8:46
    logic, we'll step back through it. We
  • 8:46 - 8:49
    are looking in our web access log events
  • 8:49 - 8:53
    here, and we want to then apply our
  • 8:53 - 8:58
    eval. And so by applying the eval, we can
  • 8:58 - 9:01
    get our error check field that provides
  • 9:01 - 9:03
    error or non-error. We're seeing that we
  • 9:03 - 9:05
    have mostly non-error
  • 9:05 - 9:10
    events. And then we have the day field,
  • 9:10 - 9:12
    and so day is actually providing the
  • 9:12 - 9:14
    full name of day for the time stamp for
  • 9:14 - 9:18
    all these events. So with our timechart,
  • 9:18 - 9:22
    this is the summarization with a
  • 9:22 - 9:24
    condition actually that we're spanning
  • 9:24 - 9:28
    by default over a single day, so this may
  • 9:28 - 9:32
    not be a very logical use of a split by
  • 9:32 - 9:35
    day when we are already using a timechart
  • 9:35 - 9:37
    command that is dividing our
  • 9:37 - 9:41
    results by the time bin, effectively a
  • 9:41 - 9:46
    span of one day. But what we can do is
  • 9:46 - 9:50
    change our split by field to host and
  • 9:50 - 9:53
    get a little bit more of a reasonable
  • 9:53 - 9:55
    presentation. We were able to see with
  • 9:55 - 9:58
    the counts in the individual days not
  • 9:58 - 10:00
    only split through the time chart, but by
  • 10:00 - 10:02
    the day field that we only had values
  • 10:02 - 10:05
    where our matrix matched up for the
  • 10:05 - 10:10
    actual day. So here we have our hosts
  • 10:10 - 10:13
    one, two, and three, and then across days
  • 10:13 - 10:16
    counts of the error events that we
  • 10:16 - 10:20
    observe. So that is the search that we
  • 10:20 - 10:22
    end on in basic search. The concepts
  • 10:22 - 10:25
    there being accessing our data,
  • 10:25 - 10:27
    searching in a descriptive manner, using
  • 10:27 - 10:29
    our metadata fields, the index and the
  • 10:29 - 10:32
    source type, the evaluation functions
  • 10:32 - 10:34
    where we're creating new fields,
  • 10:34 - 10:38
    manipulating data, and then we have a
  • 10:38 - 10:40
    timechart function that is providing
  • 10:40 - 10:43
    some summarized statistics here based
  • 10:43 - 10:44
    on a time range.
  • 10:44 - 10:49
    So we will pivot back, and we're
  • 10:49 - 10:51
    going to take a step back out of the SPL
  • 10:51 - 10:54
    for a second just to talk about these
  • 10:54 - 10:57
    different kinds of search operations
  • 10:57 - 10:59
    that we just performed. So you'll hear
  • 10:59 - 11:03
    these terms if you are really kind of
  • 11:03 - 11:06
    diving deeper into actual operations of
  • 11:06 - 11:10
    Splunk searching. And you can get very
  • 11:10 - 11:13
    detailed regarding the optimization of
  • 11:13 - 11:16
    searches around these types of
  • 11:16 - 11:18
    commands and the order in which you
  • 11:18 - 11:21
    choose to execute SPL. Today I'm going to
  • 11:21 - 11:24
    focus on how these operations actually
  • 11:24 - 11:27
    apply to the data and helping you to
  • 11:27 - 11:29
    make better decisions about what
  • 11:29 - 11:32
    commands are best for the scenario that
  • 11:32 - 11:34
    you have or the output that you want to
  • 11:34 - 11:38
    see. And in future sessions, we will
  • 11:38 - 11:39
    discuss the actual optimization of
  • 11:39 - 11:42
    searches through this optimal order
  • 11:42 - 11:46
    of functions and some other means.
  • 11:46 - 11:48
    But just a caveat there that we're going
  • 11:48 - 11:50
    to talk pretty specifically today
  • 11:50 - 11:53
    just about these individually, how
  • 11:53 - 11:55
    they work with data, and then how you
  • 11:55 - 11:57
    see them in combination.
  • 11:57 - 12:00
    So our types of SPL commands,
  • 12:00 - 12:03
    the top three in bold we'll focus on in
  • 12:03 - 12:06
    our examples. The first of which is
  • 12:06 - 12:08
    streaming operations
  • 12:08 - 12:11
    which are executed on
  • 12:11 - 12:13
    individual events as they're returned by a
  • 12:13 - 12:15
    search. So you can think of this like
  • 12:15 - 12:17
    your evals
  • 12:17 - 12:19
    that is going to be doing
  • 12:19 - 12:21
    something to every single event,
  • 12:21 - 12:24
    modifying fields when they're available.
  • 12:24 - 12:28
    We do have generating functions. So
  • 12:28 - 12:31
    generating function are going to be used
  • 12:31 - 12:34
    situationally where you're sourcing data
  • 12:34 - 12:38
    from non-indexed datasets, and so you
  • 12:38 - 12:41
    would see that from either input
  • 12:41 - 12:44
    lookup commands or maybe tstats,
  • 12:44 - 12:46
    pulling information from the tsidx
  • 12:46 - 12:49
    files, and so generating the
  • 12:49 - 12:51
    statistical output based on the data
  • 12:51 - 12:55
    available there. Transforming commands
  • 12:55 - 12:59
    you will see as often as streaming
  • 12:59 - 13:01
    commands, generally speaking, and more
  • 13:01 - 13:03
    often than generating commands where
  • 13:03 - 13:05
    transforming is intended to order
  • 13:05 - 13:09
    results into a data table. And I often
  • 13:09 - 13:11
    think of this much like how we discuss
  • 13:11 - 13:14
    the statistical functions in basic
  • 13:14 - 13:17
    search as summarization functions where
  • 13:17 - 13:20
    you're looking to condense your overall
  • 13:20 - 13:23
    dataset into really manageable
  • 13:23 - 13:25
    consumable results. So these
  • 13:25 - 13:28
    operations that apply that summarization
  • 13:28 - 13:32
    are transforming. We do have two
  • 13:32 - 13:36
    additional types of SPL commands, the
  • 13:36 - 13:39
    first is orchestrating. You can read
  • 13:39 - 13:42
    about these, I will not discuss in great
  • 13:42 - 13:45
    detail. They are used to manipulate
  • 13:45 - 13:49
    how searches are actually processed or
  • 13:49 - 13:51
    or how commands are processed. And
  • 13:51 - 13:54
    they don't directly affect the results
  • 13:54 - 13:56
    in a search, how we think about say
  • 13:56 - 14:00
    applying a stats or an eval to a data
  • 14:00 - 14:02
    set. So if you're interested,
  • 14:02 - 14:04
    definitely check it out. Linked
  • 14:04 - 14:07
    documentation has details there.
  • 14:07 - 14:11
    Dataset processing is seen much more often,
  • 14:11 - 14:15
    and you do have some conditional
  • 14:15 - 14:19
    scenarios where commands can act as
  • 14:19 - 14:22
    dataset processing, so the
  • 14:22 - 14:24
    distinction for dataset processing is
  • 14:24 - 14:26
    going to be that you are operating in
  • 14:26 - 14:30
    bulk on a single completed dataset at
  • 14:30 - 14:32
    one time. So we'll look at an
  • 14:32 - 14:34
    example of that.
  • 14:34 - 14:37
    I want to pivot back to our main
  • 14:37 - 14:38
    three that we're going to be focusing on,
  • 14:38 - 14:40
    and I have mentioned some of these
  • 14:40 - 14:44
    examples already. The eval functions
  • 14:44 - 14:46
    that we've been talking about so far are
  • 14:46 - 14:48
    perfect examples of our streaming
  • 14:48 - 14:51
    commands. So where we are creating new
  • 14:51 - 14:56
    fields for each entry or log event,
  • 14:56 - 14:59
    where we are modifying values for all of
  • 14:59 - 15:02
    the results that are available. That
  • 15:02 - 15:05
    is where we are streaming with the
  • 15:05 - 15:09
    search functions. Inputlookup is
  • 15:09 - 15:10
    possibly one of the most common
  • 15:10 - 15:12
    generating commands that I see
  • 15:12 - 15:15
    because someone is intending to
  • 15:15 - 15:19
    source a dataset stored in a CSV file
  • 15:19 - 15:21
    or a KV store collection, and you're
  • 15:21 - 15:24
    able to bring that back as a report and
  • 15:24 - 15:28
    use that logic in your queries.
  • 15:28 - 15:30
    So that is
  • 15:30 - 15:33
    not requiring the index data or
  • 15:33 - 15:36
    any index data to actually return the
  • 15:36 - 15:38
    results that you want to see.
  • 15:39 - 15:41
    And we've talked about stats, very
  • 15:41 - 15:44
    generally speaking, with a lot of
  • 15:44 - 15:46
    unique functions you can apply there
  • 15:46 - 15:50
    where this is going to provide a tabular
  • 15:50 - 15:54
    output. And it is serving that purpose of
  • 15:54 - 15:55
    summarization, so we're really
  • 15:55 - 15:58
    reformatting the data into that
  • 15:58 - 16:01
    tabular report.
  • 16:02 - 16:07
    So we see in this example search here
  • 16:07 - 16:09
    that we are often combining these
  • 16:09 - 16:12
    different types of search operations. So
  • 16:12 - 16:15
    in this example that we have, I have
  • 16:15 - 16:19
    data that already exists in a CSV file.
  • 16:19 - 16:23
    We are applying a streaming command here,
  • 16:23 - 16:26
    where, evaluating each line to see if
  • 16:26 - 16:28
    we match a condition, and then returning
  • 16:28 - 16:30
    the results
  • 16:30 - 16:32
    based on that evaluation. And then we're
  • 16:32 - 16:34
    applying a transforming command at the
  • 16:34 - 16:37
    end which is that stats summarization,
  • 16:37 - 16:40
    getting the maximum values for the
  • 16:40 - 16:44
    count of errors and the host that is
  • 16:44 - 16:48
    associated with that. So let's pivot over
  • 16:48 - 16:52
    to Splunk and we'll take a look at that example.
  • 16:54 - 16:56
    So I'm just going to grab my
  • 16:56 - 16:59
    search here and I precommented out
  • 16:59 - 17:04
    the specific lines following inputlookup
  • 17:04 - 17:06
    just to see that this generating
  • 17:06 - 17:08
    command here is not looking for any
  • 17:08 - 17:10
    specific index data. We're pulling
  • 17:10 - 17:13
    directly the results that I have in a
  • 17:13 - 17:18
    CSV file here into this output, and so
  • 17:18 - 17:21
    we have a count of errors observed
  • 17:21 - 17:25
    across multiple hosts. Our where command
  • 17:25 - 17:29
    you might think is reformatting data
  • 17:29 - 17:31
    in the sense it is transforming the
  • 17:31 - 17:34
    results, but the evaluation of a where
  • 17:34 - 17:37
    function does apply effectively to every
  • 17:37 - 17:42
    event that is returned. So it is a
  • 17:42 - 17:44
    streaming command that is going to
  • 17:44 - 17:47
    filter down our result set based on our
  • 17:47 - 17:49
    condition that the error count is less
  • 17:49 - 17:51
    than 200.
  • 17:51 - 17:55
    So the following line is our
  • 17:55 - 17:57
    transforming command where we have two
  • 17:57 - 18:02
    results left 187 for host 3. We want
  • 18:02 - 18:06
    to see our maximum values here of 187 on
  • 18:06 - 18:10
    host 3. So our scenario here has really
  • 18:10 - 18:13
    covered where you may have hosts
  • 18:13 - 18:16
    that are trending toward a negative
  • 18:16 - 18:19
    state. You're aware that the second
  • 18:19 - 18:22
    host had already exceeded its
  • 18:22 - 18:25
    threshold value for errors, but host 3
  • 18:25 - 18:27
    also appears to be trending toward this
  • 18:27 - 18:30
    threshold. So being able to combine
  • 18:30 - 18:33
    these types of commands, understand
  • 18:33 - 18:35
    the logical condition that you're
  • 18:35 - 18:38
    searching for, and then also providing
  • 18:38 - 18:41
    that consumable output. So combining
  • 18:41 - 18:44
    all three of our types of commands here.
  • 18:46 - 18:49
    So I'm going to jump to an SPL
  • 18:49 - 18:53
    demo, and as I go through these different
  • 18:53 - 18:56
    commands, I'm going to be referencing
  • 18:56 - 18:58
    back to the different command types that
  • 18:58 - 19:00
    we're working with. I'm going to
  • 19:00 - 19:02
    introduce in a lot of these searches
  • 19:02 - 19:05
    a lot of small commands that I won't
  • 19:05 - 19:07
    talk about in great detail and that
  • 19:07 - 19:09
    really is the purpose of using your
  • 19:09 - 19:12
    search manual, using your search
  • 19:12 - 19:15
    reference documentation. So I will
  • 19:15 - 19:17
    glance over the use case, talk about
  • 19:17 - 19:20
    how it's meant to be applied, and then
  • 19:20 - 19:22
    using in your own scenarios where you
  • 19:22 - 19:24
    have problem you need to solve,
  • 19:24 - 19:27
    referencing the docs to find out where
  • 19:27 - 19:30
    you can apply similar functions to
  • 19:30 - 19:33
    what we observe in the the demonstration here.
  • 19:33 - 19:37
    So the first command I'm going to
  • 19:37 - 19:41
    focus on is the rex command. So rex is a
  • 19:41 - 19:43
    streaming command that you often see
  • 19:43 - 19:47
    applied to datasets that do not fully
  • 19:47 - 19:50
    have data extracted in the format that
  • 19:50 - 19:53
    you want to be using in your
  • 19:53 - 19:57
    reporting or in your logic. And so
  • 19:57 - 20:00
    this could very well be handled actually
  • 20:00 - 20:03
    in the configuration of props and
  • 20:03 - 20:06
    transforms and extracting fields at the
  • 20:06 - 20:08
    right times and indexing data, but as
  • 20:08 - 20:10
    your bringing new data sources, you need
  • 20:10 - 20:12
    to understand what's available for use
  • 20:12 - 20:14
    in Splunk. A lot of times you'll find
  • 20:14 - 20:17
    yourself needing to extract new fields
  • 20:17 - 20:19
    in line in your searches and be able
  • 20:19 - 20:22
    to use those in your search logic. Rex
  • 20:22 - 20:28
    also has a sed mode that I also see
  • 20:28 - 20:32
    testing done for masking of data in line
  • 20:32 - 20:34
    prior to actually putting that into
  • 20:34 - 20:36
    indexing configurations.
  • 20:36 - 20:38
    So rex you would
  • 20:38 - 20:41
    generally see used when you don't
  • 20:41 - 20:43
    have those fields available, you need to
  • 20:43 - 20:46
    use them at that time. And then we're
  • 20:46 - 20:47
    going to take a look at an example of
  • 20:47 - 20:50
    masking data as well to test your
  • 20:50 - 20:53
    syntax for a sed style replace in
  • 20:53 - 21:01
    config files. So we will jump back over.
  • 21:05 - 21:07
    So I'm going to start with a search on
  • 21:07 - 21:10
    an index source type, my tutorial data.
  • 21:10 - 21:13
    And then this is actual Linux secure
  • 21:13 - 21:16
    logging so these are going to be OS
  • 21:16 - 21:19
    security logs, and we're looking at all
  • 21:19 - 21:21
    of our web hosts that we've been
  • 21:21 - 21:23
    focusing on previously.
  • 21:23 - 21:25
    In our events, you can see
  • 21:25 - 21:29
    that we have first here an event that
  • 21:29 - 21:32
    has failed password for invalid user inet,
  • 21:32 - 21:34
    We're provided a source IP, a source
  • 21:34 - 21:37
    port, and we go to see the fields that
  • 21:37 - 21:39
    are extracted and that's not
  • 21:39 - 21:42
    being done for us automatically. So just
  • 21:42 - 21:44
    to start testing our logic to see if we
  • 21:44 - 21:47
    can get the results we want to see,
  • 21:47 - 21:50
    we're going to use the rex command. And
  • 21:50 - 21:53
    in doing so, we are applying this
  • 21:53 - 21:55
    operation across every event, again, a
  • 21:55 - 22:00
    streaming command. We are looking at the
  • 22:00 - 22:01
    raw field, so we're actually looking at
  • 22:01 - 22:05
    the raw text of each of these log events.
  • 22:05 - 22:07
    And then the rex syntax is simply to
  • 22:07 - 22:12
    provide in double quotes a regex
  • 22:12 - 22:15
    match, and we're using named groups for
  • 22:15 - 22:17
    field extractions. So for every single
  • 22:17 - 22:19
    event that we see failed password for
  • 22:19 - 22:23
    invalid user, we are actually extracting
  • 22:23 - 22:26
    a user field, the source IP field, and the
  • 22:26 - 22:29
    source port field. For the sake of
  • 22:29 - 22:31
    simplicity, I tried to keep the regex simple.
  • 22:31 - 22:34
    You can make this as complex as you need
  • 22:34 - 22:38
    to for your needs, for your data. And
  • 22:38 - 22:41
    so in our extracted fields, I've
  • 22:41 - 22:43
    actually pre-selected these so we can
  • 22:43 - 22:46
    see our user is now available, and this
  • 22:46 - 22:50
    applies to the events where the regex was
  • 22:50 - 22:53
    actually valid and matching on the
  • 22:53 - 22:57
    failed password for invalid user, etc string.
  • 22:57 - 23:00
    So now that we have our fields
  • 23:00 - 23:04
    extracted, we can actually use these. And
  • 23:04 - 23:05
    we want
  • 23:05 - 23:09
    to do a stats count as failed logins, so
  • 23:09 - 23:13
    anytime you see an operation as and
  • 23:13 - 23:17
    then a unique name, just a rename
  • 23:17 - 23:19
    through the transformation function,
  • 23:19 - 23:21
    easier way to actually keep
  • 23:21 - 23:23
    consistency with referencing your
  • 23:23 - 23:27
    fields as well as not have to rename
  • 23:27 - 23:30
    later on with some additional- in this
  • 23:30 - 23:32
    case, you'd have to reference the name
  • 23:32 - 23:35
    distinct count so just a way to keep
  • 23:35 - 23:38
    things clean and easy to use in further
  • 23:38 - 23:42
    lines of SPL. So we are counting our
  • 23:42 - 23:44
    failed logins, we're looking at the
  • 23:44 - 23:48
    distinct count of the source IP values
  • 23:48 - 23:50
    that we have, and then we're splitting
  • 23:50 - 23:53
    that by the host and the user. So you can
  • 23:53 - 23:56
    see here, this tutorial data is
  • 23:56 - 23:58
    actually pretty flat across most of the
  • 23:58 - 24:00
    sources so we're not going to have
  • 24:00 - 24:05
    any outliers or spikes in our stats here,
  • 24:05 - 24:08
    but you can see the resulting presentation.
  • 24:09 - 24:11
    In line four, we do have a
  • 24:11 - 24:15
    sort command, and this is an example of a
  • 24:15 - 24:18
    dataset processing command where we are
  • 24:18 - 24:20
    actually evaluating a full completed
  • 24:20 - 24:24
    dataset and reordering it. Given the
  • 24:24 - 24:26
    logic here, we want to descend on these
  • 24:26 - 24:29
    numeric values. So keep mind as you're
  • 24:29 - 24:31
    operating on different fields, it's going
  • 24:31 - 24:34
    to be the same sort of either basic
  • 24:34 - 24:37
    numeric or the lexicographical ordering
  • 24:37 - 24:40
    that you typically see in Splunk.
  • 24:41 - 24:46
    So we do have a second example
  • 24:46 - 24:49
    with the sed style replace.
  • 24:54 - 24:59
    So you can see in my events here
  • 24:59 - 25:02
    we are searching the tutorial and
  • 25:02 - 25:05
    vendor sales index and source type. And
  • 25:05 - 25:07
    I've gone ahead and applied one
  • 25:07 - 25:09
    operation, and this is going to be a
  • 25:09 - 25:12
    helpful operation to understand really
  • 25:12 - 25:15
    what we are replacing and how to get
  • 25:15 - 25:18
    consistent operation on these fields.
  • 25:18 - 25:20
    So in this case, we are actually creating
  • 25:20 - 25:24
    an ID length field where we are going to
  • 25:24 - 25:27
    choose to mask the value of account ID
  • 25:27 - 25:29
    in our rex command. We want to know that
  • 25:29 - 25:32
    that's a consistent number of characters
  • 25:32 - 25:34
    through all of our data. It's very
  • 25:34 - 25:37
    simple to spot check, but just to be
  • 25:37 - 25:39
    certain, we want to apply this to all of
  • 25:39 - 25:43
    our data, in this case, streaming command
  • 25:43 - 25:46
    through this eval. We
  • 25:46 - 25:49
    are changing the type of the data
  • 25:49 - 25:52
    because account ID is actually numeric.
  • 25:52 - 25:54
    We're making that a string value so that
  • 25:54 - 25:57
    we can look at the length. These are
  • 25:57 - 25:59
    common functions in any programming
  • 25:59 - 26:02
    languages, and so the syntax here in
  • 26:02 - 26:04
    SPL is quite simple. Just to be able
  • 26:04 - 26:07
    to get that contextual feel, we
  • 26:07 - 26:09
    understand we have 16 characters for
  • 26:09 - 26:12
    100% of our events in the account IDs.
  • 26:12 - 26:17
    So actually applying our rex command,
  • 26:17 - 26:21
    we are going to now specify a unique
  • 26:21 - 26:24
    field, not just underscore raw. We are
  • 26:24 - 26:27
    applying the sed mode, and this is a
  • 26:27 - 26:31
    sed syntax replacement looking
  • 26:31 - 26:34
    for the- it's a capture group for the
  • 26:34 - 26:36
    first 12 digits. And then we're
  • 26:36 - 26:39
    replacing that with a series of 12 X's.
  • 26:39 - 26:42
    So you can see in our first event, the
  • 26:42 - 26:45
    account ID is now masked, we only have
  • 26:45 - 26:49
    the remaining four digits to be able to
  • 26:49 - 26:52
    identify that. And so if our data was
  • 26:52 - 26:55
    indexed and is appropriately done so
  • 26:55 - 26:58
    in Splunk with the full account IDs, but
  • 26:58 - 27:00
    for the sake of reporting we want to
  • 27:00 - 27:05
    be able to mask that for the audience,
  • 27:05 - 27:08
    then we're able to use the sed
  • 27:08 - 27:12
    replace. And then to finalize a report,
  • 27:12 - 27:14
    this is just an example of the top
  • 27:14 - 27:16
    command which does a few operations
  • 27:16 - 27:18
    together and makes for a good
  • 27:18 - 27:21
    shorthand report, taking all the
  • 27:21 - 27:24
    unique values of the provided field,
  • 27:24 - 27:26
    giving you a count of those values, and
  • 27:26 - 27:29
    then showing the percentage
  • 27:29 - 27:32
    of the makeup for the total dataset
  • 27:32 - 27:35
    that that unique value accounts for. So
  • 27:35 - 27:37
    again, pretty flat in this tutorial data
  • 27:37 - 27:40
    in seeing a very consistent
  • 27:40 - 27:45
    .03% across these different account IDs.
  • 27:47 - 27:51
    So we have looked at a few examples
  • 27:51 - 27:55
    with the rex command, and that is
  • 27:55 - 27:57
    again, streaming. We're going to look at
  • 27:57 - 27:59
    another streaming command
  • 27:59 - 28:02
    which is going to be a set of
  • 28:02 - 28:07
    multivalue eval functions. And so again,
  • 28:07 - 28:10
    if you're to have a bookmark for search
  • 28:10 - 28:12
    documentation, multivalue eval functions
  • 28:12 - 28:15
    are a great one to have because when
  • 28:15 - 28:17
    you encounter these, it really takes
  • 28:17 - 28:20
    some time to figure out how to actually
  • 28:20 - 28:26
    operate on data. And so the
  • 28:26 - 28:30
    multivalue functions are really just
  • 28:30 - 28:32
    a collection that depending on your use
  • 28:32 - 28:35
    case, you're able to determine the
  • 28:35 - 28:39
    best to apply. You see it often used
  • 28:39 - 28:43
    with JSON and XML so data formats
  • 28:43 - 28:45
    that are actually naturally going to
  • 28:45 - 28:47
    provide a multivalue field where you
  • 28:47 - 28:50
    have repeated tags or keys across
  • 28:50 - 28:54
    unique events as they're extracted.
  • 28:54 - 28:56
    And you often see a lot of times in
  • 28:56 - 28:58
    Windows event logs, you actually have
  • 28:58 - 29:01
    repeated key values where your values
  • 29:01 - 29:03
    are different and the position in the
  • 29:03 - 29:05
    event is actually specific to a
  • 29:05 - 29:09
    condition, so you may have a need
  • 29:09 - 29:11
    for extraction or interaction with one
  • 29:11 - 29:14
    of those unique values to actually
  • 29:14 - 29:19
    get a reasonable outcome from your data.
  • 29:19 - 29:23
    And so we're going to use
  • 29:23 - 29:26
    multivalue eval functions when we
  • 29:26 - 29:29
    have a change we want make to the
  • 29:29 - 29:32
    presentation of data and we're able
  • 29:32 - 29:35
    to do so with multivalue fields. This I
  • 29:35 - 29:37
    would say often occurs when you have
  • 29:37 - 29:40
    multivalue data and then you want to
  • 29:40 - 29:43
    be able to change the format of the
  • 29:43 - 29:46
    multivalue fields there. And then
  • 29:46 - 29:47
    we're also going to look at a quick
  • 29:47 - 29:51
    example of actually using multivalue
  • 29:51 - 29:55
    evaluation as a logical condition.
  • 29:55 - 30:00
    So the first example.
  • 30:03 - 30:06
    We're going to start with a
  • 30:06 - 30:09
    simple table looking at our web access
  • 30:09 - 30:11
    logs, and so we're just going to pull
  • 30:11 - 30:15
    in our status and referer domain fields.
  • 30:15 - 30:18
    And so you can see we've got a
  • 30:18 - 30:23
    HTTP status code, and we've got the
  • 30:23 - 30:26
    format of a protocol subdomain
  • 30:26 - 30:30
    TLD. And our scenario here is that for a
  • 30:30 - 30:32
    simplicity of reporting, we just want
  • 30:32 - 30:34
    to work with this referer domain field
  • 30:34 - 30:38
    and be able to simplify that. So in
  • 30:38 - 30:42
    actually splitting out the field in this
  • 30:42 - 30:45
    case, split referer domain, and then
  • 30:45 - 30:48
    choosing the period character as our
  • 30:48 - 30:50
    point to split the data. We're creating a
  • 30:50 - 30:53
    multivalue from what was previously
  • 30:53 - 30:57
    just a single value field. And using
  • 30:57 - 31:02
    this, we can actually create a new field
  • 31:02 - 31:06
    by using the index of a multivalue field,
  • 31:06 - 31:08
    and in this case, we're looking at
  • 31:08 - 31:11
    index 012.
  • 31:11 - 31:13
    The multivalue index function allows
  • 31:13 - 31:16
    us to target a specific field and then
  • 31:16 - 31:19
    choose a starting and ending index to
  • 31:19 - 31:21
    extract given values. There are a number
  • 31:21 - 31:23
    of ways to do this. In our case here
  • 31:23 - 31:25
    where we have three entries, it's quite
  • 31:25 - 31:27
    simple just to give that start and end
  • 31:27 - 31:29
    of range as the
  • 31:29 - 31:30
    two entries
  • 31:30 - 31:35
    apart. So as we are working to recreate
  • 31:35 - 31:39
    our domain, and so that is just applying
  • 31:39 - 31:42
    for this new domain field, we have
  • 31:42 - 31:44
    buttercupgames.com in what was
  • 31:44 - 31:48
    previously the HTTP www.buttercup
  • 31:48 - 31:51
    games.com. We can now use those fields
  • 31:51 - 31:55
    in a transformation function. In this
  • 31:55 - 31:58
    case, simple stats count by status in
  • 31:58 - 32:00
    the domain.
  • 32:03 - 32:07
    So I do want to look at another
  • 32:07 - 32:10
    example here that is similar, but
  • 32:10 - 32:14
    we're going to use a multivalue function
  • 32:14 - 32:17
    to actually test a condition. And so I'm
  • 32:17 - 32:18
    going to,
  • 32:18 - 32:22
    in this case, be searching the same
  • 32:22 - 32:24
    data. We're going to start with a stats
  • 32:24 - 32:29
    command, and so a stats count as well as
  • 32:29 - 32:32
    a values of status. And so the values
  • 32:32 - 32:33
    function is going to provide all the
  • 32:33 - 32:37
    unique values of a given field based
  • 32:37 - 32:42
    on the split by. And so that produces
  • 32:42 - 32:45
    a multivalue field here in the case of
  • 32:45 - 32:47
    status. We have quite a few events
  • 32:47 - 32:51
    that have multiple status codes, and as
  • 32:51 - 32:53
    we're interested in pulling those events
  • 32:53 - 32:57
    out, we can use an mvcount function to
  • 32:57 - 33:01
    evaluate and filter our dataset to
  • 33:01 - 33:04
    those specific events. So a very simple
  • 33:04 - 33:07
    operation here, you're just looking at what has
  • 33:07 - 33:10
    the- what has more than a single value
  • 33:10 - 33:13
    for status, but very useful as you're
  • 33:13 - 33:16
    applying this in reporting especially in
  • 33:16 - 33:19
    combination with others and with more
  • 33:19 - 33:23
    complex conditions.
  • 33:23 - 33:28
    So that is our set of multivalue
  • 33:28 - 33:33
    eval functions there as streaming commands.
  • 33:34 - 33:38
    So for a final section of
  • 33:38 - 33:42
    the demo, I want to talk about a concept
  • 33:42 - 33:45
    that is not so much a set of functions,
  • 33:45 - 33:48
    but really enables more complex
  • 33:48 - 33:50
    and interesting searching and can allow
  • 33:50 - 33:53
    us to use a few different types of
  • 33:53 - 33:57
    commands in our SPL. And so the concept of
  • 33:57 - 34:00
    subsearching for both filtering and
  • 34:00 - 34:04
    enrichment is taking secondary search
  • 34:04 - 34:07
    results, and we're using that to
  • 34:07 - 34:11
    affect a primary search. So a subsearch
  • 34:11 - 34:12
    will be executed, the results
  • 34:12 - 34:15
    returned, and depending on how it's used,
  • 34:15 - 34:18
    this is going to be processed in the
  • 34:18 - 34:22
    original search, and that is going to-
  • 34:22 - 34:24
    We'll look at an example that it is
  • 34:24 - 34:27
    filtering. So based on the results, we get
  • 34:27 - 34:31
    a effectively a value equals X or value
  • 34:31 - 34:34
    equals y for one of our fields that
  • 34:34 - 34:37
    we're looking at in the subsearch.
  • 34:37 - 34:39
    And then we're also going to look at an
  • 34:39 - 34:42
    enrichment example, so you see this often
  • 34:42 - 34:46
    when you have a dataset maybe saved
  • 34:46 - 34:48
    in a lookup table or you just have a
  • 34:48 - 34:50
    simple reference where you want to bring
  • 34:50 - 34:53
    in more context, maybe descriptions of
  • 34:53 - 34:55
    event codes, things like
  • 34:55 - 35:00
    that. So in that case,
  • 35:02 - 35:05
    we'll look at the first command here. Now,
  • 35:05 - 35:08
    I'm going to run my search, and we're
  • 35:08 - 35:12
    going to pivot over to a subsearch
  • 35:12 - 35:15
    tab here. And so you can see our subsearch
  • 35:15 - 35:20
    looking at the secure logs.
  • 35:20 - 35:22
    We are actually just pulling out the
  • 35:22 - 35:24
    search to see what the results are or
  • 35:24 - 35:26
    what's going to be returned from that
  • 35:26 - 35:29
    subsearch. So we're applying the same
  • 35:29 - 35:31
    rex that we had before to extract our
  • 35:31 - 35:34
    fields. We're applying a where, a streaming
  • 35:34 - 35:36
    command looking for anything that's not
  • 35:36 - 35:39
    null for user. We observed that we had
  • 35:39 - 35:41
    about 60% of our events that were going
  • 35:41 - 35:43
    to be null based on not having a user
  • 35:43 - 35:47
    field, and so looking at that total dataset,
  • 35:47 - 35:50
    we're just going to count by our
  • 35:50 - 35:54
    source IP. And this is often a quick way
  • 35:54 - 35:57
    to really just get a list of unique
  • 35:57 - 36:00
    values of any given field. And then
  • 36:00 - 36:03
    operating on that to return just the
  • 36:03 - 36:05
    the list of values, few different ways to
  • 36:05 - 36:09
    do that, I see stats count pretty often.
  • 36:09 - 36:11
    And in this case, we're actually tabling
  • 36:11 - 36:14
    out just keeping our source IP field and
  • 36:14 - 36:17
    renaming it to client IP, so the resulting
  • 36:17 - 36:21
    dataset is a single column table
  • 36:21 - 36:21
    with
  • 36:21 - 36:26
    182 results, and the field name is client
  • 36:26 - 36:30
    IP. So when returned to the original
  • 36:30 - 36:32
    search, we're running this as a sub
  • 36:32 - 36:36
    search, the effective result of this is
  • 36:36 - 36:40
    actually client IP equals my first value
  • 36:40 - 36:44
    here or client IP equals my second value
  • 36:44 - 36:47
    and so on through the full dataset. And
  • 36:47 - 36:49
    so looking at our search here, we're
  • 36:49 - 36:52
    applying this to the access logs. You can
  • 36:52 - 36:55
    see that we had a field named source IP
  • 36:55 - 36:59
    in the secure logs and we renamed to
  • 36:59 - 37:02
    client IP so that we could apply this to
  • 37:02 - 37:06
    the access logs where client IP is the
  • 37:06 - 37:09
    actual field name for the source IP
  • 37:09 - 37:14
    data. And in this case, we are filtering
  • 37:14 - 37:16
    to the client IP's relevant in the secure
  • 37:16 - 37:20
    logs for our web access logs.
  • 37:20 - 37:24
    So uncommenting here, we have a
  • 37:24 - 37:27
    series of operations that we're doing,
  • 37:27 - 37:29
    and I'm just going to run them all at
  • 37:29 - 37:33
    once and talk through that we are
  • 37:33 - 37:37
    counting the status or we're counting
  • 37:37 - 37:40
    the events by status and client IP
  • 37:40 - 37:43
    for the client IPs that were relevant to
  • 37:43 - 37:45
    authentication failures in the secure
  • 37:45 - 37:49
    logs. We are then creating a status count
  • 37:49 - 37:52
    field just by combining our status
  • 37:52 - 37:55
    and count fields, adding a colon
  • 37:55 - 37:59
    between them. And then we are doing a
  • 37:59 - 38:02
    second stats statement here to
  • 38:02 - 38:04
    actually combine all of our newly
  • 38:04 - 38:06
    created fields together in a more
  • 38:06 - 38:11
    condensed report. So a transforming command,
  • 38:11 - 38:13
    then streaming for creating our new
  • 38:13 - 38:15
    field, another transforming command, and
  • 38:15 - 38:18
    then our sort for dataset processing
  • 38:18 - 38:21
    actually gives us the results here for a
  • 38:21 - 38:25
    given client IP. And so we are, in this
  • 38:25 - 38:28
    case, looking for the scenario that
  • 38:28 - 38:31
    these client IPs that are involved in
  • 38:31 - 38:34
    authentication failures to the web
  • 38:34 - 38:37
    servers. In this case, these were all over
  • 38:37 - 38:40
    SSH. We want to see if there are
  • 38:40 - 38:43
    interactions by these same source IPs
  • 38:43 - 38:46
    actually on the website that we're
  • 38:46 - 38:50
    hosting. So seeing a high number of
  • 38:50 - 38:53
    failed values, looking at actions also is
  • 38:53 - 38:56
    a use case here for just bringing in
  • 38:56 - 38:58
    that context and seeing if there's any
  • 38:58 - 39:01
    sort of relationship between the data.
  • 39:01 - 39:04
    This is discussed often as correlation
  • 39:04 - 39:08
    of logs. I'm usually careful about using
  • 39:08 - 39:09
    the term correlation in talking about
  • 39:09 - 39:11
    Splunk queries especially in Enterprise
  • 39:11 - 39:13
    security talking about correlation
  • 39:13 - 39:16
    searches where I typically think of
  • 39:16 - 39:18
    correlation searches as being
  • 39:18 - 39:21
    overarching concepts that cover data
  • 39:21 - 39:24
    from multiple data sources, and in this
  • 39:24 - 39:26
    case, correlating events would be looking
  • 39:26 - 39:28
    at unique data types that are
  • 39:28 - 39:31
    potentially related in finding that
  • 39:31 - 39:34
    logical connection for the condition.
  • 39:34 - 39:36
    That's a little bit more up to the user.
  • 39:36 - 39:38
    It's not quite as easy as say,
  • 39:38 - 39:42
    pointing to a specific data
  • 39:42 - 39:45
    model. So we are going to look at one
  • 39:45 - 39:48
    more subsearch here, and this case is
  • 39:48 - 39:52
    going to apply the join command. And
  • 39:52 - 39:56
    so I talk about using lookup files or
  • 39:56 - 39:59
    other data returned by subsearches
  • 39:59 - 40:02
    to enrich, to bring more data in
  • 40:02 - 40:06
    rather than filter. We are going to
  • 40:06 - 40:09
    look at our first part of the command
  • 40:09 - 40:11
    here, and this is actually just a
  • 40:11 - 40:16
    simple stats report based on this rex
  • 40:16 - 40:18
    that keeps coming through the SPL to
  • 40:18 - 40:21
    give us those user and source IP fields.
  • 40:21 - 40:24
    So our result here is authentication
  • 40:24 - 40:26
    failures for all these web hosts so
  • 40:26 - 40:29
    similar to what we had previously
  • 40:29 - 40:31
    returned. And then we're going to take a
  • 40:31 - 40:33
    look at the results of the subsearch
  • 40:33 - 40:35
    here. I'm going to actually split this up so that we
  • 40:35 - 40:39
    can see the first two lines. We're
  • 40:39 - 40:42
    looking at our web access logs for
  • 40:42 - 40:46
    purchase actions, and then we are
  • 40:46 - 40:51
    looking at our stats count for errors
  • 40:51 - 40:53
    and stats count for successes. We have
  • 40:53 - 40:55
    pretty limited status code to return in
  • 40:55 - 40:59
    this data so this is viable for
  • 40:59 - 41:02
    the data present to observe our
  • 41:02 - 41:04
    errors and successes.
  • 41:04 - 41:06
    And then we are actually
  • 41:06 - 41:08
    creating a new field based on the
  • 41:08 - 41:11
    statistics that we're generating,
  • 41:11 - 41:14
    looking at our transaction errors so
  • 41:14 - 41:18
    where we have high or low numbers
  • 41:18 - 41:22
    of failed purchase actions, and then
  • 41:22 - 41:26
    summarizing that. So in the case of our
  • 41:26 - 41:28
    final command here, another transforming
  • 41:28 - 41:31
    command of table just to reduce this to
  • 41:31 - 41:35
    a small dataset to use in the subsearch.
  • 41:35 - 41:37
    And so in this case, we have our host
  • 41:37 - 41:39
    value and then our transaction error
  • 41:39 - 41:41
    rate that we observe from the web access
  • 41:41 - 41:45
    logs. And then over in our other search
  • 41:45 - 41:49
    here, we are going to perform a left
  • 41:49 - 41:51
    join based on this host field. So you see
  • 41:51 - 41:53
    in our secure logs, we still have the
  • 41:53 - 41:56
    same host value, and this is going to be
  • 41:56 - 42:00
    used to actually add our
  • 42:00 - 42:03
    transaction error rates in for each
  • 42:03 - 42:06
    host. So as we observe increased
  • 42:06 - 42:09
    authentication failures, if there's a
  • 42:09 - 42:12
    scenario for a breach and some sort of
  • 42:12 - 42:15
    interruption to the ability to serve out
  • 42:15 - 42:18
    or perform these purchase actions that
  • 42:18 - 42:21
    are affecting the intended
  • 42:21 - 42:23
    operations of the web servers, we can
  • 42:23 - 42:25
    see that here. Of course in our tutorial
  • 42:25 - 42:27
    data, there's not really much that
  • 42:27 - 42:30
    jumping out or showing that there is
  • 42:30 - 42:33
    any correlation between the two, but the
  • 42:33 - 42:35
    purpose of the join is to bring in that
  • 42:35 - 42:37
    extra dataset to give the context to
  • 42:37 - 42:40
    further investigate.
  • 42:41 - 42:47
    So that is the final
  • 42:47 - 42:52
    portion of the SPL demo. And I do want
  • 42:52 - 42:55
    to say for any questions, I'm going to
  • 42:55 - 42:57
    take a look at the chat, I'll do my best
  • 42:57 - 43:00
    to answer any questions, and then if
  • 43:00 - 43:03
    you have any other questions, please
  • 43:03 - 43:06
    feel free to reach out to my team at
  • 43:06 - 43:09
    support@kennygroup.com, and we'll be
  • 43:09 - 43:12
    happy to get back to you and help. I
  • 43:12 - 43:15
    am taking a look through.
  • 43:32 - 43:34
    Okay, seeing some questions on
  • 43:34 - 43:38
    performance of the rex, sed, regex
  • 43:38 - 43:42
    commands. So off the top of my head,
  • 43:42 - 43:44
    I'm not sure about a direct performance
  • 43:44 - 43:46
    comparison of the individual commands.
  • 43:46 - 43:49
    Definitely want to look into that, and
  • 43:49 - 43:52
    definitely follow up if you'd like to
  • 43:52 - 43:54
    explain a more detailed scenario or
  • 43:54 - 43:57
    look at some SPL that we can apply and
  • 43:57 - 44:00
    observe those changes.
  • 44:00 - 44:02
    The question on getting the
  • 44:02 - 44:05
    dataset, that is what I mentioned at
  • 44:05 - 44:08
    the beginning. Reach out to us for the
  • 44:08 - 44:10
    slides or just reach out about the
  • 44:10 - 44:15
    link. And the Splunk tutorial data, you
  • 44:15 - 44:18
    can actually search that as well. And
  • 44:18 - 44:20
    there's documentation on how to use the
  • 44:20 - 44:22
    tutorial data, one of the first links
  • 44:22 - 44:26
    there, takes you to a page that has-
  • 44:26 - 44:29
    it is a tutorial data zip file, and
  • 44:29 - 44:31
    instructions on how to [inaudible] that, it's
  • 44:31 - 44:34
    just an upload for your specific
  • 44:34 - 44:38
    environment. So in add data and then
  • 44:38 - 44:40
    upload data, two clicks, and upload
  • 44:40 - 44:43
    your file. So that is freely available
  • 44:43 - 44:46
    for anyone, and again, that package is
  • 44:46 - 44:47
    dynamically updated as well so your time
  • 44:47 - 44:51
    stamps are pretty close to normal
  • 44:51 - 44:53
    as you download the app, kind of depends
  • 44:53 - 44:56
    on the time of the cycle for the
  • 44:56 - 44:59
    update, but search overall time, you
  • 44:59 - 45:02
    won't have any issues there. And then
  • 45:02 - 45:05
    yeah, again on receiving slides, reach
  • 45:05 - 45:08
    out to my team, and we're happy to
  • 45:08 - 45:10
    provide those, discuss further, and we'll
  • 45:10 - 45:16
    have the recording available
  • 45:16 - 45:18
    for this session. You should be able to,
  • 45:18 - 45:21
    after the recording processes when
  • 45:21 - 45:23
    the session ends, actually use the
  • 45:23 - 45:25
    same link, and you can watch this
  • 45:25 - 45:26
    recording and post without having to
  • 45:26 - 45:32
    sign up or transfer that file so-
  • 45:34 - 45:38
    So okay, Chris, seeing your
  • 45:38 - 45:41
    comment there, let me know if you want
  • 45:41 - 45:44
    to reach out to me directly, anyone as
  • 45:44 - 45:49
    well. We can discuss what slides and
  • 45:49 - 45:52
    presentation you had attended, I'm not
  • 45:52 - 45:55
    sure I have the attendance report
  • 45:55 - 45:57
    for what you've seen previously, so
  • 45:57 - 46:00
    happy to get those for you.
  • 46:07 - 46:10
    All right and seeing- thanks Brett.
  • 46:10 - 46:13
    So you see Brett Woodruff in the chat
  • 46:13 - 46:17
    commenting, systems engineer on the
  • 46:17 - 46:19
    expertise on demand team so very
  • 46:19 - 46:20
    knowledgeable guy, and he's going to be
  • 46:20 - 46:24
    presenting next month's session. That
  • 46:24 - 46:25
    is going to take this concept that we
  • 46:25 - 46:29
    talked about in the subsearching as a just
  • 46:29 - 46:31
    general search topic, he's going to go
  • 46:31 - 46:34
    specifically into data enrichment using
  • 46:34 - 46:38
    joins, lookup commands, and how we see
  • 46:38 - 46:41
    that used in the wild. So definitely
  • 46:41 - 46:43
    excited for that one, encourage you to
  • 46:43 - 46:46
    register for that event.
  • 46:47 - 46:52
    All right, I'm not seeing any more questions.
  • 46:58 - 47:02
    All right, with that I am stopping my
  • 47:02 - 47:05
    share. I'm going to hang around for a few
  • 47:05 - 47:07
    minutes, but thank you all for
  • 47:07 - 47:11
    attending. and we'll see you on the next session.
Title:
Intermediate Searching in Splunk
Description:

more » « less
Video Language:
English
Duration:
47:17

English subtitles

Revisions Compare revisions