Hello everyone.

We are getting started here on

our August lunch and learn session

presented by Kinney Group's Atlas Customer

Experience team. My name is Alice Devaney. I

am the engineering manager for the Atlas

Customer Experience team, and I'm excited

to be presenting this month's session on

intermediate-level Splunk searching. So

thank you all for attending. I hope you

get some good ideas out of this.

I certainly encourage engagement through

the chat, and I'll have some

information at the end on following up

and speaking with my team directly on

any issues or interests that you have

around these types of concepts that

we're going to cover today. So jumping

into an intermediate-level session.

I do want to say that we have previously

done a basic level searching

session so that we are really

progressing from that, picking up right

where we left off. We've done that

session with quite a few of our

customers individually and highly

recommend if you're interested in doing

that or this session with a larger team,

we're happy to discuss and

coordinate that. So getting started,

we're going to take a look at the final

search from our basic search session.

And we're going to walk through that,

understand some of the concepts, and

then we're going to take a step back,

look a little more generally at SPL

operations and understanding how

different commands apply to data, and

really that next level of understanding

for how you can write more complex

searches and understand really when

to use certain types of commands. And

of course, in the session we're going

to have a series of demos using

a few specific commands, highlighting the

different SPL command types that we

discuss in the second portion and get

to see that on the tutorial data that

you can also use in your environment,

in a test environment very

simply. So I will always encourage

especially with search content that you

look into the additional resource that I

have listed here. The search reference

documentation is one of my favorite

bookmarks that I use frequently in my

own environments and working in customer

environments. It is really the

best quick resource to get information

on syntax and examples of any search

command and is always a great

resource to have. The search manual is a

little bit more conceptual, but as you're

learning more about different types of

search operations,

it's very helpful to be able to review

this documentation

and have reference

material that you can come back to as

you are studying and trying to get

better and writing more complex

search content. I have also linked here

the documentation on how to use the

Splunk tutorial data, so if you've not

done that before, it's a very simple

process, and there are consistently

updated download files that Splunk

provides that you're able to directly

upload into any Splunk environment. So

that's what I'm going to be using today,

and given that you are searching over

appropriate time windows for when you

download the tutorial dataset, these

searches will work on the tutorial

data as well. So highly encourage, after

the fact, if you want to go through

and test out some of the content,

you'll be able to access a recording as

well as if you'd like the slides that

I'm presenting off of today, which I

highly encourage because there are a lot

of useful links in here, reach out to

my team. Again, right at the end of the

slides we'll have that info.

So looking at our overview of basic

search, I just want to cover

conceptually the two categories that

we discuss in that session. And so those

two are the statistical and charting

functions which consist of in those

demos aggregate and time functions. So

aggregate functions are going to be your

commonly used statistical functions

meant for summarization, and then time

functions actually using the

timestamp field underscore time or any

other time that you've extracted from

data and looking at earliest, latest

relative time values in a

summative fashion. And then evaluation

functions are the separate type where

we discuss comparison and conditional

statements so using your if and your

case commands in

evals. Also datetime functions that

apply operations to events uniquely

so not necessarily summarization, but

interacting with the time values

themselves, maybe changing the time

format, and then multivalue evalq

functions, we touch on that very lightly,

and it is more conceptual in basic

search. So today we're going to dive in

as part of our demo and look at

multivalue eval functions later in

the presentation.

So on this slide here I

have highlighted in gray the search

that we end basic search with. And so

that is broken up into three segments

where we have the first line being a

filter to a dataset. This is very

simply how you are sourcing most of your

data in most of your searches in Splunk.

And we always want to be a specific

as possible. You'll most often see the

logical way to do that is by

identifying an index and a source type,

possibly some specific values of given

fields in that data before you start

applying other operations. In our case, we

want to work with a whole dataset,

and then we move into applying our eval

statements.

So in the evals, the purpose of these is

to create some new fields to work with,

and so we have two operations here.

And you can see that on the first line,

we're starting with an error check field.

These are web access logs, so we're

looking at the HTTP status codes as the

status field, and we have a logical

condition here for greater than or equal

to 400, we want to return errors. And so

very simple example, making it as easy

as possible. If you want to get specifics

on your 200s and your 300s, it's the

exact same type of logic to go and apply

likely a case statement to get some

additional conditions and more unique

output in an error check or some sort of

field indicating what you want to

see out of your status code so this case,

simple errors. Or the value of non error

if we have say a 200.

We're also using a time function to

create a second field called day. You

may be familiar with some of the

fields that you get out of by default

for most any events in Splunk and

that they're related to breakdowns of

the time stamps. You have day, month,

and many others. In this case, I want to

get a specific format for day so we use

a strftime function, and we have a

time format variable here on the actual

extracted time stamp for Splunk. So

coming out of the second line, we've

accessed our data, we have created two

new fields to use, and then we are

actually performing charting with a

statistical function, and so that is

using timechart. And we can see here

that we are counting our events that

actually have the error value for our

created error check field. And so I'm

going to pivot over to Splunk here,

and we're going to look at this search,

and I have commented out most of the

logic, we'll step back through it. We

are looking in our web access log events

here, and we want to then apply our

eval. And so by applying the eval, we can

get our error check field that provides

error or non-error. We're seeing that we

have mostly non-error

events. And then we have the day field,

and so day is actually providing the

full name of day for the time stamp for

all these events. So with our timechart,

this is the summarization with a

condition actually that we're spanning

by default over a single day, so this may

not be a very logical use of a split by

day when we are already using a timechart

command that is dividing our

results by the time bin, effectively a

span of one day. But what we can do is

change our split by field to host and

get a little bit more of a reasonable

presentation. We were able to see with

the counts in the individual days not

only split through the time chart, but by

the day field that we only had values

where our matrix matched up for the

actual day. So here we have our hosts

one, two, and three, and then across days

counts of the error events that we

observe. So that is the search that we

end on in basic search. The concepts

there being accessing our data,

searching in a descriptive manner, using

our metadata fields, the index and the

source type, the evaluation functions

where we're creating new fields,

manipulating data, and then we have a

timechart function that is providing

some summarized statistics here based

on a time range.

So we will pivot back, and we're

going to take a step back out of the SPL

for a second just to talk about these

different kinds of search operations

that we just performed. So you'll hear

these terms if you are really kind of

diving deeper into actual operations of

Splunk searching. And you can get very

detailed regarding the optimization of

searches around these types of

commands and the order in which you

choose to execute SPL. Today I'm going to

focus on how these operations actually

apply to the data and helping you to

make better decisions about what

commands are best for the scenario that

you have or the output that you want to

see. And in future sessions, we will

discuss the actual optimization of

searches through this optimal order

of functions and some other means.

But just a caveat there that we're going

to talk pretty specifically today

just about these individually, how

they work with data, and then how you

see them in combination.

So our types of SPL commands,

the top three in bold we'll focus on in

our examples. The first of which is

streaming operations

which are executed on

individual events as they're returned by a

search. So you can think of this like

your evals

that is going to be doing

something to every single event,

modifying fields when they're available.

We do have generating functions. So

generating function are going to be used

situationally where you're sourcing data

from non-indexed datasets, and so you

would see that from either input

lookup commands or maybe tstats,

pulling information from the tsidx

files, and so generating the

statistical output based on the data

available there. Transforming commands

you will see as often as streaming

commands, generally speaking, and more

often than generating commands where

transforming is intended to order

results into a data table. And I often

think of this much like how we discuss

the statistical functions in basic

search as summarization functions where

you're looking to condense your overall

dataset into really manageable

consumable results. So these

operations that apply that summarization

are transforming. We do have two

additional types of SPL commands, the

first is orchestrating. You can read

about these, I will not discuss in great

detail. They are used to manipulate

how searches are actually processed or

or how commands are processed. And

they don't directly affect the results

in a search, how we think about say

applying a stats or an eval to a data

set. So if you're interested,

definitely check it out. Linked

documentation has details there.

Dataset processing is seen much more often,

and you do have some conditional

scenarios where commands can act as

dataset processing, so the

distinction for dataset processing is

going to be that you are operating in

bulk on a single completed dataset at

one time. So we'll look at an

example of that.

I want to pivot back to our main

three that we're going to be focusing on,

and I have mentioned some of these

examples already. The eval functions

that we've been talking about so far are

perfect examples of our streaming

commands. So where we are creating new

fields for each entry or log event,

where we are modifying values for all of

the results that are available. That

is where we are streaming with the

search functions. Inputlookup is

possibly one of the most common

generating commands that I see

because someone is intending to

source a dataset stored in a CSV file

or a KV store collection, and you're

able to bring that back as a report and

use that logic in your queries.

So that is

not requiring the index data or

any index data to actually return the

results that you want to see.

And we've talked about stats, very

generally speaking, with a lot of

unique functions you can apply there

where this is going to provide a tabular

output. And it is serving that purpose of

summarization, so we're really

reformatting the data into that

tabular report.

So we see in this example search here

that we are often combining these

different types of search operations. So

in this example that we have, I have

data that already exists in a CSV file.

We are applying a streaming command here,

where, evaluating each line to see if

we match a condition, and then returning

the results

based on that evaluation. And then we're

applying a transforming command at the

end which is that stats summarization,

getting the maximum values for the

count of errors and the host that is

associated with that. So let's pivot over

to Splunk and we'll take a look at that example.

So I'm just going to grab my

search here and I precommented out

the specific lines following inputlookup

just to see that this generating

command here is not looking for any

specific index data. We're pulling

directly the results that I have in a

CSV file here into this output, and so

we have a count of errors observed

across multiple hosts. Our where command

you might think is reformatting data

in the sense it is transforming the

results, but the evaluation of a where

function does apply effectively to every

event that is returned. So it is a

streaming command that is going to

filter down our result set based on our

condition that the error count is less

than 200.

So the following line is our

transforming command where we have two

results left 187 for host 3. We want

to see our maximum values here of 187 on

host 3. So our scenario here has really

covered where you may have hosts

that are trending toward a negative

state. You're aware that the second

host had already exceeded its

threshold value for errors, but host 3

also appears to be trending toward this

threshold. So being able to combine

these types of commands, understand

the logical condition that you're

searching for, and then also providing

that consumable output. So combining

all three of our types of commands here.

So I'm going to jump to an SPL

demo, and as I go through these different

commands, I'm going to be referencing

back to the different command types that

we're working with. I'm going to

introduce in a lot of these searches

a lot of small commands that I won't

talk about in great detail and that

really is the purpose of using your

search manual, using your search

reference documentation. So I will

glance over the use case, talk about

how it's meant to be applied, and then

using in your own scenarios where you

have problem you need to solve,

referencing the docs to find out where

you can apply similar functions to

what we observe in the the demonstration here.

So the first command I'm going to

focus on is the rex command. So rex is a

streaming command that you often see

applied to datasets that do not fully

have data extracted in the format that

you want to be using in your

reporting or in your logic. And so

this could very well be handled actually

in the configuration of props and

transforms and extracting fields at the

right times and indexing data, but as

your bringing new data sources, you need

to understand what's available for use

in Splunk. A lot of times you'll find

yourself needing to extract new fields

in line in your searches and be able

to use those in your search logic. Rex

also has a sed mode that I also see

testing done for masking of data in line

prior to actually putting that into

indexing configurations.

So rex you would

generally see used when you don't

have those fields available, you need to

use them at that time. And then we're

going to take a look at an example of

masking data as well to test your

syntax for a sed style replace in

config files. So we will jump back over.

So I'm going to start with a search on

an index source type, my tutorial data.

And then this is actual Linux secure

logging so these are going to be OS

security logs, and we're looking at all

of our web hosts that we've been

focusing on previously.

In our events, you can see

that we have first here an event that

has failed password for invalid user inet,

We're provided a source IP, a source

port, and we go to see the fields that

are extracted and that's not

being done for us automatically. So just

to start testing our logic to see if we

can get the results we want to see,

we're going to use the rex command. And

in doing so, we are applying this

operation across every event, again, a

streaming command. We are looking at the

raw field, so we're actually looking at

the raw text of each of these log events.

And then the rex syntax is simply to

provide in double quotes a regex

match, and we're using named groups for

field extractions. So for every single

event that we see failed password for

invalid user, we are actually extracting

a user field, the source IP field, and the

source port field. For the sake of

simplicity, I tried to keep the regex simple.

You can make this as complex as you need

to for your needs, for your data. And

so in our extracted fields, I've

actually pre-selected these so we can

see our user is now available, and this

applies to the events where the regex was

actually valid and matching on the

failed password for invalid user, etc string.

So now that we have our fields

extracted, we can actually use these. And

we want

to do a stats count as failed logins, so

anytime you see an operation as and

then a unique name, just a rename

through the transformation function,

easier way to actually keep

consistency with referencing your

fields as well as not have to rename

later on with some additional- in this

case, you'd have to reference the name

distinct count so just a way to keep

things clean and easy to use in further

lines of SPL. So we are counting our

failed logins, we're looking at the

distinct count of the source IP values

that we have, and then we're splitting

that by the host and the user. So you can

see here, this tutorial data is

actually pretty flat across most of the

sources so we're not going to have

any outliers or spikes in our stats here,

but you can see the resulting presentation.

In line four, we do have a

sort command, and this is an example of a

dataset processing command where we are

actually evaluating a full completed

dataset and reordering it. Given the

logic here, we want to descend on these

numeric values. So keep mind as you're

operating on different fields, it's going

to be the same sort of either basic

numeric or the lexicographical ordering

that you typically see in Splunk.

So we do have a second example

with the sed style replace.

So you can see in my events here

we are searching the tutorial and

vendor sales index and source type. And

I've gone ahead and applied one

operation, and this is going to be a

helpful operation to understand really

what we are replacing and how to get

consistent operation on these fields.

So in this case, we are actually creating

an ID length field where we are going to

choose to mask the value of account ID

in our rex command. We want to know that

that's a consistent number of characters

through all of our data. It's very

simple to spot check, but just to be

certain, we want to apply this to all of

our data, in this case, streaming command

through this eval. We

are changing the type of the data

because account ID is actually numeric.

We're making that a string value so that

we can look at the length. These are

common functions in any programming

languages, and so the syntax here in

SPL is quite simple. Just to be able

to get that contextual feel, we

understand we have 16 characters for

100% of our events in the account IDs.

So actually applying our rex command,

we are going to now specify a unique

field, not just underscore raw. We are

applying the sed mode, and this is a

sed syntax replacement looking

for the- it's a capture group for the

first 12 digits. And then we're

replacing that with a series of 12 X's.

So you can see in our first event, the

account ID is now masked, we only have

the remaining four digits to be able to

identify that. And so if our data was

indexed and is appropriately done so

in Splunk with the full account IDs, but

for the sake of reporting we want to

be able to mask that for the audience,

then we're able to use the sed

replace. And then to finalize a report,

this is just an example of the top

command which does a few operations

together and makes for a good

shorthand report, taking all the

unique values of the provided field,

giving you a count of those values, and

then showing the percentage

of the makeup for the total dataset

that that unique value accounts for. So

again, pretty flat in this tutorial data

in seeing a very consistent

.03% across these different account IDs.

So we have looked at a few examples

with the rex command, and that is

again, streaming. We're going to look at

another streaming command

which is going to be a set of

multivalue eval functions. And so again,

if you're to have a bookmark for search

documentation, multivalue eval functions

are a great one to have because when

you encounter these, it really takes

some time to figure out how to actually

operate on data. And so the

multivalue functions are really just

a collection that depending on your use

case, you're able to determine the

best to apply. You see it often used

with JSON and XML so data formats

that are actually naturally going to

provide a multivalue field where you

have repeated tags or keys across

unique events as they're extracted.

And you often see a lot of times in

Windows event logs, you actually have

repeated key values where your values

are different and the position in the

event is actually specific to a

condition, so you may have a need

for extraction or interaction with one

of those unique values to actually

get a reasonable outcome from your data.

And so we're going to use

multivalue eval functions when we

have a change we want make to the

presentation of data and we're able

to do so with multivalue fields. This I

would say often occurs when you have

multivalue data and then you want to

be able to change the format of the

multivalue fields there. And then

we're also going to look at a quick

example of actually using multivalue

evaluation as a logical condition.

So the first example.

We're going to start with a

simple table looking at our web access

logs, and so we're just going to pull

in our status and referer domain fields.

And so you can see we've got a

HTTP status code, and we've got the

format of a protocol subdomain

TLD. And our scenario here is that for a

simplicity of reporting, we just want

to work with this referer domain field

and be able to simplify that. So in

actually splitting out the field in this

case, split referer domain, and then

choosing the period character as our

point to split the data. We're creating a

multivalue from what was previously

just a single value field. And using

this, we can actually create a new field

by using the index of a multivalue field,

and in this case, we're looking at

index 012.

The multivalue index function allows

us to target a specific field and then

choose a starting and ending index to

extract given values. There are a number

of ways to do this. In our case here

where we have three entries, it's quite

simple just to give that start and end

of range as the

two entries

apart. So as we are working to recreate

our domain, and so that is just applying

for this new domain field, we have

buttercupgames.com in what was

previously the HTTP www.buttercup

games.com. We can now use those fields

in a transformation function. In this

case, simple stats count by status in

the domain.

So I do want to look at another

example here that is similar, but

we're going to use a multivalue function

to actually test a condition. And so I'm

going to,

in this case, be searching the same

data. We're going to start with a stats

command, and so a stats count as well as

a values of status. And so the values

function is going to provide all the

unique values of a given field based

on the split by. And so that produces

a multivalue field here in the case of

status. We have quite a few events

that have multiple status codes, and as

we're interested in pulling those events

out, we can use an mvcount function to

evaluate and filter our dataset to

those specific events. So a very simple

operation here, you're just looking at what has

the- what has more than a single value

for status, but very useful as you're

applying this in reporting especially in

combination with others and with more

complex conditions.

So that is our set of multivalue

eval functions there as streaming commands.

So for a final section of

the demo, I want to talk about a concept

that is not so much a set of functions,

but really enables more complex

and interesting searching and can allow

us to use a few different types of

commands in our SPL. And so the concept of

subsearching for both filtering and

enrichment is taking secondary search

results, and we're using that to

affect a primary search. So a subsearch

will be executed, the results

returned, and depending on how it's used,

this is going to be processed in the

original search, and that is going to-

We'll look at an example that it is

filtering. So based on the results, we get

a effectively a value equals X or value

equals y for one of our fields that

we're looking at in the subsearch.

And then we're also going to look at an

enrichment example, so you see this often

when you have a dataset maybe saved

in a lookup table or you just have a

simple reference where you want to bring

in more context, maybe descriptions of

event codes, things like

that. So in that case,

we'll look at the first command here. Now,

I'm going to run my search, and we're

going to pivot over to a subsearch

tab here. And so you can see our subsearch

looking at the secure logs.

We are actually just pulling out the

search to see what the results are or

what's going to be returned from that

subsearch. So we're applying the same

rex that we had before to extract our

fields. We're applying a where, a streaming

command looking for anything that's not

null for user. We observed that we had

about 60% of our events that were going

to be null based on not having a user

field, and so looking at that total dataset,

we're just going to count by our

source IP. And this is often a quick way

to really just get a list of unique

values of any given field. And then

operating on that to return just the

the list of values, few different ways to

do that, I see stats count pretty often.

And in this case, we're actually tabling

out just keeping our source IP field and

renaming it to client IP, so the resulting

dataset is a single column table

with

182 results, and the field name is client

IP. So when returned to the original

search, we're running this as a sub

search, the effective result of this is

actually client IP equals my first value

here or client IP equals my second value

and so on through the full dataset. And

so looking at our search here, we're

applying this to the access logs. You can

see that we had a field named source IP

in the secure logs and we renamed to

client IP so that we could apply this to

the access logs where client IP is the

actual field name for the source IP

data. And in this case, we are filtering

to the client IP's relevant in the secure

logs for our web access logs.

So uncommenting here, we have a

series of operations that we're doing,

and I'm just going to run them all at

once and talk through that we are

counting the status or we're counting

the events by status and client IP

for the client IPs that were relevant to

authentication failures in the secure

logs. We are then creating a status count

field just by combining our status

and count fields, adding a colon

between them. And then we are doing a

second stats statement here to

actually combine all of our newly

created fields together in a more

condensed report. So a transforming command,

then streaming for creating our new

field, another transforming command, and

then our sort for dataset processing

actually gives us the results here for a

given client IP. And so we are, in this

case, looking for the scenario that

these client IPs that are involved in

authentication failures to the web

servers. In this case, these were all over

SSH. We want to see if there are

interactions by these same source IPs

actually on the website that we're

hosting. So seeing a high number of

failed values, looking at actions also is

a use case here for just bringing in

that context and seeing if there's any

sort of relationship between the data.

This is discussed often as correlation

of logs. I'm usually careful about using

the term correlation in talking about

Splunk queries especially in Enterprise

security talking about correlation

searches where I typically think of

correlation searches as being

overarching concepts that cover data

from multiple data sources, and in this

case, correlating events would be looking

at unique data types that are

potentially related in finding that

logical connection for the condition.

That's a little bit more up to the user.

It's not quite as easy as say,

pointing to a specific data

model. So we are going to look at one

more subsearch here, and this case is

going to apply the join command. And

so I talk about using lookup files or

other data returned by subsearches

to enrich, to bring more data in

rather than filter. We are going to

look at our first part of the command

here, and this is actually just a

simple stats report based on this rex

that keeps coming through the SPL to

give us those user and source IP fields.

So our result here is authentication

failures for all these web hosts so

similar to what we had previously

returned. And then we're going to take a

look at the results of the subsearch

here. I'm going to actually split this up so that we

can see the first two lines. We're

looking at our web access logs for

purchase actions, and then we are

looking at our stats count for errors

and stats count for successes. We have

pretty limited status code to return in

this data so this is viable for

the data present to observe our

errors and successes.

And then we are actually

creating a new field based on the

statistics that we're generating,

looking at our transaction errors so

where we have high or low numbers

of failed purchase actions, and then

summarizing that. So in the case of our

final command here, another transforming

command of table just to reduce this to

a small dataset to use in the subsearch.

And so in this case, we have our host

value and then our transaction error

rate that we observe from the web access

logs. And then over in our other search

here, we are going to perform a left

join based on this host field. So you see

in our secure logs, we still have the

same host value, and this is going to be

used to actually add our

transaction error rates in for each

host. So as we observe increased

authentication failures, if there's a

scenario for a breach and some sort of

interruption to the ability to serve out

or perform these purchase actions that

are affecting the intended

operations of the web servers, we can

see that here. Of course in our tutorial

data, there's not really much that

jumping out or showing that there is

any correlation between the two, but the

purpose of the join is to bring in that

extra dataset to give the context to

further investigate.

So that is the final

portion of the SPL demo. And I do want

to say for any questions, I'm going to

take a look at the chat, I'll do my best

to answer any questions, and then if

you have any other questions, please

feel free to reach out to my team at

support@kennygroup.com, and we'll be

happy to get back to you and help. I

am taking a look through.

Okay, seeing some questions on

performance of the rex, sed, regex

commands. So off the top of my head,

I'm not sure about a direct performance

comparison of the individual commands.

Definitely want to look into that, and

definitely follow up if you'd like to

explain a more detailed scenario or

look at some SPL that we can apply and

observe those changes.

The question on getting the

dataset, that is what I mentioned at

the beginning. Reach out to us for the

slides or just reach out about the

link. And the Splunk tutorial data, you

can actually search that as well. And

there's documentation on how to use the

tutorial data, one of the first links

there, takes you to a page that has-

it is a tutorial data zip file, and

instructions on how to [inaudible] that, it's

just an upload for your specific

environment. So in add data and then

upload data, two clicks, and upload

your file. So that is freely available

for anyone, and again, that package is

dynamically updated as well so your time

stamps are pretty close to normal

as you download the app, kind of depends

on the time of the cycle for the

update, but search overall time, you

won't have any issues there. And then

yeah, again on receiving slides, reach

out to my team, and we're happy to

provide those, discuss further, and we'll

have the recording available

for this session. You should be able to,

after the recording processes when

the session ends, actually use the

same link, and you can watch this

recording and post without having to

sign up or transfer that file so-

So okay, Chris, seeing your

comment there, let me know if you want

to reach out to me directly, anyone as

well. We can discuss what slides and

presentation you had attended, I'm not

sure I have the attendance report

for what you've seen previously, so

happy to get those for you.

All right and seeing- thanks Brett.

So you see Brett Woodruff in the chat

commenting, systems engineer on the

expertise on demand team so very

knowledgeable guy, and he's going to be

presenting next month's session. That

is going to take this concept that we

talked about in the subsearching as a just

general search topic, he's going to go

specifically into data enrichment using

joins, lookup commands, and how we see

that used in the wild. So definitely

excited for that one, encourage you to

register for that event.

All right, I'm not seeing any more questions.

All right, with that I am stopping my

share. I'm going to hang around for a few

minutes, but thank you all for

attending. and we'll see you on the next session.