-
Hello everyone.
-
We are getting started here on
-
our August lunch and learn session
-
presented by Kinney Group's Atlas Customer
-
Experience team. My name is Alice Devaney. I
-
am the engineering manager for the Atlas
-
Customer Experience team, and I'm excited
-
to be presenting this month's session on
-
intermediate-level Splunk searching. So
-
thank you all for attending. I hope you
-
get some good ideas out of this.
-
I certainly encourage engagement through
-
the chat, and I'll have some
-
information at the end on following up
-
and speaking with my team directly on
-
any issues or interests that you have
-
around these types of concepts that
-
we're going to cover today. So jumping
-
into an intermediate-level session.
-
I do want to say that we have previously
-
done a basic level searching
-
session so that we are really
-
progressing from that, picking up right
-
where we left off. We've done that
-
session with quite a few of our
-
customers individually and highly
-
recommend if you're interested in doing
-
that or this session with a larger team,
-
we're happy to discuss and
-
coordinate that. So getting started,
-
we're going to take a look at the final
-
search from our basic search session.
-
And we're going to walk through that,
-
understand some of the concepts, and
-
then we're going to take a step back,
-
look a little more generally at SPL
-
operations and understanding how
-
different commands apply to data, and
-
really that next level of understanding
-
for how you can write more complex
-
searches and understand really when
-
to use certain types of commands. And
-
of course, in the session we're going
-
to have a series of demos using
-
a few specific commands, highlighting the
-
different SPL command types that we
-
discuss in the second portion and get
-
to see that on the tutorial data that
-
you can also use in your environment,
-
in a test environment very
-
simply. So I will always encourage
-
especially with search content that you
-
look into the additional resource that I
-
have listed here. The search reference
-
documentation is one of my favorite
-
bookmarks that I use frequently in my
-
own environments and working in customer
-
environments. It is really the
-
best quick resource to get information
-
on syntax and examples of any search
-
command and is always a great
-
resource to have. The search manual is a
-
little bit more conceptual, but as you're
-
learning more about different types of
-
search operations,
-
it's very helpful to be able to review
-
this documentation
-
and have reference
-
material that you can come back to as
-
you are studying and trying to get
-
better and writing more complex
-
search content. I have also linked here
-
the documentation on how to use the
-
Splunk tutorial data, so if you've not
-
done that before, it's a very simple
-
process, and there are consistently
-
updated download files that Splunk
-
provides that you're able to directly
-
upload into any Splunk environment. So
-
that's what I'm going to be using today,
-
and given that you are searching over
-
appropriate time windows for when you
-
download the tutorial dataset, these
-
searches will work on the tutorial
-
data as well. So highly encourage, after
-
the fact, if you want to go through
-
and test out some of the content,
-
you'll be able to access a recording as
-
well as if you'd like the slides that
-
I'm presenting off of today, which I
-
highly encourage because there are a lot
-
of useful links in here, reach out to
-
my team. Again, right at the end of the
-
slides we'll have that info.
-
So looking at our overview of basic
-
search, I just want to cover
-
conceptually the two categories that
-
we discuss in that session. And so those
-
two are the statistical and charting
-
functions which consist of in those
-
demos aggregate and time functions. So
-
aggregate functions are going to be your
-
commonly used statistical functions
-
meant for summarization, and then time
-
functions actually using the
-
timestamp field underscore time or any
-
other time that you've extracted from
-
data and looking at earliest, latest
-
relative time values in a
-
summative fashion. And then evaluation
-
functions are the separate type where
-
we discuss comparison and conditional
-
statements so using your if and your
-
case commands in
-
evals. Also datetime functions that
-
apply operations to events uniquely
-
so not necessarily summarization, but
-
interacting with the time values
-
themselves, maybe changing the time
-
format, and then multivalue evalq
-
functions, we touch on that very lightly,
-
and it is more conceptual in basic
-
search. So today we're going to dive in
-
as part of our demo and look at
-
multivalue eval functions later in
-
the presentation.
-
So on this slide here I
-
have highlighted in gray the search
-
that we end basic search with. And so
-
that is broken up into three segments
-
where we have the first line being a
-
filter to a dataset. This is very
-
simply how you are sourcing most of your
-
data in most of your searches in Splunk.
-
And we always want to be a specific
-
as possible. You'll most often see the
-
logical way to do that is by
-
identifying an index and a source type,
-
possibly some specific values of given
-
fields in that data before you start
-
applying other operations. In our case, we
-
want to work with a whole dataset,
-
and then we move into applying our eval
-
statements.
-
So in the evals, the purpose of these is
-
to create some new fields to work with,
-
and so we have two operations here.
-
And you can see that on the first line,
-
we're starting with an error check field.
-
These are web access logs, so we're
-
looking at the HTTP status codes as the
-
status field, and we have a logical
-
condition here for greater than or equal
-
to 400, we want to return errors. And so
-
very simple example, making it as easy
-
as possible. If you want to get specifics
-
on your 200s and your 300s, it's the
-
exact same type of logic to go and apply
-
likely a case statement to get some
-
additional conditions and more unique
-
output in an error check or some sort of
-
field indicating what you want to
-
see out of your status code so this case,
-
simple errors. Or the value of non error
-
if we have say a 200.
-
We're also using a time function to
-
create a second field called day. You
-
may be familiar with some of the
-
fields that you get out of by default
-
for most any events in Splunk and
-
that they're related to breakdowns of
-
the time stamps. You have day, month,
-
and many others. In this case, I want to
-
get a specific format for day so we use
-
a strftime function, and we have a
-
time format variable here on the actual
-
extracted time stamp for Splunk. So
-
coming out of the second line, we've
-
accessed our data, we have created two
-
new fields to use, and then we are
-
actually performing charting with a
-
statistical function, and so that is
-
using timechart. And we can see here
-
that we are counting our events that
-
actually have the error value for our
-
created error check field. And so I'm
-
going to pivot over to Splunk here,
-
and we're going to look at this search,
-
and I have commented out most of the
-
logic, we'll step back through it. We
-
are looking in our web access log events
-
here, and we want to then apply our
-
eval. And so by applying the eval, we can
-
get our error check field that provides
-
error or non-error. We're seeing that we
-
have mostly non-error
-
events. And then we have the day field,
-
and so day is actually providing the
-
full name of day for the time stamp for
-
all these events. So with our timechart,
-
this is the summarization with a
-
condition actually that we're spanning
-
by default over a single day, so this may
-
not be a very logical use of a split by
-
day when we are already using a timechart
-
command that is dividing our
-
results by the time bin, effectively a
-
span of one day. But what we can do is
-
change our split by field to host and
-
get a little bit more of a reasonable
-
presentation. We were able to see with
-
the counts in the individual days not
-
only split through the time chart, but by
-
the day field that we only had values
-
where our matrix matched up for the
-
actual day. So here we have our hosts
-
one, two, and three, and then across days
-
counts of the error events that we
-
observe. So that is the search that we
-
end on in basic search. The concepts
-
there being accessing our data,
-
searching in a descriptive manner, using
-
our metadata fields, the index and the
-
source type, the evaluation functions
-
where we're creating new fields,
-
manipulating data, and then we have a
-
timechart function that is providing
-
some summarized statistics here based
-
on a time range.
-
So we will pivot back, and we're
-
going to take a step back out of the SPL
-
for a second just to talk about these
-
different kinds of search operations
-
that we just performed. So you'll hear
-
these terms if you are really kind of
-
diving deeper into actual operations of
-
Splunk searching. And you can get very
-
detailed regarding the optimization of
-
searches around these types of
-
commands and the order in which you
-
choose to execute SPL. Today I'm going to
-
focus on how these operations actually
-
apply to the data and helping you to
-
make better decisions about what
-
commands are best for the scenario that
-
you have or the output that you want to
-
see. And in future sessions, we will
-
discuss the actual optimization of
-
searches through this optimal order
-
of functions and some other means.
-
But just a caveat there that we're going
-
to talk pretty specifically today
-
just about these individually, how
-
they work with data, and then how you
-
see them in combination.
-
So our types of SPL commands,
-
the top three in bold we'll focus on in
-
our examples. The first of which is
-
streaming operations
-
which are executed on
-
individual events as they're returned by a
-
search. So you can think of this like
-
your evals
-
that is going to be doing
-
something to every single event,
-
modifying fields when they're available.
-
We do have generating functions. So
-
generating function are going to be used
-
situationally where you're sourcing data
-
from non-indexed datasets, and so you
-
would see that from either input
-
lookup commands or maybe tstats,
-
pulling information from the tsidx
-
files, and so generating the
-
statistical output based on the data
-
available there. Transforming commands
-
you will see as often as streaming
-
commands, generally speaking, and more
-
often than generating commands where
-
transforming is intended to order
-
results into a data table. And I often
-
think of this much like how we discuss
-
the statistical functions in basic
-
search as summarization functions where
-
you're looking to condense your overall
-
dataset into really manageable
-
consumable results. So these
-
operations that apply that summarization
-
are transforming. We do have two
-
additional types of SPL commands, the
-
first is orchestrating. You can read
-
about these, I will not discuss in great
-
detail. They are used to manipulate
-
how searches are actually processed or
-
or how commands are processed. And
-
they don't directly affect the results
-
in a search, how we think about say
-
applying a stats or an eval to a data
-
set. So if you're interested,
-
definitely check it out. Linked
-
documentation has details there.
-
Dataset processing is seen much more often,
-
and you do have some conditional
-
scenarios where commands can act as
-
dataset processing, so the
-
distinction for dataset processing is
-
going to be that you are operating in
-
bulk on a single completed dataset at
-
one time. So we'll look at an
-
example of that.
-
I want to pivot back to our main
-
three that we're going to be focusing on,
-
and I have mentioned some of these
-
examples already. The eval functions
-
that we've been talking about so far are
-
perfect examples of our streaming
-
commands. So where we are creating new
-
fields for each entry or log event,
-
where we are modifying values for all of
-
the results that are available. That
-
is where we are streaming with the
-
search functions. Inputlookup is
-
possibly one of the most common
-
generating commands that I see
-
because someone is intending to
-
source a dataset stored in a CSV file
-
or a KV store collection, and you're
-
able to bring that back as a report and
-
use that logic in your queries.
-
So that is
-
not requiring the index data or
-
any index data to actually return the
-
results that you want to see.
-
And we've talked about stats, very
-
generally speaking, with a lot of
-
unique functions you can apply there
-
where this is going to provide a tabular
-
output. And it is serving that purpose of
-
summarization, so we're really
-
reformatting the data into that
-
tabular report.
-
So we see in this example search here
-
that we are often combining these
-
different types of search operations. So
-
in this example that we have, I have
-
data that already exists in a CSV file.
-
We are applying a streaming command here,
-
where, evaluating each line to see if
-
we match a condition, and then returning
-
the results
-
based on that evaluation. And then we're
-
applying a transforming command at the
-
end which is that stats summarization,
-
getting the maximum values for the
-
count of errors and the host that is
-
associated with that. So let's pivot over
-
to Splunk and we'll take a look at that example.
-
So I'm just going to grab my
-
search here and I precommented out
-
the specific lines following inputlookup
-
just to see that this generating
-
command here is not looking for any
-
specific index data. We're pulling
-
directly the results that I have in a
-
CSV file here into this output, and so
-
we have a count of errors observed
-
across multiple hosts. Our where command
-
you might think is reformatting data
-
in the sense it is transforming the
-
results, but the evaluation of a where
-
function does apply effectively to every
-
event that is returned. So it is a
-
streaming command that is going to
-
filter down our result set based on our
-
condition that the error count is less
-
than 200.
-
So the following line is our
-
transforming command where we have two
-
results left 187 for host 3. We want
-
to see our maximum values here of 187 on
-
host 3. So our scenario here has really
-
covered where you may have hosts
-
that are trending toward a negative
-
state. You're aware that the second
-
host had already exceeded its
-
threshold value for errors, but host 3
-
also appears to be trending toward this
-
threshold. So being able to combine
-
these types of commands, understand
-
the logical condition that you're
-
searching for, and then also providing
-
that consumable output. So combining
-
all three of our types of commands here.
-
So I'm going to jump to an SPL
-
demo, and as I go through these different
-
commands, I'm going to be referencing
-
back to the different command types that
-
we're working with. I'm going to
-
introduce in a lot of these searches
-
a lot of small commands that I won't
-
talk about in great detail and that
-
really is the purpose of using your
-
search manual, using your search
-
reference documentation. So I will
-
glance over the use case, talk about
-
how it's meant to be applied, and then
-
using in your own scenarios where you
-
have problem you need to solve,
-
referencing the docs to find out where
-
you can apply similar functions to
-
what we observe in the the demonstration here.
-
So the first command I'm going to
-
focus on is the rex command. So rex is a
-
streaming command that you often see
-
applied to datasets that do not fully
-
have data extracted in the format that
-
you want to be using in your
-
reporting or in your logic. And so
-
this could very well be handled actually
-
in the configuration of props and
-
transforms and extracting fields at the
-
right times and indexing data, but as
-
your bringing new data sources, you need
-
to understand what's available for use
-
in Splunk. A lot of times you'll find
-
yourself needing to extract new fields
-
in line in your searches and be able
-
to use those in your search logic. Rex
-
also has a sed mode that I also see
-
testing done for masking of data in line
-
prior to actually putting that into
-
indexing configurations.
-
So rex you would
-
generally see used when you don't
-
have those fields available, you need to
-
use them at that time. And then we're
-
going to take a look at an example of
-
masking data as well to test your
-
syntax for a sed style replace in
-
config files. So we will jump back over.
-
So I'm going to start with a search on
-
an index source type, my tutorial data.
-
And then this is actual Linux secure
-
logging so these are going to be OS
-
security logs, and we're looking at all
-
of our web hosts that we've been
-
focusing on previously.
-
In our events, you can see
-
that we have first here an event that
-
has failed password for invalid user inet,
-
We're provided a source IP, a source
-
port, and we go to see the fields that
-
are extracted and that's not
-
being done for us automatically. So just
-
to start testing our logic to see if we
-
can get the results we want to see,
-
we're going to use the rex command. And
-
in doing so, we are applying this
-
operation across every event, again, a
-
streaming command. We are looking at the
-
raw field, so we're actually looking at
-
the raw text of each of these log events.
-
And then the rex syntax is simply to
-
provide in double quotes a regex
-
match, and we're using named groups for
-
field extractions. So for every single
-
event that we see failed password for
-
invalid user, we are actually extracting
-
a user field, the source IP field, and the
-
source port field. For the sake of
-
simplicity, I tried to keep the regex simple.
-
You can make this as complex as you need
-
to for your needs, for your data. And
-
so in our extracted fields, I've
-
actually pre-selected these so we can
-
see our user is now available, and this
-
applies to the events where the regex was
-
actually valid and matching on the
-
failed password for invalid user, etc string.
-
So now that we have our fields
-
extracted, we can actually use these. And
-
we want
-
to do a stats count as failed logins, so
-
anytime you see an operation as and
-
then a unique name, just a rename
-
through the transformation function,
-
easier way to actually keep
-
consistency with referencing your
-
fields as well as not have to rename
-
later on with some additional- in this
-
case, you'd have to reference the name
-
distinct count so just a way to keep
-
things clean and easy to use in further
-
lines of SPL. So we are counting our
-
failed logins, we're looking at the
-
distinct count of the source IP values
-
that we have, and then we're splitting
-
that by the host and the user. So you can
-
see here, this tutorial data is
-
actually pretty flat across most of the
-
sources so we're not going to have
-
any outliers or spikes in our stats here,
-
but you can see the resulting presentation.
-
In line four, we do have a
-
sort command, and this is an example of a
-
dataset processing command where we are
-
actually evaluating a full completed
-
dataset and reordering it. Given the
-
logic here, we want to descend on these
-
numeric values. So keep mind as you're
-
operating on different fields, it's going
-
to be the same sort of either basic
-
numeric or the lexicographical ordering
-
that you typically see in Splunk.
-
So we do have a second example
-
with the sed style replace.
-
So you can see in my events here
-
we are searching the tutorial and
-
vendor sales index and source type. And
-
I've gone ahead and applied one
-
operation, and this is going to be a
-
helpful operation to understand really
-
what we are replacing and how to get
-
consistent operation on these fields.
-
So in this case, we are actually creating
-
an ID length field where we are going to
-
choose to mask the value of account ID
-
in our rex command. We want to know that
-
that's a consistent number of characters
-
through all of our data. It's very
-
simple to spot check, but just to be
-
certain, we want to apply this to all of
-
our data, in this case, streaming command
-
through this eval. We
-
are changing the type of the data
-
because account ID is actually numeric.
-
We're making that a string value so that
-
we can look at the length. These are
-
common functions in any programming
-
languages, and so the syntax here in
-
SPL is quite simple. Just to be able
-
to get that contextual feel, we
-
understand we have 16 characters for
-
100% of our events in the account IDs.
-
So actually applying our rex command,
-
we are going to now specify a unique
-
field, not just underscore raw. We are
-
applying the sed mode, and this is a
-
sed syntax replacement looking
-
for the- it's a capture group for the
-
first 12 digits. And then we're
-
replacing that with a series of 12 X's.
-
So you can see in our first event, the
-
account ID is now masked, we only have
-
the remaining four digits to be able to
-
identify that. And so if our data was
-
indexed and is appropriately done so
-
in Splunk with the full account IDs, but
-
for the sake of reporting we want to
-
be able to mask that for the audience,
-
then we're able to use the sed
-
replace. And then to finalize a report,
-
this is just an example of the top
-
command which does a few operations
-
together and makes for a good
-
shorthand report, taking all the
-
unique values of the provided field,
-
giving you a count of those values, and
-
then showing the percentage
-
of the makeup for the total dataset
-
that that unique value accounts for. So
-
again, pretty flat in this tutorial data
-
in seeing a very consistent
-
.03% across these different account IDs.
-
So we have looked at a few examples
-
with the rex command, and that is
-
again, streaming. We're going to look at
-
another streaming command
-
which is going to be a set of
-
multivalue eval functions. And so again,
-
if you're to have a bookmark for search
-
documentation, multivalue eval functions
-
are a great one to have because when
-
you encounter these, it really takes
-
some time to figure out how to actually
-
operate on data. And so the
-
multivalue functions are really just
-
a collection that depending on your use
-
case, you're able to determine the
-
best to apply. You see it often used
-
with JSON and XML so data formats
-
that are actually naturally going to
-
provide a multivalue field where you
-
have repeated tags or keys across
-
unique events as they're extracted.
-
And you often see a lot of times in
-
Windows event logs, you actually have
-
repeated key values where your values
-
are different and the position in the
-
event is actually specific to a
-
condition, so you may have a need
-
for extraction or interaction with one
-
of those unique values to actually
-
get a reasonable outcome from your data.
-
And so we're going to use
-
multivalue eval functions when we
-
have a change we want make to the
-
presentation of data and we're able
-
to do so with multivalue fields. This I
-
would say often occurs when you have
-
multivalue data and then you want to
-
be able to change the format of the
-
multivalue fields there. And then
-
we're also going to look at a quick
-
example of actually using multivalue
-
evaluation as a logical condition.
-
So the first example.
-
We're going to start with a
-
simple table looking at our web access
-
logs, and so we're just going to pull
-
in our status and referer domain fields.
-
And so you can see we've got a
-
HTTP status code, and we've got the
-
format of a protocol subdomain
-
TLD. And our scenario here is that for a
-
simplicity of reporting, we just want
-
to work with this referer domain field
-
and be able to simplify that. So in
-
actually splitting out the field in this
-
case, split referer domain, and then
-
choosing the period character as our
-
point to split the data. We're creating a
-
multivalue from what was previously
-
just a single value field. And using
-
this, we can actually create a new field
-
by using the index of a multivalue field,
-
and in this case, we're looking at
-
index 012.
-
The multivalue index function allows
-
us to target a specific field and then
-
choose a starting and ending index to
-
extract given values. There are a number
-
of ways to do this. In our case here
-
where we have three entries, it's quite
-
simple just to give that start and end
-
of range as the
-
two entries
-
apart. So as we are working to recreate
-
our domain, and so that is just applying
-
for this new domain field, we have
-
buttercupgames.com in what was
-
previously the HTTP www.buttercup
-
games.com. We can now use those fields
-
in a transformation function. In this
-
case, simple stats count by status in
-
the domain.
-
So I do want to look at another
-
example here that is similar, but
-
we're going to use a multivalue function
-
to actually test a condition. And so I'm
-
going to,
-
in this case, be searching the same
-
data. We're going to start with a stats
-
command, and so a stats count as well as
-
a values of status. And so the values
-
function is going to provide all the
-
unique values of a given field based
-
on the split by. And so that produces
-
a multivalue field here in the case of
-
status. We have quite a few events
-
that have multiple status codes, and as
-
we're interested in pulling those events
-
out, we can use an mvcount function to
-
evaluate and filter our dataset to
-
those specific events. So a very simple
-
operation here, you're just looking at what has
-
the- what has more than a single value
-
for status, but very useful as you're
-
applying this in reporting especially in
-
combination with others and with more
-
complex conditions.
-
So that is our set of multivalue
-
eval functions there as streaming commands.
-
So for a final section of
-
the demo, I want to talk about a concept
-
that is not so much a set of functions,
-
but really enables more complex
-
and interesting searching and can allow
-
us to use a few different types of
-
commands in our SPL. And so the concept of
-
subsearching for both filtering and
-
enrichment is taking secondary search
-
results, and we're using that to
-
affect a primary search. So a subsearch
-
will be executed, the results
-
returned, and depending on how it's used,
-
this is going to be processed in the
-
original search, and that is going to-
-
We'll look at an example that it is
-
filtering. So based on the results, we get
-
a effectively a value equals X or value
-
equals y for one of our fields that
-
we're looking at in the subsearch.
-
And then we're also going to look at an
-
enrichment example, so you see this often
-
when you have a dataset maybe saved
-
in a lookup table or you just have a
-
simple reference where you want to bring
-
in more context, maybe descriptions of
-
event codes, things like
-
that. So in that case,
-
we'll look at the first command here. Now,
-
I'm going to run my search, and we're
-
going to pivot over to a subsearch
-
tab here. And so you can see our subsearch
-
looking at the secure logs.
-
We are actually just pulling out the
-
search to see what the results are or
-
what's going to be returned from that
-
subsearch. So we're applying the same
-
rex that we had before to extract our
-
fields. We're applying a where, a streaming
-
command looking for anything that's not
-
null for user. We observed that we had
-
about 60% of our events that were going
-
to be null based on not having a user
-
field, and so looking at that total dataset,
-
we're just going to count by our
-
source IP. And this is often a quick way
-
to really just get a list of unique
-
values of any given field. And then
-
operating on that to return just the
-
the list of values, few different ways to
-
do that, I see stats count pretty often.
-
And in this case, we're actually tabling
-
out just keeping our source IP field and
-
renaming it to client IP, so the resulting
-
dataset is a single column table
-
with
-
182 results, and the field name is client
-
IP. So when returned to the original
-
search, we're running this as a sub
-
search, the effective result of this is
-
actually client IP equals my first value
-
here or client IP equals my second value
-
and so on through the full dataset. And
-
so looking at our search here, we're
-
applying this to the access logs. You can
-
see that we had a field named source IP
-
in the secure logs and we renamed to
-
client IP so that we could apply this to
-
the access logs where client IP is the
-
actual field name for the source IP
-
data. And in this case, we are filtering
-
to the client IP's relevant in the secure
-
logs for our web access logs.
-
So uncommenting here, we have a
-
series of operations that we're doing,
-
and I'm just going to run them all at
-
once and talk through that we are
-
counting the status or we're counting
-
the events by status and client IP
-
for the client IPs that were relevant to
-
authentication failures in the secure
-
logs. We are then creating a status count
-
field just by combining our status
-
and count fields, adding a colon
-
between them. And then we are doing a
-
second stats statement here to
-
actually combine all of our newly
-
created fields together in a more
-
condensed report. So a transforming command,
-
then streaming for creating our new
-
field, another transforming command, and
-
then our sort for dataset processing
-
actually gives us the results here for a
-
given client IP. And so we are, in this
-
case, looking for the scenario that
-
these client IPs that are involved in
-
authentication failures to the web
-
servers. In this case, these were all over
-
SSH. We want to see if there are
-
interactions by these same source IPs
-
actually on the website that we're
-
hosting. So seeing a high number of
-
failed values, looking at actions also is
-
a use case here for just bringing in
-
that context and seeing if there's any
-
sort of relationship between the data.
-
This is discussed often as correlation
-
of logs. I'm usually careful about using
-
the term correlation in talking about
-
Splunk queries especially in Enterprise
-
security talking about correlation
-
searches where I typically think of
-
correlation searches as being
-
overarching concepts that cover data
-
from multiple data sources, and in this
-
case, correlating events would be looking
-
at unique data types that are
-
potentially related in finding that
-
logical connection for the condition.
-
That's a little bit more up to the user.
-
It's not quite as easy as say,
-
pointing to a specific data
-
model. So we are going to look at one
-
more subsearch here, and this case is
-
going to apply the join command. And
-
so I talk about using lookup files or
-
other data returned by subsearches
-
to enrich, to bring more data in
-
rather than filter. We are going to
-
look at our first part of the command
-
here, and this is actually just a
-
simple stats report based on this rex
-
that keeps coming through the SPL to
-
give us those user and source IP fields.
-
So our result here is authentication
-
failures for all these web hosts so
-
similar to what we had previously
-
returned. And then we're going to take a
-
look at the results of the subsearch
-
here. I'm going to actually split this up so that we
-
can see the first two lines. We're
-
looking at our web access logs for
-
purchase actions, and then we are
-
looking at our stats count for errors
-
and stats count for successes. We have
-
pretty limited status code to return in
-
this data so this is viable for
-
the data present to observe our
-
errors and successes.
-
And then we are actually
-
creating a new field based on the
-
statistics that we're generating,
-
looking at our transaction errors so
-
where we have high or low numbers
-
of failed purchase actions, and then
-
summarizing that. So in the case of our
-
final command here, another transforming
-
command of table just to reduce this to
-
a small dataset to use in the subsearch.
-
And so in this case, we have our host
-
value and then our transaction error
-
rate that we observe from the web access
-
logs. And then over in our other search
-
here, we are going to perform a left
-
join based on this host field. So you see
-
in our secure logs, we still have the
-
same host value, and this is going to be
-
used to actually add our
-
transaction error rates in for each
-
host. So as we observe increased
-
authentication failures, if there's a
-
scenario for a breach and some sort of
-
interruption to the ability to serve out
-
or perform these purchase actions that
-
are affecting the intended
-
operations of the web servers, we can
-
see that here. Of course in our tutorial
-
data, there's not really much that
-
jumping out or showing that there is
-
any correlation between the two, but the
-
purpose of the join is to bring in that
-
extra dataset to give the context to
-
further investigate.
-
So that is the final
-
portion of the SPL demo. And I do want
-
to say for any questions, I'm going to
-
take a look at the chat, I'll do my best
-
to answer any questions, and then if
-
you have any other questions, please
-
feel free to reach out to my team at
-
support@kennygroup.com, and we'll be
-
happy to get back to you and help. I
-
am taking a look through.
-
Okay, seeing some questions on
-
performance of the rex, sed, regex
-
commands. So off the top of my head,
-
I'm not sure about a direct performance
-
comparison of the individual commands.
-
Definitely want to look into that, and
-
definitely follow up if you'd like to
-
explain a more detailed scenario or
-
look at some SPL that we can apply and
-
observe those changes.
-
The question on getting the
-
dataset, that is what I mentioned at
-
the beginning. Reach out to us for the
-
slides or just reach out about the
-
link. And the Splunk tutorial data, you
-
can actually search that as well. And
-
there's documentation on how to use the
-
tutorial data, one of the first links
-
there, takes you to a page that has-
-
it is a tutorial data zip file, and
-
instructions on how to [inaudible] that, it's
-
just an upload for your specific
-
environment. So in add data and then
-
upload data, two clicks, and upload
-
your file. So that is freely available
-
for anyone, and again, that package is
-
dynamically updated as well so your time
-
stamps are pretty close to normal
-
as you download the app, kind of depends
-
on the time of the cycle for the
-
update, but search overall time, you
-
won't have any issues there. And then
-
yeah, again on receiving slides, reach
-
out to my team, and we're happy to
-
provide those, discuss further, and we'll
-
have the recording available
-
for this session. You should be able to,
-
after the recording processes when
-
the session ends, actually use the
-
same link, and you can watch this
-
recording and post without having to
-
sign up or transfer that file so-
-
So okay, Chris, seeing your
-
comment there, let me know if you want
-
to reach out to me directly, anyone as
-
well. We can discuss what slides and
-
presentation you had attended, I'm not
-
sure I have the attendance report
-
for what you've seen previously, so
-
happy to get those for you.
-
All right and seeing- thanks Brett.
-
So you see Brett Woodruff in the chat
-
commenting, systems engineer on the
-
expertise on demand team so very
-
knowledgeable guy, and he's going to be
-
presenting next month's session. That
-
is going to take this concept that we
-
talked about in the subsearching as a just
-
general search topic, he's going to go
-
specifically into data enrichment using
-
joins, lookup commands, and how we see
-
that used in the wild. So definitely
-
excited for that one, encourage you to
-
register for that event.
-
All right, I'm not seeing any more questions.
-
All right, with that I am stopping my
-
share. I'm going to hang around for a few
-
minutes, but thank you all for
-
attending. and we'll see you on the next session.