LOOKUPS – LOOKUP TABLE FILES ( PART – 1 )
A lookup table or file is one of the most important portions in Splunk, which is mainly use for mapping of fields and field-values. Splunk Lookup helps us in adding a complete new field, from an external source based on the value that matches your field in the event data. Basically it enriches our data by adding some externals data.
Let’s take an example, Suppose you have “emp_name” and “age” in your index but you don’t have “country” name of all employees, but you have a csv file where all the “country” and “emp_name” are stored, so in that case you can use that csv file as lookup file to add “country” in your Splunk result where we will use “emp_name” field as a mapping field because that is the common field we have in both the places.
Creating a lookup file in Splunk:
At first we will start with creating a lookup file in Splunk. Suppose we have a csv file which consist two field, “status_code” and “status_information” it’s basically giving us all HTTP response status codes like 402 is for “Payment Required”, 403 is for “Forbidden” and 404 is for “Page Not Found” etc.
Log in to your Splunk instance with your credentials.
Then go to the Settings and click on Lookups
Then click on Lookup table files and New Lookup Table file.
Then it will open the dialog box to upload the lookup file. Fill the all mandatory fields as shown.
Destination app : <app name> Upload a lookup file : <select the file from your system which you want to upload> Destination filename : <name of the lookup file which will be saved as by that name in Splunk>
And Save it.
In this way you can upload any lookup file in Splunk.
Important Commands and functions to interact with lookups in Splunk:
- Inputlookup – To read a lookup file or to see the contents of a lookup file.
Syntax: | inputlookup [append=<bool>] [start=<int>] [max=<int>] [<filename> | <tablename>] [WHERE <search-query>]
NOTE: BOLDS are only required arguments
<filename> or <tablename> - Name of the lookup file which you want to read. This is the only required argument in this command.
| inputlookup status_code.csv
As you know in the previous step we uploaded a lookup file name “status_code.csv”, by using the “inputlookup” command we are viewing the content of that lookup file as simply as you see.
- Lookup: Use to add fields from the lookup file file into your search result.
Syntax: ……| lookup [local=<bool>] [update=<bool>] <lookup-table-name> ( <lookup-field> [AS <event-field>] )… [ OUTPUT | OUTPUTNEW (<lookup-destfield> [AS <event-destfield>] )… ]
NOTE: BOLDS are only required arguments
<lookup-table-name> - Name of the lookup file which you want to read. This is the only required argument in this command. <lookup-field> - Field name of lookup file which will be used to map with event data. <event-field> - Field name of event data which will be used to map with lookup file. <lookup-destfield> - Field names of the lookup file which will add with event search result. One can specify multiple <lookup-destfield>. <event-destfield> - A field in the events. One can specify multiple <event-destfield>.
index=_internal sourcetype=splunkd_ui_access | stats count by status method | lookup status_code.csv status_code as status OUTPUT status_information
In the above query “_internal” is the index and “splunkd_ui_access” is the sourcetype. With the “stats” command we have used two fields “method” and “status” .
Then using lookup command we fetched data from lookup file “status_code.csv” and we used “status_code” (which is same as “status” field in our event data) as a mapping field with “status” then using “OUTPUT” clause we added an extra field from lookup file to our event search result i.e “status_information”.
Now if you check the lookup file then we will notice that status_code 200 is for “OK”, 401 is for “Unauthorized” and 404 is for “Not Found” etc. which is the same as reflected in the above figure.
OUTPUT and OUTPUTNEW clauses:
If you look carefully at the syntax of the lookup command then you will find clauses as OUTPUT and OUTPUTNEW, now what are these for?
These are basically for mentioning the output lookup fields which you want to add with your event search result. If an OUTPUT or OUTPUTNEW clause is not specified, then all of the fields in the lookup table that are not the match field are used as output fields.
Now what are the differences between these two clauses?
|Overwrite the event data with lookup file data||Put the lookup file data if event data is missing. Doesn’t overwrite any event data|
Let’s take an example then.
index=_internal sourcetype=splunkd_ui_access |eval status_information=case(status=200,"FINE",status=204,"NOPE",0=0,NULL) |table status method status_information |dedup status
In the above query “_internal” is the index and “splunkd_ui_access” is the sourcetype. Now here we have created a field called “status_information” using the eval command, which only contains information of status 200 and 204, rest are NULL. Then using table command we provided a tabular view of status, method and status_information fields. Now after that if we use lookup to fetch all the “status_information” from the lookup then what will happen?
With OUTPUT clause:
index=_internal sourcetype=splunkd_ui_access |eval status_information=case(status=200,"FINE",status=204,"NOPE",0=0,NULL) |table status method status_information |dedup status | lookup status_code.csv status_code as status OUTPUT status_information
Because we have used the OUTPUT clause here that’s why indexed data are overwritten by lookup file data. That’s why we are getting the same data as stored in the lookup file.
With OUTPUTNEW clause:
index=_internal sourcetype=splunkd_ui_access | eval status_information=case(status=200,"FINE",status=204,"NOPE",0=0,NULL) | table status method status_information | dedup status | lookup status_code.csv status_code as status OUTPUTNEW status_information
Now you can see the differences, here we have used OUTPUTNEW clause that’s why lookup file data are overwritten by indexed data. For “status_code “ 200 and 204 information was already there in our event (we earlier created using eval command), that’s why by using OUTPUTNEW only missing data will be reflected in the result.
- Outputlookup: With this command we can save any of our search results as a lookup file.
Syntax: …..| outputlookup [append=<bool>] [create_empty=<bool>] [override_if_empty=<bool>] [max=<int>] [key_field=<field>] [createinapp=<bool>] [output_format=<string>] <filename> | <tablename>
NOTE: BOLDS are only required arguments
<filename> - The name of the lookup file by which name you want to save in splunk. <tablename> - The name of the lookup table by which name you want to save in splunk.
index="test_index" sourcetype="csv" | table Rank,Peak,Title,"Worldwide gross",Year | sort - "Worldwide gross" | outputlookup film_grossing.csv
We took a sample data from the index “test_index” and “csv” sourcetype and using table command we created a table with some fields, and we have also used sort command for sorting the value in proper order. Now I want to save that result as a lookup file for that only we can use the outputlookup command.
If you want to cross check it, saved or not either you can type |inputlookup film_grossing.csv or you can go to the Settings > Lookups > Lookup table files, then we will find that in the list.
Hope you have understood this topic on Lookups – Lookup Table Files ( Part – 1 )
Lookups – Lookup Definition and Automatic Lookup ( Part – 2 )
[…] is the continuation of our previous Lookup blog Lookup – Lookup Table Files ( Part – 1) . I will request you to go through that blog before starting this […]
Good one ..keep doing
[…] we will use as a reference field later while comparing with our indexed data. After that using “outputlookup” command we saved that result in a lookup file named […]