Data Model In Splunk (Part-I)
Data model is one of the knowledge objects available in Splunk. This applies an information structure to raw data. The main function of a data model is to create a pivot table for the end-users. People are not that strong in SPL queries but still want to create dashboards and reports from Splunk data. for that purpose data model and pivot table come into the scenario. But this is not the only use case of data models, as a Splunk engineer, there can be multiple functions and use cases of data models other than a pivot.
In this series of Data models, we will discuss in detail the data model. We will configure a new data model for the demonstration in many parts.
There are two kinds of data model available,
- Root Event – Where the search query will be without a pipe.
- Root Search – Here the search query can consist of pipe.
NOTE: One data model will have a minimum of one Data Set. And a data set can have multiple child data sets.
We have some sample data from zomato. We will try to build a data model based on the following data.
Now our schematic diagram of our data model will look like this,
So we will create a data model name das “Zomato”, and under that will create a couple of Root data sets,
- Continent – This will consist of data based on different countries.
- Cuisines – This one will consist of data based on different cuisines available.
And under the continent will create a child data set named “Asia”, which will consist of data of only restaurants available in Asia.
Create Data Model:
Firstly we will create a data model,
Go to settings and click on the Data model.
And then click on “New Data Model” and enter the name of the data model and click on create.
As soon you click on create, we will be redirected to the data model. There we need to add data sets. In this blog, we are going to demonstrate only “Root event”. Click on Add dataset and then “Root event”.
Dataset Name: <enter the dataset name>
Constraints: <enter the search query, which will fetch all the data from the given query to populate in the data model. And this query will be a normal and simple query without a pipe as I mentioned earlier.
Click on Save
And it will look like this.
By default, only the host, source, and sourcetype fields will be inherited from the constraints search query as given earlier.
Now our next task will be adding fields into our data model, which will be used by the end-users while accessing the data model to create a pivot table.
To do that click on the “Add field”
Now all possible procedure of field extractions in Splunk is available here like,
Auto Extracted, eval expression, lookup, regular expression, and geo IP.
1. Auto-Extracted:
We will show it one by one. this is a very simple one if you click there. It will open a dialog with all possible pre-extracted fields available in your search query.
We will add a “City” field from there.
2. Eval expression:
now we will add a new field using eval. So click on eval expression and it will open a dialog box like this.
Then Enter your eval expression and then a new field name and then type of the field and click on save.
Here we are creating a new field “countrycode” which is similar to an already existing field “Country Code”.
3. Lookup:
As we all know this root data set is all about continents, which means it must have a field called “Country”. But we don’t have any field like that in our index itself. But we do have a field called “countrycode”, which is basically a field with the details of the country code. so for the end-users, we need a country name to understand. Therefore we have a lookup file to populate the country name by mapping the country code.
Lookup Table: <Select the lookup name we just created, it has to be a lookup definition>
Input: <select the common fields between lookup and dataset>
Output: <select the name of the field which we want to add in the data set from the lookup>
Now we will add another field using eval expression to segregate based on the continent, which will be used in the later part of our demo. The query is given below,
| eval continent=case(Country IN ("India","Indonesia","Phillipines","Singapore","Qatar","Sri Lanka","Turkey","UAE"), "Asia",Country IN ("Australia","New Zealand"),"Ocenia",Country In ("Brazil"),"South America", Country In ("Canada","United States"),"North America", Country IN ("South Africa"),"Africa" ,Country IN ( "United Kingdom"),"Europe")
And finally, it will look like this,
Now we are done with creating the event root dataset, if you want to test it’s working or not then you can access the pivot.
If you don’t know how to use pivot don’t worry we will discuss this in later blogs about this. Stay tuned with us.
In the next blog, we will create a Child dataset under this root dataset, and also we will configure a search root dataset (Cuisines data set).
Sample data and lookup file used in this blog have been given below to practice.
[paste countrycode.csv and zomatodata.xls here]
Hope you all enjoyed this blog “Data Model In Splunk (Part-I)“, we will come back with the next part of this series till then stay tuned with us.
Happy Splunking!!