Impact of fields.conf on Search Performance In Splunk
In our earlier post Index-time field extraction we had shown that how to extract the fields from the incoming data to Splunk. In this post we tried to find how this impacts the search performance of Splunk. It’s recommended to have a look at our post “index-time field extraction” before you go further in this post.
Below is what we did:
On the Indexer:
Step-1 : We created an index “demo” to store the data that we are going to use for the testing purpose.
You can simply create an index via Splunk GUI, in case you are on a non-clustered indexer.
On the Heavy Forwarder:
Step-2 : We have a file named demo_file.txt under /tmp directory, which is about 1.4 GB in size (since we need enough data to see a significant performance impact).
Below is the contents of the file (basically, linux secure logs );
Step-3: We created an inputs.conf (under, $SPLUNK_HOME/etc/system/local) to monitor the “demo_file.txt” file.
index = demo
sourcetype = demo_file
The above stanza tells the Splunk input processor to monitor a file “demo_file.txt” located under the /tmp directory and the attributes “index” and “sourcetype” assign the values to the default fields “index” and “sourcetype” required by Splunk.
Step-4: We configured the props.conf (under, $SPLUNK_HOME/etc/system/local), as shown in the screenshot below,
Step-5: Let’s configure the transforms.conf (under, $SPLUNK_HOME/etc/system/local), here we are going to set the index time extraction rules.
We are extracting only one field “IP_Address” in the index time, from the ingested data in Splunk.
To know more about these attributes, click here.
Step-6: On the search head, let’s try using this field “IP_Address” in a basic SPL Query (Verbose mode).
And, now it’s time to have a little Job inspection , click on the Job button located just below towards the right of your search bar.
As you can clearly see in the screenshot above the “runDuration” of the query was about 514 seconds.
Now, Let’s try if we can have better search performance on this field.
On the Search head:
Step-7: Configuring the fields.conf(under, $SPLUNK_HOME/etc/system/local), to check how it impacts the search performance for the same query.
[IP_Address] a This stanza holds the name of the indexed field.
INDEXED = true
The above attribute-value pair tells Splunk (search head) that the field is indexed.
And below are the job inspection results,
The above screenshot shows the “runDuration” of the query was about 478 seconds, which can be considered as a good performance impact, given that we don’t have a huge data size here.
What we concluded here, it’s always better to incorporate a fields.conf file for the indexed extractions (where ever possible), and it can have in fact a lot bigger performance impact on the searches that include those indexed fields on bigger data sets.
That’s all; we hope you enjoyed the post.