In general, we extract fields at search-time. But sometimes we get unstructured data from some resources or maybe we have some restrictions on Indexing capacity limit and more over we want to work on extracted fields only. In these cases, Field extraction at index-time makes our job easy. It increases our search performance as well.
Today, In this article we will learn how to extract fields at index-time.
Splunk can extract the following fields at index time:
- Default Fields
- Custom Fields
- File header Fields
Lets start with custom fields at index-time.
For example, machinelog.log stored at /tmp directory
Step 1:
At first open Universal Forwarder server and go to the $SPLUNK_HOME/etc/system/local directory.
# cd /opt/splunkforwarder/bin/etc/system/local
Now configure inputs.conf
# vi inputs.conf
add the following lines :
[monitor:///tmp/machinelog.log]
index=test
sourcetype=machinelog
Then, save it , press esc then type :wq and enter.
Step 2:
Then deploy the configuration files in indexer or heavy forwarder.
Configuration files are-
- props.conf
- transforms.conf
So lets start with configuring props.conf
# vi props.conf
write the following lines of code :
[<sourcetpe_that_mentioned_in_the_inputs.conf>]
TRANSFORMS-<class> = <unique_stanza_name_given_at_transforms.conf>
Example-
[machinelog]
TRANSFORMS-machine = machine-error
Step 3:
To extract fields from data, need to configure transforms.conf and inside it we have to write regular expressions.
# vi transforms.conf
Always follow this format to configure transforms.conf
[<unique_transform_stanza_name>]
REGEX = <regular_expression>
FORMAT = <your_custom_field_name>=$1 <filed_name2>=$2
DEST_KEY = <KEY>
Example,
In our case, transforms.conf looks like-
[machine-error]
REGEX = (.*?)\s+(\w+)\s-\s\[(.*?)\:(.*?)\]\s\"(.*?)\s+(.*?)\s(.*?)\"\s(.*?)\s(.*?)\s+\"(.*?)\"\s\"(.*?)\"
FORMAT = IP=$1 PATH=$2 DATE=$3 TIME=$4 METHOD=$5 site=$6 HTTP=$7 status=$8 statuscode=$9 refer=$10 browser=$11
DEST_KEY = _raw
Step 4:
Now, open Splunk web interface of indexer and go to settings => indexes
Now create new index and give index name same as given in inputs.conf in the Universal Forwarder
Step 5:
Now, we have to restart Splunk components in order IDX, HF and finally UF respectively
# cd /opt/splunk/bin
# ./splunk restart
After successfully restarting splunk, we can see the extracted fields in the Search Head
Hope you have learned about how to do Index Time Fields Extraction.
Happy Splunking !!