Index Time Field Extraction in SPLUNK

In general, we extract fields at search-time.  But sometimes we get unstructured data from some resources or maybe we have some restrictions on Indexing capacity limit and more over we want to work on extracted fields only. In these cases, Field extraction at index-time makes our job easy.  It increases our search performance as well.

Today, In this article we will learn how to extract fields at index-time.

Splunk can extract the following fields at index time:

  • Default Fields
  • Custom Fields
  • File header Fields

Lets start with custom fields at index-time.

For example, machinelog.log stored at /tmp directory

Step 1:

At first open Universal Forwarder server and  go to the $SPLUNK_HOME/etc/system/local directory.

# cd /opt/splunkforwarder/bin/etc/system/local

Now configure inputs.conf 

# vi inputs.conf

add the following lines :

[monitor:///tmp/machinelog.log]
index=test
sourcetype=machinelog

Then, save it , press esc then type :wq and enter.

Step 2:

Then deploy the configuration files in indexer or heavy forwarder.

Configuration files are-

  • props.conf
  • transforms.conf

So lets start with configuring props.conf 

# vi props.conf

write the following lines of code :

[<sourcetpe_that_mentioned_in_the_inputs.conf>]
TRANSFORMS-<class> = <unique_stanza_name_given_at_transforms.conf>

Example-

[machinelog]
TRANSFORMS-machine = machine-error

Step 3:

To extract fields from data, need to configure transforms.conf and inside it we have to write regular expressions.

# vi transforms.conf

Always follow this format to configure transforms.conf

[<unique_transform_stanza_name>]
REGEX = <regular_expression>
FORMAT = <your_custom_field_name>=$1 <filed_name2>=$2
DEST_KEY = <KEY>

Example,

You can also know about :  Correlation Searches in Splunk Enterprise Security

In our case, transforms.conf looks like-

[machine-error]
REGEX =  (.*?)\s+(\w+)\s-\s\[(.*?)\:(.*?)\]\s\"(.*?)\s+(.*?)\s(.*?)\"\s(.*?)\s(.*?)\s+\"(.*?)\"\s\"(.*?)\"
FORMAT = IP=$1 PATH=$2 DATE=$3 TIME=$4 METHOD=$5 site=$6 HTTP=$7 status=$8 statuscode=$9 refer=$10 browser=$11
DEST_KEY = _raw

Step 4:

Now, open Splunk web interface of indexer and go to settings => indexes

Now create new index and give index name same as given in inputs.conf  in the Universal Forwarder

Step 5:

Now, we have to restart Splunk components in order IDX, HF and finally UF respectively

# cd /opt/splunk/bin
# ./splunk restart

After successfully restarting splunk, we can see the extracted fields in the Search Head

Hope you have learned about how to do Index Time Fields Extraction.

Happy Splunking !! 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.