Spread our blog

In general, we extract fields at search-time.  But sometimes we get unstructured data from some resources or maybe we have some restrictions on Indexing capacity limit and more over we want to work on extracted fields only. In these cases, Field extraction at index-time makes our job easy.  It increases our search performance as well.

Today, In this article we will learn how to extract fields at index-time.

Splunk can extract the following fields at index time:

  • Default Fields
  • Custom Fields
  • File header Fields

Lets start with custom fields at index-time.

For example, machinelog.log stored at /tmp directory

Step 1:

At first open Universal Forwarder server and  go to the $SPLUNK_HOME/etc/system/local directory.

# cd /opt/splunkforwarder/bin/etc/system/local

Now configure inputs.conf 

# vi inputs.conf

add the following lines :

[monitor:///tmp/machinelog.log]
index=test
sourcetype=machinelog

Then, save it , press esc then type :wq and enter.

Step 2:

Then deploy the configuration files in indexer or heavy forwarder.

Configuration files are-

  • props.conf
  • transforms.conf

So lets start with configuring props.conf 

# vi props.conf

write the following lines of code :

[<sourcetpe_that_mentioned_in_the_inputs.conf>]
TRANSFORMS-<class> = <unique_stanza_name_given_at_transforms.conf>

Example-

[machinelog]
TRANSFORMS-machine = machine-error

Step 3:

To extract fields from data, need to configure transforms.conf and inside it we have to write regular expressions.

# vi transforms.conf

Always follow this format to configure transforms.conf

[<unique_transform_stanza_name>]
REGEX = <regular_expression>
FORMAT = <your_custom_field_name>=$1 <filed_name2>=$2
DEST_KEY = <KEY>

Example,

In our case, transforms.conf looks like-

[machine-error]
REGEX =  (.*?)\s+(\w+)\s-\s\[(.*?)\:(.*?)\]\s\"(.*?)\s+(.*?)\s(.*?)\"\s(.*?)\s(.*?)\s+\"(.*?)\"\s\"(.*?)\"
FORMAT = IP=$1 PATH=$2 DATE=$3 TIME=$4 METHOD=$5 site=$6 HTTP=$7 status=$8 statuscode=$9 refer=$10 browser=$11
DEST_KEY = _raw

Step 4:

Now, open Splunk web interface of indexer and go to settings => indexes

Now create new index and give index name same as given in inputs.conf  in the Universal Forwarder

You can also know about :  Configure a Scripted Data Input Using a PS1 or PowerShell Script

Step 5:

Now, we have to restart Splunk components in order IDX, HF and finally UF respectively

# cd /opt/splunk/bin
# ./splunk restart

After successfully restarting splunk, we can see the extracted fields in the Search Head

Hope you have learned about how to do Index Time Fields Extraction.

Happy Splunking !! 

What’s your Reaction?
+1
+1
+1
+1
2
+1
1
+1
1
+1

Spread our blog

LEAVE A REPLY

Please enter your comment!
Please enter your name here