Usage of Splunk Commands: DBINSPECT

Usage of Splunk Commands: DBINSPECT

Hi Guys,
We have come with a new and very much useful command of Splunk that is DBINSPECT. Before going to explanation of this blog lets discuss about the buckets briefly. We all know that Splunk stores the data inside the buckets based on some criteria. Basically, whenever data comes in to Splunk it creates two types of file one is rawdata and another is tsidx file. Both of these files reside under the buckets. Also, Splunk has the capability to compress the data. That’s why it took less space in the indexer rather than the actual file size.

So, in this blog we are going to discuss about those things. As you know that Splunk Administrators will have access to the CLI to the indexers. So they can easily login to the indexer’s CLI to check the buckets information and all. But if you are a Splunk Developer probably you will have the access to the Search Heads GUI only. So from the Search Head GUI how you can see all those information about the Size, Bucket state (hot, warm etc. ) , raw data size etc.

Using the help of “dbinspect” command you can find all these information from the search head GUI. So that you can see the space utilization of the indexers.

Syntax :

| dbinspect index=<index-name>

Example 1:
Use the below command to know all the information about the non-internal index.

| dbinspect index=*

Explanation:
It will give you the information about the indexes and the buckets. It will give the information about below mentioned fields.

1. eventCount
2. hostCount
3. sourceCount
4. sourceTypeCount
5. rawSize
6. path
7. state
8. sizeOnDiskMB
9. etc

From these information you will be able to know about the number of events, number of hosts, actual data size and how much disk space it is consuming in the indexer, etc.

Example 2:

| dbinspect index=wineventlog
| table path,eventCount,rawSize,sizeOnDiskMB,state,index
| rex field=path ".*\\\(?<BucketID>\S+)$"
| fields – path

Explanation :
Here we have used index=wineventlog with the “dbinspect” command to see the information about the “wineventlog” index.  Also it is showing event count per bucket, rawSize and sizeDiskOnMB

As you can see that we are having 3 warm buckets and 1 hot bucket. Also it is showing how much disk space ( sizeOnDiskMB) it is consuming in the indexer. “rawSize” field indicates what is the actual size of the log before data get indexed in the indexer.

Calculate Data Compression Percentage In Splunk
At the starting of this blog we told that Splunk Compress the data before writing on the Disk. By default Splunk can compress data up to approximately 50% ( 15% rawdata + 35 % .tsidx file ) of the actual log file. Now using a query, we can find out how much Splunk is compressing before storing the data into the Indexer. Please find the below query to see the compression percentage :

| dbinspect index=wineventlog
| fields rawSize,sizeOnDiskMB,splunk_server,index
| eval sizeOnDiskMB=round(sizeOnDiskMB,0), rawSize=round(rawSize,0)
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB by splunk_server,index
| eval rawTotalinMB=round(((rawTotal/1024)/1024),0)
| fields - rawTotal
| eval compression=round((100 - ((diskTotalinMB / rawTotalinMB) * 100)),2)." %"

Explanation :
Here using dbinspect command we are taking the information about the “wineventlog” index. We have taken rawSize, SizeOnDiskMB, splunk_server, index fields .

rawSize >>Size of the actual log file before data get indexed in the indexer. 
It shows the value in bytes
.
SizeOnDiskMB >> Space taking in indexer after compression.
splunk_server >> Indexer Name
index >> Index name

Then using stats command we have taken the summation of rawSize and SizeOnDiskMB by splunk_server and index. After that we have converted the summation result of the rawSize in MB because it was giving the summation of the actual data size in bytes. Now “diskTotalinMB” field is containing the value of total disk space occupying in the indexer and “rawTotalinMB” field is containing the value of total data size before the data get indexed in the indexer. After that we have used eval command to calculate the compression percentage [ 100 – {(Space taken in the indexer / Actual data size ) * 100 }]. As you can see that actual data size is 55 MB and it is taking 19 MB in the indexer. So the data is compressed at the rate of 65 %.

Hope you have understood the topic : Usage of Splunk Commands: DBINSPECT

Happy Splunking !!!

You can also know about :  Usage of REX attribute : max_match

2 comments

    1. Yes this is valid for clustered indexers also. But there you will get only the information about the primary buckets.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.