Interview Question

  • Name features which are not available in Splunk free version?

The following features are not available in Splunk free:

  1. Distributed searching
  2. Forwarding in HTTP or TCP
  3. Scheduled searches
  4. Access controls
  • What are three versions if Splunk?

Splunk is available in three different versions. These versions are 1) Splunk light, 2) Splunk enterprise, 3) Splunk cloud

  1. Splunk light: Splunk light is a free version which allows, to make a report, search and edit your log data. It has limited functionalities in compared to other versions.
  2. Splunk enterprise: Splunk Enterprise edition is used by many IT organizations. It gives you all the features of splunk app.
  3. Splunk cloud: Splunk Cloud is a SaaS (Software as a Service) It offers almost similar features as the enterprise version, including APIs, SDKs, and apps.
  • Distinguish between Splunk apps and add-ons

Splunk Add-on and app has the only difference:

Splunk add-ons contain only built-in configurations for getting the data in from different sources.

Splunk apps contains built-in reports, configurations, and dashboards is for visualizing your data.

  • What are the components of Splunk?

The components of Splunk  are mentioned below:

  1. Splunk Forwarders
  2. Indexer
  3. Search Heads
  4. Deployment Server
  5. License Master
  • What is Splunk Forwarder? What are the types of Splunk Forwarders?

There are two types of Splunk Forwarders:

Universal Forwarder (UF): This is the light-weight Splunk instance which is installed in the application server to bring the Data. The only function of it is to forward the data to other Splunk instance.

Heavyweight Forwarder (HWF): A full instance of Splunk which also forwards the data from one end to other, But with some advance functionalities i.e. Parsing and Filtering.

  • Define deployment server.

Deployment server acts as a centralised configuration manager for all deployment clients (forwarders, indexers, search heads). It is used to update or deploy any configuration to the deployment clients.

  • What is the use of License Master in Splunk?

License master in Splunk is the instance which is responsible for monitoring that the right amount of data getting indexed (as per the license amount you have bought). Splunk license is based on the data volume that comes to the platform within a day (24hr window).

  • What do Splunk Licenses specify?

Splunk licenses specify how much data we can index per day.

  • How does Splunk determine 1 day, from a licensing perspective?

In terms of licensing, for Splunk, 1 day is from midnight to midnight on the clock of the license master.

  • How are Forwarder Licenses purchased?

They are included with Splunk. So, no need to purchase that separately.

  • What are the types of Splunk Licenses?
  • Enterprise license
  • Free license
  • Forwarder license
  • Beta license
  • Sales Trial license
  • Explain ‘license violation’ from Splunk perspective.

As, we have discussed in the previous question License Master monitors how much data is getting indexed a day. If you exceed the data limit, then you will get a ‘license violation’ error.

The license warning will persist for 14 days. In a commercial license (Enterprise License) you can have 5 License Violation within a 30 days rolling window after which your Indexer’s search results and reports will stop triggering. In a free version however, it will show only 3 counts of warning.

  • What if the License Master is unreachable?

In case the license master is unreachable, you will not be able to search the data. However, the data coming in to the Indexer will not be affected. The data will continue to flow into your Splunk deployment, the Indexers will continue to index the data as usual.

But, you will get a warning message on top your Search head or web UI saying that you have exceeded the indexing volume and you either need to reduce the amount of data coming in or you need to buy a higher capacity of license.

  • How would you handle/troubleshoot Splunk License Violation Warning?

A license violation warning means that Splunk has indexed more data than the license you has been purchased. In this case we have to do some steps:

  • We have to identify which index/source type has received more data recently than the usual daily data volume. For that, we can check the Splunk license master pool-wise available quota and identify the pool for which the License Violation occurred.
  • Once we identified the pool for which we are receiving more data, then we have to identify the source types for which we are receiving more data than the usual.
  • Once the source type is identified, then we have to find out the source machine which is sending the huge number of logs and will troubleshoot it, accordingly.
  • What are the unique benefits of getting data into a Splunk instance via Forwarders?

The main benefits of getting data into Splunk via forwarders are bandwidth throttling, TCP connection and an encrypted SSL connection for transferring data from a forwarder to an indexer.

The data forwarded to the indexer is also load balanced by default and even if one indexer is down due to network outage or maintenance purpose, that data can always be routed to another indexer instance in a very short time. Also, the forwarder caches the events locally before forwarding it, thus creating a temporary backup of that data.

  • Explain file precedence in Splunk.

File precedence is an important aspect of troubleshooting in Splunk for an administrator, developer, as well as an architect. All of configurations in Splunk are written within plain text .conf files. There can be multiple copies present for each of these files in different directory, and thus it is important to know the role these files play when a Splunk instance is running or restarted.

File precedence is an important concept to understand for a number of reasons:

  1. To be able to plan Splunk upgrades
  2. To be able to plan app upgrades
  3. To be able to provide different data inputs and
  4. To distribute the configurations to your splunk deployments.
  5. To determine the priority among copies of a configuration file, Splunk software first determines the directory scheme. The directory schemes are either a) Global or b) App/user.

When the context is global (that is, where there’s no app/user context), directory priority descends in this order:

  • System local directory — highest priority
  • App local directories
  • App default directories
  • System default directory — lowest priority

When the context is app/user, directory priority descends from user to app to system:

  • User directories for current user — highest priority
  • App directories for currently running app (local, followed by default)
  • App directories for all other apps (local, followed by default) — for exported settings only
  • System directories (local, followed by default) — lowest priority
  • Define dispatch directory in Splunk?

Dispatch directory stores the status like the searches are running or completed.

  • Explain how data ages in Splunk?

Data coming in to the indexer is stored in directories called buckets. A bucket moves through several stages as data ages: hot, warm, cold, frozen and thawed. Over time, buckets ‘roll’ from one stage to the next stage.

The first time when data gets indexed, it goes into a hot bucket. Hot buckets are both searchable and are actively being written to. An index can have several hot buckets open at a time

When certain conditions occur (for example, the hot bucket reaches a certain size or splunkd gets restarted), the hot bucket becomes a warm bucket (“rolls to warm”), and a new hot bucket is created in its place. Warm buckets are searchable, but are not actively written to. There can be many warm buckets

Once further conditions are met (for example, the index reaches some maximum number of warm buckets), the indexer begins to roll the warm buckets to cold based on their age. It always selects the oldest warm bucket to roll to cold. Buckets continue to roll to cold as they age in this manner

After a set period of time, cold buckets roll to frozen, at which point they are either archived or deleted.

The bucket aging policy, which determines when a bucket moves from one stage to the next, can be modified by editing the attributes in indexes.conf.

  • What is a null queue?

A null queue is an approach to filter out unwanted incoming data sent by Splunk forwarders.

  • What is the difference between Search time and Index time field extractions?

As the name suggests, Search time field extraction refers to the fields extracted while performing searches whereas, fields extracted when the data comes to the indexer are referred to as Index time field extraction. You can set up the index time field extraction either at the forwarder level or at the indexer level.

Another difference is that Search time field extraction’s extracted fields are not part of the metadata, so they do not consume disk space. Whereas index time field extraction’s extracted fields are a part of metadata and hence consume disk space.

  • What is Search Factor (SF) & Replication Factor (RF)?

SF & RF are terms related to Splunk Clustered Environment i.e. Search head clustering & Indexer clustering.

In case of Indexer Clustering, Replication is the no of raw copies of the data getting indexed in the indexer i.e. the number of copies of raw data the cluster maintains and in case of a Search Head Clustering, it is the minimum number of copies of each search artifact, the cluster maintains.

The search factor determines the number of searchable copies of data maintained by the indexer cluster. The default value of search factor is 2. Search head cluster has only a Search Factor whereas an Indexer cluster has both a Search Factor and a Replication Factor

[ One point to be noted is that the search factor must be less than or equal to the replication factor ]

  • How to exclude some events from being indexed by Splunk?

Most of the time we don’t want to index all the data generated in Splunk instance. In that case, we can exclude some events.

As an example you only want to index the error messages in your application development cycle. You can exclude events from all the other categories (like INFO, DEBUG etc.) in the null queue. These null queues are put into transforms.conf at the forwarder level (in Heavy Forwarders) itself.

  • What are the common port numbers used by Splunk?

Below are the default port numbers used by Splunk. But, we can change them if required.

  1. Splunk Web port: 8000
  2. Splunk Management port: 8089
  3. Splunk Indexing port: 9997
  4. Splunk Index Replication port: 8080
  5. Splunk Network port: 514 (Used to get data from the Network port, i.e., UDP data)
  6. Splunk KV Store: 8191
  • What are the most common and important configuration files in Splunk?
  1. inputs.conf
  2. outputs.conf
  3. props.conf
  4. transforms.conf
  5. indexes.conf
  6. server.conf, There are other configuration files also exist but all depend which component you are configuring and what is the need
  • Where is Splunk Default Configuration stored?

$SPLUNK_HOME/etc/system/default

  • What is the command for restarting Splunk web server?

The following command is used to restart Splunk Web:

$SPLUNK_HOME/bin/splunk start splunkweb

  • What is the command for restarting Splunk Daemon?

The below command is used to restart Splunk Demon:

$SPLUNK_HOME/bin/splunk start splunkd

  • What is the command used for enabling Splunk to boot start?

To boot start Splunk, the following command is used:

$SPLUNK_HOME/bin/splunk enable boot-start

  • How to disable Splunk boot-start?

In order to disable Splunk boot-start, we can use the following:

$SPLUNK_HOME/bin/splunk disable boot-start

  • What is the command used to check the running Splunk processes on Unix/Linux?

If we want to check the running Splunk Enterprise processes on Unix/Linux system, we can make use of the following command:

ps aux | grep -i splunk

  • How to clear Splunk Search History?

We can clear Splunk search history by deleting the following file from Splunk server:

$SPLUNK_HOME/var/log/splunk/searches.log

  • How to disable Splunk Launch Message?

You can do so by doing the following in Splunk launch.conf

OFFENSIVE=Less

  • What is fishbucket in Splunk?

Splunk fishbucket is a directory or index at the default location:

/opt/splunk/var/lib/splunk

It contains seek pointers and CRCs for the files we are indexing, so ‘splunkd’ can tell us if it has read them already and prevents data duplication. We can access it through the GUI by searching for:

index=”_thefishbucket”

  • How does Splunk avoid the duplicate indexing of logs?

The Answer is in the previous question only

At the indexer, Splunk keeps track of the indexed events in a index called fishbucket with the default location:

/opt/splunk/var/lib/splunk

It contains seek pointers and CRCs for the files we are indexing, so splunkd can tell us if it has read them already.

  • How to troubleshoot Splunk performance issues?

The answer can be of different types, but you can give the common one in the following way : –

You can check splunkd.log in “_internal” index for errors

You can check server performance issues, i.e., CPU, memory usage, disk I/O, etc.