Whether to use Universal Forwarder or the Heavy Forwarder?

Whether to use Universal Forwarder or the Heavy Forwarder?

The common question which keeps rattling in the mind of many Splunkers, when to use Universal Forwarder or the Heavy Forwarder.

Splunk provides two different packages/binaries, the full version of Splunk (Splunk Enterprise) and the Universal Forwarder. A Splunk Enterprise instance can be configured as a Heavy Forwarder. The Universal Forwarder is a lightweight version of Splunk, with limited features.

When should you use the Universal Forwarder and why?

We should keep in mind that Universal Forwarder was designed to collect the data from servers and forward data to other Splunk instances, hence is ideal for collecting files from disk or for use as an intermediate forwarder. Also if you need to do intermediate forwarding, universal forwarder should be the choice. Some data collection Add-ons (e.g. DB connect) can’t be installed on the Universal Forwarder.

As you must have seen Heavy Forwarders are used rather than Universal Forwarders to filter data before indexing, which seemed to be the most efficient use of resources, but most of the time resulted in increased complexity of the environment, also increased the amount of network IO that the indexers had to handle. In some scenarios this also has been reported to increase the CPU and memory usage, compromising the efficiency gain from a distributed environment.  The increase in network traffic is the result of Heavy Forwarder sending parsed/cooked data over the network with all the index time fields, raw event, and related metadata, rather than just a raw event.

Doing all the parsing and filtering on the indexers when possible, keeps the network IO down, this makes the configuration simpler to manage through the use of Universal Forwarders.

The following tests were conducted by Splunk:
The table below shows the results of sample tests, sending a dataset from a Heavy Forwarder to an indexer. This test was repeated with acknowledgment enabled and then repeated the tests again using a Universal Forwarder as the data source. The test file contained 367,463,625 events.

Indexer Acknowledgement   Network GB Transferred  Network Avg(KBps)   Indexing Avg(KBps)  Duration          (Secs)
Heavy Yes 39.1 1941 5092 21151
No 38.4 1922 5139 20998
Universal Yes 6.5 863 14344 7923
No 6.4 1015 17466 6662

The size of data sent over the network was almost 6 times lesser for the Universal Forwarder, was also indexed up to 6 times faster when collected by the Universal Forwarder.

The size of data indexed per second was seen to be approx. 3 times higher when collected by the Universal Forwarder.

NOTE: The use of aggregation layers between collection and indexing tiers should be the exception rather than the rule, as this can have unintended consequences when it comes to your data.

Points to Remember:

• A Universal Forwarder cannot filter based on regular expressions. Use the indexers instead or use the Heavy Forwarder if you need to drop the majority of data at the source. This in turn also makes it easier to manage your environment on a large scale.

• A Universal Forwarder can filter windows events at source by Event ID.

• Simple routing and cloning of data can be performed with the Universal Forwarder, consider using the Heavy Forwarder if you need to route different events to different destinations. As with filtering, try to do this if possible on the indexer.

• Always try to use the Universal Forwarder until and unless you need some feature or functionality that it can’t offer.

• Perform your data filtering on the indexer, it makes your data index quicker.

Thanks for going through this post,

Happy Splunking!!


One comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.