Usage of Splunk commands  : PREDICT

Usage of Splunk commands  : PREDICT

Usage of Splunk commands : PREDICT is as follows :

  • Predict  command is used for predicting the values of time series data.
  • Predict command  fill the missing values in time series data and also can predict the values for future time steps.

Find below the skeleton of the usage of the command “predict” in SPLUNK :

…| predict <field-name> [AS <new-field-name>] [algorithm=<algorithm-name>] [future_timespan=<number>] [holdback=<number>]

There are several arguments with predict command below we have given important arguments.

  • algorithm – This attribute is used for specifying the prediction algorithms names . There are several prediction algorithms such as LL ,LLT,LLP,LLP5,LLB and BiLL. By default prediction command takes the algorithm as LLP5. Predict command takes minimum number of data and then depending upon these algorithms, it predict the values.
Algorithm Name(Algorithm Type) Description
LL( Local level) This is an univariate model with no trends and no seasonality. At least  2 data points are required for prediction . The LL algorithm is the most common algorithm and  it returns the levels of the time series.
LLT( Local level trend) This is an univariate model with trend, but no seasonality. At least  3 data points are required for prediction.
LLP( Seasonal local level) This is an univariate model with seasonality. If you use the period attribute, data points must be at least twice of the period number. Data points must be periodic.
LLP5( Combines LLT and LLP models for its prediction.) This algorithm is the combination of LLT and LLP . One prediction will be done by LLT and another prediction will be done by  LLP. Then the algorithm takes an average of these two prediction and gives a overall output.
LLB( Bivariate local level) This is a  bivariate model with no trends and no seasonality. At least  2 data points are required for prediction. This algorithm uses one data set to predict the another data set.
BiLL( Bivariate local level) This is a bivariate model which predicts data points as well as time-series  simultaneously.

 

  • future_timespan – This attribute is used for specifying the number of future prediction data computed by the predict command. The number should be a positive integer. Also would not use this attribute if  algorithm=LLB. By default predict command takes future_timespan=5.
  • holdback – This attribute is used for specifying the number of data points from the end which is will not be used for prediction with predict command. By default predict command takes holdback=0.

NOTE : Predict command must need a time series data. So the input of predict command must be timechart command’s output.

Example 1:

index=_internal sourcetype=splunkd_ui_access
| timechart span=1h count(method) as method_count
| predict method_count

Result:

p1

Visualization :

p2

Explanation:

In the above query method is an existing field name in _internal index and sourcetype name is splunkd_ui_access . By the timechart command we have taken the count of method and rename as method_count . We have run the query for last 24 hours and give the span as 1h. It will show the data in hourly basis. At last we have used predict command to predict the value of method_count . We don’t specify any future_timespan so predict command takes future_timespan as 5 .Also predict command take algorithm=LLP by default for prediction. As you can see the result last data has come from the index at 6 am and we are getting the prediction data up to 11 am.

************************************************************************

Example 2:

index=_internal sourcetype=splunkd_ui_access
| timechart span=1h count(method) as method_count
| predict method_count algorithm=LL

Result :

p3

Visualization:

p4

Explanation:

In the above query method is an existing field name in _internal index and sourcetype name is splunkd_ui_access . By the timechart command we have taken the count of method and rename as method_count . We have run the query for last 24 hours and give the span as 1h. It will show the data in hourly basis. At last we have used predict command to predict the value of method_count . We don’t specify any future_timespan so predict command takes future_timespan as 5 .Also we have specified  algorithm=LL with predict command for prediction. As you can see the result last data has come from the index at 6 am and we are getting the prediction data up to 11 am.

****************************************************************************

Example 3:

index=_internal sourcetype=splunkd_ui_access
| timechart span=1h count(method) as method_count
| predict method_count algorithm=LL future_timespan=10

Result:

p5

Visualization:

p6

Explanation:

In the above query method is an existing field name in _internal index and sourcetype name is splunkd_ui_access . By the timechart command we have taken the count of method and rename as method_count . We have run the query for last 24 hours and give the span as 1h. It will show the data in hourly basis. At last we have used predict command to predict the value of method_count . We specified the algorithm as LL and give the future_timespan=10 to predict 10 future value in the time series. As you can see the result last data has come  from the index at 7 am and we are getting the prediction data up to 5 pm because we have specified the future_timespan as 10.

*******************************************************************************

Example 4:

index=_internal sourcetype=splunkd_ui_access
| timechart span=1h count(method) as method_count
| predict method_count algorithm=LL future_timespan=10 holdback=1

Result:

p7

Visualization:

p8

Explanation:

In the above query method is an existing field name in _internal index and sourcetype name is splunkd_ui_access . By the timechart command we have taken the count of method and rename as method_count . We have run the query for last 24 hours and give the span as 1h. It will show the data in hourly basis. At last we have used predict command to predict the value of method_count. We specified the algorithm as LL and give the future_timespan=10 to predict 10 future value in the time series.  Also we have given holdback=1 for specifying the number of data points from the end which is will not be used for prediction with predict command. So it will not take one data point from the end for prediction. As you can see the result last data has come in from the index at 7 am and we are getting the prediction data up to 4 pm though we have specified future_timespan=10 . Because we have specified holdback=1 so it will ignore the last data point from the end. So for prediction it will take the data up to 6 am and will ignore the data of 7 am.

Now you can effectively utilize “predict”  command in  your daily use to meet your requirement !!

Hope you are now comfortable in : Usage of Splunk commands  : PREDICT

 

Happy Splunking !!

Advertisements

One comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.