Posts in Category: Splunk

Simple Splunk Scripted Input Example

Overview

In this article, I will walk you through the process of creating a scripted input in Splunk. With a scripted input, you configure your Splunk server or Universal Forwarder (UF) to run a script and capture the output of that script as events to be indexed by Splunk. This article will assume that you have some understanding of Splunk, running python and shell scripts on your system, and understand the difference between a Universal Forwarder and a Splunk Indexer. This article has been tested on Ubuntu 14, running Splunk 6.5. With minor modifications it should work for most Linux and Unix-based systems.

Getting Started

We will assume that our initial goal is to have Splunk run a python script, capturing the output as events. We will configure Splunk to run this python script as a scripted input by creating a new add-on on the Splunk system. To do this, we will create our add-on folder in the apps directory of the Splunk system. The apps directory is located under the etc folder in the $SPLUNK_HOME directory. $SPLUNK_HOME is the location where Splunk was installed. On an indexer, this will often be /opt/splunk, while on a Universal Forwarder, this will often be /opt/splunkforwarder. This guide will assume that you are working on a Universal Forwarder, but all steps can be easily modified for a Splunk Indexer.

Following Splunk’s naming conventions for applications, we will name this application TA-SimpleApp. The TA stands for “Technology Add-on” (Which is different from a Splunk App, which has a GUI). First we will create all the necessary folder and files:

# The path below is the apps folder for a Universal Forwarder
# Change if you are on an indexer (/opt/splunk) or used a different install location

cd /opt/splunkforwarder/etc/apps/

sudo mkdir TA-SimpleApp
sudo mkdir TA-SimpleApp/bin
sudo mkdir TA-SimpleApp/default

sudo touch TA-SimpleApp/bin/TA-SimpleApp.py
sudo touch TA-SimpleApp/default/inputs.conf

Next we need to adjust permissions:

cd /opt/splunkforwarder/etc/apps/
sudo chown -R splunk:splunk TA-SimpleApp

When you install Splunk, a Splunk user is created. We want these folders to be owned by the Splunk user, to ensure that it can access these files.

you should now have the following files and folders:

noah@thor:/opt/splunkforwarder/etc/apps$ tree TA-SimpleApp/
TA-SimpleApp/
├── bin
│   └── TA-SimpleApp.py
└── default
    └── inputs.conf

Now we need to add the content of these two files. First we’ll setup our default inputs.conf (located in the default folder):

[script://./bin/TA-SimpleApp.py]
interval = 10
sourcetype = my_sourcetype
disabled = False 
index = main

A breakdown of this file:

  1. The first line tells Splunk that we are creating a scripted input. We reference the path to the file we want executed (note that we can not pass parameters to the script here, to do that you need a wrapper script, shown below). Rather than referencing the relative path to the script, you can also reference the full path of the script, relative to $SPLUNK_HOME. an example of this would be: [script://$SPLUNK_HOME/etc/apps/TA-SimpleApp/bin/TA-SimpleApp.py]. I prefer the relative path. If you want to reference a location outside of the $SPLUNK_HOME folder heirarchy, you need to use the .path option.
  2. The second line tells Splunk to run this script every 10 seconds. You can also set the script to run on a schedule.
  3. The third line is optional, and tells Splunk they sourcetype of the events (Why this matters).
  4. The fourth line tells Splunk to run this script. Set to True or 1 to prevent Splunk from running this script.
  5. The fifth line is optional, and tells Splunk which index events from this script should be written to. The index must exist on the Indexer, and the default index is main.

Next we add the following content to our python script (TA-SimpleApp.py):


# So we can run this scipt under python 2 or 3
from __future__ import print_function

import sys			            # for sys.stderr.write()
import time 			        # for strftime
from datetime import datetime	# for datetime.utcnow()
import random			        # to provide random data for this example

sys.stderr.write("TA-SimpleApp python script is starting up\n")                  

# output a single event
print (str(time.time()) + ", username=\"agent smith\", status=\"mediocre\", admin=noah, money=" + str(random.randint(1, 1000)))

# output three events, each one separated by a newline (each line will be a unique event)
for x in range(0, 3):
	strEvent = str(time.time()) + ", "
	strEvent += "username=\"" + random.choice(["Stryker", "Valkerie", "Disco Stu"]) + "\", "
	strEvent += "status=\"" + random.choice(["groovy", "hungry", "rage quit"]) + "\", "
	strEvent += "admin=" + random.choice(["lenny", "carl", "moe"]) + ", "
	strEvent += "money=" + str(random.randint(1, 1000))
	print (strEvent)

Line 9 is an example of how we write information to the Splunk event log: $SPLUNK_HOME/var/log/splunk/splunkd.log. See the section below on logging.

Line 12 is where we generate a single event for Splunk to consume. We start by printing the UNIX epoch time, followed by a comma-separated list of keys and values, ending with a newline (the print function ends each line with a newline). Epoch time is preferred because it is easy for Splunk to identify, and it is highly accurate. Splunk will automatically identify the key-value pairs in this instance. If your output is less structured, you would need to configure a props.conf and transforms.conf (a simple example of these files).

Beginning with line 15, we loop thee times to create three random events. Each event will be written to stdout (using the print function at line 21, like above), and will terminate with a newline, which Splunk interprets as the end of the event. You can have Splunk ingest multiple-line events, by configuring Line Breaking in props.conf.

Now we want to test that the python script works correctly. We can do this by simply running it, and making sure we are seeing the stderr and stdout being written correctly to the screen (this is how Splunk will ingest this information). Run the script manually, and look for similar output:

noah@thor:/opt/splunkforwarder/etc/apps/TA-SimpleApp/bin$ python ./TA-SimpleAppy 
TA-SimpleApp python script is starting up
1484997424.92, username="agent smith", status="mediocre", admin=noah, money=627
1484997424.92, username="Valkerie", status="hungry", admin=carl, money=800
1484997424.92, username="Disco Stu", status="rage quit", admin=lenny, money=663
1484997424.92, username="Stryker", status="groovy", admin=moe, money=483

If you have the following output, then your script is correct. Now you need to restart Splunk to have it load your Technology Add-on, and run your script. Reboot Splunk:

noah@thor:~$ cd /opt/splunkforwarder/bin/
noah@thor:/opt/splunkforwarder/bin$ sudo ./splunk restart

Check your log files (see the next section on error logging) and the SplunkWeb search for these events. In the SplunkWeb search app, you should see something similar to:

Congratulations, if you have similar output to above, you now have a simple scripted input for Splunk. On a UF, you have to ensure that python is available. On an indexer, Splunk will use it’s own version of python (2.7.5), and on a UF, you’ll use the system’s version of python.

Error Logging

One challenge is that you have to be an administrator to view this log file, or even browse the log folder. To view this folder, I usually use sudo bash in a separate terminal window to work with log files. Because Splunk captures all output from your script, we need to differentiate between events and log information. We do this by writing events to stdout, which is ingested by Splunk, becomes an event, and is indexed. Anything written to stderr is captured by Splunk and is written to the splunkd.log file. This event will look like the following entry in the splunkd.log:

01-21-2017 09:43:09.216 +0200 ERROR ExecProcessor - message from "/opt/splunkforwarder/etc/apps/TA-SimpleApp/bin/myAppLauncher.sh" TA-SimpleApp myAppLauncher.sh is starting

You will notice that Splunk marks this event as an ERROR in the log, there doesn’t seem to be a way to modify this event to change the severity.

A good way to follow these events on a Universal Forwarder is as follows:

root@thor:/opt/splunkforwarder/var/log/splunk# sudo tail -f splunkd.log | grep ExecProcessor
01-21-2017 09:49:32.684 +0200 INFO  ExecProcessor - New scheduled exec process: python /opt/splunkforwarder/etc/apps/TA-SimpleApp/bin/TA-SimpleApp.py
01-21-2017 09:49:32.684 +0200 INFO  ExecProcessor - 	interval: 10 ms
01-21-2017 09:49:35.195 +0200 ERROR ExecProcessor - message from "python /opt/splunkforwarder/etc/apps/TA-SimpleApp/bin/TA-SimpleApp.py" TA-SimpleApp python script is starting up

On an indexer (rather than a UF), this command generates too much information (there are a number of apps that are started by Splunk on an Indexer, where on a UF there is only the one we created). On an indexer, you may want to grep for the name of the app: TA-SimpleApp instead.

Wrapper Script

If you need to pass parameters to your script, or you want to execute an application, you can have Splunk call a shell script, where you have more options in launching your script. Let’s create this script:

cd /opt/splunk/etc/apps
sudo touch TA-SimpleApp/bin/myAppLauncher.sh
sudo chmod a+x TA-SimpleApp/bin/myAppLauncher.sh
sudo chown splunk:splunk TA-SimpleApp/bin/myAppLauncher.sh

We need to modify our inputs.conf to call this new shell script, rather than the python script directly. Modify the first line of inputs.conf to look like this (everything else is the same):

[script://./bin/myAppLauncher.sh]

And enter the following content for this shell script (myAppLauncher.sh):

#!/bin/bash

# Write a line to the splunk log file
echo TA-SimpleApp myAppLauncher.sh is starting >&2

# Check if we can run a python interpreter, exit otherwise
command -v python >/dev/null 2>&1 || { echo No python interpreter found. >&2; exit 1; }

# execute our python script, located in the same folder as this script
cd $( dirname "${BASH_SOURCE[0]}" )
python ./TA-SimpleApp.py

On line 4, we are writing to stderr, which adds the output to an event to the splunkd.log event log (this would probably be removed in a final release, but it is good for testing).
On line 7 we are checking that we can find a python interpreter to run our script. If we can’t find one, we write an error to the splunkd.log log file and quit.
on line 10 we are setting the current directory to the directory of this shell script.
on line 11 we are calling our python script, which will execute and generate output just like the example above.

This example is very similar to launching a python script directly, however we now have the ability to pass parameters, do additional setup, change environmental variables (if needed), test for a python interpreter, and other housekeeping. You could also call a binary executable, output to your own log files, and other required setup or testing for your script. Basically, if you can do it from a shell script, and it generates output to stdout and stderr, you can use it as a scripted input. For any scripts or applications you execute here (including all child applications), all data written to stdout will become an event on your Splunk indexer. All errors written to stderr will become entries in the splunkd.log log file.

Testing your script as Splunk will run it

When Splunk runs your script, it is doing it within it’s own environment, with it’s own environmental variables. To test your script within that environment (without setting it up as an App), you can run ./splunk cmd from the bin folder:

noah@thor:/opt/splunkforwarder/bin$ ./splunk cmd ../etc/apps/TA-SimpleApp/bin/myAppLauncher.sh 

you’ll see the following output:

TA-SimpleApp myAppLauncher.sh is starting
TA-SimpleApp python script is starting up
1484994520.21, username="agent smith", status="mediocre", admin=noah, money=736
1484994520.21, username="Valkerie", status="hungry", admin=carl, money=141
1484994520.21, username="Stryker", status="rage quit", admin=carl, money=941
1484994520.21, username="Disco Stu", status="groovy", admin=moe, money=168

All data written to stdout and sdterr will show on the screen as your application writes it. Splunk is not running your app and processing the output, this tool merely lets you test your script to see if it would run in the environment that Splunk would run it. If you want to see the environmental variables for Splunk’s environment, just add the printevn command to the end of your shell script and run the ./splunk cmd command again. A simple example of the difference in environmental variables:

noah@thor:/opt/splunkforwarder/bin$ printenv | grep "splunk"
PWD=/opt/splunkforwarder/bin

noah@thor:/opt/splunkforwarder/bin$ ./splunk cmd /usr/bin/printenv | grep "splunk"
PATH=/opt/splunkforwarder/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin
PWD=/opt/splunkforwarder/bin
_=./splunk
SPLUNK_HOME=/opt/splunkforwarder
SPLUNK_DB=/opt/splunkforwarder/var/lib/splunk
SPLUNK_ETC=/opt/splunkforwarder/etc
SPLUNK_WEB_NAME=splunkweb
LD_LIBRARY_PATH=/opt/splunkforwarder/lib
OPENSSL_CONF=/opt/splunkforwarder/openssl/openssl.cnf
LDAPCONF=/opt/splunkforwarder/etc/openldap/ldap.conf

Where To Go From Here

If you want to see some examples of Splunk Tecnology Add-ons, just download them from the Splunkbase and browse through their files, or look at the links below.

Feedback is welcomed, especially if there are errors in this guide or recommendations you have from your own experience. Please contact me here.

Resources

Setting up a scripted input
Add a scripted input with inputs.conf
Writing reliable scripts
Anatomy of an app
Advanced Python Script Testing
A good Scripted Inputs Tutorial
Building Splunk Technology Add-ons From the Splunk blog.
Package and publish a Splunk app
Apps and add-ons: an introduction