My experience of building Splunk application

I joined Splunk a couple weeks ago and my first challenge was to learn everything I could about how to build Splunk applications. The best way of doing that is just to write your own application – and this is exactly what I did.

Application which I wrote contains two parts. The first part of application is a very simple scripted input for Firebase, the second part of application is built with the Splunk Web Framework that shows you objects and their routes on Google Maps using both real-time or playback historic information.

I hope that my experience can give you some thoughts about how you can extend Splunk for your needs.

Prepare environment

Installing Splunk

The first thing you need to do is download Splunk software on your local machine. You can find instructions about how to do it on this page Step-by-step installation instructions. In my case, I was using Mac OS X. The easiest way probably is just to download a tar file and extract it somewhere (I’ll assume that this folder is set to $SPLUNK_HOME variable). Next command can launch Splunk:

$SPLUNK_HOME/bin$ ./splunk start

This command launches both the server and UI componants of Splunk. On first launch it will ask you to read and accept license agreement. If you have security issues with launching this application, just don’t forget to give executable permissions for Splunk:

$SPLUNK_HOME/bin$ chmod +x splunk

After you launch it – you can try to open web page http://localhost:8000, where you will be asked to enter your user name and password (page will show you default user name and password).

To stop Splunk you need to use next command:

$SPLUNK_HOME/bin$ ./splunk stop

Installing application

Next step is to install the application that I built in Splunk. Splunk keeps all applications under $SPLUNK_HOME/etc/apps. If you need to install applications manually (without using http://apps.splunk.com/ you need copy your applications under $SPLUNK_HOME/etc/apps folder and that is it.

If you are familiar with git, you can clone my application repository somewhere, for example on ~/Desktop

~/Desktop$ git clone https://github.com/splunk/splunk-demo-app-firebase

After that you will need to copy folder routemap to $SPLUNK_HOME/etc/apps folder. If you don’t know how to use git or you don’t have git client installed – this is ok too. You can download zip archive splunk-demo-app-firebase, unzip them somewhere and copy folder routemap to $SPLUNK_HOME/etc/apps.

To load these applications you need to restart Splunk:

$SPLUNK_HOME/bin$ ./splunk restart

After all these steps you should be able to see the Route Map application on Splunk Dashboard:

Splunk Dashboard

If everything is installed and you have an Internet connection after navigating to Route Map application, you should be able to see buses and their routes on the map:

Route Map demo application

Building Application

Now, let’s talk about application and about what did I use to build it.

Firebase scripted input

Node.js and bash part

The first application which I wrote uses scripted input. Using scripted inputs is one of the simplest ways to import custom data into Splunk.

I needed to find a way to import data from one of the datasets of the Firebase database, which provides real-time information about buses locations in San Francisco (sf-muni). On the Firebase documentation website I found that it has SDK for Node.js. At first I wrote simple Node.js application, which just sends all data it gets to console output, you can find this application under bin/app. You can try to launch it with the following command if you have Node.js installed:

.../routemap/bin/app$ node app.js

If you don’t have Node.js on your machine, this is not a problem. Splunk has it, so you can write next command using cmd option:

$SPLUNK_HOME/bin$ ./splunk cmd node $SPLUNK_HOME/etc/apps/routemap/bin/app/app.js

If everything is installed properly you should see serialized JSON in output, like this:

{"dirTag":"08X__IB","heading":134,"id":6272,"lat":37.7180899,"lon":-122.44271,"predictable":"true","routeTag":"8X","secsSinceReport":25,"speedKmHr":0,"ts":1383004678.15}
{"dirTag":"28_IB2","heading":357,"id":8231,"lat":37.7480959,"lon":-122.47592,"predictable":"true","routeTag":28,"secsSinceReport":2,"speedKmHr":42,"ts":1383004701.141}
{"dirTag":"49_IB2","heading":1,"id":7011,"lat":37.79403,"lon":-122.42289,"predictable":"true","routeTag":49,"secsSinceReport":16,"speedKmHr":24,"ts":1383004846.601}
{"dirTag":"08AX_OB","heading":168,"id":6247,"lat":37.785683,"lon":-122.40955,"predictable":"true","routeTag":"8AX","secsSinceReport":15,"speedKmHr":16,"ts":1383004487.437}
{"dirTag":"14_OB2","heading":210,"id":7013,"lat":37.7162999,"lon":-122.44122,"predictable":"true","routeTag":14,"secsSinceReport":34,"speedKmHr":22,"ts":1383004809.27}
{"dirTag":"38_IB1","heading":267,"id":6414,"lat":37.777813,"lon":-122.492805,"predictable":"true","routeTag":38,"secsSinceReport":2,"speedKmHr":13,"ts":1383004860.681}

If go one level up from this folder you also will find simple bash script launch_app.sh, which launches this Node.js application using Splunk:

#!/bin/bash

current_dir=$(dirname "$0")
"$SPLUNK_HOME/bin/splunk" cmd node "$current_dir/app/app.js"

For Windows I wrote similar script launch_app.cmd.

Splunk integration

Ok, we have application, which can send all data it gets in console. Next step is to configure the application to make it work with Splunk. In folder ./default you can find two configuration files inputs.conf

# Linux Bash script
[script://./bin/launch_app.sh]
disabled = 0
sourcetype = firebase
source = sf-muni-data
host = publicdata-transit.firebaseio.com

# Windows Batch script
[script://.\bin\launch_app.cmd]
disabled = 0
sourcetype = firebase
source = sf-muni-data
host = publicdata-transit.firebaseio.com

and props.conf:

[firebase]
NO_BINARY_CHECK = 1
TIME_PREFIX="ts":

In file inputs.conf I specified information about how to launch my scripted input and what are the default field values source, sourcetype and host. You can find detailed documentation for inputs.conf at Splunk’s documentation. Also I highly recommend you to learn About default fields (host, source, sourcetype, and more).

File props.conf helps to recognize timestamp values from my events. Using Splunk data preview page you can easily find the right set of properties for your input. This was very helpful for me, so I’d like to explain you in details how you can do this.

For example, you can launch next command to generate preview.log for firebase app:

.../firebase/bin/app$ node app.js >> preview.log

Now you can use preview.log file to find out which properties you need to use. On Splunk Dashboard page click on Add Data:

Add data

After that choose From files and directories link

Add data from file

The Next step will be Preview data before indexing, just set path to file preview.log

Specify file for preview

On the next page you can choose Start new source type, but we are not going to import anything, we just want to find the right properties for our events. As you can see, Splunk failed to parse the timestamp:

Failed to parse timestamp

Click the adjust timestamp and event break settings link on the top of the page and go to the Timestamps tab, insert “ts”: in Timestamp is always prefaced by a pattern and click the Apply button:

Fixed timestamp

Ok, so looks like this is exactly what we want. The next step is to open Advanced mode (props.conf) tab and just copy Currently applied settings to props.conf file.

Advanced mode

Route map demo view

My second part of application is a Django Bindings based extension for Splunk with one custom view. To get template for your first Web Framework app, you just need to run one command:

$SPLUNK_HOME/etc/apps/framework$ ./splunkdj createapp my_app_name

After creating the application folders and template, I spent most of my time working in JavaScript files under django/routemap/static/routemap and HTML layout in file map.html.

Search Macros

Before starting work on your application, think about the data you are going to use and how are you going to use it. In my case, I knew that I was going to use events the I imported with my Firebase scripted input. I also wanted my Route Map application to work with any type of event with only one requirement: these events need to have geo data (latitude and longitude) and timestamps (all events in Splunk have timestamps, right?). So I decided to find an easy way in which user can prepare their data for showing it with my application. And this is how I met Search Macros. In my case I built normalize macros (macros.conf):

# macro prepares data for showing on map with grouping by one field
[normalize(4)]
args = ts, lat, lon, field1
definition = sort $ts$ | eval data=$ts$+";"+$lat$+";"+$lon$ | table data, $field1$

You can find in macros.conf more macros witht the same name normalize, but if you will take a deeper look on them you will find out that they are all the same, the only difference is that they return more fields with last command table. These are fields, which these macros expect:

ts – event timestamp, I use this field for sorting events on server side to make sure that I have all points in order on client side.
lat – Latitude.
lon – Longitude.
field1, field2, field3, field4 – you can specify at minimum just one field or up to 4 fields by which you want to group these events. Basically these fields help me to identify objects.

So, for example, if you have events in Splunk from some source cars-positions which stores information about cars and their positions in time, for example each event has timestamp, latitude, longitude and plate-number fields, you can write next search command to use my Route Map application with your data:

source="cars-positions" | `normalize(ts=timestamp, lat=latitude, lon=longitude, field1=plate-number)`

Splunk Web Framework

Before building your own application with the Web Framework, it’s helpful to learn how the JavaScript libraries included with the Web Framework work:

UnderscoreJS – helps you to do a lot of manipulations with Arrays and objects.
Backbone.js – helps you to keep your application more MVC structured.
RequireJS – helps to manage dependencies in Splunk.
jQuery – helps to do manipulations with DOM.

You don’t really need to know all these libraries. If you never used these JavaScript libraries and you don’t want to go too deep, you can look at examples in the documentation for common implementations.

SplunkJS

If you want to build a Splunk App with the Web Framework it is also good to learn about the SplunkJS stack. In my application I used only three SplunkJS components (you can take a look how in this file mapObjectsPageController.js):

SearchManager – The is the SplunkJS component that helps you to get data from Splunk. Make sure you read: How to create a search manager using SplunkJS Stack.One thing worth to mention is that if you need to get real-time data you need to subscribe on preview event, and in case of non-real-time data you need to subscribe on result event (you also can keep using preview event, but to get all events it is better to use result event). Documentation has note about thisNote that real-time searches don’t finish and they never fire a search:done event.This is actually why I subscribe for both preview and results events. I handle preview when application is in realtime mode:

// When we are in real-time we get only events on preview
var previewData = this.searchManager.data("preview", {count: 0, output_mode: 'json'});
previewData.on('data', function() {
  if (previewData.hasData() && this.mapObjectsView.viewModel.realtime()) {
    dataHandler(previewData.data().results);
  }
}.bind(this));

And handle results when it is in not realtime mode:

// When we are not in real-time we get events on results
var resultsData = this.searchManager.data('results', {count: 0, output_mode: 'json'});
resultsData.on('data', function() {
  if (resultsData.hasData() && !this.mapObjectsView.viewModel.realtime()) {
    dataHandler(resultsData.data().results);
  }
}.bind(this));

But the data handler function is the same for both of them, which parses latitude, longitude and timestamp fields and call renderPoints method:

var dataHandler = function(results) {
  var dataPoints = [];

  for (var rIndex = 0; rIndex < results.length; rIndex++) {
    var result = results[rIndex];

    if (result.data) {
      var data = result.data.split(';');
      var point = { ts: parseFloat(data[0]), lat: parseFloat(data[1]), lon: parseFloat(data[2]) };
      delete result['data'];
      dataPoints.push({obj: result, point: point});
    }
  }

  this.mapObjectsView.renderPoints(dataPoints);
}.bind(this);

They other two components SearchBarView and SearchControlsView provide basic search bar and controls views: Search controls

I have a small trick about how I use SearchManager and SearchBarView, When a user wants to see real-time data set the interval for search [now-10s, now], which only gets new events for last 10 seconds, and at the same time I use selected real-time interval as a time window for how long the application needs to keep points in history. For example, if user selects that he wants to see real-time data in 30 minute window I keep all points in memory for 30 minutes, but ask the server only for events in the last 10 seconds. I don’t do any statistics on the server, so I don’t expect that something can be changed in past. This is my handler for time range events which handles this:

// Update the search manager when the timerange in the searchbar changes
this.searchBarView.timerange.on('change', function(timerange) {
  this.mapObjectsView.viewModel.realtime((/^rt(now)?$/).test(timerange.latest_time));
  if (this.mapObjectsView.viewModel.realtime()) {
    var timeWindow = parseTimeWindow(timerange.earliest_time);
    // In case of real-time we use Search time range as a time window 
    // for how long we want to keep data on client. But we always ask 
    // server only for new events with range -30 seconds.
    this.mapObjectsView.viewModel.timeWindow(timeWindow);
    this.searchManager.search.set({ latest_time: 'rt', earliest_time: 'rt-30' });
  } else {
    this.mapObjectsView.viewModel.timeWindow(null);
    this.searchManager.search.set(this.searchBarView.timerange.val());
  }
}.bind(this));

Application architecture

Ok, now that I’ve told you everything I learned about building Splunk Applications, let’s talk about the Route Map application architecture and how to customize it.

mapObjectsDictionary.js – this file has two types MapObject and MapObjectsDictionary. MapObject represents each object on a map, this type keeps information about all points in time for this object and calculates the current position of the current object when somebody changes currentTime with the calculatePos method. MapObjectsDictionary is a container, which stores all instances of MapObject.
mapObjectsViewModel.js – this files contains one type MapObjectsViewModel, which keeps all required for my view properties and can replay how objects were moved in time from beginTime to endTime.
mapObjectsView.js – this file contains two types RoutesMapView and MapObjectListViewItem. First view type represents whole View you can see on page, including map, special controls and the list of all objects on map, which you can find on the left. MapObjectListViewItem represents all items on left.
mapObjectsPageController.js – this file has all logic to connect views with Splunk.

Two modes

Route Map application can work with real-time data as well as replay historic data depending on what you choose in the search control application.

In real-time mode you can only track locations of objects at the current time. The application also stores the routes of these objects for time window which you choose in the search controls:

Real time

If you choose a specific time range in past, like “Last 15 minutes”, you can replay object movements on the map, also you can specify Speed and Refresh Rate (how often you want to redraw points on map):

Playback mode

In both modes you can filter which objects you can see on map, as well turn showing routes off and on. To find object on the map you can click on the corresponding color in the list.