Write Test Cases for Your Apache Logstash Filters

#logstash#rspec#docker#ruby

About a year ago, I started working on an ELK (ElasticSearch, Logstash, Kibana) setup for a BI platform. While processing events in Logstash, I found it frustrating to work with filter changes, especially when dealing with complex regular expressions, structures, or conditions.

Having extensive experience in Ruby on Rails development, I decided to leverage RSpec test cases to cover my Logstash filters. This approach made sense since Logstash uses JRuby internally for its packages and extensions.

I also wanted to containerize everything with Docker to avoid installing Logstash and gems on my local machine.

Let's start with the Dockerfile and then build out the project structure step by step.

FROM logstash:2.4

# Install Rspec related dependencies
RUN logstash-plugin install --development

# Install prod dependencies
RUN logstash-plugin install logstash-filter-prune

ARG ES_PLUGIN=logstash-output-elasticsearch-6.2.4-java.gem
ARG KAFKA_PLUGIN=logstash-input-kafka-7.0.0.gem

COPY gems/${ES_PLUGIN} /tmp/${ES_PLUGIN}
RUN logstash-plugin install /tmp/${ES_PLUGIN}

COPY gems/${KAFKA_PLUGIN} /tmp/${KAFKA_PLUGIN}
RUN logstash-plugin install /tmp/${KAFKA_PLUGIN}

The gems/ directory contains frozen gem versions for our Logstash setup.

For a more efficient way to run test cases, let's define a Makefile:

NAME = your_logstash

build:
	docker build -t $(NAME) .
.PHONY: build

clean:
	docker rmi --force $(NAME)
.PHONY: clean

test:
	@docker run --rm -t -i \
		-v `pwd`/../:/app \
		-w /app \
		$(NAME) \
		/bin/bash -c "rspec /app/logstash/spec/$(TEST_CASE)"
.PHONY: test

console:
	@docker run --rm -t -i \
		-v `pwd`/../:/app \
		-w /app \
		$(NAME) \
		/bin/bash
.PHONY: console
  • make test runs RSpec inside your container
  • make console opens an interactive terminal inside the Docker container for manual execution and debugging (using binding.pry) of your specs

Now, let's create our spec/spec_helper.rb, which forms the foundation of our RSpec setup:

require "logstash/devutils/rspec/spec_helper"
require 'rspec'
require 'rspec/expectations'

require 'ostruct'
require 'erb'
require 'yaml'
require 'json'

# Running the grok code outside a logstash package means
# LOGSTASH_HOME will not be defined, so let's set it here
# before requiring the grok filter
# (coming from the original examples for logstash specs)
unless LogStash::Environment.const_defined?(:LOGSTASH_HOME)
  LogStash::Environment::LOGSTASH_HOME = File.expand_path("../", __FILE__)
end

module Helpers
  ROOT_PATH = File.dirname(File.expand_path(__FILE__))
  TEMPLATES_PATH = File.join(ROOT_PATH, '..', 'conf.d/')

  def load_fixture(filename, settings = {})
    message = File.read(File.join(ROOT_PATH, 'fixtures', filename))
    settings.merge('message' => message)
  end

  def load_filter(filename, render_vars = {})
    content = File.read(File.join(TEMPLATES_PATH, filename))

    render_vars = OpenStruct.new(render_vars)

    # This isn't the most elegant solution, but it's the simplest way to handle
    # Jinja2-style variable replacement
    template = ERB.new(content.gsub('{{', '<%=').gsub('}}', '%>'))
    template.result(render_vars.instance_eval { binding })
  end
end

require "logstash/filters/grok"

In my Logstash filters, I use Jinja2-style syntax ({{ }}) for variable replacement, which gets handled by Ansible during deployment.

Now we're ready to define an actual spec to test our Logstash filter. Let's assume we want to parse a line like username=<username>. We'll create a filter_spec.rb file inside the spec/filters folder. Filters should focus solely on processing logic, not input or output statements.

require_relative '../spec_helper'

describe 'elb' do
  extend Helpers

  # Set config using our defined filter in `conf.d/01-filter.conf` file
  config load_filter('01-filter.conf')

  # You can define in spec/fixtures/sample1.txt
  # Example: username=oivoodoo
  sample(load_fixture('sample1.txt', 'type' => 'your-source-data')) do
    insist { subject.get('username') } == 'oivoodoo'
  end
end

And here's our filter configuration:

filter {
  if [type] == "{{ type }}" {
    grok {
      match => {
        "message" => [
          "%{WORD:username}"
        ]
      }
    }

    prune {
      blacklist_names => [
        "@version",
        "message"
      ]
    }
  }
}

This approach has proven extremely valuable for saving time compared to deploying changes and waiting for new data. It's also easy to add a ruby filter inside your configuration and use binding.pry to inspect the event object during debugging.

← Back to all posts

© Copyright 2023 Bitscorp