Operation

Basic Operation

Generally, you create a config with 3 sections;

  • input
  • filter
  • output

This example uses the grok filter to parse the message.

sudo vi /etc/logstash/conf.d/logstash.conf
input {
  file {
        path => "/var/pdweb/www-default/log/request.log"        
      }
}
filter {
  grok {
    match => [ "message", "%{COMBINEDAPACHELOG}"]
  }
}
output {
  stdout { }
}

Then you test it at the command line

# Test the config file itself
/opt/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf --configtest

# Test the parsing of data
/opt/logstash/bin/logstash -e -f /etc/logstash/conf.d/logstash.conf

You should get some nicely parsed lines. If that’s the case, you can edit your config to add a sincedb and an actual destination.

input {
  file {
        path => "/var/pdweb/www-default/log/request.log"
        sincedb_path => "/opt/logstash/sincedb"
  }
}
filter {
  grok {
    match => [ "message", "%{COMBINEDAPACHELOG}"]
  }
}
output {
  elasticsearch {
    host => "some.server.private"
    protocol => "http"
  }
}

If instead you see output with a _grokparsefailure like below, you need to change the filter. Take a look at the common gotchas, then the parse failure section below it.

{
       "message" => "test message",
      "@version" => "1",
    "@timestamp" => "2014-10-31T17:39:28.925Z",
          "host" => "some.app.private",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

Common Gotchas

No New Data

Logstash reads new lines by default. If you don’t have anyone actually hitting your webserver, but you do have some log entries in the file itself, you can tell logstash to process the exiting entries and not save it’s place in the file.

file {
  path => "/var/log/httpd/request.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
}

Multiple Conf files

Logstash uses all the files in the conf.d directory - even if they don’t end in .conf. Make sure to remove any you don’t want as they can conflict.

Default Index

Logstash creates Elasticsearch indexes that look like:

logstash-%{+YYYY.MM.dd}

The logstash folks have some great material on how to get started. Really top notch.

http://logstash.net/docs/1.4.2/configuration#fieldreferences

Parse Failures

The Greedy Method

The best way to start is to change your match to a simple pattern and work out from there. Try the ‘GREEDYDATA’ pattern and assign it to a field named ‘Test’. This takes the form of:

%{GREEDYDATA:Test}

And it looks like:

filter {
  grok {
    match => [ "message" => "%{GREEDYDATA:Test}" ]
  }
}


       "message" => "test message",
      "@version" => "1",
    "@timestamp" => "2014-10-31T17:39:28.925Z",
          "host" => "some.app.private",
          "Test" => "The rest of your message

That should give you some output. You can then start cutting it up with the patterns (also called macros) found here;

You can also use the online grok debugger and the list of default patterns.

Combining Patterns

There may not be a standard pattern for what you want, but it’s easy to pull together several existing ones. Here’s an example that pulls in a custom timestamp.

Example:
Sun Oct 26 22:20:55 2014 File does not exist: /var/www/html/favicon.ico

Pattern:
match => { "message" => "(?<timestamp>%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR})"}

Notice the ‘?’ at the beginning of the parenthetical enclosure. That tells the pattern matching engine not to bother capturing that for later use. Like opting out of a ( ) and \1 in sed.

Optional Fields

Some log formats simply skip columns when they don’t have data. This will cause your parse to fail unless you make some fields optional with a ‘?’, like this:

match => [ "message", "%{HOSTNAME:VHost}? %{COMBINEDAPACHELOG} %{IP:XForwardedFor}?"]

Date Formats

http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html

Dropping Events

Oftentimes, you’ll have messages that you don’t care about and you’ll want to drop those. Best practice is to do coarse actions first, so you’ll want to compare and drop with a general conditional like:

filter {
  if [message] =~ /File does not exist/ {
    drop { }
  }
  grok {
    ...
    ...

You can also directly reference fields once you have grok’d the message

filter {
  grok {
    match => { "message" => "%{HOSTNAME:VHost}? %{COMBINEDAPACHELOG} %{IP:XForwardedFor}?"}
  }  
  if [request] == "/status" {
        drop { }
  }
}

http://logstash.net/docs/1.4.2/configuration#conditionals

Dating Messages

By default, logstash date stamps the message when it sees them. However, there can be a delay between when an action happens and when it gets logged to a file. To remedy this - and allow you to suck in old files without the date on every event being the same - you add a date filter.

Note - you actually have to grok out the date into it’s own variable, you can’t just attempt to match on the whole message. The combined apache macro below does this for us.

filter { grok { match => { “message” => “%{HOSTNAME:VHost}? %{COMBINEDAPACHELOG} %{IP:XForwardedFor}?”} } date { match => [ “timestamp” , “dd/MMM/yyyy:HH:mm:ss Z” ] } }

In the above case, ’timestamp’ is a parsed field and you’re using the date language to tell it what the component parts are

http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html

Sending to Multiple Servers

In addition to an elasticsearch server, you may want to send it to a syslog server at the same time.

    input {
      file {
        path => "/var/pdweb/www-default/log/request.log"
        sincedb_path => "/opt/logstash/sincedb"
      }
    }

    filter {
      grok {
        match => [ "message", "%{HOSTNAME:VHost}? %{COMBINEDAPACHELOG} %{IP:XForwardedFor}?"]
      }
      date {
        match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
      }

    }

    output {
      elasticsearch {
        host => "some.server.private"
        protocol => "http"
      }
      syslog {
        host => "some.syslog.server"
        port => "514"
        severity => "notice"
        facility => "daemon"
      }
    }

Deleting Sent Events

Sometimes you’ll accidentally send a bunch of event to the server and need to delete and resend corrected versions.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-delete-mapping.html

curl -XDELETE <http://localhost:9200/_all/SOMEINDEX>
curl -XDELETE <http://localhost:9200/_all/SOMEINDEX?q=path:"/var/log/httpd/ssl_request_log>"

Last modified February 4, 2025: Elasticsearch tree initial (97d87cc)