Friday, January 2, 2015

LogStash

  1. introduction
    1. logstash is a tool for managing events and logs
    2. You can use it to collect logs, parse them, and store them for later use (like, for searching)
    3. Speaking of searching, logstash comes with a web interface for searching and drilling into all of your logs
    4. It is fully free and fully open source
    5. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way
    6. logstash is now a part of the Elasticsearch family! This allows us to build better software much faster as well as offering production support
  2. Installation
    1. curl -O https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
    2. tar zxvf logstash-1.4.2.tar.gz
    3. cd logstash-1.4.2
  3. Sample
    1. Sample 1
      1. bin/logstash -e 'input { stdin { } } output { stdout {} }'
      2. hello world
    2. Sample 2
      1. bin/logstash -e 'input { stdin { } } output { stdout { codec => rubydebug } }'
        goodnight moon
        {
        "message" => "goodnight moon",
        "@timestamp" => "2013-11-20T23:48:05.335Z",
        "@version" => "1",
        "host" => "my-laptop"
        }
  4. Configuration
    1. Inputs
      1. file: reads from a file on the filesystem, much like the UNIX command "tail -0a"
      2. redis: reads from a redis server, using both redis channels and also redis lists. Redis is often used as a "broker" in a centralized Logstash installation, which queues Logstash events from remote Logstash "shippers"
      3. lumberjack: processes events sent in the lumberjack protocol. Now called logstash-forwarder
    2. Filters
      1. grok: parses arbitrary text and structure it. Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable. With 120 patterns shipped built-in to Logstash, it’s more than likely you’ll find one that meets your needs
      2. mutate: The mutate filter allows you to do general mutations to fields. You can rename, remove, replace, and modify fields in your events
      3. drop: drop an event completely, for example, debug events
      4. clone: make a copy of an event, possibly adding or removing fields
      5. geoip: adds information about geographical location of IP addresses (and displays amazing charts in kibana)
    3. Outputs
      1. elasticsearch: If you’re planning to save your data in an efficient, convenient and easily queryable format
      2. file: writes event data to a file on disk
    4. Codecs
      1. json: encode / decode data in JSON format
      2. multiline: Takes multiple-line text events and merge them into a single event, e.g. java exception and stacktrace messages
  5. Inputs
    1. file
      input {
        file {
          add_field => ... # hash (optional), default: {}
          codec => ... # codec (optional), default: "plain"
          discover_interval => ... # number (optional), default: 15
          exclude => ... # array (optional)
          path => ... # array (required)
          sincedb_path => ... # string (optional)
          sincedb_write_interval => ... # number (optional), default: 15
          start_position => ... # string, one of ["beginning", "end"] (optional), default: "end"
          stat_interval => ... # number (optional), default: 1
          tags => ... # array (optional)
          type => ... # string (optional)
        }
      }
  6. Filters
    1. grok
      filter {
        grok {
          add_field => ... # hash (optional), default: {}
          add_tag => ... # array (optional), default: []
          break_on_match => ... # boolean (optional), default: true
          drop_if_match => ... # boolean (optional), default: false
          keep_empty_captures => ... # boolean (optional), default: false
          match => ... # hash (optional), default: {}
          named_captures_only => ... # boolean (optional), default: true
          overwrite => ... # array (optional), default: []
          patterns_dir => ... # array (optional), default: []
          remove_field => ... # array (optional), default: []
          remove_tag => ... # array (optional), default: []
          tag_on_failure => ... # array (optional), default: ["_grokparsefailure"]
        }
      }
    2. geoip
      filter {
        geoip {
          add_field => ... # hash (optional), default: {}
          add_tag => ... # array (optional), default: []
          database => ... # a valid filesystem path (optional)
          fields => ... # array (optional)
          remove_field => ... # array (optional), default: []
          remove_tag => ... # array (optional), default: []
          source => ... # string (required)
          target => ... # string (optional), default: "geoip"
      }
      
      }
    3. multiline
      filter {
        multiline {
          add_field => ... # hash (optional), default: {}
          add_tag => ... # array (optional), default: []
          enable_flush => ... # boolean (optional), default: false
          negate => ... # boolean (optional), default: false
          pattern => ... # string (required)
          patterns_dir => ... # array (optional), default: []
          remove_field => ... # array (optional), default: []
          remove_tag => ... # array (optional), default: []
          stream_identity => ... # string (optional), default: "%{host}.%{path}.%{type}"
          what => ... # string, one of ["previous", "next"] (required)
        }
      }
    4. drop
      filter {
        drop {
          add_field => ... # hash (optional), default: {}
          add_tag => ... # array (optional), default: []
          remove_field => ... # array (optional), default: []
          remove_tag => ... # array (optional), default: []
        }
      }
    5. date
      filter {
        date {
          add_field => ... # hash (optional), default: {}
          add_tag => ... # array (optional), default: []
          locale => ... # string (optional)
          match => ... # array (optional), default: []
          remove_field => ... # array (optional), default: []
          remove_tag => ... # array (optional), default: []
          target => ... # string (optional), default: "@timestamp"
          timezone => ... # string (optional)
        }
      }
    6. mutate
      filter {
        mutate {
          add_field => ... # hash (optional), default: {}
          add_tag => ... # array (optional), default: []
          convert => ... # hash (optional)
          gsub => ... # array (optional)
          join => ... # hash (optional)
          lowercase => ... # array (optional)
          merge => ... # hash (optional)
          remove_field => ... # array (optional), default: []
          remove_tag => ... # array (optional), default: []
          rename => ... # hash (optional)
          replace => ... # hash (optional)
          split => ... # hash (optional)
          strip => ... # array (optional)
          update => ... # hash (optional)
          uppercase => ... # array (optional)
        }
      }
  7. Outputs
    1. elasticsearch
      output {
        elasticsearch {
          action => ... # string (optional), default: "index"
          bind_host => ... # string (optional)
          bind_port => ... # number (optional)
          cluster => ... # string (optional)
          codec => ... # codec (optional), default: "plain"
          document_id => ... # string (optional), default: nil
          embedded => ... # boolean (optional), default: false
          embedded_http_port => ... # string (optional), default: "9200-9300"
          flush_size => ... # number (optional), default: 5000
          host => ... # string (optional)
          idle_flush_time => ... # number (optional), default: 1
          index => ... # string (optional), default: "logstash-%{+YYYY.MM.dd}"
          index_type => ... # string (optional)
          manage_template => ... # boolean (optional), default: true
          node_name => ... # string (optional)
          port => ... # string (optional)
          protocol => ... # string, one of ["node", "transport", "http"] (optional)
          template => ... # a valid filesystem path (optional)
          template_name => ... # string (optional), default: "logstash"
          template_overwrite => ... # boolean (optional), default: false
          workers => ... # number (optional), default: 1
        }
      }
    2. elasticearch-river
      output {
        elasticsearch_river {
          codec => ... # codec (optional), default: "plain"
          document_id => ... # string (optional), default: nil
          durable => ... # boolean (optional), default: true
          es_bulk_size => ... # number (optional), default: 1000
          es_bulk_timeout_ms => ... # number (optional), default: 100
          es_host => ... # string (required)
          es_ordered => ... # boolean (optional), default: false
          es_port => ... # number (optional), default: 9200
          exchange => ... # string (optional), default: "elasticsearch"
          exchange_type => ... # string, one of ["fanout", "direct", "topic"] (optional), default: "direct"
          index => ... # string (optional), default: "logstash-%{+YYYY.MM.dd}"
          index_type => ... # string (optional), default: "%{type}"
          key => ... # string (optional), default: "elasticsearch"
          password => ... # string (optional), default: "guest"
          persistent => ... # boolean (optional), default: true
          queue => ... # string (optional), default: "elasticsearch"
          rabbitmq_host => ... # string (required)
          rabbitmq_port => ... # number (optional), default: 5672
          user => ... # string (optional), default: "guest"
          vhost => ... # string (optional), default: "/"
          workers => ... # number (optional), default: 1
        }
      }
    3. stdout
      output {
        stdout {
          codec => ... # codec (optional), default: "plain"
          workers => ... # number (optional), default: 1
        }
      }
  8. Flag
    1. -w: utilize multiple cores
  9. Reference
    1. http://logstash.net/docs/1.4.2
    2. http://grokdebug.herokuapp.com/

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.