This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Elasticsearch

1 - Installation (Linux)

This is circa 2014 - use with a grain of salt.

This is generally the first step, as you need a place to collect your logs. Elasticsearch itself is a NoSQL database and well suited for pure-web style integrations.

Java is required, and you may wish to deploy Oracle’s java per the Elasticsearch team’s recommendation. You may also want to dedicate a data partition. By default, data is stored in /var/lib/elasticsearch and that can fill up. We will also install the ‘kopf’ plugin that makes it easier to manage your data.

Install Java and Elasticsearch

# (add a java repo)
sudo yum install java

# (add the elasticsearch repo)
sudo yum install elasticsearch

# Change the storage location
sudo mkdir /opt/elasticsearch
sudo chown elasticsearch:elasticsearch /opt/elasticsearch

sudo vim /etc/elasticsearch/elasticsearch.yml

    ...
    path.data: /opt/elasticsearch/data
    ...

# Allow connections on ports 9200, 9300-9400 and set the cluster IP

# By design, Elasticsearch is open so control access with care
sudo iptables --insert INPUT --protocol tcp --source 10.18.0.0/16 --dport 9200 --jump ACCEPT

sudo iptables --insert INPUT --protocol tcp --source 10.18.0.0/16 --dport 9300:9300 --jump ACCEPT

sudo vim /etc/elasticsearch/elasticsearch.yml
    ...
    # Failing to set the 'publish_host can result in the cluster auto-detecting an interface clients or other
    # nodes can't reach. If you only have one interface you can leave commented out. 
    network.publish_host: 10.18.3.1
    ...


# Increase the heap size
sudo vim  /etc/sysconfig/elasticsearch

    # Heap size defaults to 256m min, 1g max
    # Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g
ES_HEAP_SIZE=2g

# Install the kopf plugin and access it via your browser

sudo /usr/share/elasticsearch/bin/plugin -install lmenezes/elasticsearch-kopf
sudo service elasticsearch restart

In your browser, navigate to

http://10.18.3.1:9200/_plugin/kopf/

If everything is working correctly you should see a web page with KOPF at the top.

2 - Installation (Windows)

You may need to install on windows to ensure the ‘maximum amount of service ability with existing support staff’. I’ve used it on both Windows and Linux and it’s fine either way. Windows just requires a few more steps.

Requirements and Versions

The current version of Elasticsearch at time of writing these notes is 7.6. It requires an OS and Java. The latest of those supported are:

  • Windows Server 2016
  • OpenJDK 13

Installation

The installation instructions are at https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-elastic-stack.html

Note: Elasicsearch has both an zip and a MSI. The former comes with a java distro but the MSI includes a service installer.

Java

The OpenJDK 13 GA Releases at https://jdk.java.net/13/ no longer include installers or the JRE. But you can install via a MSI from https://github.com/ojdkbuild/ojdkbuild

Download the latest java-13-openjdk-jre-13.X and execute. Use the advanced settings to include the configuration of the JAVA_HOME and other useful variables.

To test the install, open a command prompt and check the version

C:\Users\allen>java --version
openjdk 13.0.2 2020-01-14
OpenJDK Runtime Environment 19.9 (build 13.0.2+8)
OpenJDK 64-Bit Server VM 19.9 (build 13.0.2+8, mixed mode, sharing)

Elasticsearch

Download the MSI installer from https://www.elastic.co/downloads/elasticsearch. It may be tagged as beta, but it installs the GA product well. Importantly, it also installs a windows service for Elasticsearch.

Verify the installation by checking your services for ‘Elasticsearch’, which should be running.

Troubleshooting

Elasticsearch only listing on localhhost

By default, this is the case. You must edit the config file.

# In an elevated command prompt
notepad C:\ProgramDaata\Elastic\Elasticsearach\config\elasticsearch.yml

# add
discovery.type: single-node
network.host: 0.0.0.0

https://stackoverflow.com/questions/59350069/elasticsearch-start-up-error-the-default-discovery-settings-are-unsuitable-for

failure while checking if template exists: 405 Method Not Allowed

You can’t run newer versions of the filebeat with older versions of elasticsearch. Download the old deb and sudo apt install ./some.deb

https://discuss.elastic.co/t/filebeat-receives-http-405-from-elasticsearch-after-7-x-8-1-upgrade/303821 https://discuss.elastic.co/t/cant-start-filebeat/181050

3 - Common Tasks

This is circa 2014 - use with a grain of salt.

Configuration of elasticsearch itself is seldom needed. You will have to maintain the data in your indexes however. This is done by either using the kopf tool, or at the command line.

After you have some data in elasticsearch, you’ll see that your ‘documents’ are organized into ‘indexes’. This is a simply a container for your data that was specified when logstash originally sent it, and the naming is arbitrarily defined by the client.

Deleting Data

The first thing you’re likely to need is to delete some badly-parsed data from your testing.

Delete all indexes with the name test*

curl -XDELETE http://localhost:9200/test*

Delete from all indexes documents of type ‘WindowsEvent’

curl -XDELETE http://localhost:9200/_all/WindowsEvent

Delete from all indexes documents have the attribute ‘path’ equal to ‘/var/log/httpd/ssl_request.log’

curl -XDELETE 'http://localhost:9200/_all/_query?q=path:/var/log/https/ssl_request.log'

Delete from the index ’logstash-2014.10.29’ documents of type ‘shib-access’

curl -XDELETE http://localhost:9200/logstash-2014.10.29/shib-access

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html

Curator

All the maintenance by hand has to stop at some point and Curator is a good tool to automate some of it. This is a script that will do some curls for you, so to speak.

Install

wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install elasticsearch-curator
sudo pip install argparse

Use

curator --help
curator delete --help

And in your crontab

# Note: you must escape % characters with a \ in crontabs
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-bb-.*'
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-adfsv2-.*'
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-20.*'

Sometimes you’ll need to do an inverse match.

0 20 * * * curator delete indices --regex '^((?!logstash).)*$'

A good way to test your regex is by using the show indices method

curator show indices --regex '^((?!logstash).)*$'

Here’s some OLD posts and links, but be aware the syntax had changed and it’s been several versions since these

http://www.ragingcomputer.com/2014/02/removing-old-records-for-logstash-elasticsearch-kibana http://www.elasticsearch.org/blog/curator-tending-your-time-series-indices/ http://stackoverflow.com/questions/406230/regular-expression-to-match-line-that-doesnt-contain-a-word

Replication and Yellow Cluster Status

By default, elasticsearch assumes you want to have two nodes and replicate your data and the default for new indexes is to have 1 replica. You may not want to do that to start with however, so you change the default and change the replica settings on your existing data in-bulk with:

http://stackoverflow.com/questions/24553718/updating-the-default-index-number-of-replicas-setting-for-new-indices

Set all existing replica requirements to just one copy

curl -XPUT 'localhost:9200/_settings' -d '
{ 
  "index" : { "number_of_replicas" : 0 } 
}'

Change the default settings for new indexes to have just one copy

curl -XPUT 'localhost:9200/_template/logstash_template' -d ' 
{ 
  "template" : "*", 
  "settings" : {"number_of_replicas" : 0 }
} '

http://stackoverflow.com/questions/24553718/updating-the-default-index-number-of-replicas-setting-for-new-indices

Unassigned Shards

You will occasionally have a hiccup where you run out of disk space or something similar and be left with indexes that have no data in them or have shards unassigned. Generally, you will have to delete them but you can also manually reassign them.

http://stackoverflow.com/questions/19967472/elasticsearch-unassigned-shards-how-to-fix

Listing Index Info

You can get a decent human readable list of your indexes using the cat api

curl localhost:9200/_cat/indices

If you wanted to list by size, they use the example

curl localhost:9200/_cat/indices?bytes=b | sort -rnk8