1 - Installation (Linux)
This is circa 2014 - use with a grain of salt.
This is generally the first step, as you need a place to collect your logs. Elasticsearch itself is a NoSQL database and well suited for pure-web style integrations.
Java is required, and you may wish to deploy Oracle’s java per the Elasticsearch team’s recommendation. You may also want to dedicate a data partition. By default, data is stored in /var/lib/elasticsearch and that can fill up. We will also install the ‘kopf’ plugin that makes it easier to manage your data.
Install Java and Elasticsearch
# (add a java repo)
sudo yum install java
# (add the elasticsearch repo)
sudo yum install elasticsearch
# Change the storage location
sudo mkdir /opt/elasticsearch
sudo chown elasticsearch:elasticsearch /opt/elasticsearch
sudo vim /etc/elasticsearch/elasticsearch.yml
...
path.data: /opt/elasticsearch/data
...
# Allow connections on ports 9200, 9300-9400 and set the cluster IP
# By design, Elasticsearch is open so control access with care
sudo iptables --insert INPUT --protocol tcp --source 10.18.0.0/16 --dport 9200 --jump ACCEPT
sudo iptables --insert INPUT --protocol tcp --source 10.18.0.0/16 --dport 9300:9300 --jump ACCEPT
sudo vim /etc/elasticsearch/elasticsearch.yml
...
# Failing to set the 'publish_host can result in the cluster auto-detecting an interface clients or other
# nodes can't reach. If you only have one interface you can leave commented out.
network.publish_host: 10.18.3.1
...
# Increase the heap size
sudo vim /etc/sysconfig/elasticsearch
# Heap size defaults to 256m min, 1g max
# Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g
ES_HEAP_SIZE=2g
# Install the kopf plugin and access it via your browser
sudo /usr/share/elasticsearch/bin/plugin -install lmenezes/elasticsearch-kopf
sudo service elasticsearch restart
In your browser, navigate to
http://10.18.3.1:9200/_plugin/kopf/
If everything is working correctly you should see a web page with KOPF at the top.
2 - Installation (Windows)
You may need to install on windows to ensure the ‘maximum amount of service ability with existing support staff’. I’ve used it on both Windows and Linux and it’s fine either way. Windows just requires a few more steps.
Requirements and Versions
The current version of Elasticsearch at time of writing these notes is 7.6. It requires an OS and Java. The latest of those supported are:
- Windows Server 2016
- OpenJDK 13
Installation
The installation instructions are at https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-elastic-stack.html
Note: Elasicsearch has both an zip and a MSI. The former comes with a java distro but the MSI includes a service installer.
Java
The OpenJDK 13 GA Releases at https://jdk.java.net/13/ no longer include installers or the JRE. But you can install via a MSI from https://github.com/ojdkbuild/ojdkbuild
Download the latest java-13-openjdk-jre-13.X and execute. Use the advanced settings to include the configuration of the JAVA_HOME and other useful variables.
To test the install, open a command prompt and check the version
C:\Users\allen>java --version
openjdk 13.0.2 2020-01-14
OpenJDK Runtime Environment 19.9 (build 13.0.2+8)
OpenJDK 64-Bit Server VM 19.9 (build 13.0.2+8, mixed mode, sharing)
Elasticsearch
Download the MSI installer from https://www.elastic.co/downloads/elasticsearch. It may be tagged as beta, but it installs the GA product well. Importantly, it also installs a windows service for Elasticsearch.
Verify the installation by checking your services for ‘Elasticsearch’, which should be running.
Troubleshooting
Elasticsearch only listing on localhhost
By default, this is the case. You must edit the config file.
# In an elevated command prompt
notepad C:\ProgramDaata\Elastic\Elasticsearach\config\elasticsearch.yml
# add
discovery.type: single-node
network.host: 0.0.0.0
failure while checking if template exists: 405 Method Not Allowed
You can’t run newer versions of the filebeat with older versions of elasticsearch. Download the old deb and sudo apt install ./some.deb
https://discuss.elastic.co/t/filebeat-receives-http-405-from-elasticsearch-after-7-x-8-1-upgrade/303821 https://discuss.elastic.co/t/cant-start-filebeat/181050
3 - Common Tasks
This is circa 2014 - use with a grain of salt.
Configuration of elasticsearch itself is seldom needed. You will have to maintain the data in your indexes however. This is done by either using the kopf tool, or at the command line.
After you have some data in elasticsearch, you’ll see that your ‘documents’ are organized into ‘indexes’. This is a simply a container for your data that was specified when logstash originally sent it, and the naming is arbitrarily defined by the client.
Deleting Data
The first thing you’re likely to need is to delete some badly-parsed data from your testing.
Delete all indexes with the name test*
curl -XDELETE http://localhost:9200/test*
Delete from all indexes documents of type ‘WindowsEvent’
curl -XDELETE http://localhost:9200/_all/WindowsEvent
Delete from all indexes documents have the attribute ‘path’ equal to ‘/var/log/httpd/ssl_request.log’
curl -XDELETE 'http://localhost:9200/_all/_query?q=path:/var/log/https/ssl_request.log'
Delete from the index ’logstash-2014.10.29’ documents of type ‘shib-access’
curl -XDELETE http://localhost:9200/logstash-2014.10.29/shib-access
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
Curator
All the maintenance by hand has to stop at some point and Curator is a good tool to automate some of it. This is a script that will do some curls for you, so to speak.
Install
wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install elasticsearch-curator
sudo pip install argparse
Use
curator --help
curator delete --help
And in your crontab
# Note: you must escape % characters with a \ in crontabs
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-bb-.*'
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-adfsv2-.*'
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-20.*'
Sometimes you’ll need to do an inverse match.
0 20 * * * curator delete indices --regex '^((?!logstash).)*$'
A good way to test your regex is by using the show indices method
curator show indices --regex '^((?!logstash).)*$'
Here’s some OLD posts and links, but be aware the syntax had changed and it’s been several versions since these
http://www.ragingcomputer.com/2014/02/removing-old-records-for-logstash-elasticsearch-kibana http://www.elasticsearch.org/blog/curator-tending-your-time-series-indices/ http://stackoverflow.com/questions/406230/regular-expression-to-match-line-that-doesnt-contain-a-word
Replication and Yellow Cluster Status
By default, elasticsearch assumes you want to have two nodes and replicate your data and the default for new indexes is to have 1 replica. You may not want to do that to start with however, so you change the default and change the replica settings on your existing data in-bulk with:
Set all existing replica requirements to just one copy
curl -XPUT 'localhost:9200/_settings' -d '
{
"index" : { "number_of_replicas" : 0 }
}'
Change the default settings for new indexes to have just one copy
curl -XPUT 'localhost:9200/_template/logstash_template' -d '
{
"template" : "*",
"settings" : {"number_of_replicas" : 0 }
} '
Unassigned Shards
You will occasionally have a hiccup where you run out of disk space or something similar and be left with indexes that have no data in them or have shards unassigned. Generally, you will have to delete them but you can also manually reassign them.
http://stackoverflow.com/questions/19967472/elasticsearch-unassigned-shards-how-to-fix
Listing Index Info
You can get a decent human readable list of your indexes using the cat api
curl localhost:9200/_cat/indices
If you wanted to list by size, they use the example
curl localhost:9200/_cat/indices?bytes=b | sort -rnk8