Common Tasks
This is circa 2014 - use with a grain of salt.
Configuration of elasticsearch itself is seldom needed. You will have to maintain the data in your indexes however. This is done by either using the kopf tool, or at the command line.
After you have some data in elasticsearch, you’ll see that your ‘documents’ are organized into ‘indexes’. This is a simply a container for your data that was specified when logstash originally sent it, and the naming is arbitrarily defined by the client.
Deleting Data
The first thing you’re likely to need is to delete some badly-parsed data from your testing.
Delete all indexes with the name test*
curl -XDELETE http://localhost:9200/test*
Delete from all indexes documents of type ‘WindowsEvent’
curl -XDELETE http://localhost:9200/_all/WindowsEvent
Delete from all indexes documents have the attribute ‘path’ equal to ‘/var/log/httpd/ssl_request.log’
curl -XDELETE 'http://localhost:9200/_all/_query?q=path:/var/log/https/ssl_request.log'
Delete from the index ’logstash-2014.10.29’ documents of type ‘shib-access’
curl -XDELETE http://localhost:9200/logstash-2014.10.29/shib-access
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
Curator
All the maintenance by hand has to stop at some point and Curator is a good tool to automate some of it. This is a script that will do some curls for you, so to speak.
Install
wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install elasticsearch-curator
sudo pip install argparse
Use
curator --help
curator delete --help
And in your crontab
# Note: you must escape % characters with a \ in crontabs
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-bb-.*'
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-adfsv2-.*'
20 0 * * * curator delete indices --time-unit days --older-than 14 --timestring '\%Y.\%m.\%d' --regex '^logstash-20.*'
Sometimes you’ll need to do an inverse match.
0 20 * * * curator delete indices --regex '^((?!logstash).)*$'
A good way to test your regex is by using the show indices method
curator show indices --regex '^((?!logstash).)*$'
Here’s some OLD posts and links, but be aware the syntax had changed and it’s been several versions since these
http://www.ragingcomputer.com/2014/02/removing-old-records-for-logstash-elasticsearch-kibana http://www.elasticsearch.org/blog/curator-tending-your-time-series-indices/ http://stackoverflow.com/questions/406230/regular-expression-to-match-line-that-doesnt-contain-a-word
Replication and Yellow Cluster Status
By default, elasticsearch assumes you want to have two nodes and replicate your data and the default for new indexes is to have 1 replica. You may not want to do that to start with however, so you change the default and change the replica settings on your existing data in-bulk with:
Set all existing replica requirements to just one copy
curl -XPUT 'localhost:9200/_settings' -d '
{
"index" : { "number_of_replicas" : 0 }
}'
Change the default settings for new indexes to have just one copy
curl -XPUT 'localhost:9200/_template/logstash_template' -d '
{
"template" : "*",
"settings" : {"number_of_replicas" : 0 }
} '
Unassigned Shards
You will occasionally have a hiccup where you run out of disk space or something similar and be left with indexes that have no data in them or have shards unassigned. Generally, you will have to delete them but you can also manually reassign them.
http://stackoverflow.com/questions/19967472/elasticsearch-unassigned-shards-how-to-fix
Listing Index Info
You can get a decent human readable list of your indexes using the cat api
curl localhost:9200/_cat/indices
If you wanted to list by size, they use the example
curl localhost:9200/_cat/indices?bytes=b | sort -rnk8
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.