- 2015.08.14
- mapper attachments type for elasticsearch
- each node
- bin/plugin install elasticsearch/elasticsearch-mapper-attachments/2.4.3
- note that 2.4.3 is for ES 1.4
- restart
- bin/plugin install elasticsearch/elasticsearch-mapper-attachments/2.4.3
- DELETE /test
- PUT /test
- PUT /test/person/_mapping
{
"person" : {
"properties" : {
"file" : {
"type" : "attachment",
"fields" : {
"file" : {"term_vector" : "with_positions_offsets", "store": true},
"title" : {"store" : "yes"},
"date" : {"store" : "yes"},
"author" : {"store" : "yes"},
"keywords" : {"store" : "yes"},
"content_type" : {"store" : "yes"},
"content_length" : {"store" : "yes"},
"language" : {"store" : "yes"}
}
}
}
}
} - curl -XPOST "http://localhost:9200/test/person" -d '
{
"file" : {
"_content" : "... base64 encoded attachment ..."
}
}' - for long base64
- curl -XPOST "http://localhost:9200/test/person" -d @- <<CURL_DATA
{
"file" : {
"_content" : "`base64 my.pdf | perl -pe 's/\n/\\n/g'`"
}
}
CURL_DATA
- GET /test/person/_search
{
"fields": [ "file.date", "file.title", "file.name", "file.author", "file.keywords", "file.language", "file.cotent_length", "file.content_type", "file" ],
"query": {
"match": {
"file.content_type": "pdf"
}
}
}
- each node
- mapper attachments type for elasticsearch
- 2015.03.03
- bashrc
- export INNERIP=`hostname -i`
export ES_HEAP_SIZE=8g
export ES_CLASSPATH=/etc/hadoop/conf:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/libexec/../../hadoop-yarn/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*
- configuration
- cluster.name: test
- node.name: ${HOSTNAME}
- transport.host: ${INNERIP}
- discovery.zen.ping.multicast.enabled: false
- discovery.zen.ping.unicast.hosts: ["10.0.2.a", "10.0.2.b", "10.0.2.c"]
- indices.fielddata.cache.size: 40%
- bashrc
- 2015.03.02
- snapshot and restore
- repository register
- PUT _snapshot/hdfs
{
"type": "hdfs",
"settings": {
"path": "/backup/elasticsearch"
}
}
- repository verification
- POST _snapshot/hdfs/_verify
- snapshot
- PUT _snapshot/hdfs/20150302
- monitoring snapshot/restore progress
- GET _snapshot/hdfs/20150302/_status
- GET _snapshot/hdfs/20150302
- snapshot information and status
- GET _snapshot/hdfs/20150302
- GET _snapshot/hdfs/_all
- GET _snapshot/_status
- GET _snapshot/hdfs/_status
- GET _snapshot/hdfs/20150302/_status
- restore
- POST _snapshot/hdfs/20150302/_restore
- snapshot deletion / stopping currently running snapshot and restore operations
- DELETE _snapshot/hdfs/20150302
- repository deletion
- DELETE _snapshot/hdfs
- reference
- repository register
- rolling update
- Disable shard reallocation
- curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.enable" : "none" } }'
- Shut down a single node within the cluster
- curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown'
- Confirm that all shards are correctly reallocated to the remaining running nodes
- Download newest version
- Extract the zip or tarball to a new directory
- Copy the configuration files from the old Elasticsearch installation’s config directory to the new Elasticsearch installation’s config directory
- Move data files from the old Elasticsesarch installation’s data directory
- Install plugins
- Start the now upgraded node
- Confirm that it joins the cluster
- Re-enable shard reallocation
- curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.enable" : "all" } }'
- Observe that all shards are properly allocated on all nodes
- Repeat this process for all remaining nodes
- Reference
- Disable shard reallocation
- snapshot and restore
- 2015.02.13
- MySQL Slow Query Log Mapping
PUT msql-2015 { "mappings": { "log": { "properties": { "@timestamp": { "type": "date", "format": "dateOptionalTime" }, "@version": { "type": "string" }, "host": { "type": "string", "index": "not_analyzed" }, "ip": { "type": "string", "index": "not_analyzed" }, "lock_time": { "type": "double" }, "message": { "type": "string", "index": "not_analyzed" }, "query": { "type": "string" }, "query_time": { "type": "double" }, "rows_examined": { "type": "double" }, "rows_sent": { "type": "double" }, "type": { "type": "string" }, "user": { "type": "string" } } } } }
- MySQL Slow Query Dump Mapping
PUT msqld-2015 { "mappings": { "dump": { "properties": { "@timestamp": { "type": "date", "format": "dateOptionalTime" }, "@version": { "type": "string" }, "count": { "type": "double" }, "host": { "type": "string", "index": "not_analyzed" }, "ip": { "type": "string", "index": "not_analyzed" }, "lock": { "type": "double" }, "message": { "type": "string", "index": "not_analyzed" }, "query": { "type": "string" }, "rows": { "type": "double" }, "time": { "type": "double" }, "type": { "type": "string" }, "user": { "type": "string" } } } } }
- 2015.02.12
- MySQL Slow Query Log & Dump Mappings
PUT msqld-2015 { "mappings": { "log": { "properties": { "@timestamp": { "type": "date", "format": "dateOptionalTime" }, "@version": { "type": "string" }, "host": { "type": "string", "index": "not_analyzed" }, "ip": { "type": "string", "index": "not_analyzed" }, "lock_time": { "type": "double" }, "message": { "type": "string", "index": "not_analyzed" }, "query": { "type": "string" }, "query_time": { "type": "double" }, "rows_examined": { "type": "double" }, "rows_sent": { "type": "double" }, "type": { "type": "string" }, "user": { "type": "string" } } }, "dump": { "properties": { "@timestamp": { "type": "date", "format": "dateOptionalTime" }, "@version": { "type": "string" }, "count": { "type": "double" }, "host": { "type": "string", "index": "not_analyzed" }, "ip": { "type": "string", "index": "not_analyzed" }, "lock": { "type": "double" }, "message": { "type": "string", "index": "not_analyzed" }, "query": { "type": "string" }, "rows": { "type": "double" }, "time": { "type": "double" }, "type": { "type": "string" }, "user": { "type": "string" } } } } }
- 2015.01.19
- restart script
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":true}}' sleep 1s curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown' sleep 1s bin/elasticsearch -d sleep 10s curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":false}}'
- restart script
- ~ 2015.01.01
- Commnad
- curl 'http://localhost:9200/?pretty'
- curl -XPOST 'http://localhost:9200/_shutdown'
- curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown'
- curl -XPOST 'http://localhost:9200/_cluster/nodes/nodeId1,nodeId2/_shutdown'
- curl -XPOST 'http://localhost:9200/_cluster/nodes/_master/_shutdown'
- Configuration
- config/elasticsearch.yml
- cluster.name
- node.name
- node.master
- node.data
- path.*
- path.conf: -Des.path.conf
- path.data
- path.work
- path.logs
- discovery.zen.ping.multicast.enabled: false
- discovery.zen.ping.unicast.hosts
- gateway.recover_after_nodes: n
- discovery.zen.minimum_master_nodes: (n/2) + 1
- action.disable_delete_all_indices: true
- action.auto_create_index: false
- action.destructive_requires_name: true
- index.mapper.dynamic: false
- script.disable_dynamic: true
- indices.fielddata.cache.size: 40%
- dynamic
- discovery.zen.minimum_master_nodes
curl -XPUT localhost:9200/_cluster/settings -d '{ "persistent" : { "discovery.zen.minimum_master_nodes" : (n/2) + 1 } }'
- disable _all
PUT /my_index/_mapping/my_type { "my_type": { "_all": { "enabled": false } } }
- include_in_all
PUT /my_index/my_type/_mapping { "my_type": { "include_in_all": false, "properties": { "title": { "type": "string", "include_in_all": true }, ... } } }
- _alias, _aliases
PUT /my_index_v1 PUT /my_index_v1/_alias/my_index
POST /_aliases { "actions": [ { "remove": { "index": "my_index_v1", "alias": "my_index" }}, { "add": { "index": "my_index_v2", "alias": "my_index" }} ] }
- refresh_interval (bulk indexing)
PUT /my_logs { "settings": { "refresh_interval": "30s" } }
POST /my_logs/_settings { "refresh_interval": -1 } POST /my_logs/_settings { "refresh_interval": "1s" }
- flush
POST /blogs/_flush POST /_flush?wait_for_ongoing
- optimize
POST /logstash-old-index/_optimize?max_num_segments=1
- filed length norm (for logging)
PUT /my_index { "mappings": { "doc": { "properties": { "text": { "type": "string", "norms": { "enabled": false } } } } } }
- tune cluster and index recovery settings (test the value)
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.node_initial_primary_recoveries":25}}' curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.node_concurrent_recoveries":5}}' ? curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.recovery.max_bytes_per_sec":"100mb"}}' curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.recovery.concurrent_streams":20}}'
- logging.yml
- use node.name instead of cluster.name
file: ${path.logs}/${node.name}.log
- elasticsearch.in.sh
- disable HeapDumpOnOutOfMemoryError
#JAVA_OPTS="$JAVA_OPTS -XX:+HeapDumpOnOutOfMemoryError"
- ES_HEAP_SIZE: 50% (< 32g)
- export ES_HEAP_SIZE=31g
- no swap
- bootstrap.mlockall = true
- ulimit -l unlimited
- thread pools
- thread pool size
- search - 3 * # of processors (3 * 64 = 192)
- index - 2 * # of processors (2 * 64 = 128)
- bulk - 3 * # of processors (3 * 64 = 192)
- queues - set the size to -1 to prevent rejections from ES
- thread pool size
- buffers
- increased indexing buffer size to 40%
- dynamic node.name
- ES script
export ES_NODENMAE=`hostname -s`
- elasticsearch.yml
node.name: "${ES_NODENAME}"
- config/elasticsearch.yml
- Hardware
- CPU
- core
- disk
- SSD
- noop / deadline scheduler
- better IOPS
- cheaper WRT: IOPS
- manufacturing tolerance can vary
- RAID
- do not necessarily need
- ES handles redundancy
- SSD
- CPU
- Monitoring
- curl 'localhost:9200/_cluster/health'
- curl 'localhost:9200/_nodes/process'
- max_file_descriptotrs: 30000?
- curl 'localhost:9200/_nodes/jvm'
- version
- mem.heap_max
- curl 'localhost:9200/_nodes/jvm/stats'
- heap_used
- curl 'localhost:9200/_nodes/indices/stats'
- fielddata
- curl 'localhost:9200/_nodes/indices/stats?fields=created_on'
- fields
- curl 'localhost:9200/_nodes/http/stats'
- http
- GET /_stats/fielddata?fields=*
- GET /_nodes/stats/indices/fielddata?fields=*
- GET /_nodes/stats/indices/fielddata?level=indices&fields=*
- Scenario
- adding nodes
- disable allocation to stop shard shuffling until ready
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":true}}'
- increase speed of transfers
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"indices.recovery.concurrent_streams":6,"indices.recovery.max_bytes_per_sec":"50mb"}}'
- start new nodes
- enable allocation
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":false}}'
- removing nodes
- exclude the nodes from the cluster, this will tell ES to move things off
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.exclude._name":"node-05*,node-06*"}}'
- increase speed of transfers
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"indices.recovery.concurrent_streams":6,"indices.recovery.max_bytes_per_sec":"50mb"}}'
- shutdown old nodes after all shards move off
curl -XPOST 'localhost:9200/_cluster/nodes/node-05*,node-06*/_shutdown'
- upgrades / node restarts
- disable auto balancing if doing rolling restarts
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":true}}'
- restart
- able auto balancing
curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":false}}'
- re / bulk indexing
- set replicas to 0
- increase after completion
- configure heap size
- heap size setting
- export ES_HEAP_SIZE=9g
- curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":true}}'
- curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown'
- bin/elasticsearch -d
- curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.disable_allocation":false}}'
- adding nodes
- Commnad
Monday, August 31, 2015
ElasticSearch
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.