Elasticsearch入門

Elasticsearch是一個開源的、高度可擴展的全文搜索與分析引擎。它可以存儲海量的數(shù)據(jù),能近乎實時地搜索和分析數(shù)據(jù),能支撐復雜的查詢需求。

Elasticsearch的使用場景有:

在線商店搜索

日志分析(ELK技術(shù)棧)

商品價格波動監(jiān)控

海量數(shù)據(jù)的快速調(diào)查、分析、可視化和即席查詢

Elasticsearch功能強大,使用簡單,接下來我們將介紹Elasticsearch集群的搭建和簡單使用,以快速上手。

基本概念

集群

集群由一個或多個節(jié)點構(gòu)成,使用唯一的名字標識,默認為elasticsearch。如果一個網(wǎng)絡(luò)環(huán)境中運行著多個Elasticsearch集群,集群名字最好不要相同。因為如果節(jié)點配置為根據(jù)集群名字加入集群,那么就會產(chǎn)生沖突。

節(jié)點

節(jié)點是集群中的單個服務(wù)器。節(jié)點也以名字進行標識,默認為UUID,在啟動時獲得。節(jié)點名字可以配置。集群可以包含任意多個節(jié)點,單節(jié)點也可以構(gòu)成一個集群。

索引

索引是文檔的集合。集群中可以創(chuàng)建任意多個索引,只要資源足夠。

類型

索引中可以定義一個或多個類型,類型是索引下的邏輯分類,通常擁有共同字段的文檔定義在一個類型之內(nèi)。

文檔

文檔是索引中信息的基本單元。

分片(shard)和副本(replica)

索引可以存儲大量的數(shù)據(jù),會超過單個節(jié)點的硬件上限。例如,一個包含10億文檔的索引占1TB硬盤空間,單個節(jié)點要么空間不夠,要么相應(yīng)查詢的速度太慢。

為了解決這一問題,Elasticsearch支持將一個索引分成多個小塊,稱為分片。在創(chuàng)建索引的時候可以定義分片數(shù)。每一個分片相當于一個功能完備的獨立的小索引,可以存儲在集群的任意節(jié)點上。

分片重要的原因有兩點:?

1. 它能水平拆分數(shù)據(jù)?

2. 并行操作分片,提升吞吐量

在網(wǎng)絡(luò)和云環(huán)境中,故障隨時可能發(fā)生,因此故障恢復機制十分必要。Elasticsearch支持為分片創(chuàng)建一個或多個副本,稱為分片副本。

副本有兩個好處:?

1. 高可用性。?

2. 提升查詢的吞吐量。

總的來說,每一個索引可以拆分成多個分片,可以復制多個副本,存在主分片和分片副本。分片數(shù)和副本數(shù)都可以在創(chuàng)建索引時指定,不同的是,分片數(shù)確定之后就不能更改,而副本數(shù)可以動態(tài)修改。

默認情況下,每個索引擁有5個主分片和一個副本(即5個分片,每個分片都有一個副本)。

每一個Elasticsearch分片都是一個Lucene索引。Lucene索引有文檔數(shù)上限。在LUCENE-5843中,該上限為2,147,483,519 (=Integer.MAX_VALUE-128)??梢允褂胈cat/shards監(jiān)控分片的大小。

curl -XGET gd01:9200/_cat/shards/20171229?

20171229 1 p STARTED 1904509 369.5mb 132.98.16.178 data-178?

20171229 1 r STARTED 1902986 383.6mb 132.98.16.176 master-176?

20171229 3 r STARTED 1898048 349.7mb 132.98.16.178 data-178?

20171229 3 p STARTED 1898595 492.2mb 132.98.16.177 data-177?

20171229 2 r STARTED 1903094 481.2mb 132.98.16.178 data-178?

20171229 2 p STARTED 1904497 526.9mb 132.98.16.176 master-176?

20171229 4 p STARTED 1902180 487mb 132.98.16.178 data-178?

20171229 4 r STARTED 1900635 586.9mb 132.98.16.176 master-176?

20171229 0 p STARTED 1902472 421.6mb 132.98.16.177 data-177?

20171229 0 r STARTED 1901511 511.8mb 132.98.16.176 master-176

Elasticsearch集群安裝

Elasticsearch集群依賴JDK1.8,因此在安裝之前應(yīng)先安裝好JDK1.8。

下載安裝文件

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.5.tar.gz

解壓

tar -xvf elasticsearch-5.6.5.tar.gz

啟動單節(jié)點

elasticsearch-5.6.5/bin/elasticsearch

集群配置

elasticsearch.yml示例

# ======================== Elasticsearch Configuration =========================

#

# NOTE: Elasticsearch comes with reasonable defaults for most settings.

#? ? ? Before you set out to tweak and tune the configuration, make sure you

#? ? ? understand what are you trying to accomplish and the consequences.

#

# The primary way of configuring a node is via this file. This template lists

# the most important settings you may want to configure for a production cluster.

#

# Please consult the documentation for further information on configuration options:

# https://www.elastic.co/guide/en/elasticsearch/reference/index.html

#

# ---------------------------------- Cluster -----------------------------------

#

# Use a descriptive name for your cluster:

#

cluster.name: es-gotcha

#

# ------------------------------------ Node ------------------------------------

#

# Use a descriptive name for the node:

#

node.name: node-${HOSTNAME}

#

# Add custom attributes to the node:

#

#node.attr.rack: r1

#

node.master: true

node.data: false

# ----------------------------------- Paths ------------------------------------

#

# Path to directory where to store the data (separate multiple locations by comma):

#

path.data: /var/data/es

#

# Path to log files:

#

path.logs: /var/log/es

#

# ----------------------------------- Memory -----------------------------------

#

# Lock the memory on startup:

#

bootstrap.memory_lock: true

#

# Make sure that the heap size is set to about half the memory available

# on the system and that the owner of the process is allowed to use this

# limit.

#

# Elasticsearch performs poorly when the system is swapping the memory.

#

# ---------------------------------- Network -----------------------------------

#

# Set the bind address to a specific IP (IPv4 or IPv6):

#

network.host: 132.98.16.176

#

# Set a custom port for HTTP:

#

#http.port: 9200

#

# For more information, consult the network module documentation.

#

# --------------------------------- Discovery ----------------------------------

#

# Pass an initial list of hosts to perform discovery when new node is started:

# The default list of hosts is ["127.0.0.1", "[::1]"]

#

discovery.zen.ping.unicast.hosts: ["132.98.16.176", "132.98.16.177", "132.98.16.179", "132.98.16.180", "132.98.16.182", "132.98.16.183", "132.98.16.184"]

#

# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):

#

discovery.zen.minimum_master_nodes: 3

#

# For more information, consult the zen discovery module documentation.

#

# ---------------------------------- Gateway -----------------------------------

#

# Block initial recovery after a full cluster restart until N nodes are started:

#

#gateway.recover_after_nodes: 3

#

# For more information, consult the gateway module documentation.

#

# ---------------------------------- Various -----------------------------------

#

# Require explicit names when deleting indices:

#

action.destructive_requires_name: true

需要配置的有:

cluster.name

node.name

node.master,定義節(jié)點是否為主節(jié)點

node.data

network.host

discovery.zen.ping.unicast.hosts,Elasticsearch集群節(jié)點列表

discovery.zen.minimum_master_nodes,構(gòu)成集群的最少主節(jié)點數(shù)

在多臺機器上部署Elasticsearch,然后依次啟動,節(jié)點會自動發(fā)現(xiàn),并構(gòu)成一個集群。

集群小試

Elasticsearch提供了REST API和Java API。接下來我們使用REST API。使用API,我們可以:

檢查集群、節(jié)點、索引健康、狀態(tài)和一些統(tǒng)計信息

管理集群、節(jié)點、索引數(shù)據(jù)和元數(shù)據(jù)

執(zhí)行CRUD

執(zhí)行高級搜索,如分頁、排序、過濾、執(zhí)行腳本、聚合等等

集群健康

curl -XGET gd01:9200/_cat/health?v

epoch? ? ? timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent

1514533722 15:48:42? esbds? green? ? ? ? ? 3? ? ? ? 3? ? 20? 10? ? 0? ? 0? ? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? ? ? ? -? ? ? ? ? ? ? ? 100.0%

獲取節(jié)點列表

curl -XGET gd01:9200/_cat/nodes?v

ip? ? ? ? ? ? heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name

132.98.16.176? ? ? ? ? 64? ? ? ? ? 26? 5? ? 1.45? ? 1.49? ? 1.43 mdi? ? ? *? ? ? master-176

132.98.16.177? ? ? ? ? 84? ? ? ? ? 19? 8? ? 1.27? ? 1.43? ? 1.60 di? ? ? ? -? ? ? data-177

132.98.16.178? ? ? ? ? 57? ? ? ? ? 78? 16? ? 2.24? ? 2.40? ? 2.45 di? ? ? ? -? ? ? data-178

列舉索引

curl -XGET gd01:9200/_cat/indices?v

health status index? ? uuid? ? ? ? ? ? ? ? ? pri rep docs.count docs.deleted store.size pri.store.size

green? open? 20171228 ij-Y05EEQIimzEDYPyzvjw? 5? 1? ? 7810000? ? ? 3922446? ? ? 2.6gb? ? ? ? ? 1.3gb

green? open? 20171229 FUabFhc5TYyi4K_y81GJ9w? 5? 1? ? 9905546? ? ? 6122165? ? ? 4.2gb? ? ? ? ? 2.2gb

創(chuàng)建索引

curl -XPUT gd01:9200/test_idx?pretty

返回:

{

? "acknowledged" : true,

? "shards_acknowledged" : true,

? "index" : "test_idx"

}

創(chuàng)建文檔

在test_idx索引中創(chuàng)建類型為external,id為1的文檔。

curl -XPUT gd01:9200/test_idx/external/1?pretty -d '

{

? "name": "John Doe"

}'

返回

{

? "_index" : "test_idx",

? "_type" : "external",

? "_id" : "1",

? "_version" : 1,

? "result" : "created",

? "_shards" : {

? ? "total" : 2,

? ? "successful" : 2,

? ? "failed" : 0

? },

? "created" : true

}

查詢文檔

curl -XGET gd01:9200/test_idx/external/1?pretty

返回

{

? "_index" : "test_idx",

? "_type" : "external",

? "_id" : "1",

? "_version" : 1,

? "found" : true,

? "_source" : {

? ? "name" : "John Doe"

? }

}

bulk操作

批量創(chuàng)建文檔

curl -XPOST gd01:9200/test_idx/external/_bulk?pretty -d '?

{"index":{"_id":"1"}}

{"name": "John Doe" }

{"index":{"_id":"2"}}

{"name": "Jane Doe" }

'

bulk中的操作可以不一樣

curl -XPOST gd01:9200/test_idx/external/_bulk?pretty -d '

{"update":{"_id":"1"}}

{"doc": { "name": "John Doe becomes Jane Doe" } }

{"delete":{"_id":"2"}}

'

查詢

在Elasticsearch中,查詢條件可以放在url中,也可以在請求體里。

url附帶查詢條件

curl -XGET gd01:9200/test_idx/external/_search?q=John

返回

{

? "_index" : "test_idx",

? "_type" : "external",

? "_id" : "1",

? "_version" : 2,

? "found" : true,

? "_source" : {

? ? "name" : "John Doe"

? }

}

請求體中附帶查詢條件

curl -XPOST gd01:9200/test_idx/external/_search?pretty -d '

{

? "query": {

? ? "term": {

? ? ? "name": "John Doe"

? ? }

? }

}'

除了簡單查詢,Elasticsearch還支持:

過濾,請參考https://www.elastic.co/guide/en/elasticsearch/reference/5.6/_executing_filters.html

聚合,請參考https://www.elastic.co/guide/en/elasticsearch/reference/5.6/_executing_aggregations.html

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容