系統信息相關API
// 查看ec版本配置信息
http://localhost:9200
// 查看所有插件
http://localhost:9200/_cat/plugins?v
// 查看所有索引
http://localhost:9200/_cat/indices?v
// 對ES進行健康檢查
http://localhost:9200/_cat/health?v
// 查看當前的磁盤占用率
http://localhost:9200/_cat/allocation?v
創建索引 yii2blog/articles
curl -v -X PUT "http://localhost:9200/yii2blog?pretty=true" -d "json"
curl -v -X POST "http://localhost:9200/yii2blog?pretty=true" -d "json"
{
"settings":{
"refresh_interval": "5s", //5秒后刷新
"number_of_shards": 1, //分片,目前1臺機器,所以為1
"number_of_replicas": 0 //副本為0
},
"mappings": {
"_default_": {
"_all": {
"enabled": true //所有數據都索引
}
},
"articles": { //名稱可自定義,可定義為表名
"dynamic": false , //動態映射
"properties": {
"article_id": {
"type": "long"
},
"post_title": {
"type": "string",
"index": "analyzed",
"analyzer": "ik"
},
"post_excerpt": {
"type": "string",
"index": "analyzed",
"analyzer": "ik"
}
}
}
}
}
刪除索引 yii2blog/articles
curl -v -X DELETE "localhost:9200/yii2blog?pretty=true"
添加記錄
curl -v X PUT "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
curl -v X POST "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
{
"article_id" : 1,
"post_title" : "這是文章標題",
"post_excerpt" : "這是文章描述"
}
查看記錄
curl -v X GET "http://localhost:9200/yii2blog/articles/1?pretty=true"
刪除記錄
curl -v X DELETE "http://localhost:9200/yii2blog/articles/1?pretty=true"
更新記錄
curl -v X PUT "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
curl -v X POST "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
{
"article_id" : 1,
"post_title" : "更新文章標題",
"post_excerpt" : "更新文章描述"
}
數據查詢
1. 返回 elastic 中所有記錄
curl -v X GET "http://localhost:9200/_search?pretty=true"
2. 返回 yii2blog 中所有記錄
curl -v X GET "http://localhost:9200/yii2blog/_search?pretty=true"
3. 返回 yii2blog/articles 中所有記錄
curl -v X GET "http://localhost:9200/yii2blog/articles/_search?pretty=true"
全文搜索
1. 使用 Match 查詢,指定的匹配條件是 post_excerpt 字段里面包含"描述"這個詞
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述" }}
}
2. 返回2兩條記錄(Elastic 默認一次返回10條結果)
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述" }},
"size": 2
}
3. 從位置3開始(默認是從位置0開始),只返回5條結果。
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述" }},
"from": 3,
"size": 5
}
4. 搜索的是 "描述" or "文章"。
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述 文章" }}
}
5. 搜索的是 "描述" and "文章",必須使用布爾查詢。
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query": {
"bool": {
"must": [
{ "match": { "post_excerpt": "描述" } },
{ "match": { "post_excerpt": "文章" } }
]
}
}
}
6. 使用 Match 查詢,指定的匹配條件是 post_excerpt 字段里面包含帶高亮的"描述"這個詞
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query": {
"multi_match": {
"query": "描述",
"fields": [
"post_excerpt"
]
}
},
"highlight": {
"pre_tags": [
"<b class=\"highlight\">"
],
"post_tags": [
"</b>"
],
"fields": {
"post_excerpt": {}
}
}
}
結果:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.48288077,
"hits": [
{
"_index": "yii2blog",
"_type": "articles",
"_id": "1",
"_score": 0.48288077,
"_source": {
"article_id": 1,
"post_title": "這是文章標題",
"post_excerpt": "這是文章描述,更新"
},
"highlight": {
"post_excerpt": [
"這是文章描述,<b class="highlight">更新</b>"
]
}
},
{
"_index": "yii2blog",
"_type": "articles",
"_id": "2",
"_score": 0.48288077,
"_source": {
"article_id": 1,
"post_title": "這是文章標題",
"post_excerpt": "這是文章描述,更新"
},
"highlight": {
"post_excerpt": [
"這是文章描述,<b class="highlight">更新</b>"
]
}
}
]
}
}
中文分詞設置
首先,安裝中文分詞插件。這里使用的是 ik,也可以考慮其他插件(比如 smartcn)。
$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
上面代碼安裝的是5.5.1版的插件,與 Elastic 5.5.1 配合使用。
接著,重新啟動 Elastic,就會自動加載這個新安裝的插件。
然后,新建一個 Index,指定需要分詞的字段。這一步根據數據結構而異,下面的命令只針對本文?;旧?,凡是需要搜索的中文字段,都要單獨設置一下。
$ curl -X PUT 'localhost:9200/accounts' -d '
{
"mappings": {
"person": {
"properties": {
"user": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"desc": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}
}
}'
上面代碼中,首先新建一個名稱為accounts
的 Index,里面有一個名稱為person
的 Type。person
有三個字段。
- user
- title
- desc
這三個字段都是中文,而且類型都是文本(text),所以需要指定中文分詞器,不能使用默認的英文分詞器。
Elastic 的分詞器稱為 analyzer。我們對每個字段指定分詞器。
"user": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
上面代碼中,analyzer
是字段文本的分詞器,search_analyzer
是搜索詞的分詞器。ik_max_word
分詞器是插件ik
提供的,可以對文本進行最大數量的分詞。
分詞測試
1. 自帶分詞器 standard
curl -v -X POST "http://localhost:9200/_analyze?analyzer=standard&pretty=true" -d "這是一段漢字"
結果
{
"tokens": [
{
"token": "這",
"start_offset": 1,
"end_offset": 2,
"type": "<IDEOGRAPHIC>",
"position": 0
},
{
"token": "是",
"start_offset": 2,
"end_offset": 3,
"type": "<IDEOGRAPHIC>",
"position": 1
},
{
"token": "一",
"start_offset": 3,
"end_offset": 4,
"type": "<IDEOGRAPHIC>",
"position": 2
},
{
"token": "段",
"start_offset": 4,
"end_offset": 5,
"type": "<IDEOGRAPHIC>",
"position": 3
},
{
"token": "漢",
"start_offset": 5,
"end_offset": 6,
"type": "<IDEOGRAPHIC>",
"position": 4
},
{
"token": "字",
"start_offset": 6,
"end_offset": 7,
"type": "<IDEOGRAPHIC>",
"position": 5
}
]
}
2. 中文分詞插件 ik
curl -v -X POST "http://localhost:9200/_analyze?analyzer=ik&pretty=true" -d "這是一段漢字"
結果
{
"tokens": [
{
"token": "這是",
"start_offset": 1,
"end_offset": 3,
"type": "CN_WORD",
"position": 0
},
{
"token": "一段",
"start_offset": 3,
"end_offset": 5,
"type": "CN_WORD",
"position": 1
},
{
"token": "一",
"start_offset": 3,
"end_offset": 4,
"type": "TYPE_CNUM",
"position": 2
},
{
"token": "段",
"start_offset": 4,
"end_offset": 5,
"type": "COUNT",
"position": 3
},
{
"token": "漢字",
"start_offset": 5,
"end_offset": 7,
"type": "CN_WORD",
"position": 4
},
{
"token": "漢",
"start_offset": 5,
"end_offset": 6,
"type": "CN_WORD",
"position": 5
},
{
"token": "字",
"start_offset": 6,
"end_offset": 7,
"type": "CN_CHAR",
"position": 6
}
]
}