ES緩存

node query cache

一個節點的所有shard共享一個緩存區。利用LRU算法替換緩存內容。

query cache緩存查詢結果,但只緩存filter類型的查詢。

可通過indices.queries.cache.size設置緩存的大小。

在5.1.1中移除了term query的緩存。因為term query和filter query二者查詢時間相差不多。https://www.elastic.co/guide/en/elasticsearch/reference/5.1/release-notes-5.1.1.html
因此下面的查詢是不會緩存的。

curl -XPOST 'localhost:9200/_search?pretty' -H 'Content-Type: application/json' -d'
{
  "query": {
        "term" : { "user" : "Kimchy" } 
    }
}'

針對于filter執行的不需要計算排名的查詢,官網的說明如下:https://www.elastic.co/guide/en/elasticsearch/guide/2.x/_finding_exact_values.html

Find matching docs.
The term query looks up the term XHDK-A-1293-#fJ3 in the inverted index and retrieves the list of documents that contain that term. In this case, only document 1 has the term we are looking for.
Build a bitset.
The filter then builds a bitset--an array of 1s and 0s—that describes which documents contain the term. Matching documents receive a 1 bit. In our example, the bitset would be [1,0,0,0]. Internally, this is represented as a "roaring bitmap", which can efficiently encode both sparse and dense sets.
Iterate over the bitset(s)
Once the bitsets are generated for each query, Elasticsearch iterates over the bitsets to find the set of matching documents that satisfy all filtering criteria. The order of execution is decided heuristically, but generally the most sparse bitset is iterated on first (since it excludes the largest number of documents).
Increment the usage counter.
Elasticsearch can cache non-scoring queries for faster access, but its silly to cache something that is used only rarely. Non-scoring queries are already quite fast due to the inverted index, so we only want to cache queries we know will be used again in the future to prevent resource wastage.
To do this, Elasticsearch tracks the history of query usage on a per-index basis. If a query is used more than a few times in the last 256 queries, it is cached in memory. And when the bitset is cached, caching is omitted on segments that have fewer than 10,000 documents (or less than 3% of the total index size). These small segments tend to disappear quickly anyway and it is a waste to associate a cache with them.

注意當segment的文檔數量小于10000或者小于總index數量的3%時,查詢是不會緩存的。
在博客http://www.lxweimin.com/p/b5ff856f3190中有提到同樣的請求前兩次訪問時間為38ms,但是第3和4次請求時,需要的時間為150ms。后面再請求時,時間為1ms。一直有疑問,上述的第四條介紹了原因,具體是因為只有對頻繁訪問的請求才會建bitmap,建bitmap的過程需要一定的時間。
會被filter的情況:

Frequently used filters will be cached automatically by Elasticsearch, to speed up performance.
Filter context is in effect whenever a query clause is passed to a filter parameter, such as the filter or must_not parameters in the bool query, the filter parameter in the constant_score query, or the filter aggregation.

經過測試的有以下查詢會被緩存:
1 range
2 bool/must_not
3 bool/filter
4 constant_score/filter
5 filter aggregation

shard request cache

shard-level類型的請求會緩存本地的結果到每個shard。
shard request cache會緩存以下內容:

By default, the requests cache will only cache the results of search requests where size=0, so it will not cache hits, but it will cache hits.total, aggregations, and suggestions.Most queries that use now (see Date Mathedit) cannot be cached.

即:
1 hits.total
2 aggregations
3 suggestions

針對于含有now的query,可通過https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-search-speed.html#_search_rounded_dates加速。

緩存的結果會隨著shard的refresh而無效。因此越長的refresh interval,在不超出deadline的情況下緩存可用的時間就越長。當緩存滿時,最近最少使用的緩存將被清除。

可以設置每個索引的請求是否cache:

curl -XPUT 'localhost:9200/my_index?pretty' -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index.requests.cache.enable": false
  }
}
'

也可以手動的設置每個請求是否緩存:

curl -XGET 'localhost:9200/my_index/_search?request_cache=true&pretty' -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "colors"
      }
    }
  }
}
'

size大于0的請求將不會被緩存,即使手動的設置request_cache=true。

cache key

提交的json body被作為cache key。如果你的json body發生了改變,則不能利用緩存。即使是同一個請求,但是條件的順序不同,也不行。
可通過indices.requests.cache.size: 2% 設置cache的大小。

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容

  • NAME dnsmasq - A lightweight DHCP and caching DNS server....
    ximitc閱讀 2,929評論 0 0
  • PLEASE READ THE FOLLOWING APPLE DEVELOPER PROGRAM LICENSE...
    念念不忘的閱讀 13,541評論 5 6
  • 我駕著一根羽毛 飄浮在小溪的身上 夜空灑下溫柔的目光 一川星子歡快地搖蕩 我駕著一根羽毛 迷失在帝國的心臟 看不到...
    相枝閱讀 196評論 0 0
  • 【POCKROOT】之〖一年級〗 每種植物都有自己的生長軌跡, 每個孩子都有自己的成長經歷。 就像植物播種、發芽、...
    肥鴿子麻麻閱讀 314評論 0 0
  • 別人是繪畫,而我的,確認是涂鴉。只因為我并不是全部都很認真很細膩的利用手中的筆,把眼睛里面的影像表述出來,而是隨心...
    曉楓晨露閱讀 851評論 4 5