calcite物化視圖基于規(guī)則查詢改寫原理解析

1. 術(shù)語(yǔ)定義

物化視圖:將視圖的查詢結(jié)果物化保存下來(lái)的結(jié)果。
物化視圖 QueryRel: 生成物化視圖的SQL關(guān)系表達(dá)式(查詢語(yǔ)句)。
物化視圖 TableRel:生成物化視圖結(jié)果存儲(chǔ)的關(guān)系表達(dá)式(存儲(chǔ)物化視圖的tableScan算子)。
COMPLETE : 查詢表模型和物化視圖表模型完全相同,比如查詢引用了a,b,c三張表,物化視圖也引用了a,b,c三張表。
VIEW_PARTIAL:查詢表模型完全包含物化視圖表模型,比如查詢引用了a,b,c三張表,物化視圖也引用了a,b兩張表。
QUERY_PARTIAL: 物化試圖表模型完全包含查詢表模型,比如查詢引用了a,b兩張表,物化視圖引用了a,b,c三張表。

2. 背景

物化視圖指將SQL查詢的結(jié)果保存下來(lái)。
查詢使用物化視圖改寫是一種有效的加速方式,即將查詢語(yǔ)句的全部或者部分改寫成物化視圖進(jìn)行加速。
物化視圖和查詢完全等效可以直接命中查詢,查詢和物化視圖不完全等效的情形下需要通過(guò)條件補(bǔ)償,聚合上拉等方式,使用物化視圖對(duì)于查詢關(guān)系代數(shù)局部進(jìn)行改寫。

3. 問(wèn)題定義

怎樣通過(guò)基于規(guī)則的方式,使用物化視圖對(duì)查詢表達(dá)式進(jìn)行局部改寫?或者整體替換?

4. 概述

Calcite中基于規(guī)則UnifyRule查詢改寫的主要原理就是通過(guò)循環(huán)遍歷查詢SQL的RelNode關(guān)系表達(dá)式和生成物化視圖QueryRelNode表達(dá)式,基于RelNode關(guān)系表達(dá)式命中對(duì)應(yīng)的UnifyRule規(guī)則,如果匹配match UnifyRule規(guī)則,就調(diào)用對(duì)應(yīng)規(guī)則的apply方法,使用物化結(jié)果的TableRel表達(dá)式對(duì)于查詢SQL關(guān)系表達(dá)式進(jìn)行改寫。

5.流程圖

5.1 結(jié)構(gòu)關(guān)系示意圖

在SubstitutionVistor中會(huì)使用UnifyRule,使用Target MutableNode 對(duì)于 Query MutableNode進(jìn)行改寫


結(jié)構(gòu)關(guān)系示意圖

5.2 視圖替換流程圖

圖中相同的顏色表示相同的節(jié)點(diǎn),視圖替換核心流程示意圖


物化視圖匹配流程圖.png

6.核心組件

UnifyRule:

使用物化視圖target對(duì)查詢關(guān)系表達(dá)式進(jìn)行改寫的規(guī)則,下面是源碼到注釋

  /** Rule that attempts to match a query relational expression
   * against a target relational expression.
   *
   * <p>The rule declares the query and target types; this allows the
   * engine to fire only a few rules in a given context.</p>
   */

UnifyRule子類如下

UnifyRule子類

SubstitutionVisitor:

替換查詢關(guān)系表達(dá)式樹(shù)的核心類,使用從下而上的替換算法,可以進(jìn)行一定的改寫和條件補(bǔ)償,查詢關(guān)系表達(dá)式和物化結(jié)果查詢表達(dá)式不必完全相等

/**
 * Substitutes part of a tree of relational expressions with another tree.
 * <p>Uses a bottom-up matching algorithm. Nodes do not need to be identical.
 * At each level, returns the residue.</p>
 */

MutableRel

關(guān)系表達(dá)式RelNode在進(jìn)行視圖替換之前,會(huì)首先轉(zhuǎn)換成MutableRel,之后使用MutableRel在SubstitutionVisitor中進(jìn)行查詢改寫,當(dāng)改寫完成后,會(huì)把MutableRel再轉(zhuǎn)成RelNode,它和RelNode是等價(jià)的,并且記錄了它在父節(jié)點(diǎn)中的位置,便于視圖替換的時(shí)候進(jìn)行便利和回溯

/** Mutable equivalent of {@link RelNode}.
 *
 * <p>Each node has mutable state, and keeps track of its parent and position
 * within parent.
 */

核心方法

org.apache.calcite.plan.SubstitutionVisitor#go(org.apache.calcite.rel.RelNode)

7. 視圖替換核心流程

7.1替換過(guò)程數(shù)據(jù)

這里選擇一個(gè)聚合上拉的例子分析下基于規(guī)則視圖改寫機(jī)制

物化視圖SQL語(yǔ)句

select C,D, count(A) from "@jingda".employees
GROUP BY C,D

物化視圖查詢關(guān)系表達(dá)式

  LogicalAggregate(group=[{0, 1}], EXPR$2=[COUNT($2)])
    LogicalProject(C=[$2], D=[$3], A=[$0])
      ScanCrel(table=["@jingda".employees], columns=[`A`, `B`, `C`, `D`, `E`, `F`], splits=[1])

物化視圖結(jié)果存儲(chǔ)算子

LogicalProject(C=[$0], D=[$1], EXPR$2=[CAST($2):BIGINT NOT NULL])
  ScanCrel(table=["__accelerator"."7db4b655-d381-4cc8-ba6f-adc2c40d0153"."479ce684-efd6-4420-8a5b-68350789b8bb"], columns=[`C`, `D`, `EXPR$2`], splits=[3])

查詢語(yǔ)句SQL語(yǔ)句

select D, count(A) from "@jingda".employees
GROUP BY D

查詢語(yǔ)句關(guān)系表達(dá)式

 LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1)])
  LogicalProject(D=[$3], A=[$0])
    ScanCrel(table=["@jingda".employees], columns=[`A`, `B`, `C`, `D`, `E`, `F`], splits=[1])

改寫后的SQL語(yǔ)句示意

select D, sum(A) FROM
(select C,D, count(A) from "@jingda".employees GROUP BY C,D)

改寫后的關(guān)系表達(dá)式
這個(gè)地方可以看到,查詢關(guān)系表達(dá)式已經(jīng)使用了提前物化好的結(jié)果進(jìn)行了改寫

LogicalAggregate(group=[{1}], EXPR$1=[$SUM0($2)])
  LogicalProject(C=[$0], D=[$1], EXPR$2=[CAST($2):BIGINT NOT NULL])
    ScanCrel(table=["__accelerator"."7db4b655-d381-4cc8-ba6f-adc2c40d0153"."479ce684-efd6-4420-8a5b-68350789b8bb"], columns=[`C`, `D`, `EXPR$2`], splits=[3])

7.2 數(shù)據(jù)流轉(zhuǎn)圖

數(shù)據(jù)流轉(zhuǎn)圖.png

圖中初始為Query為查詢的SQL語(yǔ)句,Target為生成物化視圖的SQL語(yǔ)句,Replacement為物化視圖存儲(chǔ)的位置算子

經(jīng)過(guò)第一輪是命中了CalcToCalcUnifyRule規(guī)則,對(duì)于底層下面的關(guān)系表達(dá)式進(jìn)行改寫變成了

Calc(program: (expr#0..2=[{inputs}], D=[$t1], A=[$t2]))
 Calc(program: (expr#0..5=[{inputs}], C=[$t2], D=[$t3], A=[$t0]))
   Scan(table: [@rp_test, employees])

第二輪是命中了AggregateOnCalcToAggregateUnifyRule規(guī)則,對(duì)于底層下面的關(guān)系表達(dá)式進(jìn)行改寫變成了

Aggregate(groupSet: {1}, groupSets: [{1}], calls: [$SUM0($2)])
  Aggregate(groupSet: {0, 1}, groupSets: [{0, 1}], calls: [COUNT($2)])

最后把整個(gè)查詢的關(guān)系表達(dá)式改寫成

Holder
  Aggregate(groupSet: {1}, groupSets: [{1}], calls: [$SUM0($2)])
    Project(projects: [$0, $1, CAST($2):BIGINT NOT NULL])
      Scan(table: [__accelerator, 1c4b39df-c7c2-4e40-aebb-dfa87faa80a9, 14c0517b-10e6-4d66-92d3-f68e451c4216])

7.3 核心代碼分析

核心的代碼在org.apache.calcite.plan.SubstitutionVisitor#go(org.apache.calcite.rel.mutable.MutableRel)中

for (;;) {
      int count = 0;
      MutableRel queryDescendant = query;
    outer:
      while (queryDescendant != null) {
        for (Replacement r : attempted) {
          // 如果當(dāng)前查詢節(jié)點(diǎn)已經(jīng)使用物化視圖進(jìn)行了替換,就搜索queryDescendant的另一個(gè)分支
          if (r.stopTrying && queryDescendant == r.after) {
            // This node has been replaced by previous iterations in the
            // hope to match its ancestors and stopTrying indicates
            // there's no need to be matched again.
            queryDescendant = MutableRels.preOrderTraverseNext(queryDescendant);
            continue outer;
          }
        }
        final MutableRel next = MutableRels.preOrderTraverseNext(queryDescendant);
        final MutableRel childOrNext =
            queryDescendant.getInputs().isEmpty()
                ? next : queryDescendant.getInputs().get(0);
        // 對(duì)于當(dāng)前queryDescendant,遍歷所有物化視圖的關(guān)系表達(dá)式節(jié)點(diǎn)
        for (MutableRel targetDescendant : targetDescendants) {
         // 根據(jù)關(guān)系表達(dá)式節(jié)點(diǎn)獲取可用的規(guī)則UnifyRule
          for (UnifyRule rule
              : applicableRules(queryDescendant, targetDescendant)) {
            UnifyRuleCall call =
                rule.match(this, queryDescendant, targetDescendant);
            if (call != null) {
              // 執(zhí)行規(guī)則
              final UnifyResult result = rule.apply(call);
              if (result != null) {
                // 說(shuō)明找到了匹配的物化視圖,處理局部視圖替換的邏輯
                ++count;
                attempted.add(
                    new Replacement(result.call.query, result.result, result.stopTrying));
                result.call.query.replaceInParent(result.result);

                // Replace previous equivalents with new equivalents, higher up
                // the tree.
                for (int i = 0; i < rule.slotCount; i++) {
                  Collection<MutableRel> equi = equivalents.get(slots[i]);
                  if (!equi.isEmpty()) {
                    equivalents.remove(slots[i], equi.iterator().next());
                  }
                }
                assert rowTypesAreEquivalent(result.result, result.call.query, Litmus.THROW);
                equivalents.put(result.result, result.call.query);
                // 如果待改寫的節(jié)點(diǎn)等于物化視圖結(jié)果,進(jìn)行改寫替換
                if (targetDescendant == target) {
                  // A real substitution happens. We purge the attempted
                  // replacement list and add them into substitution list.
                  // Meanwhile we stop matching the descendants and jump
                  // to the next subtree in pre-order traversal.
                  if (!target.equals(replacement)) {
                    Replacement r = replace(
                        query.getInput(), target, replacement.clone());
                    assert r != null
                        : rule + "should have returned a result containing the target.";
                    attempted.add(r);
                  }
                  substitutions.add(ImmutableList.copyOf(attempted));
                  attempted.clear();
                  queryDescendant = next;
                  continue outer;
                }
                // We will try walking the query tree all over again to see
                // if there can be any substitutions after the replacement
                // attempt.
                break outer;
              }
            }
          }
        }
        queryDescendant = childOrNext;
      }
      // Quit the entire loop if:
      // 1) we have walked the entire query tree with one or more successful
      //    substitutions, thus count != 0 && attempted.isEmpty();
      // 2) we have walked the entire query tree but have made no replacement
      //    attempt, thus count == 0 && attempted.isEmpty();
      // 3) we had done some replacement attempt in a previous walk, but in
      //    this one we have not found any potential matches or substitutions,
      //    thus count == 0 && !attempted.isEmpty().
      if (count == 0 || attempted.isEmpty()) {
        break;
      }
    }
    if (!attempted.isEmpty()) {
      // We had done some replacement attempt in the previous walk, but that
      // did not lead to any substitutions in this walk, so we need to recover
      // the replacement.
      undoReplacement(attempted);
    }
    return substitutions;

8. 查詢改寫技術(shù)總結(jié)

查詢改寫在業(yè)界大概的分類有三種技術(shù)

  • 基于結(jié)構(gòu)信息改寫
  • 基于規(guī)則視圖替換
  • 基于語(yǔ)法改寫
    本文介紹的是基于規(guī)則的視圖替換技術(shù),核心就是尋找查詢關(guān)系表達(dá)式和物化視圖表達(dá)式的相同視圖,進(jìn)行局部改寫和替換,后面會(huì)介紹基于結(jié)構(gòu)信息改寫的技術(shù)特性,下面是三種技術(shù)的簡(jiǎn)單對(duì)比。
查詢改寫技術(shù)對(duì)比

原創(chuàng)不易,轉(zhuǎn)載請(qǐng)注明出處,謝謝!

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

推薦閱讀更多精彩內(nèi)容