1. 術(shù)語(yǔ)定義
物化視圖:將視圖的查詢結(jié)果物化保存下來(lái)的結(jié)果。
物化視圖 QueryRel: 生成物化視圖的SQL關(guān)系表達(dá)式(查詢語(yǔ)句)。
物化視圖 TableRel:生成物化視圖結(jié)果存儲(chǔ)的關(guān)系表達(dá)式(存儲(chǔ)物化視圖的tableScan算子)。
COMPLETE : 查詢表模型和物化視圖表模型完全相同,比如查詢引用了a,b,c三張表,物化視圖也引用了a,b,c三張表。
VIEW_PARTIAL:查詢表模型完全包含物化視圖表模型,比如查詢引用了a,b,c三張表,物化視圖也引用了a,b兩張表。
QUERY_PARTIAL: 物化試圖表模型完全包含查詢表模型,比如查詢引用了a,b兩張表,物化視圖引用了a,b,c三張表。
2. 背景
物化視圖指將SQL查詢的結(jié)果保存下來(lái)。
查詢使用物化視圖改寫是一種有效的加速方式,即將查詢語(yǔ)句的全部或者部分改寫成物化視圖進(jìn)行加速。
物化視圖和查詢完全等效可以直接命中查詢,查詢和物化視圖不完全等效的情形下需要通過(guò)條件補(bǔ)償,聚合上拉等方式,使用物化視圖對(duì)于查詢關(guān)系代數(shù)局部進(jìn)行改寫。
3. 問(wèn)題定義
怎樣通過(guò)基于規(guī)則的方式,使用物化視圖對(duì)查詢表達(dá)式進(jìn)行局部改寫?或者整體替換?
4. 概述
Calcite中基于規(guī)則UnifyRule查詢改寫的主要原理就是通過(guò)循環(huán)遍歷查詢SQL的RelNode關(guān)系表達(dá)式和生成物化視圖QueryRelNode表達(dá)式,基于RelNode關(guān)系表達(dá)式命中對(duì)應(yīng)的UnifyRule規(guī)則,如果匹配match UnifyRule規(guī)則,就調(diào)用對(duì)應(yīng)規(guī)則的apply方法,使用物化結(jié)果的TableRel表達(dá)式對(duì)于查詢SQL關(guān)系表達(dá)式進(jìn)行改寫。
5.流程圖
5.1 結(jié)構(gòu)關(guān)系示意圖
在SubstitutionVistor中會(huì)使用UnifyRule,使用Target MutableNode 對(duì)于 Query MutableNode進(jìn)行改寫
5.2 視圖替換流程圖
圖中相同的顏色表示相同的節(jié)點(diǎn),視圖替換核心流程示意圖
6.核心組件
UnifyRule:
使用物化視圖target對(duì)查詢關(guān)系表達(dá)式進(jìn)行改寫的規(guī)則,下面是源碼到注釋
/** Rule that attempts to match a query relational expression
* against a target relational expression.
*
* <p>The rule declares the query and target types; this allows the
* engine to fire only a few rules in a given context.</p>
*/
UnifyRule子類如下
SubstitutionVisitor:
替換查詢關(guān)系表達(dá)式樹(shù)的核心類,使用從下而上的替換算法,可以進(jìn)行一定的改寫和條件補(bǔ)償,查詢關(guān)系表達(dá)式和物化結(jié)果查詢表達(dá)式不必完全相等
/**
* Substitutes part of a tree of relational expressions with another tree.
* <p>Uses a bottom-up matching algorithm. Nodes do not need to be identical.
* At each level, returns the residue.</p>
*/
MutableRel
關(guān)系表達(dá)式RelNode在進(jìn)行視圖替換之前,會(huì)首先轉(zhuǎn)換成MutableRel,之后使用MutableRel在SubstitutionVisitor中進(jìn)行查詢改寫,當(dāng)改寫完成后,會(huì)把MutableRel再轉(zhuǎn)成RelNode,它和RelNode是等價(jià)的,并且記錄了它在父節(jié)點(diǎn)中的位置,便于視圖替換的時(shí)候進(jìn)行便利和回溯
/** Mutable equivalent of {@link RelNode}.
*
* <p>Each node has mutable state, and keeps track of its parent and position
* within parent.
*/
核心方法
org.apache.calcite.plan.SubstitutionVisitor#go(org.apache.calcite.rel.RelNode)
7. 視圖替換核心流程
7.1替換過(guò)程數(shù)據(jù)
這里選擇一個(gè)聚合上拉的例子分析下基于規(guī)則視圖改寫機(jī)制
物化視圖SQL語(yǔ)句
select C,D, count(A) from "@jingda".employees
GROUP BY C,D
物化視圖查詢關(guān)系表達(dá)式
LogicalAggregate(group=[{0, 1}], EXPR$2=[COUNT($2)])
LogicalProject(C=[$2], D=[$3], A=[$0])
ScanCrel(table=["@jingda".employees], columns=[`A`, `B`, `C`, `D`, `E`, `F`], splits=[1])
物化視圖結(jié)果存儲(chǔ)算子
LogicalProject(C=[$0], D=[$1], EXPR$2=[CAST($2):BIGINT NOT NULL])
ScanCrel(table=["__accelerator"."7db4b655-d381-4cc8-ba6f-adc2c40d0153"."479ce684-efd6-4420-8a5b-68350789b8bb"], columns=[`C`, `D`, `EXPR$2`], splits=[3])
查詢語(yǔ)句SQL語(yǔ)句
select D, count(A) from "@jingda".employees
GROUP BY D
查詢語(yǔ)句關(guān)系表達(dá)式
LogicalAggregate(group=[{0}], EXPR$1=[COUNT($1)])
LogicalProject(D=[$3], A=[$0])
ScanCrel(table=["@jingda".employees], columns=[`A`, `B`, `C`, `D`, `E`, `F`], splits=[1])
改寫后的SQL語(yǔ)句示意
select D, sum(A) FROM
(select C,D, count(A) from "@jingda".employees GROUP BY C,D)
改寫后的關(guān)系表達(dá)式
這個(gè)地方可以看到,查詢關(guān)系表達(dá)式已經(jīng)使用了提前物化好的結(jié)果進(jìn)行了改寫
LogicalAggregate(group=[{1}], EXPR$1=[$SUM0($2)])
LogicalProject(C=[$0], D=[$1], EXPR$2=[CAST($2):BIGINT NOT NULL])
ScanCrel(table=["__accelerator"."7db4b655-d381-4cc8-ba6f-adc2c40d0153"."479ce684-efd6-4420-8a5b-68350789b8bb"], columns=[`C`, `D`, `EXPR$2`], splits=[3])
7.2 數(shù)據(jù)流轉(zhuǎn)圖
圖中初始為Query為查詢的SQL語(yǔ)句,Target為生成物化視圖的SQL語(yǔ)句,Replacement為物化視圖存儲(chǔ)的位置算子
經(jīng)過(guò)第一輪是命中了CalcToCalcUnifyRule規(guī)則,對(duì)于底層下面的關(guān)系表達(dá)式進(jìn)行改寫變成了
Calc(program: (expr#0..2=[{inputs}], D=[$t1], A=[$t2]))
Calc(program: (expr#0..5=[{inputs}], C=[$t2], D=[$t3], A=[$t0]))
Scan(table: [@rp_test, employees])
第二輪是命中了AggregateOnCalcToAggregateUnifyRule規(guī)則,對(duì)于底層下面的關(guān)系表達(dá)式進(jìn)行改寫變成了
Aggregate(groupSet: {1}, groupSets: [{1}], calls: [$SUM0($2)])
Aggregate(groupSet: {0, 1}, groupSets: [{0, 1}], calls: [COUNT($2)])
最后把整個(gè)查詢的關(guān)系表達(dá)式改寫成
Holder
Aggregate(groupSet: {1}, groupSets: [{1}], calls: [$SUM0($2)])
Project(projects: [$0, $1, CAST($2):BIGINT NOT NULL])
Scan(table: [__accelerator, 1c4b39df-c7c2-4e40-aebb-dfa87faa80a9, 14c0517b-10e6-4d66-92d3-f68e451c4216])
7.3 核心代碼分析
核心的代碼在org.apache.calcite.plan.SubstitutionVisitor#go(org.apache.calcite.rel.mutable.MutableRel)中
for (;;) {
int count = 0;
MutableRel queryDescendant = query;
outer:
while (queryDescendant != null) {
for (Replacement r : attempted) {
// 如果當(dāng)前查詢節(jié)點(diǎn)已經(jīng)使用物化視圖進(jìn)行了替換,就搜索queryDescendant的另一個(gè)分支
if (r.stopTrying && queryDescendant == r.after) {
// This node has been replaced by previous iterations in the
// hope to match its ancestors and stopTrying indicates
// there's no need to be matched again.
queryDescendant = MutableRels.preOrderTraverseNext(queryDescendant);
continue outer;
}
}
final MutableRel next = MutableRels.preOrderTraverseNext(queryDescendant);
final MutableRel childOrNext =
queryDescendant.getInputs().isEmpty()
? next : queryDescendant.getInputs().get(0);
// 對(duì)于當(dāng)前queryDescendant,遍歷所有物化視圖的關(guān)系表達(dá)式節(jié)點(diǎn)
for (MutableRel targetDescendant : targetDescendants) {
// 根據(jù)關(guān)系表達(dá)式節(jié)點(diǎn)獲取可用的規(guī)則UnifyRule
for (UnifyRule rule
: applicableRules(queryDescendant, targetDescendant)) {
UnifyRuleCall call =
rule.match(this, queryDescendant, targetDescendant);
if (call != null) {
// 執(zhí)行規(guī)則
final UnifyResult result = rule.apply(call);
if (result != null) {
// 說(shuō)明找到了匹配的物化視圖,處理局部視圖替換的邏輯
++count;
attempted.add(
new Replacement(result.call.query, result.result, result.stopTrying));
result.call.query.replaceInParent(result.result);
// Replace previous equivalents with new equivalents, higher up
// the tree.
for (int i = 0; i < rule.slotCount; i++) {
Collection<MutableRel> equi = equivalents.get(slots[i]);
if (!equi.isEmpty()) {
equivalents.remove(slots[i], equi.iterator().next());
}
}
assert rowTypesAreEquivalent(result.result, result.call.query, Litmus.THROW);
equivalents.put(result.result, result.call.query);
// 如果待改寫的節(jié)點(diǎn)等于物化視圖結(jié)果,進(jìn)行改寫替換
if (targetDescendant == target) {
// A real substitution happens. We purge the attempted
// replacement list and add them into substitution list.
// Meanwhile we stop matching the descendants and jump
// to the next subtree in pre-order traversal.
if (!target.equals(replacement)) {
Replacement r = replace(
query.getInput(), target, replacement.clone());
assert r != null
: rule + "should have returned a result containing the target.";
attempted.add(r);
}
substitutions.add(ImmutableList.copyOf(attempted));
attempted.clear();
queryDescendant = next;
continue outer;
}
// We will try walking the query tree all over again to see
// if there can be any substitutions after the replacement
// attempt.
break outer;
}
}
}
}
queryDescendant = childOrNext;
}
// Quit the entire loop if:
// 1) we have walked the entire query tree with one or more successful
// substitutions, thus count != 0 && attempted.isEmpty();
// 2) we have walked the entire query tree but have made no replacement
// attempt, thus count == 0 && attempted.isEmpty();
// 3) we had done some replacement attempt in a previous walk, but in
// this one we have not found any potential matches or substitutions,
// thus count == 0 && !attempted.isEmpty().
if (count == 0 || attempted.isEmpty()) {
break;
}
}
if (!attempted.isEmpty()) {
// We had done some replacement attempt in the previous walk, but that
// did not lead to any substitutions in this walk, so we need to recover
// the replacement.
undoReplacement(attempted);
}
return substitutions;
8. 查詢改寫技術(shù)總結(jié)
查詢改寫在業(yè)界大概的分類有三種技術(shù)
- 基于結(jié)構(gòu)信息改寫
- 基于規(guī)則視圖替換
- 基于語(yǔ)法改寫
本文介紹的是基于規(guī)則的視圖替換技術(shù),核心就是尋找查詢關(guān)系表達(dá)式和物化視圖表達(dá)式的相同視圖,進(jìn)行局部改寫和替換,后面會(huì)介紹基于結(jié)構(gòu)信息改寫的技術(shù)特性,下面是三種技術(shù)的簡(jiǎn)單對(duì)比。
原創(chuàng)不易,轉(zhuǎn)載請(qǐng)注明出處,謝謝!