什么是model子句
model子句提供了一種很好的替代電子表格的方法。model子句可以使用sql語句中一些很強大的功能,如聚合,并行,以及多維,多變量分析。
model子句可以建立一定維度數的數據矩陣或模型。模型使用了from子句中列出的表中可以列的子集,至少具有一個維度,一個量值,或者還可以有一個或多個分區。可以將模型看作是每個計算值具有單獨工作表的電子表格文件。工作表有一個x軸和一個y軸(兩個維度)。
定義好模型之后,就可以創建修改量值的規則了。這些規則是model子句的關鍵所在。通過幾種規則,就可以在數據上進行復雜計算,甚至創建新的數據行。量值列現在就是通過維列進行索引的數組,其中的規則應用于數組中所有分區。在所有規則都應用后,模型就重新轉換為傳統的數據行。
model是sql語言應用的一個擴展,近似于oracle數據庫的可擴展性。超過幾百萬行數據的多維,多變量計算,如果不是10億級數據量,都可以很容易地通過model子句來實現。同時,很多數據庫特性如對象分區以及并行執行都可以在model子句中高效地應用,從而進一步提高了可擴展性。
準備測試數據
--非規范化sales_fact表
drop table sales_fact;
create table sales_fact AS
select c.country_name country,c.country_subRegion region,p.prod_name product,
t.calendar_year year,t.calendar_week_number week,
sum(s.amount_sold) sale,
sum(s.amount_sold*
(
case
when mod(rownum,10)=0 then 1.4
when mod(rownum,5)=0 then 0.6
when mod(rownum,2)=0 then 0.9
when mod(rownum,2)=1 then 1.2
else 1
end
)
) receipts
from sh.sales s,sh.times t,sh.customers cu,sh.countries c, sh.products p
where s.time_id=t.time_id
and s.prod_id=p.prod_id
and s.cust_id=cu.cust_id
and cu.country_id=c.country_id
group by c.country_name,c.country_subregion,p.prod_name,t.calendar_year,t.calendar_week_number;
select * from sales_fact where rownum<50;
下面的sql可以生成一個計算sales_fact表中每年各周的庫存的電子表格。這個使用model子句的sql,是對前面電子表格功能進行仿真。
--使用model子句進行庫存公式計算
col product fommat A30
col country fommat A10
col region fommat A10
col year fommat 9999
col week fommat 99
col sale fommat 999999
lines 120 pages 100
select product,
country,
year,
week,
inventory,
sale,
receipts
from sales_fact sf
where sf.country in ('Australia')
and sf.product = 'Xtend Memory' model return updated rows
partition by(product, country) dimension by(year, week)
measures(0 inventory, sale, receipts) rules automatic
order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ])
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 1998 1 8.88 58.15 67.03
Xtend Memory Australia 1998 2 14.758 29.39 35.268
Xtend Memory Australia 1998 3 20.656 29.49 35.388
Xtend Memory Australia 1998 4 8.86 29.49 17.694
Xtend Memory Australia 1998 5 14.82 29.8 35.76
Xtend Memory Australia 1998 6 8.942 58.78 52.902
Xtend Memory Australia 1998 9 2.939 58.78 61.719
Xtend Memory Australia 1998 10 0.01 117.76 114.831
Xtend Memory Australia 1998 12 -14.9 59.6 44.7
Xtend Memory Australia 1998 14 11.756 58.78 70.536
Xtend Memory Australia 1998 15 5.878 58.78 52.902
Xtend Memory Australia 1998 17 11.756 58.78 70.536
Xtend Memory Australia 1998 18 8.817 117.56 114.621
Xtend Memory Australia 1998 19 2.919 58.98 53.082
Xtend Memory Australia 1998 21 2.98 59.6 62.58
Xtend Memory Australia 1998 23 -11.756 117.56 105.804
Xtend Memory Australia 1998 26 11.756 117.56 129.316
Xtend Memory Australia 1998 27 14.632 57.52 60.396
Xtend Memory Australia 1998 28 0.202 57.72 43.29
Xtend Memory Australia 1998 29 -14.228 57.72 43.29
關鍵字model return updated rows
聲明使用model子句。在使用model子句的sql中,有3組列:分區列,維度列,度量會值列。分區列類似于電子表格中的一張工作表。維度列類似于行標簽和列標簽。度量值類似于含有公式的單元格。
子句 | 說明 |
---|---|
partition by(product, country) |
將product和country這兩列指定為分區列。 |
dimension by(year, week) |
列指定為維度列。 |
measures(0 inventory, sale, receipts) |
將inventory,sales,receipts列指定為度量值列。 |
order(inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ]) |
規則類似于一個公式 |
model子句實現了分區數組,維度列是指向數組元素的索引。每個數組元素,也稱為單元格,也就是一個度量列值。
分區列值相同的所有行被認為是在同一個分區中。這個例子中,所有產品和國家值相同的行在一個分區中,在一個分區中,維度列唯一辯識每一行。
cv表示現值,可以用來表示從規則左側計算得來的規則右側的列值。
如規則左側的year和week列的值為(2001,3),規則右側的cv(year)子句所指的值為規則左側year值的值,也就是2001,類似地,cv(week)子句指的是規則左側week列的值也就是3。因此,inventory [ cv(year), cv(week) - 1 ]
子句將返回2001年中前1周也就是第2周的庫存度量值。類似,sale [ cv(year), cv(week) ]
和 receipts [ cv(year), cv(week) ]
指的是使用c函數計算的2001年第3周的sale列和receipts列的值。在規則中并未聲明分區列product和country,規則隱式地為product和country列引用當前分區中的值。
位置標記
--使用位置引用初始華2002年的值 --- upsert
select product,
country,
year,
week,
inventory,
sale,
receipts
from sales_fact sf
where sf.country in ('Australia')
and sf.product = 'Xtend Memory'
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory, sale, receipts) rules automatic
order(
inventory [ year, week ] = nvl(inventory [ cv(year), cv(week) - 1 ], 0) - sale [ cv(year), cv(week) ] + receipts [ cv(year), cv(week) ],
sale[2002,1]=0,
receipts[2002,1]=0
)
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 2001 38 -7.795 139 143.384
Xtend Memory Australia 2001 39 5.687 115.57 129.052
Xtend Memory Australia 2001 40 12.174 45.18 51.667
Xtend Memory Australia 2001 41 12.058 67.19 67.074
Xtend Memory Australia 2001 42 6.426 136.98 131.348
Xtend Memory Australia 2001 43 4.053 139.58 137.207
Xtend Memory Australia 2001 44 8.711 23.29 27.948
Xtend Memory Australia 2001 46 2.357 93.58 95.937
Xtend Memory Australia 2001 48 2.314 182.96 185.274
Xtend Memory Australia 2001 49 4.772 45.26 47.718
Xtend Memory Australia 2001 50 9.4 23.14 27.768
Xtend Memory Australia 2001 51 4.86 114.82 110.28
Xtend Memory Australia 2001 52 14.116 23.14 32.396
Xtend Memory Australia 2002 1 0 0 0
位置標記能夠在結果集中插入一個新單元格或更新一個己有單元格。如果所引用的單元格在結果集中存在,則會更新單元格的值;如果不存在,則會增加一個新的單元格。這種存在則更新,不存在則插入的概念被稱為upsert特性,是update和insert功能的融合版本,位置標記提供了upsert的能力。
符號標記
符號標記能夠在規則左側聲明一定的范圍值。
--符號引用 --- upsert
select product,
country,
year,
week,
sale
from sales_fact sf
where sf.country in ('Australia')
and sf.product = 'Xtend Memory'
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(sale)
rules(
sale[year in (2000,2001), week in (1,52,53)] order by year,week
=sale[cv(year),cv(week)]*1.10
)
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK SALE
-------------------------------------------------- ---------------------------------------- ---------- ---------- ----------
Xtend Memory Australia 2000 1 51.37
Xtend Memory Australia 2000 52 74.195
Xtend Memory Australia 2001 1 101.486
Xtend Memory Australia 2001 52 25.454
將2001年和2002年第1,52,53周的sale列的值更新為實際值的110%。year in (2000,2001)
使用in運算符指定year列的值列表。類似地,week in (1,52,53)
子句指定week列的值列表。
沒有數據行滿足week列為53周的要求,并且在結果集中對于week=53也沒有新行被加入或更新。生成新行的能力是符號標記和位置標記的最主要的區別。符號標記權提供了update的能力,而位置標記提供了upsert的功能。
for循環
--符號引用 model與for循環
select product,country, year,week,inventory, sale,receipts
from sales_fact sf
where sf.country in ('Australia') and sf.product = 'Xtend Memory'
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory, sale, receipts)
rules automatic order(
inventory[year,week]=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)],
sale[2002, for week from 1 to 53 increment 1]=0,
receipts[2002, for week from 1 to 53 increment 1]=0
)
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 2001 38 -7.795 139 143.384
Xtend Memory Australia 2001 39 5.687 115.57 129.052
Xtend Memory Australia 2001 40 12.174 45.18 51.667
Xtend Memory Australia 2001 41 12.058 67.19 67.074
Xtend Memory Australia 2001 42 6.426 136.98 131.348
Xtend Memory Australia 2001 43 4.053 139.58 137.207
Xtend Memory Australia 2001 44 8.711 23.29 27.948
Xtend Memory Australia 2001 46 2.357 93.58 95.937
Xtend Memory Australia 2001 48 2.314 182.96 185.274
Xtend Memory Australia 2001 49 4.772 45.26 47.718
Xtend Memory Australia 2001 50 9.4 23.14 27.768
Xtend Memory Australia 2001 51 4.86 114.82 110.28
Xtend Memory Australia 2001 52 14.116 23.14 32.396
Xtend Memory Australia 2002 1 0 0 0
Xtend Memory Australia 2002 2 0 0 0
Xtend Memory Australia 2002 3 0 0 0
Xtend Memory Australia 2002 4 0 0 0
Xtend Memory Australia 2002 5 0 0 0
Xtend Memory Australia 2002 6 0 0 0
Xtend Memory Australia 2002 7 0 0 0
Xtend Memory Australia 2002 8 0 0 0
for循環允許指定規則左側的值列表。for循環只可以定義在規則的左側,用來將新的單元格加入到輸出中,不能在規則的右側使用。
語法:
for dimension for <value1> to <value2>
[increment | decrement] <value3>
返回更新后的行
--沒有return updated rows的sql語句
select product,country,year,week,sale
from sales_fact sf
where sf.country in ('Australia') and sf.product = 'Xtend Memory'
model --return updated rows
partition by(product, country)
dimension by(year, week)
measures(sale)
rules(
sale[year in (2000,2001), week in (1,52,53)] order by year,week
=sale[cv(year),cv(week)]*1.10
)
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 1998 1 8.88 58.15 67.03
Xtend Memory Australia 1998 2 14.758 29.39 35.268
Xtend Memory Australia 1998 3 20.656 29.49 35.388
Xtend Memory Australia 1998 4 8.86 29.49 17.694
Xtend Memory Australia 1998 5 14.82 29.8 35.76
Xtend Memory Australia 1998 6 8.942 58.78 52.902
Xtend Memory Australia 1998 9 2.939 58.78 61.719
Xtend Memory Australia 1998 10 0.01 117.76 114.831
Xtend Memory Australia 1998 12 -14.9 59.6 44.7
Xtend Memory Australia 1998 14 11.756 58.78 70.536
Xtend Memory Australia 1998 15 5.878 58.78 52.902
Xtend Memory Australia 1998 17 11.756 58.78 70.536
Xtend Memory Australia 1998 18 8.817 117.56 114.621
Xtend Memory Australia 1998 19 2.919 58.98 53.082
Xtend Memory Australia 1998 21 2.98 59.6 62.58
Xtend Memory Australia 1998 23 -11.756 117.56 105.804
Xtend Memory Australia 1998 26 11.756 117.56 129.316
Xtend Memory Australia 1998 27 14.632 57.52 60.396
Xtend Memory Australia 1998 28 0.202 57.72 43.29
Xtend Memory Australia 1998 29 -14.228 57.72 43.29
上面的sql返回了159行,而之前的例子僅返回了4行,return updated rows
控制了這一行為并能夠限制sql語句所返回的單元格。如果沒有這個子句,不管規則有沒有更新這些行,所有的數據行都會返回。
return updated rows
子句同樣適用于使用位置標記的語句,如下面的例子:
--return updated rows與upsert
select product,country,year,week,sale
from sales_fact sf
where sf.country in ('Australia') and sf.product = 'Xtend Memory'
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(sale)
rules(
sale[2002,1]=0
)
order by product, country, year, week;
運行結果
PRODUCT COUNTRY YEAR WEEK SALE
-------------------------------------------------- ---------------------------------------- ---------- ---------- ----------
Xtend Memory Australia 2002 1 0
求解順序
在規則部分可以聲明多個規則,并且規則可以聲明相互之間的依賴關系。不僅如此,即使在一個單獨的規則中,規則的求解也必須要按照一定的邏輯順序進行。
--產生錯誤ORA-32637的順序
select product,country,year,week,inventory,sale,receipts
from sales_fact sf
where sf.country in ('Australia')
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory,sale, receipts)
rules -- automatic order
(
inventory[year,week]=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
order by product, country, year, week;
ORA-32637: 順序排序 MODEL 中的自循環規則
將automatic order
注釋掉,強制使用了sequential order的默認行為。該規則通過inventory[cv(year),cv(week)-1]
子句進行了跨行引用。庫存列的值必須按照周的升序進行計算。前一周的庫存規則必須在當前周的庫存規則之前求解。通過automatic order
,數據庫引擎確定了行依賴關系并嚴格按照依賴關系的順序對行進行求解。如果沒有automatic order
,行求解順序就不能確定,這將會導致ORA-32637錯誤。
顯式聲明行求解順序以避免這個錯誤是一種更好的實踐方式。
--單元格級的求值順序
select product,country,year,week,inventory,sale,receipts
from sales_fact sf
where sf.country in ('Australia') and product in ('Xtend Memory')
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory,sale, receipts)
rules -- automatic order
(
inventory[year,week] order by year,week =nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 1998 1 8.88 58.15 67.03
Xtend Memory Australia 1998 2 14.758 29.39 35.268
Xtend Memory Australia 1998 3 20.656 29.49 35.388
Xtend Memory Australia 1998 4 8.86 29.49 17.694
Xtend Memory Australia 1998 5 14.82 29.8 35.76
Xtend Memory Australia 1998 6 8.942 58.78 52.902
Xtend Memory Australia 1998 9 2.939 58.78 61.719
Xtend Memory Australia 1998 10 0.01 117.76 114.831
Xtend Memory Australia 1998 12 -14.9 59.6 44.7
Xtend Memory Australia 1998 14 11.756 58.78 70.536
Xtend Memory Australia 1998 15 5.878 58.78 52.902
Xtend Memory Australia 1998 17 11.756 58.78 70.536
Xtend Memory Australia 1998 18 8.817 117.56 114.621
Xtend Memory Australia 1998 19 2.919 58.98 53.082
Xtend Memory Australia 1998 21 2.98 59.6 62.58
Xtend Memory Australia 1998 23 -11.756 117.56 105.804
Xtend Memory Australia 1998 26 11.756 117.56 129.316
Xtend Memory Australia 1998 27 14.632 57.52 60.396
Xtend Memory Australia 1998 28 0.202 57.72 43.29
Xtend Memory Australia 1998 29 -14.228 57.72 43.29
在規則部分,通過order by year,week
子句顯式聲明了行求解順,表示必須按照year和week列值的升序求解。
--使用desc關鍵字的求解順序
select product,country,year,week,inventory,sale,receipts
from sales_fact sf
where sf.country in ('Australia') and product in ('Xtend memory')
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory,sale, receipts)
rules
(
inventory[year,week] order by year,week desc =nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
order by product, country, year, week;
上面的盡管語法是對的,但與需求不一致。
規則求解順序
除了行求解順序外,還需要面對所應用的規則求解順序問題。
--規則求值順序--順序求值
select * from (
select product,country,year,week,inventory,sale,receipts
from sales_fact sf
where sf.country in ('Australia') and product in ('Xtend memory')
model return updated rows
partition by (product, country)
dimension by (year, week)
measures (0 inventory,sale, receipts)
rules sequential order
(
inventory[year,week] order by year,week =nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)],
receipts[year in (2000,2001),week in (51,52,53)] order by year,week =receipts[cv(year),cv(week)]*10
)
order by product, country, year, week
) where week>50;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 1998 1 8.88 58.15 67.03
Xtend Memory Australia 1998 2 14.758 29.39 35.268
Xtend Memory Australia 1998 3 20.656 29.49 35.388
Xtend Memory Australia 1998 4 8.86 29.49 17.694
Xtend Memory Australia 1998 5 14.82 29.8 35.76
Xtend Memory Australia 1998 6 8.942 58.78 52.902
Xtend Memory Australia 1998 9 2.939 58.78 61.719
Xtend Memory Australia 1998 10 0.01 117.76 114.831
Xtend Memory Australia 1998 12 -14.9 59.6 44.7
Xtend Memory Australia 1998 14 11.756 58.78 70.536
Xtend Memory Australia 1998 15 5.878 58.78 52.902
Xtend Memory Australia 1998 17 11.756 58.78 70.536
Xtend Memory Australia 1998 18 8.817 117.56 114.621
Xtend Memory Australia 1998 19 2.919 58.98 53.082
Xtend Memory Australia 1998 21 2.98 59.6 62.58
Xtend Memory Australia 1998 23 -11.756 117.56 105.804
Xtend Memory Australia 1998 26 11.756 117.56 129.316
Xtend Memory Australia 1998 27 14.632 57.52 60.396
Xtend Memory Australia 1998 28 0.202 57.72 43.29
Xtend Memory Australia 1998 29 -14.228 57.72 43.29
sequential order
指定了規則按照其在列表中的先后順序進行求解。
--規則求值順序--自動求值
select * from (
select product,country,year,week,inventory,sale,receipts
from sales_fact sf
where sf.country in ('Australia') and product in ('Xtend Memory')
model return updated rows
partition by (product, country)
dimension by (year, week)
measures (0 inventory,sale, receipts)
rules sequential order
(
inventory[year,week] order by year,week =nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)],
receipts[year in (2000,2001),week in (51,52,53)] order by year,week =receipts[cv(year),cv(week)]*10
)
order by product, country, year, week
) where week>50;
PRODUCT COUNTRY YEAR WEEK INVENTORY SALE RECEIPTS
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ---------- ----------
Xtend Memory Australia 1998 51 0.04 58.32 61.236
Xtend Memory Australia 1998 52 5.812 86.38 92.152
Xtend Memory Australia 1999 53 -2.705 27.05 24.345
Xtend Memory Australia 2000 52 -1.383 67.45 660.67
Xtend Memory Australia 2001 51 4.86 114.82 1102.8
Xtend Memory Australia 2001 52 14.116 23.14 323.96
上面兩個sql的結果是不匹配的。automatic order
允許數據庫引擎自動識別規則之間的依賴關系。因此,數據庫引擎首先對receipts規則求解,然后是inventory規則。規則的求解順序是非常重要的,如果存在很復雜的相互依賴性,需要指定automatic order
并按照嚴格求解順序依次列出規則。
聚合
在數據倉庫的查詢中經常會用到數據的聚合運算。model子句可以在一定范圍的維度列上使用聚合函數從而實現數據聚合。許多不同的聚合函數調用如sum,max,avg,stddev以及olap函數調用都可以用來進行規則中的數據的聚合。
--聚合
select product,country,year,week,inventory,avg_inventory,max_sale
from sales_fact sf
where sf.country in ('Australia') and sf.product ='Xtend Memory'
model return updated rows
partition by (product, country)
dimension by (year, week)
measures (0 inventory,0 avg_inventory,0 max_sale, sale, receipts)
rules automatic order(
inventory[year,week] =nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)],
avg_inventory[year,ANY]= avg(inventory)[cv(year),week],
max_sale[year,ANY]= avg(sale)[cv(year),week]
)
order by product, country, year, week;
PRODUCT COUNTRY YEAR WEEK INVENTORY AVG_INVENTORY MAX_SALE
-------------------------------------------------- ---------------------------------------- ---------- ---------- ---------- ------------- ----------
Xtend Memory Australia 1998 1 8.88 -0.7254166666 71.165
Xtend Memory Australia 1998 2 14.758 -0.7254166666 71.165
Xtend Memory Australia 1998 3 20.656 -0.7254166666 71.165
Xtend Memory Australia 1998 4 8.86 -0.7254166666 71.165
Xtend Memory Australia 1998 5 14.82 -0.7254166666 71.165
Xtend Memory Australia 1998 6 8.942 -0.7254166666 71.165
Xtend Memory Australia 1998 9 2.939 -0.7254166666 71.165
Xtend Memory Australia 1998 10 0.01 -0.7254166666 71.165
Xtend Memory Australia 1998 12 -14.9 -0.7254166666 71.165
Xtend Memory Australia 1998 14 11.756 -0.7254166666 71.165
Xtend Memory Australia 1998 15 5.878 -0.7254166666 71.165
Xtend Memory Australia 1998 17 11.756 -0.7254166666 71.165
Xtend Memory Australia 1998 18 8.817 -0.7254166666 71.165
Xtend Memory Australia 1998 19 2.919 -0.7254166666 71.165
Xtend Memory Australia 1998 21 2.98 -0.7254166666 71.165
Xtend Memory Australia 1998 23 -11.756 -0.7254166666 71.165
Xtend Memory Australia 1998 26 11.756 -0.7254166666 71.165
Xtend Memory Australia 1998 27 14.632 -0.7254166666 71.165
Xtend Memory Australia 1998 28 0.202 -0.7254166666 71.165
Xtend Memory Australia 1998 29 -14.228 -0.7254166666 71.165
迭代
迭代是另一種使用簡潔的model sql語句來實現復雜業務的功能。迭代意味著一段規則代碼能夠在循環中執行一定的次數或者當條件保持為真時執行。
語法:
[iterate (n) [until <condition>] ]
( <cell_assignment> = <expression> ... )
--迭代
select year,week,sale,sale_list
from sales_fact sf
where sf.country in ('Australia') and sf.product ='Xtend Memory'
model return updated rows
partition by (product, country)
dimension by (year, week)
measures ( cast(' ' as varchar2(50) ) sale_list, sale )
rules iterate(5)(
sale_list[year,week] order by year,week =sale[cv(year),cv(week)-iteration_number+2] ||
case when iteration_number=0 then '' else ',' end ||
sale_list [cv(year),cv(week)]
)
order by year, week;
YEAR WEEK SALE SALE_LIST
---------- ---------- ---------- --------------------------------------------------
1998 1 58.15 ,,58.15,29.39,29.49
1998 2 29.39 ,58.15,29.39,29.49,29.49
1998 3 29.49 58.15,29.39,29.49,29.49,29.8
1998 4 29.49 29.39,29.49,29.49,29.8,58.78
1998 5 29.8 29.49,29.49,29.8,58.78,
1998 6 58.78 29.49,29.8,58.78,,
1998 9 58.78 ,,58.78,117.76,
1998 10 117.76 ,58.78,117.76,,59.6
1998 12 59.6 117.76,,59.6,,58.78
1998 14 58.78 59.6,,58.78,58.78,
1998 15 58.78 ,58.78,58.78,,58.78
1998 17 58.78 58.78,,58.78,117.56,58.98
1998 18 117.56 ,58.78,117.56,58.98,
1998 19 58.98 58.78,117.56,58.98,,59.6
1998 21 59.6 58.98,,59.6,,117.56
1998 23 117.56 59.6,,117.56,,
1998 26 117.56 ,,117.56,57.52,57.72
1998 27 57.52 ,117.56,57.52,57.72,57.72
1998 28 57.72 117.56,57.52,57.72,57.72,
1998 29 57.72 57.52,57.72,57.72,,
目標為以逗號分隔列表的形式展示5周sale列的值。
子句 | 描述 |
---|---|
rules iterate(5) |
規則程序段進行5次循環 |
iteration_number |
當前循環次數的變量從第一次0開始,結束于n-1,其中n為iterate (n) 子句中指定的循環次數。 |
sale[cv(year),cv(week)-iteration_number+2] |
訪問前兩周以及后兩周的值。 |
case when iteration_number=0 then '' else ',' end |
為列表中除了第一個成員以外的每個成員加上了一個逗號。 |
presentv與空值
如果規則訪問一個不存在的行,將會返回一個空值。
--迭代和presntv
select year,week,sale,sale_list
from sales_fact sf
where sf.country in ('Australia') and sf.product ='Xtend Memory'
model return updated rows
partition by (product, country)
dimension by (year, week)
measures ( cast(' ' as varchar2(50) ) sale_list, sale )
rules iterate(5)(
sale_list[year,week] order by year,week =
presentv(sale[cv(year),cv(week)-iteration_number+2],
sale[cv(year),cv(week)-iteration_number+2] ||
case when iteration_number=0 then '' else ',' end ||
sale_list [cv(year),cv(week)],
sale_list [cv(year),cv(week)])
)
order by year, week;
YEAR WEEK SALE SALE_LIST
---------- ---------- ---------- --------------------------------------------------
1998 1 58.15 ,,58.15,29.39,29.49
1998 2 29.39 ,58.15,29.39,29.49,29.49
1998 3 29.49 58.15,29.39,29.49,29.49,29.8
1998 4 29.49 29.39,29.49,29.49,29.8,58.78
1998 5 29.8 29.49,29.49,29.8,58.78,
1998 6 58.78 29.49,29.8,58.78,,
1998 9 58.78 ,,58.78,117.76,
1998 10 117.76 ,58.78,117.76,,59.6
1998 12 59.6 117.76,,59.6,,58.78
1998 14 58.78 59.6,,58.78,58.78,
1998 15 58.78 ,58.78,58.78,,58.78
1998 17 58.78 58.78,,58.78,117.56,58.98
1998 18 117.56 ,58.78,117.56,58.98,
1998 19 58.98 58.78,117.56,58.98,,59.6
1998 21 59.6 58.98,,59.6,,117.56
1998 23 117.56 59.6,,117.56,,
1998 26 117.56 ,,117.56,57.52,57.72
1998 27 57.52 ,117.56,57.52,57.72,57.72
1998 28 57.72 117.56,57.52,57.72,57.72,
1998 29 57.72 57.52,57.72,57.72,,
presentnnv函數與presentv函數類似,但它可以進一步區分所引用的是不存在的單元格還是存在的值為空的單元格。
presentnnv函數的語法:
presentnnv (cell_reference, expr1, expr2)
如果第1個參數cell_reference引用了存在的單元格并且單元格不含空值,那么返回第1個參數expr1,否則返回第2個參數expr2。
presentv與presentnnv的比較
單元格存 | 是否為空格 | presentv | presentnnv |
---|---|---|---|
是 | 非空 | expr1 | expr1 |
是 | 空 | expr1 | expr2 |
否 | 非空 | expr12 | expr2 |
否 | 空 | expr12 | expr2 |
查找表
可以定義一個查找表并在規則部分進行引用。這樣的一個查找表有時也稱為參考表。
--參考model
select year,week,sale,prod_list_price
from sales_fact
where country in ('Australia') and product in ('Xtend Memory')
model return updated rows
reference ref_prod on
(select prod_name, max(prod_list_price) prod_list_price
from sh.products group by prod_name)
dimension by (prod_name)
measures (prod_list_price)
MAIN main_section
partition by(product, country)
dimension by(year, week)
measures(sale, receipts, 0 prod_list_price)
rules(
prod_list_price[year, week] order by year, week =ref_prod.prod_list_price[cv(product)]
)
order by year,week;
YEAR WEEK SALE PROD_LIST_PRICE
---------- ---------- ---------- ---------------
2001 38 139 20.99
2001 39 115.57 20.99
2001 40 45.18 20.99
2001 41 67.19 20.99
2001 42 136.98 20.99
2001 43 139.58 20.99
2001 44 23.29 20.99
2001 46 93.58 20.99
2001 48 182.96 20.99
2001 49 45.26 20.99
2001 50 23.14 20.99
2001 51 114.82 20.99
2001 52 23.14 20.99
reference ref_prod on
(select prod_name, max(prod_list_price) prod_list_price
from products group by prod_name)
dimension by (prod_name)
measures (prod_list_price)
使用reference
子句定義了一個查找表ref_prod.
reference ref_prod
指定ref_prod為查找表。
prod_name
指定了維度列
prod_list_price
為度量值列
注意引用表的維度列必須唯一,并且針對維度列中的每個值只會取出一行。
MAIN main_section
partition by(product, country)
dimension by(year, week)
measures(sale, receipts, 0 prod_list_price)
rules(
prod_list_price[year, week] order by year, week =ref_prod.prod_list_price[cv(product)]
)
以MAIN關鍵字開頭聲明了main model部分。
prod_list_price[year, week] order by year, week =ref_prod.prod_list_price[cv(product)]
子句讀取。product列的當前值通過cv(product)子句傳遞過來作為查找表中的查找鍵值。
--更多查找表
select year,week,sale,prod_list_price,iso_code
from sales_fact
where country in ('Australia') and product in ('Xtend Memory')
model return updated rows
reference ref_prod on
(select prod_name, max(prod_list_price) prod_list_price from sh.products group by prod_name)
dimension by (prod_name)
measures (prod_list_price)
reference ref_country on
(select c.country_name, c.country_iso_code from sh.countries c)
dimension by (country_name)
measures (country_iso_code)
MAIN main_section
partition by(product, country)
dimension by(year, week)
measures(sale, receipts, 0 prod_list_price, cast(' ' as varchar2(5)) iso_code)
rules(
prod_list_price[year, week] order by year, week =ref_prod.prod_list_price[cv(product)],
iso_code[year, week] order by year, week =ref_country.country_iso_code[cv(country)]
)
order by year,week;
YEAR WEEK SALE PROD_LIST_PRICE ISO_CODE
---------- ---------- ---------- --------------- --------
2001 38 139 20.99 AU
2001 39 115.57 20.99 AU
2001 40 45.18 20.99 AU
2001 41 67.19 20.99 AU
2001 42 136.98 20.99 AU
2001 43 139.58 20.99 AU
2001 44 23.29 20.99 AU
2001 46 93.58 20.99 AU
2001 48 182.96 20.99 AU
2001 49 45.26 20.99 AU
2001 50 23.14 20.99 AU
2001 51 114.82 20.99 AU
2001 52 23.14 20.99 AU
空值
在使用model的sql語句中,有兩個原因使得值為空:己經存在的單元格值為空值或引用了不存在的單元格。
--keep nav的例子
select product,country, year,week,sale
from sales_fact
where country in ('Australia') and product in ('Xtend Memory')
model keep nav return updated rows
partition by(product, country)
dimension by(year, week)
measures(sale)
rules sequential order(
sale[2001, 1] order by year, week =sale[2001,1],
sale[2002, 1] order by year, week =sale[2001,1]+sale[2002, 1]
)
order by year,week;
PRODUCT COUNTRY YEAR WEEK SALE
-------------------------------------------------- ---------------------------------------- ---------- ---------- ----------
Xtend Memory Australia 2001 1 92.26
Xtend Memory Australia 2002 1
sale[2002, 1]
子句訪問的是2002年第1周的sale列的值。在sales_fact表中沒有2002年的數據,因此sale[2002, 1]
訪問的是一個不存在的單元格。由于與空值進行數學運算,這個表中的輸出為空值。
NAV non available values,表示沒有可用值,引用不存在的單元格默認會返回空值。
--忽略nav
select product,country, year,week,sale
from sales_fact
where country in ('Australia') and product in ('Xtend Memory')
model ignore nav return updated rows
partition by(product, country)
dimension by(year, week)
measures(sale)
rules sequential order(
sale[2001, 1] order by year, week =sale[2001,1],
sale[2002, 1] order by year, week =sale[2001,1]+sale[2002, 1]
)
order by year,week;
PRODUCT COUNTRY YEAR WEEK SALE
-------------------------------------------------- ---------------------------------------- ---------- ---------- ----------
Xtend Memory Australia 2001 1 92.26
Xtend Memory Australia 2002 1 92.26
這個默認行為可以使用ignore nav
子句修改。如果訪問不存在的單元格,則數值列將會返回0,文本列會返回一個空字符串而不返回空值。sale[2001,1]+sale[2002, 1]
返回了 92.26,因為不存在單元格sale[2002, 1]的返回值為0。
使用model子句進行性能調優
1. acyclic
SQL> set autotrace traceonly
SQL> --自動排序與acyclic
SQL> select product,country, year,week,inventory,sale,receipts
2 from sales_fact
3 where country in ('Australia') and product in ('Xtend Memory')
4 model return updated rows
5 partition by(product, country)
6 dimension by(year, week)
7 measures(0 inventory, sale,receipts)
8 rules automatic order(
9 inventory[year,week] order by year,week=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
10 )
11 order by product,country,year,week;
執行計劃
----------------------------------------------------------
Plan hash value: 612713790
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 251 | 25351 | 311 (1)| 00:00:04 |
| 1 | SORT ORDER BY | | 251 | 25351 | 311 (1)| 00:00:04 |
| 2 | SQL MODEL ACYCLIC | | 251 | 25351 | 311 (1)| 00:00:04 |
|* 3 | TABLE ACCESS FULL| SALES_FACT | 251 | 25351 | 310 (1)| 00:00:04 |
關鍵字acyclic 表明規則之間沒有可能的cyclic依賴關系
這兒使用了order by year,week
來控制規則之間的依賴關系,避免循環依賴性。
2. acyclic fast
如果規則是只訪問某一個單元格的簡單規則,那么可以使用acyclic fast算法。
--自動排序與acyclic fast
SELECT DISTINCT product,country,year,week, sale_first_week
FROM sales_fact
WHERE country IN ('Australia') AND product='Xtend Memory'
MODEL RETURN UPDATED ROWS
PARTITION BY (product, country)
DIMENSION BY (year,week)
MEASURES (0 sale_first_week, sale)
RULES AUTOMATIC ORDER(
sale_first_week[2000,1] = 0.12*sale[2000,1]
)
ORDER BY product,country,year,week;
執行計劃
----------------------------------------------------------
Plan hash value: 2162534578
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 251 | 22088 | 312 (2)| 00:00:04 |
| 1 | SORT ORDER BY | | 251 | 22088 | 312 (2)| 00:00:04 |
| 2 | SQL MODEL ACYCLIC FAST| | 251 | 22088 | 312 (2)| 00:00:04 |
|* 3 | TABLE ACCESS FULL | SALES_FACT | 251 | 22088 | 310 (1)| 00:00:04 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')
3. cyclic
下面使用cyclic算法來進行規則的求解。
--自動排序與CYCLIC
select product,country, year,week,inventory,sale,receipts
from sales_fact
where country in ('Australia') and product in ('Xtend Memory')
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory, sale,receipts)
rules automatic order(
inventory[year,week]=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
order by product,country,year,week;
執行計劃
----------------------------------------------------------
Plan hash value: 1486878524
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 251 | 25351 | 311 (1)| 00:00:04 |
| 1 | SORT ORDER BY | | 251 | 25351 | 311 (1)| 00:00:04 |
| 2 | SQL MODEL CYCLIC | | 251 | 25351 | 311 (1)| 00:00:04 |
|* 3 | TABLE ACCESS FULL| SALES_FACT | 251 | 25351 | 310 (1)| 00:00:04 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')
4. sequential
如果規則聲明了sequential順序,那么規則的求解算法將會是ordered。
--sequential順序
select product,country, year,week,inventory,sale,receipts
from sales_fact
where country in ('Australia') and product in ('Xtend Memory')
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory, sale,receipts)
rules sequential order(
inventory[year,week] order by year, week=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
order by product,country,year,week;
執行計劃
----------------------------------------------------------
Plan hash value: 3753083011
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 251 | 25351 | 311 (1)| 00:00:04 |
| 1 | SORT ORDER BY | | 251 | 25351 | 311 (1)| 00:00:04 |
| 2 | SQL MODEL ORDERED | | 251 | 25351 | 311 (1)| 00:00:04 |
|* 3 | TABLE ACCESS FULL| SALES_FACT | 251 | 25351 | 310 (1)| 00:00:04 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')
簡單來說,規則的復雜度和互相依賴性在求解算法中扮演了關鍵的角色。acyclic fast和ordered fast算法可擴展性更強,這在數據量不斷增加時,它倆的作用尤其明顯。
謂詞推進
從概念上來說,model子句是分析型sql的一個變體,典型的是在一個視圖或內嵌視圖中實現。謂詞是在視圖之外聲明的,為了獲得可以授受的性能,這些謂詞必須被推進到視圖中去。事實上,謂詞推進對于model子句的性能是非常關鍵的。如果沒有推進,那么model子句將會在更大的行數據集上執行并且可能導致較差的性能。
--謂詞推進
SELECT * FROM (
select product,country, year,week,inventory,sale,receipts
from sales_fact
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory, sale,receipts)
rules automatic order(
inventory[year,week] order by year, week=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
) where country in ('Australia') and product in ('Xtend Memory')
order by product,country,year,week;
執行計劃
----------------------------------------------------------
Plan hash value: 3432178194
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 251 | 28614 | 311 (1)| 00:00:04 |
| 1 | SORT ORDER BY | | 251 | 28614 | 311 (1)| 00:00:04 |
| 2 | VIEW | | 251 | 28614 | 310 (1)| 00:00:04 |
| 3 | SQL MODEL ACYCLIC | | 251 | 25351 | | |
|* 4 | TABLE ACCESS FULL| SALES_FACT | 251 | 25351 | 310 (1)| 00:00:04 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("COUNTRY"='Australia' AND "PRODUCT"='Xtend Memory')
上面定義了一個內嵌視圖,然后在country和product列上加上了謂詞。執行計劃中的第4步顯示兩個謂詞都被推進到了視圖中,數據行經過了這兩個謂詞的篩選,然后在其結果集上應用model子句。
--未進行謂詞推進
SELECT * FROM (
select product,country, year,week,inventory,sale,receipts
from sales_fact
model return updated rows
partition by(product, country)
dimension by(year, week)
measures(0 inventory, sale,receipts)
rules automatic order(
inventory[year,week] order by year, week=nvl(inventory[cv(year),cv(week)-1],0)-sale[cv(year),cv(week)]+receipts[cv(year),cv(week)]
)
) where year=2000
order by product,country,year,week;
執行計劃
----------------------------------------------------------
Plan hash value: 3432178194
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 132K| 14M| | 3711 (1)| 00:00:45 |
| 1 | SORT ORDER BY | | 132K| 14M| 16M| 3711 (1)| 00:00:45 |
|* 2 | VIEW | | 132K| 14M| | 311 (1)| 00:00:04 |
| 3 | SQL MODEL ACYCLIC | | 132K| 12M| | | |
| 4 | TABLE ACCESS FULL| SALES_FACT | 132K| 12M| | 311 (1)| 00:00:04 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("YEAR"=2000)
上面這個例子是一個謂詞沒有推進到視圖中的例子。這兒指定year=2000
,但沒有推進到內嵌視圖中。優化器估計顯示model需要處理近132000行數據。只有在安全的情況下才可以將謂詞推進到視圖中。上面將year和week列作為維度列。分區列上的謂詞可以很安全地推進到視圖中,但并不是所有維度列上的謂詞都可以進行推進的。