with子句最有用的特性之一就是消除復雜的sql查詢,當查詢中包含大量的表和數(shù)據(jù)列時,想要搞清楚查詢中的數(shù)據(jù)流就變得很困難。使用子查詢因子化,只要一個個查詢就可以將一些較復雜的部分移到主查詢之外,從而使得查詢更易于理解。
下面使用pivot運算符生成一個交叉數(shù)據(jù)分析報告。最里層的查詢在sales表的關(guān)鍵列上創(chuàng)建了一系列的聚合,而接下來的最外層查詢只是提供了在pivot運算符中出現(xiàn)的列的列名,從而生成了每種產(chǎn)品不同渠道和季度的最終銷售值。
--沒有進行子查詢因子化的交叉數(shù)據(jù)分析查詢
SELECT *
FROM (
SELECT /*+ gather_plan_statistics */
product,channel,quarter,country,quantity_sold
FROM (
SELECT pr.prod_name product,co.country_name country,sa.channel_id channel, Substr(t.calendar_quarter_desc,6,2) quarter,
SUM(sa.amount_sold) amount_sold,
SUM(sa.quantity_sold) quantity_sold
FROM sh.sales sa
JOIN sh.times t ON t.time_id=sa.time_id
JOIN sh.customers cu ON cu.cust_id=sa.cust_id
JOIN sh.countries co ON co.country_id=cu.country_id
JOIN sh.products pr ON pr.prod_id=sa.prod_id
GROUP BY
pr.prod_name,co.country_name,sa.channel_id,Substr(t.calendar_quarter_desc,6,2)
)
) PIVOT (
SUM(quantity_sold) FOR (channel,quarter) IN
(
(5,'02') AS catolog_q2,
(4,'01') AS internet_q1,
(4,'04') AS internet_q4,
(2,'02') AS partners_q2,
(9,'03') AS tele_q3
)
)
ORDER BY product,country;
使用with子句將這個查詢分解為易于理解的字節(jié)級大小的塊。使用with子句建立3個因子化子查詢來進行了重寫,分別命名為sales_countries、top_sales 、sales_rpt 子查詢。sales_countries指的是銷售所發(fā)生的國家,top_sales收集銷售數(shù)據(jù),而sales_rpt子查詢對這些數(shù)據(jù)進行聚合。
--進行子查詢因子化的交叉表
WITH sales_countries AS (
SELECT /*+ gather_plan_statistics */
cu.cust_id,co.country_name
FROM sh.countries co, sh.customers cu
WHERE cu.country_id=co.country_id
),
top_sales AS
(
SELECT p.prod_name,sc.country_name,sa.channel_id,
t.calendar_quarter_desc,sa.amount_sold,sa.quantity_sold
FROM sh.sales sa
JOIN sh.times t ON t.time_id=sa.time_id
JOIN sh.customers c ON c.cust_id = sa.cust_id
JOIN sales_countries sc ON sc.cust_id = c.cust_id
JOIN sh.products p ON p.prod_id = sa.prod_id
),
sales_rpt AS
(
SELECT ts.prod_name product,
ts.country_name country,
ts.channel_id channel,
SUBSTR(ts.calendar_quarter_desc,6,2) quarter,
SUM(amount_sold) amount_sold,
SUM(quantity_sold) quantity_sold
FROM top_sales ts
GROUP BY ts.prod_name,
ts.country_name,
ts.channel_id,
SUBSTR(ts.calendar_quarter_desc,6,2)
)
SELECT * FROM
(
SELECT product, channel,quarter,country,quantity_sold
FROM sales_rpt
) PIVOT (
SUM(quantity_sold)
FOR (channel,quarter) IN
(
(5,'02') AS catalog_q2,
(4,'01') AS internet_q1,
(4,'04') AS internet_q4,
(2,'02') AS partners_q2,
(9,'03') AS tele_q3
)
)
ORDER BY product,country;
用with定義PL/SQL函數(shù)
Oracle 12c中引入了一種特性,可以使用with子句聲明并定義pl/sql函數(shù)和存儲過程。在定義之后,可以在聲明這個子句的查詢中引用該pl/sql函數(shù)。
WITH
FUNCTION calc_markup(p_markup NUMBER,p_price NUMBER) RETURN NUMBER
IS
BEGIN
RETURN p_markup*p_price;
END;
SELECT t.prod_name,
t.prod_list_price cur_price,
calc_markup(.05,t.prod_list_price) mup5,
ROUND(t.prod_list_price+calc_markup(0.10,t.prod_list_price),2) new_price
FROM sh.products t;
/
SELECT prod_name,cur_price,mup5,new_price
FROM (
WITH
FUNCTION calc_markup(p_markup NUMBER,p_price NUMBER) RETURN NUMBER
IS
BEGIN
RETURN p_markup*p_price;
END;
SELECT t.prod_name,
t.prod_list_price cur_price,
calc_markup(.05,t.prod_list_price) mup5,
ROUND(t.prod_list_price+calc_markup(0.10,t.prod_list_price),2) new_price
FROM sh.products t
) WHERE cur_price<1000
AND new_price>1000;
SELECT /*+ WITH_PLSQL */ prod_name,cur_price,mup5,new_price
FROM (
WITH
FUNCTION calc_markup(p_markup NUMBER,p_price NUMBER) RETURN NUMBER
IS
BEGIN
RETURN p_markup*p_price;
END;
SELECT t.prod_name,
t.prod_list_price cur_price,
calc_markup(.05,t.prod_list_price) mup5,
ROUND(t.prod_list_price+calc_markup(0.10,t.prod_list_price),2) new_price
FROM sh.products t
) WHERE cur_price<1000
AND new_price>1000;
/
必須使用斜線/運行這個語名,類似執(zhí)行一個匿名的PL/SQL塊。
SQL優(yōu)化
當一個SQL查詢被設(shè)計或修改以利用子查詢因子化時,在優(yōu)化器為查詢建立執(zhí)行計劃時,可能將因子化的子查詢作為臨時表來處理。
sqlplus scott/scott@orcl
命令行登陸語法
sqlplus username/password@servername as sysdba
set autotrace on
顯示計劃信息和查詢的數(shù)據(jù)
set autotrace traceonly
只顯示計劃信息
SQL> set autotrace traceonly
SQL> --with和materialize
SQL> WITH cust AS (
2 SELECT /*+ materialize gather_plan_statistics */
3 t.cust_income_level,
4 a.country_name
5 FROM sh.customers t
6 JOIN sh.countries a ON a.country_id=t.country_id
7 )
8 SELECT c.country_name,cust_income_level,COUNT(c.country_name) country_cust_count
9 FROM cust c
10 HAVING COUNT(country_name) >
11 (
12 SELECT COUNT(*) * .01 FROM cust c2
13 )
14 OR COUNT(cust_income_level) >
15 (
16 SELECT MEDIAN(income_income_count)
17 FROM (
18 SELECT cust_income_level,COUNT(*)* .25 income_income_count
19 FROM cust
20 GROUP BY cust_income_level
21 )
22 )
23 GROUP BY country_name,cust_income_level
24 ORDER BY 1,2;
已選擇35行。
執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 3455850065
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 620 | 499 (2)| 00:00:06 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | | | | | |
|* 3 | HASH JOIN | | 55500 | 2222K| 410 (1)| 00:00:05 |
| 4 | TABLE ACCESS FULL | COUNTRIES | 23 | 345 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
|* 6 | FILTER | | | | | |
| 7 | SORT GROUP BY | | 20 | 620 | 89 (5)| 00:00:02 |
| 8 | VIEW | | 55500 | 1680K| 86 (2)| 00:00:02 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_61C9C4 | 55500 | 1680K| 86 (2)| 00:00:02 |
| 10 | SORT AGGREGATE | | 1 | | | |
| 11 | VIEW | | 55500 | | 86 (2)| 00:00:02 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_61C9C4 | 55500 | 1680K| 86 (2)| 00:00:02 |
| 13 | SORT GROUP BY | | 1 | 13 | | |
| 14 | VIEW | | 12 | 156 | 89 (5)| 00:00:02 |
| 15 | SORT GROUP BY | | 12 | 252 | 89 (5)| 00:00:02 |
| 16 | VIEW | | 55500 | 1138K| 86 (2)| 00:00:02 |
| 17 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_61C9C4 | 55500 | 1680K| 86 (2)| 00:00:02 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
6 - filter(COUNT("COUNTRY_NAME")> (SELECT COUNT(*)*.01 FROM (SELECT /*+ CACHE_TEMP_TABLE
("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_61C9C4"
"T1") "C2") OR COUNT("CUST_INCOME_LEVEL")> (SELECT PERCENTILE_CONT(0.500000) WITHIN GROUP (
ORDER BY "INCOME_INCOME_COUNT") FROM (SELECT "CUST_INCOME_LEVEL"
"CUST_INCOME_LEVEL",COUNT(*)*.25 "INCOME_INCOME_COUNT" FROM (SELECT /*+ CACHE_TEMP_TABLE
("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_61C9C4"
"T1") "CUST" GROUP BY "CUST_INCOME_LEVEL") "from$_subquery$_006"))
統(tǒng)計信息
----------------------------------------------------------
4 recursive calls
314 db block gets
2382 consistent gets
303 physical reads
600 redo size
1916 bytes sent via SQL*Net to client
438 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
35 rows processed
需要測試才能得到最優(yōu)查詢性能可以通過一個管理層需要的報表說明。這個報告必須按照照國家和收入水平顯示消費者的分布情況,并且只顯示那些占總消費比例等于或超過1%的國家和收入水平的數(shù)據(jù)。如果某個收入水平范圍的消費者數(shù)目等于或超過該收入水平范圍的總消費者數(shù)的25%,這樣的國家和收入水平也需要被收入到報告中。前一個查詢中因子化的子查詢cust被保留了下來,新的內(nèi)容是having子句中的子查詢,這是用來保證執(zhí)行報告所規(guī)定的規(guī)則的。
執(zhí)行這個sql語句時,所有一切都像你預想的那樣。然后檢查執(zhí)行計劃發(fā)現(xiàn)customers和countries表的聯(lián)結(jié)經(jīng)過了一個臨時表轉(zhuǎn)換(TEMP TABLE TRANSFORMATION ),接下來的查詢都會用到這個臨時表sys_temp_of。到止前為止,如果懷疑所選擇的執(zhí)行計劃是不是合理,可以使用materialized和inline提示測試。
SQL> --with和inline
SQL> WITH cust AS (
2 SELECT /*+ inline gather_plan_statistics */
3 t.cust_income_level,
4 a.country_name
5 FROM sh.customers t
6 JOIN sh.countries a ON a.country_id=t.country_id
7 )
8 SELECT c.country_name,cust_income_level,COUNT(c.country_name) country_cust_count
9 FROM cust c
10 HAVING COUNT(country_name) >
11 (
12 SELECT COUNT(*) * .01 FROM cust c2
13 )
14 OR COUNT(cust_income_level) >
15 (
16 SELECT MEDIAN(income_income_count)
17 FROM (
18 SELECT cust_income_level,COUNT(*)* .25 income_income_count
19 FROM cust
20 GROUP BY cust_income_level
21 )
22 )
23 GROUP BY country_name,cust_income_level
24 ORDER BY 1,2;
已選擇35行。
執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 1412345716
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Tim
e |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 820 | 413 (2)| 00:00:05 |
|* 1 | FILTER | | | | | |
| 2 | SORT GROUP BY | | 20 | 820 | 413 (2)| 00:00:05 |
|* 3 | HASH JOIN | | 55500 | 2222K| 410 (1)| 00:00:05 |
| 4 | TABLE ACCESS FULL | COUNTRIES | 23 | 345 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
| 6 | SORT AGGREGATE | | 1 | 10 | | |
|* 7 | HASH JOIN | | 55500 | 541K| 408 (1)| 00:00:05 |
| 8 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 270K| 406 (1)| 00:00:05 |
| 10 | SORT GROUP BY | | 1 | 13 | | |
| 11 | VIEW | | 12 | 156 | 411 (2)| 00:00:05 |
| 12 | SORT GROUP BY | | 12 | 372 | 411 (2)| 00:00:05 |
|* 13 | HASH JOIN | | 55500 | 1680K| 408 (1)| 00:00:05 |
| 14 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 |
| 15 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(COUNT(*)> (SELECT COUNT(*)*.01 FROM "SH"."COUNTRIES"
"A","SH"."CUSTOMERS" "T" WHERE "A"."COUNTRY_ID"="T"."COUNTRY_ID") OR
COUNT("T"."CUST_INCOME_LEVEL")> (SELECT PERCENTILE_CONT(0.500000)WITHIN GROUP
( ORDER BY "INCOME_INCOME_COUNT") FROM (SELECT "T"."CUST_INCOME_LEVEL"
"CUST_INCOME_LEVEL",COUNT(*)*.25 "INCOME_INCOME_COUNT" FROM "SH"."COUNTRIES"
"A","SH"."CUSTOMERS" "T" WHERE "A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY
"T"."CUST_INCOME_LEVEL") "from$_subquery$_006"))
3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
7 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
13 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
統(tǒng)計信息
----------------------------------------------------------
1 recursive calls
0 db block gets
4382 consistent gets
1 physical reads
0 redo size
1916 bytes sent via SQL*Net to client
438 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
35 rows processed
從執(zhí)行計劃中可以看出,對customers表進行了3次全掃描,對COUNTRIES進行了一次全掃描。兩次執(zhí)行cust子查詢的只需要COUNTRIES_PK索引中的信息,因此對索引而不是表進行了1次全掃描,節(jié)省了少量的時間和資源。
清除共享池
alter system flush shared_pool;
清除緩沖區(qū)
alter system flush buffer_cache;
測試查詢改變的影響
在前面,報告中需要的是任何國家一定收入層次的人員等于或超過該收入層次所有人員的25%。如果被要求如果某個收入層次的數(shù)目大于該收入層次總消費數(shù)的中間值,則將該收入層次也包括到報告中。
修改后的查詢收入inline
SQL> WITH cust AS
2 (SELECT /*+ inline gather_plan_statistics */ --查詢國家的收入等級及對應國家
3 t.cust_income_level, a.country_name
4 FROM sh.customers t
5 JOIN sh.countries a
6 ON a.country_id = t.country_id
7 ),
8 median_income_set AS
9 (SELECT /*+ inline */
10 cust_income_level, COUNT(*) income_level_count --某個收入層次的數(shù)目大于該收入層次的中間數(shù)
11 FROM cust
12 GROUP BY cust_income_level
13 HAVING COUNT(cust_income_level) > (SELECT MEDIAN(income_level_count) income_level_count
14 FROM (SELECT cust_income_level,
15 COUNT(*) income_level_count
16 FROM cust
17 GROUP BY cust_income_level)))
18 SELECT country_name,
19 cust_income_level,
20 COUNT(country_name) country_cust_count
21 FROM cust c
22 HAVING COUNT (country_name) > (SELECT COUNT(*) * .01 FROM cust c2) OR cust_income_level IN (SELECT mis.cust_income_level
23 FROM median_income_set mis)
24 GROUP BY country_name, cust_income_level;
已選擇123行。
執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 1635819209
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 820 | 413 (2)| 00:00:05 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 20 | 820 | 413 (2)| 00:00:05 |
|* 3 | HASH JOIN | | 55500 | 2222K| 410 (1)| 00:00:05 |
| 4 | TABLE ACCESS FULL | COUNTRIES | 23 | 345 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
| 6 | SORT AGGREGATE | | 1 | 10 | | |
|* 7 | HASH JOIN | | 55500 | 541K| 408 (1)| 00:00:05 |
| 8 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 270K| 406 (1)| 00:00:05 |
|* 10 | FILTER | | | | | |
| 11 | HASH GROUP BY | | 1 | 31 | 411 (2)| 00:00:05 |
|* 12 | HASH JOIN | | 55500 | 1680K| 408 (1)| 00:00:05 |
| 13 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 |
| 14 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
| 15 | SORT GROUP BY | | 1 | 13 | | |
| 16 | VIEW | | 12 | 156 | 411 (2)| 00:00:05 |
| 17 | SORT GROUP BY | | 12 | 372 | 411 (2)| 00:00:05 |
|* 18 | HASH JOIN | | 55500 | 1680K| 408 (1)| 00:00:05 |
| 19 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 |
| 20 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(COUNT(*)> (SELECT COUNT(*)*.01 FROM "SH"."COUNTRIES"
"A","SH"."CUSTOMERS" "T" WHERE "A"."COUNTRY_ID"="T"."COUNTRY_ID") OR EXISTS
(SELECT 0 FROM "SH"."COUNTRIES" "A","SH"."CUSTOMERS" "T" WHERE
"A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY "T"."CUST_INCOME_LEVEL" HAVING
"T"."CUST_INCOME_LEVEL"=:B1 AND COUNT("T"."CUST_INCOME_LEVEL")> (SELECT
PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT") FROM
(SELECT "T"."CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*)
"INCOME_LEVEL_COUNT" FROM "SH"."COUNTRIES" "A","SH"."CUSTOMERS" "T" WHERE
"A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY "T"."CUST_INCOME_LEVEL")
"from$_subquery$_005")))
3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
7 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
10 - filter("T"."CUST_INCOME_LEVEL"=:B1 AND COUNT("T"."CUST_INCOME_LEVEL")>
(SELECT PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT")
FROM (SELECT "T"."CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*)
"INCOME_LEVEL_COUNT" FROM "SH"."COUNTRIES" "A","SH"."CUSTOMERS" "T" WHERE
"A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY "T"."CUST_INCOME_LEVEL")
"from$_subquery$_005"))
12 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
18 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
統(tǒng)計信息
----------------------------------------------------------
0 recursive calls
0 db block gets
23362 consistent gets
0 physical reads
0 redo size
5460 bytes sent via SQL*Net to client
504 bytes received via SQL*Net from client
10 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
123 rows processed
SQL>
修改后的查詢收入materialize
增加了1次全表掃描和索引掃描,下面是允許臨時表轉(zhuǎn)換查詢的性能輸出:
SQL> WITH cust AS
2 (SELECT /*+ materialize gather_plan_statistics */ --查詢國家的收入等級及對應國家
3 t.cust_income_level, a.country_name
4 FROM sh.customers t
5 JOIN sh.countries a
6 ON a.country_id = t.country_id
7 ),
8 median_income_set AS
9 (SELECT /*+ inline */
10 cust_income_level, COUNT(*) income_level_count --某個收入層次的數(shù)目大于該收入層次的中間數(shù)
11 FROM cust
12 GROUP BY cust_income_level
13 HAVING COUNT(cust_income_level) > (SELECT MEDIAN(income_level_count) income_level_count
14 FROM (SELECT cust_income_level,
15 COUNT(*) income_level_count
16 FROM cust
17 GROUP BY cust_income_level)))
18 SELECT country_name,
19 cust_income_level,
20 COUNT(country_name) country_cust_count
21 FROM cust c
22 HAVING COUNT (country_name) > (SELECT COUNT(*) * .01 FROM cust c2) OR cust_income_level IN (SELECT mis.cust_income_level
23 FROM median_income_set mis)
24 GROUP BY country_name, cust_income_level;
已選擇123行。
執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 2452612
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 620 | 499 (2)| 00:00:06 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | | | | | |
|* 3 | HASH JOIN | | 55500 | 2222K| 410 (1)| 00:00:05 |
| 4 | TABLE ACCESS FULL | COUNTRIES | 23 | 345 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 1409K| 406 (1)| 00:00:05 |
|* 6 | FILTER | | | | | |
| 7 | HASH GROUP BY | | 20 | 620 | 89 (5)| 00:00:02 |
| 8 | VIEW | | 55500 | 1680K| 86 (2)| 00:00:02 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_644975 | 55500 | 1680K| 86 (2)| 00:00:02 |
| 10 | SORT AGGREGATE | | 1 | | | |
| 11 | VIEW | | 55500 | | 86 (2)| 00:00:02 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_644975 | 55500 | 1680K| 86 (2)| 00:00:02 |
|* 13 | FILTER | | | | | |
| 14 | HASH GROUP BY | | 1 | 21 | 89 (5)| 00:00:02 |
| 15 | VIEW | | 55500 | 1138K| 86 (2)| 00:00:02 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_644975 | 55500 | 1680K| 86 (2)| 00:00:02 |
| 17 | SORT GROUP BY | | 1 | 13 | | |
| 18 | VIEW | | 12 | 156 | 89 (5)| 00:00:02 |
| 19 | SORT GROUP BY | | 12 | 252 | 89 (5)| 00:00:02 |
| 20 | VIEW | | 55500 | 1138K| 86 (2)| 00:00:02 |
| 21 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6607_644975 | 55500 | 1680K| 86 (2)| 00:00:02 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
6 - filter(COUNT("COUNTRY_NAME")> (SELECT COUNT(*)*.01 FROM (SELECT /*+ CACH E_TEMP_TABLE
("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_644975"
"T1") "C2") OR EXISTS (SELECT 0 FROM (SELECT /*+ CACHE_TEMP_TABLE ("T1") */ "C0"
"CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_644975" "T1") "CUST"
GROUP BY "CUST_INCOME_LEVEL" HAVING "CUST_INCOME_LEVEL"=:B1 AND COUNT("CUST_INCOME_LEVEL")>
(SELECT PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT") FROM (SELECT
"CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*) "INCOME_LEVEL_COUNT" FROM (SELECT /*+
CACHE_TEMP_TABLE ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM
"SYS"."SYS_TEMP_0FD9D6607_644975" "T1") "CUST" GROUP BY "CUST_INCOME_LEVEL")
"from$_subquery$_005")))
13 - filter("CUST_INCOME_LEVEL"=:B1 AND COUNT("CUST_INCOME_LEVEL")> (SELECT
PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT") FROM (SELECT
"CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*) "INCOME_LEVEL_COUNT" FROM (SELECT /*+
CACHE_TEMP_TABLE ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM
"SYS"."SYS_TEMP_0FD9D6607_644975" "T1") "CUST" GROUP BY "CUST_INCOME_LEVEL")
"from$_subquery$_005"))
統(tǒng)計信息
----------------------------------------------------------
138 recursive calls
317 db block gets
6379 consistent gets
303 physical reads
1520 redo size
5460 bytes sent via SQL*Net to client
504 bytes received via SQL*Net from client
10 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
123 rows processed
因為在查詢修改后的版本中增加了掃描次數(shù),邏輯IO的支出更明顯了。在這個查詢中允許oracle進行表轉(zhuǎn)換,將散列聯(lián)結(jié)的結(jié)果寫入到磁盤中的一張臨時表中然后在查詢中多次重用的效率就明顯更高。
尋找其他優(yōu)化機會
計算產(chǎn)口各個銷售渠道的成本找出2000年所生的每種產(chǎn)品的平均,最小和最大成本。但下面的查詢不僅閱讀起來困難并且難以修改,而且在某種程度上效率也是不高的。
SQL> --用來計算成本的老sql語句
SQL> SELECT /*+ gather_plan_statistics */
2 SUBSTR(prod_name,1,30) prod_name,
3 channel_desc,
4 (
5 SELECT AVG(c2.unit_cost) AS avg_cost FROM sh.costs c2
6 WHERE c2.prod_id=c.prod_id AND c2.channel_id=c.channel_id
7 AND c2.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
8 AND to_date('12/31/2000','mm/dd/yyyy')
9 ),
10 (
11 SELECT MIN(c2.unit_cost) AS min_cost FROM sh.costs c2
12 WHERE c2.prod_id=c.prod_id AND c2.channel_id=c.channel_id
13 AND c2.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
14 AND to_date('12/31/2000','mm/dd/yyyy')
15 ),
16 (
17 SELECT MAX(c2.unit_cost) AS max_cost FROM sh.costs c2
18 WHERE c2.prod_id=c.prod_id AND c2.channel_id=c.channel_id
19 AND c2.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
20 AND to_date('12/31/2000','mm/dd/yyyy')
21 )
22 FROM (
23 SELECT DISTINCT pr.prod_id,pr.prod_name,ch.channel_id,ch.channel_desc
24 FROM sh.channels ch,sh.products pr,sh.costs co
25 WHERE ch.channel_id=co.channel_id
26 AND co.prod_id=pr.prod_id
27 AND co.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
28 AND to_date('12/31/2000','mm/dd/yyyy')
29 ) c
30 ORDER BY prod_name,channel_desc;
已選擇216行。
執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 1877279774
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20640 | 1310K| | 640 (1)| 00:00:08 | | |
| 1 | SORT AGGREGATE | | 1 | 20 | | | | | |
| 2 | PARTITION RANGE ITERATOR | | 96 | 1920 | | 17 (0)| 00:00:01 | 13 | 16 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| COSTS | 96 | 1920 | | 17 (0)| 00:00:01 | 13 | 16 |
| 4 | BITMAP CONVERSION TO ROWIDS | | | | | | | | |
|* 5 | BITMAP INDEX SINGLE VALUE | COSTS_PROD_BIX | | | | | | 13 | 16 |
| 6 | SORT AGGREGATE | | 1 | 20 | | | | | |
| 7 | PARTITION RANGE ITERATOR | | 96 | 1920 | | 17 (0)| 00:00:01 | 13 | 16 |
|* 8 | TABLE ACCESS BY LOCAL INDEX ROWID| COSTS | 96 | 1920 | | 17 (0)| 00:00:01 | 13 | 16 |
| 9 | BITMAP CONVERSION TO ROWIDS | | | | | | | | |
|* 10 | BITMAP INDEX SINGLE VALUE | COSTS_PROD_BIX | | | | | | 13 | 16 |
| 11 | SORT AGGREGATE | | 1 | 20 | | | | | |
| 12 | PARTITION RANGE ITERATOR | | 96 | 1920 | | 17 (0)| 00:00:01 | 13 | 16 |
|* 13 | TABLE ACCESS BY LOCAL INDEX ROWID| COSTS | 96 | 1920 | | 17 (0)| 00:00:01 | 13 | 16 |
| 14 | BITMAP CONVERSION TO ROWIDS | | | | | | | | |
|* 15 | BITMAP INDEX SINGLE VALUE | COSTS_PROD_BIX | | | | | | 13 | 16 |
| 16 | SORT ORDER BY | | 20640 | 1310K|1632K| 640 (1)| 00:00:08 | | |
| 17 | VIEW | | 20640 | 1310K| | 316 (2)| 00:00:04 | | |
| 18 | HASH UNIQUE | | 20640 | 1169K|1384K| 316 (2)| 00:00:04 | | |
|* 19 | HASH JOIN | | 20640 | 1169K| | 25 (8)| 00:00:01 | | |
| 20 | TABLE ACCESS FULL | PRODUCTS | 72 | 2160 | | 3 (0)| 00:00:01 | | |
|* 21 | HASH JOIN | | 20640 | 564K| | 21 (5)| 00:00:01 | | |
| 22 | TABLE ACCESS FULL | CHANNELS | 5 | 65 | | 3 (0)| 00:00:01 | | |
| 23 | PARTITION RANGE ITERATOR | | 20640 | 302K| | 17 (0)| 00:00:01 | 13 | 16 |
|* 24 | TABLE ACCESS FULL | COSTS | 20640 | 302K| | 17 (0)| 00:00:01 | 13 | 16 |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("C2"."CHANNEL_ID"=:B1 AND "C2"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
5 - access("C2"."PROD_ID"=:B1)
8 - filter("C2"."CHANNEL_ID"=:B1 AND "C2"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
10 - access("C2"."PROD_ID"=:B1)
13 - filter("C2"."CHANNEL_ID"=:B1 AND "C2"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
15 - access("C2"."PROD_ID"=:B1)
19 - access("CO"."PROD_ID"="PR"."PROD_ID")
21 - access("CH"."CHANNEL_ID"="CO"."CHANNEL_ID")
24 - filter("CO"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
統(tǒng)計信息
----------------------------------------------------------
1 recursive calls
0 db block gets
29642 consistent gets
0 physical reads
0 redo size
14092 bytes sent via SQL*Net to client
570 bytes received via SQL*Net from client
16 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
216 rows processed
將begin_date和end_date列放入單獨的查詢bookends 開始,只留下需要設(shè)定值的地方。產(chǎn)品的數(shù)據(jù)被放在prodmaster子查詢中。盡管這幾段sql語句放在子查詢也可以實現(xiàn)其功能,但將它們移到因子化的子查詢中大大地增強了sql語句整體的可讀性。
平均、最大和最小成本的計算被一個稱為cost_compare的子查詢?nèi)〈?。最后,加入了?lián)結(jié)prodmaster和cost_compare子查詢的sql語句。
SQL> --使用with子句進行重構(gòu)后的老sql語句
SQL> WITH bookends AS
2 (SELECT to_date('01/01/2000', 'mm/dd/yyyy') begin_date,
3 to_date('12/31/2000', 'mm/dd/yyyy') end_date
4 FROM dual),
5 prodmaster AS
6 (SELECT DISTINCT pr.prod_id, pr.prod_name, ch.channel_id, ch.channel_desc
7 FROM sh.channels ch, sh.products pr, sh.costs co
8 WHERE ch.channel_id = co.channel_id
9 AND co.prod_id = pr.prod_id
10 AND co.time_id BETWEEN (SELECT begin_date FROM bookends) AND
11 (SELECT end_date FROM bookends)),
12 cost_compare AS
13 (SELECT c2.prod_id,
14 c2.channel_id,
15 AVG(c2.unit_cost) avg_cost,
16 MIN(c2.unit_cost) min_cost,
17 MAX(c2.unit_cost) max_cost
18 FROM sh.costs c2
19 WHERE c2.time_id BETWEEN (SELECT begin_date FROM bookends) AND
20 (SELECT end_date FROM bookends)
21 GROUP BY c2.prod_id, c2.channel_id)
22 SELECT /*+ gather_plan_statistics */
23 SUBSTR(pm.prod_name, 1, 30) prod_name,
24 pm.channel_desc,
25 cc.avg_cost,
26 cc.min_cost,
27 cc.max_cost
28 FROM prodmaster pm
29 JOIN cost_compare cc
30 ON cc.prod_id = pm.prod_id
31 AND cc.channel_id = pm.channel_id
32 ORDER BY pm.prod_id, pm.channel_id;
已選擇216行。
執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 2361085328
----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 138 | 12696 | 83 (5)| 00:00:01 | | |
| 1 | MERGE JOIN | | 138 | 12696 | 83 (5)| 00:00:01 | | |
| 2 | SORT JOIN | | 205 | 9430 | 44 (5)| 00:00:01 | | |
| 3 | VIEW | | 205 | 9430 | 44 (5)| 00:00:01 | | |
| 4 | HASH UNIQUE | | 205 | 11890 | 44 (5)| 00:00:01 | | |
|* 5 | HASH JOIN | | 205 | 11890 | 39 (3)| 00:00:01 | | |
| 6 | TABLE ACCESS FULL | PRODUCTS | 72 | 2160 | 3 (0)| 00:00:01 | | |
| 7 | MERGE JOIN | | 205 | 5740 | 36 (3)| 00:00:01 | | |
| 8 | TABLE ACCESS BY INDEX ROWID | CHANNELS | 5 |65 | 2 (0)| 00:00:01 | | |
| 9 | INDEX FULL SCAN | CHANNELS_PK | 5 | | 1 (0)| 00:00:01 | | |
|* 10 | SORT JOIN | | 205 | 3075 | 34 (3)| 00:00:01 | | |
| 11 | PARTITION RANGE ITERATOR | | 205 | 3075 | 33 (0)| 00:00:01 | KEY | KEY |
| 12 | TABLE ACCESS BY LOCAL INDEX ROWID| COSTS | 205 | 3075 | 33 (0)| 00:00:01 | KEY | KEY |
| 13 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
|* 14 | BITMAP INDEX RANGE SCAN | COSTS_TIME_BIX | | | | | KEY | KEY |
| 15 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | |
| 16 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | |
|* 17 | SORT JOIN | | 145 | 6670 | 39 (6)| 00:00:01 | | |
| 18 | VIEW | | 145 | 6670 | 38 (3)| 00:00:01 | | |
| 19 | HASH GROUP BY | | 145 | 2900 | 38 (3)| 00:00:01 | | |
| 20 | PARTITION RANGE ITERATOR | | 205 | 4100 | 33 (0)| 00:00:01 | KEY | KEY |
| 21 | TABLE ACCESS BY LOCAL INDEX ROWID | COSTS | 205 | 4100 | 33 (0)| 00:00:01 | KEY | KEY |
| 22 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
|* 23 | BITMAP INDEX RANGE SCAN | COSTS_TIME_BIX | | | | | KEY | KEY |
| 24 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | |
| 25 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("CO"."PROD_ID"="PR"."PROD_ID")
10 - access("CH"."CHANNEL_ID"="CO"."CHANNEL_ID")
filter("CH"."CHANNEL_ID"="CO"."CHANNEL_ID")
14 - access("CO"."TIME_ID">= (SELECT TO_DATE(' 2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"
"DUAL") AND "CO"."TIME_ID"<= (SELECT TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"
"DUAL"))
17 - access("CC"."PROD_ID"="PM"."PROD_ID" AND "CC"."CHANNEL_ID"="PM"."CHANNEL_ID")
filter("CC"."CHANNEL_ID"="PM"."CHANNEL_ID" AND "CC"."PROD_ID"="PM"."PROD_ID")
23 - access("C2"."TIME_ID">= (SELECT TO_DATE(' 2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"
"DUAL") AND "C2"."TIME_ID"<= (SELECT TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"
"DUAL"))
統(tǒng)計信息
----------------------------------------------------------
1 recursive calls
0 db block gets
7436 consistent gets
0 physical reads
0 redo size
13596 bytes sent via SQL*Net to client
570 bytes received via SQL*Net from client
16 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
216 rows processed
將子查詢因子化應用到pl/sql中
例子:
- 只列出至少在3個不同的年份中都采購了產(chǎn)品的消費者
- 按照產(chǎn)品類別分組統(tǒng)計每個消費者的購買總額
用常規(guī)的pl/sql獲取所需的數(shù)據(jù)
查詢滿足標準的所有消費者并將他們的ID保存在一張臨時表中
然后在新保存的消費者ID中執(zhí)行循環(huán)并找到他們,加起來,將這些信息插入到另一張臨時表中。然后再將得到的結(jié)果與customers和products表聯(lián)結(jié)以生成報告。
--用PL/SQL生成消費者報告
BEGIN
EXECUTE IMMEDIATE 'drop table cust3year';
EXECUTE IMMEDIATE 'drop table sales3year';
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
/
create global temporary table cust3year(cust_id number);
CREATE GLOBAL TEMPORARY TABLE sales3year(
cust_id NUMBER,
prod_category VARCHAR2(50),
total_sale NUMBER
)
/
BEGIN
EXECUTE IMMEDIATE 'truncate table cust3year';
EXECUTE IMMEDIATE 'truncate table sales3year';
INSERT INTO cust3year
SELECT cust_id--,count(cust_years) year_count
FROM (
SELECT DISTINCT cust_id,TRUNC(time_id,'YEAR') cust_years
FROM sh.sales
)
GROUP BY cust_id
HAVING COUNT(cust_years)>=3;
--SELECT * FROM cust3year;
FOR crec IN (SELECT cust_id FROM cust3year)
LOOP
INSERT INTO sales3year
SELECT sa.cust_id,p.prod_category,SUM(co.unit_cost*sa.quantity_sold)
FROM sh.sales sa
JOIN sh.products p ON p.prod_id=sa.prod_id
JOIN sh.costs co ON co.prod_id=sa.prod_id AND co.time_id=sa.time_id
JOIN sh.customers cu ON cu.cust_id=sa.cust_id
WHERE crec.cust_id=sa.cust_id
GROUP BY sa.cust_id,p.prod_category;
END LOOP;
END;
/
SELECT c3.cust_id,c.cust_last_name,c.cust_first_name,s3.prod_category,s3.total_sale FROM sales3year s3
JOIN cust3year c3 ON s3.cust_id=c3.cust_id
JOIN sh.customers c ON c.cust_id=s3.cust_id
ORDER BY 1,4;
上面是一段很好的PL/SQL程序塊,如果考慮子查詢因子化,還可以改進。首先將消費者ID的部分放到with子句中,接下來再利用子查詢的結(jié)果生成報告所需的銷售,產(chǎn)品和消費者信息就可以了。
--使用with子句生成消費者報告
WITH cust3year AS(
SELECT cust_id
FROM (
SELECT DISTINCT cust_id,TRUNC(time_id,'YEAR') cust_years
FROM sh.sales
)
GROUP BY cust_id
HAVING COUNT(cust_years)>=3
),
sales3year AS (
SELECT sa.cust_id,p.prod_category,SUM(co.unit_cost*sa.quantity_sold) AS total_sale
FROM sh.sales sa
JOIN sh.products p ON p.prod_id=sa.prod_id
JOIN sh.costs co ON co.prod_id=sa.prod_id AND co.time_id=sa.time_id
JOIN sh.customers cu ON cu.cust_id=sa.cust_id
WHERE sa.cust_id IN (SELECT cust_id FROM cust3year)
GROUP BY sa.cust_id,p.prod_category
)
SELECT c3.cust_id,c.cust_last_name,c.cust_first_name,s3.prod_category,s3.total_sale FROM sales3year s3
JOIN cust3year c3 ON s3.cust_id=c3.cust_id
JOIN sh.customers c ON c.cust_id=s3.cust_id
ORDER BY 1,4;
WITH custyear AS
(SELECT sa.cust_id, EXTRACT(YEAR FROM time_id) sales_year
FROM sh.sales sa
WHERE EXTRACT(YEAR FROM time_id) BETWEEN 1998 AND 2002
GROUP BY sa.cust_id, EXTRACT(YEAR FROM time_id)),
cust3year AS
(SELECT DISTINCT c3.cust_id
FROM (SELECT cust_id, COUNT(*) OVER(PARTITION BY cust_id) year_count
FROM custyear) c3
WHERE c3.year_count >= 3)
SELECT c.cust_id,
c.cust_last_name,
c.cust_first_name,
p.prod_category,
SUM(co.unit_price * sa.quantity_sold) AS total_sale
FROM cust3year c3
JOIN sh.sales sa
ON sa.cust_id = c3.cust_id
JOIN sh.products p
ON p.prod_id = sa.prod_id
JOIN sh.costs co
ON co.prod_id = sa.prod_id
AND co.time_id = sa.time_id
JOIN sh.customers c
ON c.cust_id = c3.cust_id
GROUP BY c.cust_id, c.cust_last_name, c.cust_first_name, p.prod_category
ORDER BY c.cust_id;
extract()函數(shù)將年份從日期中提取出來并轉(zhuǎn)化為整形值以簡化年份的比較
子查詢因子化可以用來更好地組織一些查詢,在某些情況下甚至可以用來作為性能調(diào)優(yōu)的工具。學會使用它就等于在你的oracle工具箱中添加了一個新工具。
遞歸子查詢
遞歸子查詢因子化 recursive subquery factoring RSF
--基本的connect by
SELECT LPAD(' ', LEVEL * 2 - 1, ' ') || emp.emp_last_name emp_last_name,
emp.emp_first_name,
emp.employee_id,
emp.mgr_last_name,
emp.mgr_first_name,
emp.manager_id,
emp.department_name
FROM (SELECT /*+ inline gather_plan_statistics */
e.last_name emp_last_name,
e.first_name emp_first_name,
e.employee_id,
d.department_id,
e.manager_id,
d.department_name,
es.last_name mgr_last_name,
es.first_name mgr_first_name
FROM hr.employees e
LEFT OUTER JOIN hr.departments d
ON e.department_id = d.department_id
LEFT OUTER JOIN hr.employees es
ON es.employee_id = e.manager_id) emp
CONNECT BY PRIOR emp.employee_id = emp.manager_id
START WITH emp.manager_id IS NULL
ORDER SIBLINGS BY emp.emp_last_name;
內(nèi)嵌視圖emp用來與employee和department表進行聯(lián)結(jié),然后將一個數(shù)據(jù)集提供給select ... connect by語句。用prior運算符來將當前的employee_id與另一行中的manager_id列值匹配。反復的這么做就建立了一個遞歸查詢。
start with子句是用來指引從manager_id為空的那一行開始。level偽列保存了遞歸的深度值,使得可以通過一個簡單的方法來輸出進行縮進,從而可以直觀地看出組織層次結(jié)構(gòu)。
RSF示例
--基本的遞歸子查詢因子化
WITH emp AS
(SELECT /*+ inline gather_plan_statistics */
e.last_name, e.first_name, e.employee_id, e.manager_id, d.department_name
FROM hr.employees e
LEFT OUTER JOIN hr.departments d
ON e.department_id = d.department_id),
emp_recurse(last_name,
first_name,
employee_id,
manager_id,
department_name,
lvl) AS
(SELECT e.last_name AS last_name,
e.first_name AS first_name,
e.employee_id AS employee_id,
e.manager_id AS manager_id,
e.department_name AS department_name,
1 AS lvl
FROM emp e
WHERE e.manager_id IS NULL
UNION ALL
SELECT emp.last_name AS last_name,
emp.first_name AS first_name,
emp.employee_id AS employee_id,
emp.manager_id AS manager_id,
emp.department_name AS department_name,
empr.lvl + 1 AS lvl
FROM emp
JOIN emp_recurse empr
ON empr.employee_id = emp.manager_id)
search DEPTH FIRST BY last_name SET order1
SELECT LPAD(' ', lvl * 2 - 1, ' ') || er.last_name last_name,
er.first_name,
er.department_name
FROM emp_recurse er;
遞歸的with子句需要兩個查詢塊:定位點成員和遞歸成員。這兩個子查詢塊必須通過集合運算符union all結(jié)合到一起。定位點成員是union all之前的查詢,而遞歸成員是其后面的查詢。遞歸子查詢必須引用定義子查詢,這樣就進行了遞歸。
RSF的限制條件
RSF的使用比connect by要靈活得多,但是,它的使用也有一些限制:
- distinct關(guān)鍵字或group by子句
- model子句
- 聚合函數(shù),但在select列表中可以使用分析函數(shù)
- 引用query_name的子查詢
- 引用query_name作為右表的外聯(lián)結(jié)
與connect by的不同點
與connect by相比較,rsf查詢返回的列必須在查詢暄義中聲明,如emp_recurse(last_name,first_name,employee_id,manager_id,department_name,lvl)
search depth first
,默認的搜索是breadth first,這通常不是一個層級型查詢所想要的輸出。breadth first搜索在返回任何子數(shù)據(jù)行之前返回每一層級上的兄弟數(shù)據(jù)行。指定search depth first將會按照層級的順序返回數(shù)據(jù)行。search子句中的set order1部分將order1偽列的值設(shè)置為數(shù)據(jù)行返回的順序值。
類型 | 名稱 | 用途 |
---|---|---|
函數(shù) | sys_connect_by_path | 返回當前數(shù)據(jù)行的所有祖先 |
運算符 | connect_by_root | 返回根數(shù)據(jù)行的值 |
運算符 | prior | 用來表明層級型查詢,在遞歸子查詢中不需要 |
偽列 | connect_by_iscycle | 在層級中檢測循環(huán) |
參數(shù) | nocycle | connect by的參數(shù),與connect_by_iscycle一起使用 |
偽列 | connect_by_isleaf | 標識葉子數(shù)據(jù)行 |
偽列 | level | 用來表明層級中的深度 |
level偽列
--level偽列
SELECT LPAD(' ', LEVEL * 2 - 1, ' ') || e.last_name last_name, LEVEL
FROM hr.employees e
CONNECT BY PRIOR e.employee_id = e.manager_id
START WITH e.manager_id IS NULL
ORDER SIBLINGS BY e.last_name;
在層級型查詢中經(jīng)常被用來實現(xiàn)輸出縮進,使得層級看起來很直觀。;
--創(chuàng)建lvl列
WITH emp_recurse(employee_id,manager_id,last_name,lvl) AS (
SELECT e.employee_id,NULL,e.last_name,1 AS lvl
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT e1.employee_id,e1.manager_id,e1.last_name,e2.lvl+1 AS lvl
FROM hr.employees e1
JOIN emp_recurse e2 ON e2.employee_id=e1.manager_id
)
search DEPTH FIRST BY last_name SET last_time_order
SELECT LPAD(' ', r.lvl * 2 - 1, ' ') || r.last_name last_name,r.lvl
FROM emp_recurse r
ORDER BY last_time_order;
sys_connect_by_path函數(shù)
用來返回組成層級的直到當前的行的值。下面的列子用sys_connect_by_path函數(shù)用來建立一個冒號分隔的從根到節(jié)點的層級。
--sys_connect_by_path
SELECT LPAD(' ', 2 * (LEVEL - 1)) || e.last_name AS last_name,
sys_connect_by_path(last_name, ':') path
FROM hr.employees e
START WITH e.manager_id IS NULL
CONNECT BY PRIOR e.employee_id = e.manager_id
ORDER SIBLINGS BY e.last_name;
盡管sys_connect_by_path函數(shù)不能在RSF查詢中使用,你可以使用與重新產(chǎn)生的level偽列幾乎相同的方法來復制這個函數(shù)的功能?,F(xiàn)在不用使用計算器來計數(shù),而是附加一個字符串值。
--建立你自己的sys_connect_by_path函數(shù)
WITH emp_recurse(employee_id,manager_id,last_name,lvl,PATH) AS (
SELECT e.employee_id,NULL,e.last_name,1 AS lvl,':'||to_char(e.last_name) AS path
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT e1.employee_id,e1.manager_id,e1.last_name,e2.lvl+1 AS lvl,e2.path||':'||to_char(e1.last_name) AS path
FROM hr.employees e1
JOIN emp_recurse e2 ON e2.employee_id=e1.manager_id
)
search DEPTH FIRST BY last_name SET last_time_order
SELECT LPAD(' ', r.lvl * 2 - 1, ' ') || r.last_name last_name,r.path
FROM emp_recurse r
ORDER BY last_time_order;
如果你需要將層級顯示為逗號分隔的列表,sys_connect_by_path無法做到,因為sys_connect_by_path函數(shù)的問題在于輸出中的第一個字符必須是冒號。
--RSF逗號分隔的路徑
WITH emp_recurse(employee_id,manager_id,last_name,lvl,PATH) AS (
SELECT e.employee_id,NULL,e.last_name,1 AS lvl,to_char(e.last_name) AS path
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT e1.employee_id,e1.manager_id,e1.last_name,e2.lvl+1 AS lvl,e2.path||','||to_char(e1.last_name) AS path
FROM hr.employees e1
JOIN emp_recurse e2 ON e2.employee_id=e1.manager_id
)
search DEPTH FIRST BY last_name SET last_time_order
SELECT LPAD(' ', r.lvl * 2 - 1, ' ') || r.last_name last_name,r.path
FROM emp_recurse r
ORDER BY last_time_order;
connect_by_root運算符
這個運算符強化了connect by語法,使得它可以返回當前行的根節(jié)點。
--connect_by_root
UPDATE hr.employees SET manager_id=NULL WHERE last_name='Kochhar';
SELECT /*+ inline gather_plan_statistics */
LEVEL,LPAD(' ',2*(LEVEL-1))||last_name last_name,first_name,
connect_by_root last_name AS root,
sys_connect_by_path(last_name,':') PATH
FROM hr.employees
WHERE connect_by_root last_name='Kochhar'
CONNECT BY PRIOR employee_id=manager_id
START WITH manager_id IS NULL;
--復制connect_by_root運算符功能
WITH emp_recurse(employee_id,manager_id,last_name,lvl,path) AS (
SELECT /*+ gather_plan_statistics */
e.employee_id,NULL AS manager_id,
e.last_name,1 AS lvl,
':'||e.last_name||':' AS path
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT
e.employee_id,e.manager_id,
e.last_name,er.lvl+1 AS lvl,
er.path||e.last_name||':' AS path
FROM hr.employees e
JOIN emp_recurse er ON er.employee_id=e.manager_id
JOIN hr.employees e2 ON e2.employee_id=e.manager_id
)
search DEPTH FIRST BY last_name SET order1,
emps AS (
SELECT lvl,last_name,path,SUBSTR(path,2,INSTR(path,':',2)-2) root
FROM emp_recurse
)
SELECT lvl,LPAD(' ',2*(lvl-1))|| last_name last_name,
root,path FROM emps
WHERE root='Kochhar';
connect_by_iscycle偽列和nocycle參數(shù)
connect_by_iscycle偽列使得在層級中檢測循環(huán)變得很容易。
這里將smith設(shè)置為king的經(jīng)理來故意引入了一個錯誤,這將導致connect by中出現(xiàn)錯誤。
--connect by中的循環(huán)錯誤
SELECT * FROM hr.employees WHERE employee_id IN (100,171);
--將Smith設(shè)置為King的經(jīng)理
UPDATE hr.employees SET manager_id=171 WHERE employee_id=100;
SELECT LPAD(' ',2*(LEVEL-1))|| last_name last_name,
first_name,employee_id,LEVEL
FROM hr.employees
START WITH employee_id=100
CONNECT BY PRIOR employee_id=manager_id;
nocycle和connect_by_iscycle可以用來檢測層級中的循環(huán)。nocycle參數(shù)可以阻止發(fā)ora-1436錯誤,使得所有行都要以輸出。connect_by_iscycle運算符使得你可以很容易地找到導致錯誤發(fā)生的行。
--通過connect_by_iscycle檢測循環(huán)
SELECT LPAD(' ',2*(LEVEL-1))|| last_name last_name,
first_name,employee_id,LEVEL,
connect_by_iscycle
FROM hr.employees
START WITH employee_id=100
CONNECT BY NOCYCLE PRIOR employee_id=manager_id;
connect_by_iscycle的值為1,表示smith的那一行數(shù)據(jù)導致了錯誤。接下來查詢Smith的數(shù)據(jù),所有一切看上去都很正常。最后,你再以Smith的員工ID尋找他所管理的所有員工,錯誤就是公司總裁沒有經(jīng)理。因此解決辦法不是將這一行的manager_id設(shè)置回空值。
SELECT e.last_name, e.first_name, e.employee_id, e.manager_id
FROM hr.employees e
WHERE e.employee_id = 171
OR e.manager_id = 171;
--在遞歸查詢中檢測循環(huán)
WITH emp(employee_id,manager_id,last_name,first_name,lvl) AS (
SELECT e.employee_id,NULL AS manager_id,e.last_name,e.first_name,1 AS lvl
FROM hr.employees e
WHERE e.employee_id=100
UNION ALL
SELECT e.employee_id,e.manager_id,e.last_name,e.first_name,emp.lvl+1 AS lvl
FROM hr.employees e
JOIN emp ON emp.employee_id=e.manager_id
)
search DEPTH FIRST BY last_name SET order1
CYCLE employee_id SET is_cycle TO '1' DEFAULT '0'
SELECT LPAD(' ',2*(lvl-1))||last_name last_name,first_name,employee_id,lvl,is_cycle
FROM emp ORDER BY order1;
注意: cycle子句讓你將is_cycle列值設(shè)置為0或1的。這里只允許單值字符。這一列的名稱同樣是用戶自定義的。檢查輸出,可以看到RSF中的cycle子句在指明導致數(shù)據(jù)循環(huán)的行時做得更好。出現(xiàn)錯誤的數(shù)據(jù)行很清楚地標記為King那一行,因此可以查詢那一行并迅速確定錯誤所在。
connect_by_isleaf偽列
connect_by_isleaf用來在層級數(shù)據(jù)中識別葉子節(jié)點。
--connect_by_isleaf偽列
SELECT LPAD(' ',2*(LEVEL-1))|| e.last_name last_name,connect_by_isleaf
FROM hr.employees e
START WITH e.manager_id IS NULL
CONNECT BY PRIOR e.employee_id=e.manager_id
ORDER SIBLINGS BY e.last_name;
RSF中要復制這一點還比較困難的。你需要在員工層級中標識出葉子節(jié)點,從定義上來說,葉子節(jié)點都不是經(jīng)理。所有不是經(jīng)理的行就是葉子節(jié)點。
--在遞歸查詢中找出葉子節(jié)點
WITH leaves AS (
SELECT e.employee_id FROM hr.employees e
WHERE e.employee_id NOT IN (
SELECT manager_id FROM hr.employees WHERE manager_id IS NOT NULL
)
),
emp(manager_id,employee_id,last_name,lvl,isleaf) AS (
SELECT e.manager_id,e.employee_id,e.last_name,1 AS lvl,0 AS isleaf
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT e.manager_id,nvl(e.employee_id,NULL),e.last_name,emp.lvl+1 AS lvl, DECODE(l.employee_id,NULL,0,1) AS isleaf
FROM hr.employees e
JOIN emp ON emp.employee_id=e.manager_id
LEFT OUTER JOIN leaves l ON l.employee_id=e.employee_id
)
search DEPTH FIRST BY last_name SET order1
SELECT LPAD(' ',2*(lvl-1))||last_name last_name, isleaf
FROM emp;
leaves子查詢被用來尋找葉子節(jié)點,然后將結(jié)果與employees表進行左外聯(lián)結(jié)。leaves.employee_id列的值表時當前行是否是葉子。
別一種方法利用分析函數(shù)lead()使用lvl列的值來確定數(shù)據(jù)行是否為葉子節(jié)點。lead()函數(shù)依賴seach子句中所定的last_name_order列的值。
--使用lead()尋找葉子節(jié)點
WITH emp(manager_id,employee_id,last_name,lvl) AS (
SELECT e.manager_id,e.employee_id,e.last_name,1 AS lvl
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT e.manager_id,nvl(e.employee_id,NULL),e.last_name,emp.lvl+1 AS lvl
FROM hr.employees e
JOIN emp ON emp.employee_id=e.manager_id
)
search DEPTH FIRST BY last_name SET last_name_order
SELECT LPAD(' ',2*(lvl-1))||last_name last_name,lvl,
LEAD(lvl) OVER(ORDER BY last_name_order) leadlvlorder,
CASE
WHEN (lvl-LEAD(lvl) OVER (ORDER BY last_name_order))<0
THEN
0
ELSE 1
END isleaf
FROM emp;
如果search子從depth first改為breadth first,它因為依賴數(shù)據(jù)的順序而顯得有點脆弱,這樣的輸出有可能是不正確的,如下的運行:
--使用breadth first的lead()
WITH emp(manager_id,employee_id,last_name,lvl) AS (
SELECT e.manager_id,e.employee_id,e.last_name,1 AS lvl
FROM hr.employees e
WHERE e.manager_id IS NULL
UNION ALL
SELECT e.manager_id,nvl(e.employee_id,NULL),e.last_name,emp.lvl+1 AS lvl
FROM hr.employees e
JOIN emp ON emp.employee_id=e.manager_id
)
search breadth FIRST BY last_name SET last_name_order
SELECT LPAD(' ',2*(lvl-1))||last_name last_name,lvl,
LEAD(lvl) OVER(ORDER BY last_name_order) leadlvlorder,
CASE
WHEN (lvl-LEAD(lvl) OVER (ORDER BY last_name_order))<0
THEN
0
ELSE 1
END isleaf
FROM emp;
盡管在大多數(shù)據(jù)實踐中你都可以使用在遞歸因子化子查詢中復制connect by的功能,但很多情況下,全用connect by語法更簡單,在RSF中做同樣的事件在多數(shù)情況下需要更多的SQL代碼。connect by可以產(chǎn)生比RSF更好的執(zhí)行計劃,鋮是對于相對簡單的查詢。