oracle學習筆記10: 子查詢因子化

with子句最有用的特性之一就是消除復雜的sql查詢,當查詢中包含大量的表和數(shù)據(jù)列時,想要搞清楚查詢中的數(shù)據(jù)流就變得很困難。使用子查詢因子化,只要一個個查詢就可以將一些較復雜的部分移到主查詢之外,從而使得查詢更易于理解。

下面使用pivot運算符生成一個交叉數(shù)據(jù)分析報告。最里層的查詢在sales表的關(guān)鍵列上創(chuàng)建了一系列的聚合,而接下來的最外層查詢只是提供了在pivot運算符中出現(xiàn)的列的列名,從而生成了每種產(chǎn)品不同渠道和季度的最終銷售值。

--沒有進行子查詢因子化的交叉數(shù)據(jù)分析查詢
SELECT *
FROM (
 SELECT /*+ gather_plan_statistics */
 product,channel,quarter,country,quantity_sold 
 FROM (
  SELECT pr.prod_name product,co.country_name country,sa.channel_id channel, Substr(t.calendar_quarter_desc,6,2) quarter,
  SUM(sa.amount_sold) amount_sold,
  SUM(sa.quantity_sold) quantity_sold
  FROM sh.sales sa
  JOIN sh.times t ON t.time_id=sa.time_id
  JOIN sh.customers cu ON cu.cust_id=sa.cust_id
  JOIN sh.countries co ON co.country_id=cu.country_id
  JOIN sh.products pr ON pr.prod_id=sa.prod_id
  GROUP BY 
  pr.prod_name,co.country_name,sa.channel_id,Substr(t.calendar_quarter_desc,6,2)
 ) 
) PIVOT (
  SUM(quantity_sold) FOR (channel,quarter) IN
  (
   (5,'02') AS catolog_q2,
   (4,'01') AS internet_q1,
   (4,'04') AS internet_q4,
   (2,'02') AS partners_q2,
   (9,'03') AS tele_q3
  )
)
ORDER BY product,country;

使用with子句將這個查詢分解為易于理解的字節(jié)級大小的塊。使用with子句建立3個因子化子查詢來進行了重寫,分別命名為sales_countries、top_sales 、sales_rpt 子查詢。sales_countries指的是銷售所發(fā)生的國家,top_sales收集銷售數(shù)據(jù),而sales_rpt子查詢對這些數(shù)據(jù)進行聚合。

--進行子查詢因子化的交叉表
WITH sales_countries AS (
 SELECT /*+ gather_plan_statistics */
 cu.cust_id,co.country_name
 FROM sh.countries co, sh.customers cu
 WHERE cu.country_id=co.country_id
),
top_sales AS
(
 SELECT p.prod_name,sc.country_name,sa.channel_id,
 t.calendar_quarter_desc,sa.amount_sold,sa.quantity_sold
 FROM sh.sales sa
 JOIN sh.times t ON t.time_id=sa.time_id
 JOIN sh.customers c ON c.cust_id = sa.cust_id
 JOIN sales_countries sc ON sc.cust_id = c.cust_id
 JOIN sh.products p ON p.prod_id = sa.prod_id
),
sales_rpt AS
(
 SELECT ts.prod_name product,
 ts.country_name country,
 ts.channel_id channel,
 SUBSTR(ts.calendar_quarter_desc,6,2) quarter,
 SUM(amount_sold) amount_sold,
 SUM(quantity_sold) quantity_sold
 FROM top_sales ts
 GROUP BY ts.prod_name,
 ts.country_name,
 ts.channel_id,
 SUBSTR(ts.calendar_quarter_desc,6,2)
)
SELECT * FROM 
(
 SELECT product, channel,quarter,country,quantity_sold
 FROM sales_rpt
) PIVOT (
 SUM(quantity_sold)
 FOR (channel,quarter) IN
 (
     (5,'02') AS catalog_q2,
     (4,'01') AS internet_q1,
     (4,'04') AS internet_q4,
     (2,'02') AS partners_q2,
     (9,'03') AS tele_q3
 )
)
ORDER BY product,country;

用with定義PL/SQL函數(shù)

Oracle 12c中引入了一種特性,可以使用with子句聲明并定義pl/sql函數(shù)和存儲過程。在定義之后,可以在聲明這個子句的查詢中引用該pl/sql函數(shù)。

WITH 
  FUNCTION calc_markup(p_markup NUMBER,p_price NUMBER) RETURN NUMBER
  IS
  BEGIN
         RETURN p_markup*p_price;
    END;
  SELECT t.prod_name,
  t.prod_list_price cur_price,
  calc_markup(.05,t.prod_list_price) mup5,
  ROUND(t.prod_list_price+calc_markup(0.10,t.prod_list_price),2) new_price 
  FROM sh.products t;
/
SELECT prod_name,cur_price,mup5,new_price
FROM (
  WITH 
    FUNCTION calc_markup(p_markup NUMBER,p_price NUMBER) RETURN NUMBER
    IS
    BEGIN
           RETURN p_markup*p_price;
      END;
      SELECT t.prod_name,
      t.prod_list_price cur_price,
      calc_markup(.05,t.prod_list_price) mup5,
      ROUND(t.prod_list_price+calc_markup(0.10,t.prod_list_price),2) new_price  
      FROM sh.products t
) WHERE cur_price<1000
AND new_price>1000;
SELECT /*+ WITH_PLSQL */ prod_name,cur_price,mup5,new_price
FROM (
  WITH 
    FUNCTION calc_markup(p_markup NUMBER,p_price NUMBER) RETURN NUMBER
    IS
    BEGIN
           RETURN p_markup*p_price;
      END;
      SELECT t.prod_name,
      t.prod_list_price cur_price,
      calc_markup(.05,t.prod_list_price) mup5,
      ROUND(t.prod_list_price+calc_markup(0.10,t.prod_list_price),2) new_price  
      FROM sh.products t
) WHERE cur_price<1000
AND new_price>1000;
/

必須使用斜線/運行這個語名,類似執(zhí)行一個匿名的PL/SQL塊。

SQL優(yōu)化

當一個SQL查詢被設(shè)計或修改以利用子查詢因子化時,在優(yōu)化器為查詢建立執(zhí)行計劃時,可能將因子化的子查詢作為臨時表來處理。

sqlplus scott/scott@orcl

命令行登陸語法 sqlplus username/password@servername as sysdba
set autotrace on顯示計劃信息和查詢的數(shù)據(jù)
set autotrace traceonly只顯示計劃信息

SQL> set autotrace traceonly
SQL> --with和materialize
SQL> WITH cust AS (
  2       SELECT /*+ materialize gather_plan_statistics */
  3       t.cust_income_level,
  4       a.country_name
  5       FROM sh.customers t
  6       JOIN sh.countries a ON a.country_id=t.country_id
  7  )
  8  SELECT  c.country_name,cust_income_level,COUNT(c.country_name) country_cust_count
  9  FROM cust c
 10  HAVING COUNT(country_name) >
 11  (
 12         SELECT COUNT(*) * .01 FROM cust c2
 13  )
 14  OR COUNT(cust_income_level) >
 15  (
 16         SELECT MEDIAN(income_income_count)
 17         FROM (
 18              SELECT cust_income_level,COUNT(*)* .25 income_income_count
 19              FROM cust
 20              GROUP BY cust_income_level
 21         )
 22  )
 23  GROUP BY country_name,cust_income_level
 24  ORDER BY 1,2;

已選擇35行。


執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 3455850065

--------------------------------------------------------------------------------------------------------

| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT           |                           |    20 |   620 |   499   (2)| 00:00:06 |

|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |

|   2 |   LOAD AS SELECT           |                           |       |       |           |          |

|*  3 |    HASH JOIN               |                           | 55500 |  2222K|   410   (1)| 00:00:05 |

|   4 |     TABLE ACCESS FULL      | COUNTRIES                 |    23 |   345 |     3   (0)| 00:00:01 |

|   5 |     TABLE ACCESS FULL      | CUSTOMERS                 | 55500 |  1409K|   406   (1)| 00:00:05 |

|*  6 |   FILTER                   |                           |       |       |            |          |

|   7 |    SORT GROUP BY           |                           |    20 |   620 |    89   (5)| 00:00:02 |

|   8 |     VIEW                   |                           | 55500 |  1680K|    86   (2)| 00:00:02 |

|   9 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6607_61C9C4 | 55500 |  1680K|    86   (2)| 00:00:02 |

|  10 |    SORT AGGREGATE          |                           |     1 |       |           |          |

|  11 |     VIEW                   |                           | 55500 |       |    86   (2)| 00:00:02 |

|  12 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6607_61C9C4 | 55500 |  1680K|    86   (2)| 00:00:02 |

|  13 |    SORT GROUP BY           |                           |     1 |    13 |            |          |

|  14 |     VIEW                   |                           |    12 |   156 |    89   (5)| 00:00:02 |

|  15 |      SORT GROUP BY         |                           |    12 |   252 |    89   (5)| 00:00:02 |

|  16 |       VIEW                 |                           | 55500 |  1138K|    86   (2)| 00:00:02 |

|  17 |        TABLE ACCESS FULL   | SYS_TEMP_0FD9D6607_61C9C4 | 55500 |  1680K|    86   (2)| 00:00:02 |

--------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
   6 - filter(COUNT("COUNTRY_NAME")> (SELECT COUNT(*)*.01 FROM  (SELECT /*+ CACHE_TEMP_TABLE

              ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_61C9C4"

              "T1") "C2") OR COUNT("CUST_INCOME_LEVEL")> (SELECT PERCENTILE_CONT(0.500000) WITHIN GROUP (

              ORDER BY "INCOME_INCOME_COUNT") FROM  (SELECT "CUST_INCOME_LEVEL"
              "CUST_INCOME_LEVEL",COUNT(*)*.25 "INCOME_INCOME_COUNT" FROM  (SELECT /*+ CACHE_TEMP_TABLE

              ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_61C9C4"

              "T1") "CUST" GROUP BY "CUST_INCOME_LEVEL") "from$_subquery$_006"))


統(tǒng)計信息
----------------------------------------------------------
          4  recursive calls
        314  db block gets
       2382  consistent gets
        303  physical reads
        600  redo size
       1916  bytes sent via SQL*Net to client
        438  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
         35  rows processed

需要測試才能得到最優(yōu)查詢性能可以通過一個管理層需要的報表說明。這個報告必須按照照國家和收入水平顯示消費者的分布情況,并且只顯示那些占總消費比例等于或超過1%的國家和收入水平的數(shù)據(jù)。如果某個收入水平范圍的消費者數(shù)目等于或超過該收入水平范圍的總消費者數(shù)的25%,這樣的國家和收入水平也需要被收入到報告中。前一個查詢中因子化的子查詢cust被保留了下來,新的內(nèi)容是having子句中的子查詢,這是用來保證執(zhí)行報告所規(guī)定的規(guī)則的。
執(zhí)行這個sql語句時,所有一切都像你預想的那樣。然后檢查執(zhí)行計劃發(fā)現(xiàn)customers和countries表的聯(lián)結(jié)經(jīng)過了一個臨時表轉(zhuǎn)換(TEMP TABLE TRANSFORMATION ),接下來的查詢都會用到這個臨時表sys_temp_of。到止前為止,如果懷疑所選擇的執(zhí)行計劃是不是合理,可以使用materialized和inline提示測試。

SQL> --with和inline
SQL> WITH cust AS (
  2       SELECT /*+ inline gather_plan_statistics */
  3       t.cust_income_level,
  4       a.country_name
  5       FROM sh.customers t
  6       JOIN sh.countries a ON a.country_id=t.country_id
  7  )
  8  SELECT  c.country_name,cust_income_level,COUNT(c.country_name) country_cust_count
  9  FROM cust c
 10  HAVING COUNT(country_name) >
 11  (
 12         SELECT COUNT(*) * .01 FROM cust c2
 13  )
 14  OR COUNT(cust_income_level) >
 15  (
 16         SELECT MEDIAN(income_income_count)
 17         FROM (
 18              SELECT cust_income_level,COUNT(*)* .25 income_income_count
 19              FROM cust
 20              GROUP BY cust_income_level
 21         )
 22  )
 23  GROUP BY country_name,cust_income_level
 24  ORDER BY 1,2;

已選擇35行。


執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 1412345716

---------------------------------------------------------------------------------------

| Id  | Operation              | Name         | Rows  | Bytes | Cost (%CPU)| Tim
e     |

---------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |              |    20 |   820 |   413   (2)| 00:00:05 |

|*  1 |  FILTER                |              |       |       |            |      |

|   2 |   SORT GROUP BY        |              |    20 |   820 |   413   (2)| 00:00:05 |

|*  3 |    HASH JOIN           |              | 55500 |  2222K|   410   (1)| 00:00:05 |

|   4 |     TABLE ACCESS FULL  | COUNTRIES    |    23 |   345 |     3   (0)| 00:00:01 |

|   5 |     TABLE ACCESS FULL  | CUSTOMERS    | 55500 |  1409K|   406   (1)| 00:00:05 |

|   6 |   SORT AGGREGATE       |              |     1 |    10 |            |      |

|*  7 |    HASH JOIN           |              | 55500 |   541K|   408   (1)| 00:00:05 |

|   8 |     INDEX FULL SCAN    | COUNTRIES_PK |    23 |   115 |     1   (0)| 00:00:01 |

|   9 |     TABLE ACCESS FULL  | CUSTOMERS    | 55500 |   270K|   406   (1)| 00:00:05 |

|  10 |   SORT GROUP BY        |              |     1 |    13 |            |      |

|  11 |    VIEW                |              |    12 |   156 |   411   (2)| 00:00:05 |

|  12 |     SORT GROUP BY      |              |    12 |   372 |   411   (2)| 00:00:05 |

|* 13 |      HASH JOIN         |              | 55500 |  1680K|   408   (1)| 00:00:05 |

|  14 |       INDEX FULL SCAN  | COUNTRIES_PK |    23 |   115 |     1   (0)| 00:00:01 |

|  15 |       TABLE ACCESS FULL| CUSTOMERS    | 55500 |  1409K|   406   (1)| 00:00:05 |

---------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(COUNT(*)> (SELECT COUNT(*)*.01 FROM "SH"."COUNTRIES"
              "A","SH"."CUSTOMERS" "T" WHERE "A"."COUNTRY_ID"="T"."COUNTRY_ID") OR

              COUNT("T"."CUST_INCOME_LEVEL")> (SELECT PERCENTILE_CONT(0.500000)WITHIN GROUP

              ( ORDER BY "INCOME_INCOME_COUNT") FROM  (SELECT "T"."CUST_INCOME_LEVEL"

              "CUST_INCOME_LEVEL",COUNT(*)*.25 "INCOME_INCOME_COUNT" FROM "SH"."COUNTRIES"

              "A","SH"."CUSTOMERS" "T" WHERE "A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY

              "T"."CUST_INCOME_LEVEL") "from$_subquery$_006"))
   3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
   7 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
  13 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")


統(tǒng)計信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       4382  consistent gets
          1  physical reads
          0  redo size
       1916  bytes sent via SQL*Net to client
        438  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
         35  rows processed

從執(zhí)行計劃中可以看出,對customers表進行了3次全掃描,對COUNTRIES進行了一次全掃描。兩次執(zhí)行cust子查詢的只需要COUNTRIES_PK索引中的信息,因此對索引而不是表進行了1次全掃描,節(jié)省了少量的時間和資源。

清除共享池
alter system flush shared_pool;
清除緩沖區(qū)
alter system flush buffer_cache;

測試查詢改變的影響

在前面,報告中需要的是任何國家一定收入層次的人員等于或超過該收入層次所有人員的25%。如果被要求如果某個收入層次的數(shù)目大于該收入層次總消費數(shù)的中間值,則將該收入層次也包括到報告中。

修改后的查詢收入inline

SQL> WITH cust AS
  2   (SELECT /*+ inline gather_plan_statistics */    --查詢國家的收入等級及對應國家
  3     t.cust_income_level, a.country_name
  4      FROM sh.customers t
  5      JOIN sh.countries a
  6        ON a.country_id = t.country_id
  7    ),
  8  median_income_set AS
  9   (SELECT /*+ inline */
 10     cust_income_level, COUNT(*) income_level_count --某個收入層次的數(shù)目大于該收入層次的中間數(shù)
 11      FROM cust
 12     GROUP BY cust_income_level
 13    HAVING COUNT(cust_income_level) > (SELECT MEDIAN(income_level_count) income_level_count
 14                                        FROM (SELECT cust_income_level,
 15                                                     COUNT(*) income_level_count
 16                                                FROM cust
 17                                               GROUP BY cust_income_level)))
 18  SELECT country_name,
 19         cust_income_level,
 20         COUNT(country_name) country_cust_count
 21    FROM cust c
 22  HAVING COUNT (country_name) > (SELECT COUNT(*) * .01 FROM cust c2) OR cust_income_level IN (SELECT mis.cust_income_level
 23                                                                                                FROM median_income_set mis)
 24   GROUP BY country_name, cust_income_level;

已選擇123行。


執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 1635819209

----------------------------------------------------------------------------------------

| Id  | Operation               | Name         | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT        |              |    20 |   820 |   413   (2)| 00:00:05 |

|*  1 |  FILTER                 |              |       |       |            |       |

|   2 |   HASH GROUP BY         |              |    20 |   820 |   413   (2)| 00:00:05 |

|*  3 |    HASH JOIN            |              | 55500 |  2222K|   410   (1)| 00:00:05 |

|   4 |     TABLE ACCESS FULL   | COUNTRIES    |    23 |   345 |     3   (0)| 00:00:01 |

|   5 |     TABLE ACCESS FULL   | CUSTOMERS    | 55500 |  1409K|   406   (1)| 00:00:05 |

|   6 |   SORT AGGREGATE        |              |     1 |    10 |            |       |

|*  7 |    HASH JOIN            |              | 55500 |   541K|   408   (1)| 00:00:05 |

|   8 |     INDEX FULL SCAN     | COUNTRIES_PK |    23 |   115 |     1   (0)| 00:00:01 |

|   9 |     TABLE ACCESS FULL   | CUSTOMERS    | 55500 |   270K|   406   (1)| 00:00:05 |

|* 10 |   FILTER                |              |       |       |            |       |

|  11 |    HASH GROUP BY        |              |     1 |    31 |   411   (2)| 00:00:05 |

|* 12 |     HASH JOIN           |              | 55500 |  1680K|   408   (1)| 00:00:05 |

|  13 |      INDEX FULL SCAN    | COUNTRIES_PK |    23 |   115 |     1   (0)| 00:00:01 |

|  14 |      TABLE ACCESS FULL  | CUSTOMERS    | 55500 |  1409K|   406   (1)| 00:00:05 |

|  15 |    SORT GROUP BY        |              |     1 |    13 |            |       |

|  16 |     VIEW                |              |    12 |   156 |   411   (2)| 00:00:05 |

|  17 |      SORT GROUP BY      |              |    12 |   372 |   411   (2)| 00:00:05 |

|* 18 |       HASH JOIN         |              | 55500 |  1680K|   408   (1)| 00:00:05 |

|  19 |        INDEX FULL SCAN  | COUNTRIES_PK |    23 |   115 |     1   (0)| 00:00:01 |

|  20 |        TABLE ACCESS FULL| CUSTOMERS    | 55500 |  1409K|   406   (1)| 00:00:05 |

----------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(COUNT(*)> (SELECT COUNT(*)*.01 FROM "SH"."COUNTRIES"
              "A","SH"."CUSTOMERS" "T" WHERE "A"."COUNTRY_ID"="T"."COUNTRY_ID") OR  EXISTS

              (SELECT 0 FROM "SH"."COUNTRIES" "A","SH"."CUSTOMERS" "T" WHERE
              "A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY "T"."CUST_INCOME_LEVEL" HAVING

              "T"."CUST_INCOME_LEVEL"=:B1 AND COUNT("T"."CUST_INCOME_LEVEL")> (SELECT

              PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT") FROM

              (SELECT "T"."CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*)
              "INCOME_LEVEL_COUNT" FROM "SH"."COUNTRIES" "A","SH"."CUSTOMERS" "T" WHERE

              "A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY "T"."CUST_INCOME_LEVEL")

              "from$_subquery$_005")))
 3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
 7 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
 10 - filter("T"."CUST_INCOME_LEVEL"=:B1 AND COUNT("T"."CUST_INCOME_LEVEL")>
              (SELECT PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT")

              FROM  (SELECT "T"."CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*)


              "INCOME_LEVEL_COUNT" FROM "SH"."COUNTRIES" "A","SH"."CUSTOMERS" "T" WHERE

              "A"."COUNTRY_ID"="T"."COUNTRY_ID" GROUP BY "T"."CUST_INCOME_LEVEL")

              "from$_subquery$_005"))
  12 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
  18 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")


統(tǒng)計信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      23362  consistent gets
          0  physical reads
          0  redo size
       5460  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
         10  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
        123  rows processed

SQL>

修改后的查詢收入materialize

增加了1次全表掃描和索引掃描,下面是允許臨時表轉(zhuǎn)換查詢的性能輸出:

SQL> WITH cust AS
  2   (SELECT /*+ materialize gather_plan_statistics */    --查詢國家的收入等級及對應國家
  3     t.cust_income_level, a.country_name
  4      FROM sh.customers t
  5      JOIN sh.countries a
  6        ON a.country_id = t.country_id
  7    ),
  8  median_income_set AS
  9   (SELECT /*+ inline */
 10     cust_income_level, COUNT(*) income_level_count --某個收入層次的數(shù)目大于該收入層次的中間數(shù)
 11      FROM cust
 12     GROUP BY cust_income_level
 13    HAVING COUNT(cust_income_level) > (SELECT MEDIAN(income_level_count) income_level_count
 14                                        FROM (SELECT cust_income_level,
 15                                                     COUNT(*) income_level_count
 16                                                FROM cust
 17                                               GROUP BY cust_income_level)))
 18  SELECT country_name,
 19         cust_income_level,
 20         COUNT(country_name) country_cust_count
 21    FROM cust c
 22  HAVING COUNT (country_name) > (SELECT COUNT(*) * .01 FROM cust c2) OR cust_income_level IN (SELECT mis.cust_income_level
 23                                                                                                FROM median_income_set mis)
 24   GROUP BY country_name, cust_income_level;

已選擇123行。


執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 2452612

--------------------------------------------------------------------------------------------------------

| Id  | Operation                  | Name                      | Rows  | Bytes |
 Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT           |                           |    20 |   620 |  499   (2)| 00:00:06 |

|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |           |          |

|   2 |   LOAD AS SELECT           |                           |       |       |            |          |

|*  3 |    HASH JOIN               |                           | 55500 |  2222K|   410   (1)| 00:00:05 |

|   4 |     TABLE ACCESS FULL      | COUNTRIES                 |    23 |   345 |     3   (0)| 00:00:01 |

|   5 |     TABLE ACCESS FULL      | CUSTOMERS                 | 55500 |  1409K|   406   (1)| 00:00:05 |

|*  6 |   FILTER                   |                           |       |       |            |          |

|   7 |    HASH GROUP BY           |                           |    20 |   620 |    89   (5)| 00:00:02 |

|   8 |     VIEW                   |                           | 55500 |  1680K|    86   (2)| 00:00:02 |

|   9 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6607_644975 | 55500 |  1680K|    86   (2)| 00:00:02 |

|  10 |    SORT AGGREGATE          |                           |     1 |       |            |          |

|  11 |     VIEW                   |                           | 55500 |       |    86   (2)| 00:00:02 |

|  12 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6607_644975 | 55500 |  1680K|    86   (2)| 00:00:02 |

|* 13 |    FILTER                  |                           |       |       |            |          |

|  14 |     HASH GROUP BY          |                           |     1 |    21 |    89   (5)| 00:00:02 |

|  15 |      VIEW                  |                           | 55500 |  1138K|    86   (2)| 00:00:02 |

|  16 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6607_644975 | 55500 |  1680K|    86   (2)| 00:00:02 |

|  17 |     SORT GROUP BY          |                           |     1 |    13 |            |          |

|  18 |      VIEW                  |                           |    12 |   156 |    89   (5)| 00:00:02 |

|  19 |       SORT GROUP BY        |                           |    12 |   252 |    89   (5)| 00:00:02 |

|  20 |        VIEW                |                           | 55500 |  1138K|    86   (2)| 00:00:02 |

|  21 |         TABLE ACCESS FULL  | SYS_TEMP_0FD9D6607_644975 | 55500 |  1680K|    86   (2)| 00:00:02 |

--------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."COUNTRY_ID"="T"."COUNTRY_ID")
   6 - filter(COUNT("COUNTRY_NAME")> (SELECT COUNT(*)*.01 FROM  (SELECT /*+ CACH E_TEMP_TABLE

              ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_644975"

              "T1") "C2") OR  EXISTS (SELECT 0 FROM  (SELECT /*+ CACHE_TEMP_TABLE ("T1") */ "C0"

              "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM "SYS"."SYS_TEMP_0FD9D6607_644975" "T1") "CUST"

              GROUP BY "CUST_INCOME_LEVEL" HAVING "CUST_INCOME_LEVEL"=:B1 AND COUNT("CUST_INCOME_LEVEL")>

              (SELECT PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT") FROM  (SELECT

              "CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*) "INCOME_LEVEL_COUNT" FROM  (SELECT /*+

              CACHE_TEMP_TABLE ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM

              "SYS"."SYS_TEMP_0FD9D6607_644975" "T1") "CUST" GROUP BY "CUST_INCOME_LEVEL")

              "from$_subquery$_005")))
  13 - filter("CUST_INCOME_LEVEL"=:B1 AND COUNT("CUST_INCOME_LEVEL")> (SELECT
              PERCENTILE_CONT(0.500000) WITHIN GROUP ( ORDER BY "INCOME_LEVEL_COUNT") FROM  (SELECT

              "CUST_INCOME_LEVEL" "CUST_INCOME_LEVEL",COUNT(*) "INCOME_LEVEL_COUNT" FROM  (SELECT /*+

              CACHE_TEMP_TABLE ("T1") */ "C0" "CUST_INCOME_LEVEL","C1" "COUNTRY_NAME" FROM

              "SYS"."SYS_TEMP_0FD9D6607_644975" "T1") "CUST" GROUP BY "CUST_INCOME_LEVEL")

              "from$_subquery$_005"))


統(tǒng)計信息
----------------------------------------------------------
        138  recursive calls
        317  db block gets
       6379  consistent gets
        303  physical reads
       1520  redo size
       5460  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
         10  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
        123  rows processed

因為在查詢修改后的版本中增加了掃描次數(shù),邏輯IO的支出更明顯了。在這個查詢中允許oracle進行表轉(zhuǎn)換,將散列聯(lián)結(jié)的結(jié)果寫入到磁盤中的一張臨時表中然后在查詢中多次重用的效率就明顯更高。

尋找其他優(yōu)化機會

計算產(chǎn)口各個銷售渠道的成本找出2000年所生的每種產(chǎn)品的平均,最小和最大成本。但下面的查詢不僅閱讀起來困難并且難以修改,而且在某種程度上效率也是不高的。

SQL>  --用來計算成本的老sql語句
SQL> SELECT /*+ gather_plan_statistics */
  2  SUBSTR(prod_name,1,30) prod_name,
  3  channel_desc,
  4  (
  5   SELECT AVG(c2.unit_cost) AS avg_cost FROM sh.costs c2
  6   WHERE c2.prod_id=c.prod_id AND c2.channel_id=c.channel_id
  7   AND c2.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
  8   AND to_date('12/31/2000','mm/dd/yyyy')
  9  ),
 10  (
 11   SELECT MIN(c2.unit_cost) AS min_cost FROM sh.costs c2
 12   WHERE c2.prod_id=c.prod_id AND c2.channel_id=c.channel_id
 13   AND c2.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
 14   AND to_date('12/31/2000','mm/dd/yyyy')
 15  ),
 16  (
 17   SELECT MAX(c2.unit_cost) AS max_cost FROM sh.costs c2
 18   WHERE c2.prod_id=c.prod_id AND c2.channel_id=c.channel_id
 19   AND c2.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
 20   AND to_date('12/31/2000','mm/dd/yyyy')
 21  )
 22  FROM (
 23   SELECT DISTINCT pr.prod_id,pr.prod_name,ch.channel_id,ch.channel_desc
 24   FROM  sh.channels ch,sh.products pr,sh.costs co
 25   WHERE ch.channel_id=co.channel_id
 26   AND co.prod_id=pr.prod_id
 27   AND co.time_id BETWEEN to_date('01/01/2000','mm/dd/yyyy')
 28   AND to_date('12/31/2000','mm/dd/yyyy')
 29  ) c
 30  ORDER BY prod_name,channel_desc;

已選擇216行。


執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 1877279774

------------------------------------------------------------------------------------------------------------------------------

| Id  | Operation                           | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |

------------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                    |                | 20640 |  1310K|     |   640   (1)| 00:00:08 |       |       |

|   1 |  SORT AGGREGATE                     |                |     1 |    20 |     |            |          |       |       |

|   2 |   PARTITION RANGE ITERATOR          |                |    96 |  1920 |     |    17   (0)| 00:00:01 |    13 |    16 |

|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID| COSTS          |    96 |  1920 |     |    17   (0)| 00:00:01 |    13 |    16 |

|   4 |     BITMAP CONVERSION TO ROWIDS     |                |       |       |     |            |          |       |       |

|*  5 |      BITMAP INDEX SINGLE VALUE      | COSTS_PROD_BIX |       |       |     |            |          |    13 |    16 |

|   6 |  SORT AGGREGATE                     |                |     1 |    20 |     |            |          |       |       |

|   7 |   PARTITION RANGE ITERATOR          |                |    96 |  1920 |     |    17   (0)| 00:00:01 |    13 |    16 |

|*  8 |    TABLE ACCESS BY LOCAL INDEX ROWID| COSTS          |    96 |  1920 |     |    17   (0)| 00:00:01 |    13 |    16 |

|   9 |     BITMAP CONVERSION TO ROWIDS     |                |       |       |     |            |          |       |       |

|* 10 |      BITMAP INDEX SINGLE VALUE      | COSTS_PROD_BIX |       |       |     |            |          |    13 |    16 |

|  11 |  SORT AGGREGATE                     |                |     1 |    20 |     |            |          |       |       |

|  12 |   PARTITION RANGE ITERATOR          |                |    96 |  1920 |     |    17   (0)| 00:00:01 |    13 |    16 |

|* 13 |    TABLE ACCESS BY LOCAL INDEX ROWID| COSTS          |    96 |  1920 |     |    17   (0)| 00:00:01 |    13 |    16 |

|  14 |     BITMAP CONVERSION TO ROWIDS     |                |       |       |     |            |          |       |       |

|* 15 |      BITMAP INDEX SINGLE VALUE      | COSTS_PROD_BIX |       |       |     |            |          |    13 |    16 |

|  16 |  SORT ORDER BY                      |                | 20640 |  1310K|1632K|   640   (1)| 00:00:08 |       |       |

|  17 |   VIEW                              |                | 20640 |  1310K|     |   316   (2)| 00:00:04 |       |       |

|  18 |    HASH UNIQUE                      |                | 20640 |  1169K|1384K|   316   (2)| 00:00:04 |       |       |

|* 19 |     HASH JOIN                       |                | 20640 |  1169K|     |    25   (8)| 00:00:01 |       |       |

|  20 |      TABLE ACCESS FULL              | PRODUCTS       |    72 |  2160 |     |     3   (0)| 00:00:01 |       |       |

|* 21 |      HASH JOIN                      |                | 20640 |   564K|     |    21   (5)| 00:00:01 |       |       |

|  22 |       TABLE ACCESS FULL             | CHANNELS       |     5 |    65 |     |     3   (0)| 00:00:01 |       |       |

|  23 |       PARTITION RANGE ITERATOR      |                | 20640 |   302K|     |    17   (0)| 00:00:01 |    13 |    16 |

|* 24 |        TABLE ACCESS FULL            | COSTS          | 20640 |   302K|     |    17   (0)| 00:00:01 |    13 |    16 |

------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("C2"."CHANNEL_ID"=:B1 AND "C2"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

   5 - access("C2"."PROD_ID"=:B1)
   8 - filter("C2"."CHANNEL_ID"=:B1 AND "C2"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

  10 - access("C2"."PROD_ID"=:B1)
  13 - filter("C2"."CHANNEL_ID"=:B1 AND "C2"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

  15 - access("C2"."PROD_ID"=:B1)
  19 - access("CO"."PROD_ID"="PR"."PROD_ID")
  21 - access("CH"."CHANNEL_ID"="CO"."CHANNEL_ID")
  24 - filter("CO"."TIME_ID"<=TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))



統(tǒng)計信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      29642  consistent gets
          0  physical reads
          0  redo size
      14092  bytes sent via SQL*Net to client
        570  bytes received via SQL*Net from client
         16  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
        216  rows processed

將begin_date和end_date列放入單獨的查詢bookends 開始,只留下需要設(shè)定值的地方。產(chǎn)品的數(shù)據(jù)被放在prodmaster子查詢中。盡管這幾段sql語句放在子查詢也可以實現(xiàn)其功能,但將它們移到因子化的子查詢中大大地增強了sql語句整體的可讀性。
平均、最大和最小成本的計算被一個稱為cost_compare的子查詢?nèi)〈?。最后,加入了?lián)結(jié)prodmaster和cost_compare子查詢的sql語句。

SQL> --使用with子句進行重構(gòu)后的老sql語句
SQL> WITH bookends AS
  2   (SELECT to_date('01/01/2000', 'mm/dd/yyyy') begin_date,
  3           to_date('12/31/2000', 'mm/dd/yyyy') end_date
  4      FROM dual),
  5  prodmaster AS
  6   (SELECT DISTINCT pr.prod_id, pr.prod_name, ch.channel_id, ch.channel_desc
  7      FROM sh.channels ch, sh.products pr, sh.costs co
  8     WHERE ch.channel_id = co.channel_id
  9       AND co.prod_id = pr.prod_id
 10       AND co.time_id BETWEEN (SELECT begin_date FROM bookends) AND
 11           (SELECT end_date FROM bookends)),
 12  cost_compare AS
 13   (SELECT c2.prod_id,
 14           c2.channel_id,
 15           AVG(c2.unit_cost) avg_cost,
 16           MIN(c2.unit_cost) min_cost,
 17           MAX(c2.unit_cost) max_cost
 18      FROM sh.costs c2
 19     WHERE c2.time_id BETWEEN (SELECT begin_date FROM bookends) AND
 20           (SELECT end_date FROM bookends)
 21     GROUP BY c2.prod_id, c2.channel_id)
 22  SELECT /*+ gather_plan_statistics */
 23   SUBSTR(pm.prod_name, 1, 30) prod_name,
 24   pm.channel_desc,
 25   cc.avg_cost,
 26   cc.min_cost,
 27   cc.max_cost
 28    FROM prodmaster pm
 29    JOIN cost_compare cc
 30      ON cc.prod_id = pm.prod_id
 31     AND cc.channel_id = pm.channel_id
 32   ORDER BY pm.prod_id, pm.channel_id;

已選擇216行。


執(zhí)行計劃
----------------------------------------------------------
Plan hash value: 2361085328

----------------------------------------------------------------------------------------------------------------------------

| Id  | Operation                                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

----------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                          |                |   138 | 12696 |    83   (5)| 00:00:01 |       |       |

|   1 |  MERGE JOIN                               |                |   138 | 12696 |    83   (5)| 00:00:01 |       |       |

|   2 |   SORT JOIN                               |                |   205 |  9430 |    44   (5)| 00:00:01 |       |       |

|   3 |    VIEW                                   |                |   205 |  9430 |    44   (5)| 00:00:01 |       |       |

|   4 |     HASH UNIQUE                           |                |   205 | 11890 |    44   (5)| 00:00:01 |       |       |

|*  5 |      HASH JOIN                            |                |   205 | 11890 |    39   (3)| 00:00:01 |       |       |

|   6 |       TABLE ACCESS FULL                   | PRODUCTS       |    72 |  2160 |     3   (0)| 00:00:01 |       |       |

|   7 |       MERGE JOIN                          |                |   205 |  5740 |    36   (3)| 00:00:01 |       |       |

|   8 |        TABLE ACCESS BY INDEX ROWID        | CHANNELS       |     5 |65 |     2   (0)| 00:00:01 |       |       |

|   9 |         INDEX FULL SCAN                   | CHANNELS_PK    |     5 |   |     1   (0)| 00:00:01 |       |       |

|* 10 |        SORT JOIN                          |                |   205 |  3075 |    34   (3)| 00:00:01 |       |       |

|  11 |         PARTITION RANGE ITERATOR          |                |   205 |  3075 |    33   (0)| 00:00:01 |   KEY |   KEY |

|  12 |          TABLE ACCESS BY LOCAL INDEX ROWID| COSTS          |   205 |  3075 |    33   (0)| 00:00:01 |   KEY |   KEY |

|  13 |           BITMAP CONVERSION TO ROWIDS     |                |       |   |            |          |       |       |

|* 14 |            BITMAP INDEX RANGE SCAN        | COSTS_TIME_BIX |       |   |            |          |   KEY |   KEY |

|  15 |             FAST DUAL                     |                |     1 |   |     2   (0)| 00:00:01 |       |       |

|  16 |             FAST DUAL                     |                |     1 |   |     2   (0)| 00:00:01 |       |       |

|* 17 |   SORT JOIN                               |                |   145 |  6670 |    39   (6)| 00:00:01 |       |       |

|  18 |    VIEW                                   |                |   145 |  6670 |    38   (3)| 00:00:01 |       |       |

|  19 |     HASH GROUP BY                         |                |   145 |  2900 |    38   (3)| 00:00:01 |       |       |

|  20 |      PARTITION RANGE ITERATOR             |                |   205 |  4100 |    33   (0)| 00:00:01 |   KEY |   KEY |

|  21 |       TABLE ACCESS BY LOCAL INDEX ROWID   | COSTS          |   205 |  4100 |    33   (0)| 00:00:01 |   KEY |   KEY |

|  22 |        BITMAP CONVERSION TO ROWIDS        |                |       |   |            |          |       |       |

|* 23 |         BITMAP INDEX RANGE SCAN           | COSTS_TIME_BIX |       |   |            |          |   KEY |   KEY |

|  24 |          FAST DUAL                        |                |     1 |   |     2   (0)| 00:00:01 |       |       |

|  25 |          FAST DUAL                        |                |     1 |   |     2   (0)| 00:00:01 |       |       |

----------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CO"."PROD_ID"="PR"."PROD_ID")
  10 - access("CH"."CHANNEL_ID"="CO"."CHANNEL_ID")
       filter("CH"."CHANNEL_ID"="CO"."CHANNEL_ID")
  14 - access("CO"."TIME_ID">= (SELECT TO_DATE(' 2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"

              "DUAL") AND "CO"."TIME_ID"<= (SELECT TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"

              "DUAL"))
  17 - access("CC"."PROD_ID"="PM"."PROD_ID" AND "CC"."CHANNEL_ID"="PM"."CHANNEL_ID")

       filter("CC"."CHANNEL_ID"="PM"."CHANNEL_ID" AND "CC"."PROD_ID"="PM"."PROD_ID")

  23 - access("C2"."TIME_ID">= (SELECT TO_DATE(' 2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"

              "DUAL") AND "C2"."TIME_ID"<= (SELECT TO_DATE(' 2000-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') FROM "SYS"."DUAL"

              "DUAL"))


統(tǒng)計信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       7436  consistent gets
          0  physical reads
          0  redo size
      13596  bytes sent via SQL*Net to client
        570  bytes received via SQL*Net from client
         16  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
        216  rows processed

將子查詢因子化應用到pl/sql中

例子:

  • 只列出至少在3個不同的年份中都采購了產(chǎn)品的消費者
  • 按照產(chǎn)品類別分組統(tǒng)計每個消費者的購買總額
    用常規(guī)的pl/sql獲取所需的數(shù)據(jù)
    查詢滿足標準的所有消費者并將他們的ID保存在一張臨時表中
    然后在新保存的消費者ID中執(zhí)行循環(huán)并找到他們,加起來,將這些信息插入到另一張臨時表中。然后再將得到的結(jié)果與customers和products表聯(lián)結(jié)以生成報告。
--用PL/SQL生成消費者報告
BEGIN
  EXECUTE IMMEDIATE 'drop table cust3year';
  EXECUTE IMMEDIATE 'drop table sales3year';
  EXCEPTION 
    WHEN OTHERS THEN
      NULL;
END;
/
create global temporary table cust3year(cust_id number);
CREATE GLOBAL TEMPORARY TABLE sales3year(
       cust_id NUMBER,
       prod_category VARCHAR2(50),
       total_sale NUMBER
)
/
BEGIN
  EXECUTE IMMEDIATE 'truncate table cust3year';
  EXECUTE IMMEDIATE 'truncate table sales3year';
  INSERT INTO cust3year
  SELECT cust_id--,count(cust_years) year_count
  FROM (
       SELECT DISTINCT cust_id,TRUNC(time_id,'YEAR') cust_years
       FROM sh.sales
  ) 
  GROUP BY cust_id
  HAVING COUNT(cust_years)>=3;
  --SELECT * FROM cust3year;
  FOR crec IN (SELECT cust_id FROM  cust3year)
    LOOP
      INSERT INTO sales3year 
      SELECT sa.cust_id,p.prod_category,SUM(co.unit_cost*sa.quantity_sold)
      FROM sh.sales sa
      JOIN sh.products p ON p.prod_id=sa.prod_id
      JOIN sh.costs co ON co.prod_id=sa.prod_id AND co.time_id=sa.time_id
      JOIN sh.customers cu ON cu.cust_id=sa.cust_id
      WHERE crec.cust_id=sa.cust_id
      GROUP BY sa.cust_id,p.prod_category; 
    END LOOP;
END;
/    
SELECT c3.cust_id,c.cust_last_name,c.cust_first_name,s3.prod_category,s3.total_sale FROM sales3year s3
JOIN cust3year c3 ON s3.cust_id=c3.cust_id
JOIN sh.customers c ON c.cust_id=s3.cust_id
ORDER BY 1,4;

上面是一段很好的PL/SQL程序塊,如果考慮子查詢因子化,還可以改進。首先將消費者ID的部分放到with子句中,接下來再利用子查詢的結(jié)果生成報告所需的銷售,產(chǎn)品和消費者信息就可以了。

--使用with子句生成消費者報告
WITH cust3year AS(
 SELECT cust_id
  FROM (
       SELECT DISTINCT cust_id,TRUNC(time_id,'YEAR') cust_years
       FROM sh.sales
  ) 
  GROUP BY cust_id
  HAVING COUNT(cust_years)>=3
),
sales3year AS (
  SELECT sa.cust_id,p.prod_category,SUM(co.unit_cost*sa.quantity_sold) AS total_sale
        FROM sh.sales sa
        JOIN sh.products p ON p.prod_id=sa.prod_id
        JOIN sh.costs co ON co.prod_id=sa.prod_id AND co.time_id=sa.time_id
        JOIN sh.customers cu ON cu.cust_id=sa.cust_id
        WHERE sa.cust_id IN (SELECT cust_id FROM cust3year)
        GROUP BY sa.cust_id,p.prod_category
)
SELECT c3.cust_id,c.cust_last_name,c.cust_first_name,s3.prod_category,s3.total_sale FROM sales3year s3
JOIN cust3year c3 ON s3.cust_id=c3.cust_id
JOIN sh.customers c ON c.cust_id=s3.cust_id
ORDER BY 1,4;

WITH custyear AS
 (SELECT sa.cust_id, EXTRACT(YEAR FROM time_id) sales_year
    FROM sh.sales sa
   WHERE EXTRACT(YEAR FROM time_id) BETWEEN 1998 AND 2002
   GROUP BY sa.cust_id, EXTRACT(YEAR FROM time_id)),
cust3year AS
 (SELECT DISTINCT c3.cust_id
    FROM (SELECT cust_id, COUNT(*) OVER(PARTITION BY cust_id) year_count
            FROM custyear) c3
   WHERE c3.year_count >= 3)
SELECT c.cust_id,
       c.cust_last_name,
       c.cust_first_name,
       p.prod_category,
       SUM(co.unit_price * sa.quantity_sold) AS total_sale
  FROM cust3year c3
  JOIN sh.sales sa
    ON sa.cust_id = c3.cust_id
  JOIN sh.products p
    ON p.prod_id = sa.prod_id
  JOIN sh.costs co
    ON co.prod_id = sa.prod_id
   AND co.time_id = sa.time_id
  JOIN sh.customers c
    ON c.cust_id = c3.cust_id
 GROUP BY c.cust_id, c.cust_last_name, c.cust_first_name, p.prod_category
 ORDER BY c.cust_id;

extract()函數(shù)將年份從日期中提取出來并轉(zhuǎn)化為整形值以簡化年份的比較

子查詢因子化可以用來更好地組織一些查詢,在某些情況下甚至可以用來作為性能調(diào)優(yōu)的工具。學會使用它就等于在你的oracle工具箱中添加了一個新工具。

遞歸子查詢

遞歸子查詢因子化 recursive subquery factoring RSF

--基本的connect by 
SELECT LPAD(' ', LEVEL * 2 - 1, ' ') || emp.emp_last_name emp_last_name,
       emp.emp_first_name,
       emp.employee_id,
       emp.mgr_last_name,
       emp.mgr_first_name,
       emp.manager_id,
       emp.department_name
  FROM (SELECT /*+ inline gather_plan_statistics */
         e.last_name       emp_last_name,
         e.first_name      emp_first_name,
         e.employee_id,
         d.department_id,
         e.manager_id,
         d.department_name,
         es.last_name      mgr_last_name,
         es.first_name     mgr_first_name
          FROM hr.employees e
          LEFT OUTER JOIN hr.departments d
            ON e.department_id = d.department_id
          LEFT OUTER JOIN hr.employees es
            ON es.employee_id = e.manager_id) emp
CONNECT BY PRIOR emp.employee_id = emp.manager_id
 START WITH emp.manager_id IS NULL
 ORDER SIBLINGS BY emp.emp_last_name;

內(nèi)嵌視圖emp用來與employee和department表進行聯(lián)結(jié),然后將一個數(shù)據(jù)集提供給select ... connect by語句。用prior運算符來將當前的employee_id與另一行中的manager_id列值匹配。反復的這么做就建立了一個遞歸查詢。
start with子句是用來指引從manager_id為空的那一行開始。level偽列保存了遞歸的深度值,使得可以通過一個簡單的方法來輸出進行縮進,從而可以直觀地看出組織層次結(jié)構(gòu)。

RSF示例

--基本的遞歸子查詢因子化
WITH emp AS
 (SELECT /*+ inline gather_plan_statistics */
   e.last_name, e.first_name, e.employee_id, e.manager_id, d.department_name
    FROM hr.employees e
    LEFT OUTER JOIN hr.departments d
      ON e.department_id = d.department_id),
emp_recurse(last_name,
first_name,
employee_id,
manager_id,
department_name,
lvl) AS
 (SELECT e.last_name AS last_name,
         e.first_name AS first_name,
         e.employee_id AS employee_id,
         e.manager_id AS manager_id,
         e.department_name AS department_name,
         1 AS lvl
    FROM emp e
   WHERE e.manager_id IS NULL
  UNION ALL
  SELECT emp.last_name AS last_name,
         emp.first_name AS first_name,
         emp.employee_id AS employee_id,
         emp.manager_id AS manager_id,
         emp.department_name AS department_name,
         empr.lvl + 1 AS lvl
    FROM emp
    JOIN emp_recurse empr
      ON empr.employee_id = emp.manager_id)
search DEPTH FIRST BY last_name SET order1      
SELECT LPAD(' ', lvl * 2 - 1, ' ') || er.last_name last_name,
       er.first_name,
       er.department_name
  FROM emp_recurse er;

遞歸的with子句需要兩個查詢塊:定位點成員和遞歸成員。這兩個子查詢塊必須通過集合運算符union all結(jié)合到一起。定位點成員是union all之前的查詢,而遞歸成員是其后面的查詢。遞歸子查詢必須引用定義子查詢,這樣就進行了遞歸。

RSF的限制條件

RSF的使用比connect by要靈活得多,但是,它的使用也有一些限制:

  • distinct關(guān)鍵字或group by子句
  • model子句
  • 聚合函數(shù),但在select列表中可以使用分析函數(shù)
  • 引用query_name的子查詢
  • 引用query_name作為右表的外聯(lián)結(jié)

與connect by的不同點

與connect by相比較,rsf查詢返回的列必須在查詢暄義中聲明,如emp_recurse(last_name,first_name,employee_id,manager_id,department_name,lvl)
search depth first,默認的搜索是breadth first,這通常不是一個層級型查詢所想要的輸出。breadth first搜索在返回任何子數(shù)據(jù)行之前返回每一層級上的兄弟數(shù)據(jù)行。指定search depth first將會按照層級的順序返回數(shù)據(jù)行。search子句中的set order1部分將order1偽列的值設(shè)置為數(shù)據(jù)行返回的順序值。

類型 名稱 用途
函數(shù) sys_connect_by_path 返回當前數(shù)據(jù)行的所有祖先
運算符 connect_by_root 返回根數(shù)據(jù)行的值
運算符 prior 用來表明層級型查詢,在遞歸子查詢中不需要
偽列 connect_by_iscycle 在層級中檢測循環(huán)
參數(shù) nocycle connect by的參數(shù),與connect_by_iscycle一起使用
偽列 connect_by_isleaf 標識葉子數(shù)據(jù)行
偽列 level 用來表明層級中的深度

level偽列

--level偽列
SELECT LPAD(' ', LEVEL * 2 - 1, ' ') || e.last_name last_name, LEVEL
  FROM hr.employees e
CONNECT BY PRIOR e.employee_id = e.manager_id
 START WITH e.manager_id IS NULL
 ORDER SIBLINGS BY e.last_name;

在層級型查詢中經(jīng)常被用來實現(xiàn)輸出縮進,使得層級看起來很直觀。;

--創(chuàng)建lvl列
 WITH emp_recurse(employee_id,manager_id,last_name,lvl) AS (
      SELECT e.employee_id,NULL,e.last_name,1 AS lvl
      FROM hr.employees e
      WHERE e.manager_id IS NULL
      UNION ALL
      SELECT e1.employee_id,e1.manager_id,e1.last_name,e2.lvl+1 AS lvl
      FROM hr.employees e1
      JOIN emp_recurse e2 ON e2.employee_id=e1.manager_id  
 )
 search DEPTH FIRST BY last_name SET last_time_order
 SELECT LPAD(' ', r.lvl * 2 - 1, ' ') || r.last_name last_name,r.lvl
 FROM emp_recurse r
 ORDER BY last_time_order;

sys_connect_by_path函數(shù)

用來返回組成層級的直到當前的行的值。下面的列子用sys_connect_by_path函數(shù)用來建立一個冒號分隔的從根到節(jié)點的層級。

 --sys_connect_by_path
 SELECT LPAD(' ', 2 * (LEVEL - 1)) || e.last_name AS last_name,
        sys_connect_by_path(last_name, ':') path
   FROM hr.employees e
  START WITH e.manager_id IS NULL
 CONNECT BY PRIOR e.employee_id = e.manager_id
  ORDER SIBLINGS BY e.last_name;

盡管sys_connect_by_path函數(shù)不能在RSF查詢中使用,你可以使用與重新產(chǎn)生的level偽列幾乎相同的方法來復制這個函數(shù)的功能?,F(xiàn)在不用使用計算器來計數(shù),而是附加一個字符串值。

--建立你自己的sys_connect_by_path函數(shù)
 WITH emp_recurse(employee_id,manager_id,last_name,lvl,PATH) AS (
      SELECT e.employee_id,NULL,e.last_name,1 AS lvl,':'||to_char(e.last_name) AS path
      FROM hr.employees e
      WHERE e.manager_id IS NULL
      UNION ALL
      SELECT e1.employee_id,e1.manager_id,e1.last_name,e2.lvl+1 AS lvl,e2.path||':'||to_char(e1.last_name) AS path
      FROM hr.employees e1
      JOIN emp_recurse e2 ON e2.employee_id=e1.manager_id  
)
 search DEPTH FIRST BY last_name SET last_time_order
 SELECT LPAD(' ', r.lvl * 2 - 1, ' ') || r.last_name last_name,r.path
 FROM emp_recurse r
 ORDER BY last_time_order;
建立你自己的sys_connect_by_path函數(shù)

如果你需要將層級顯示為逗號分隔的列表,sys_connect_by_path無法做到,因為sys_connect_by_path函數(shù)的問題在于輸出中的第一個字符必須是冒號。

 --RSF逗號分隔的路徑
 WITH emp_recurse(employee_id,manager_id,last_name,lvl,PATH) AS (
      SELECT e.employee_id,NULL,e.last_name,1 AS lvl,to_char(e.last_name) AS path
      FROM hr.employees e
      WHERE e.manager_id IS NULL
      UNION ALL
      SELECT e1.employee_id,e1.manager_id,e1.last_name,e2.lvl+1 AS lvl,e2.path||','||to_char(e1.last_name) AS path
      FROM hr.employees e1
      JOIN emp_recurse e2 ON e2.employee_id=e1.manager_id  
)
 search DEPTH FIRST BY last_name SET last_time_order
 SELECT LPAD(' ', r.lvl * 2 - 1, ' ') || r.last_name last_name,r.path
 FROM emp_recurse r
 ORDER BY last_time_order;
RSF逗號分隔的路徑

connect_by_root運算符

這個運算符強化了connect by語法,使得它可以返回當前行的根節(jié)點。

 --connect_by_root
 UPDATE hr.employees SET manager_id=NULL WHERE last_name='Kochhar';
 SELECT /*+ inline gather_plan_statistics */
 LEVEL,LPAD(' ',2*(LEVEL-1))||last_name last_name,first_name,
 connect_by_root last_name AS root,
 sys_connect_by_path(last_name,':') PATH
 FROM hr.employees
 WHERE connect_by_root last_name='Kochhar'
 CONNECT BY PRIOR employee_id=manager_id
 START WITH manager_id IS NULL;
connect_by_root
 --復制connect_by_root運算符功能
 WITH emp_recurse(employee_id,manager_id,last_name,lvl,path) AS (
   SELECT /*+ gather_plan_statistics */
   e.employee_id,NULL AS manager_id,
   e.last_name,1 AS lvl,
   ':'||e.last_name||':' AS path
   FROM hr.employees e
   WHERE e.manager_id IS NULL
   UNION ALL
   SELECT
   e.employee_id,e.manager_id,
   e.last_name,er.lvl+1 AS lvl,
   er.path||e.last_name||':' AS path
   FROM hr.employees e
   JOIN emp_recurse er ON er.employee_id=e.manager_id
   JOIN hr.employees e2 ON e2.employee_id=e.manager_id
 )
 search DEPTH FIRST BY last_name SET order1,
 emps AS (
   SELECT lvl,last_name,path,SUBSTR(path,2,INSTR(path,':',2)-2) root
   FROM emp_recurse
 )
 SELECT lvl,LPAD(' ',2*(lvl-1))|| last_name last_name,
 root,path FROM emps
 WHERE root='Kochhar';
復制connect_by_root運算符功能

connect_by_iscycle偽列和nocycle參數(shù)

connect_by_iscycle偽列使得在層級中檢測循環(huán)變得很容易。
這里將smith設(shè)置為king的經(jīng)理來故意引入了一個錯誤,這將導致connect by中出現(xiàn)錯誤。

 --connect by中的循環(huán)錯誤
 SELECT * FROM hr.employees WHERE employee_id IN (100,171);
 --將Smith設(shè)置為King的經(jīng)理
 UPDATE hr.employees SET manager_id=171 WHERE employee_id=100;
 SELECT LPAD(' ',2*(LEVEL-1))|| last_name last_name,
 first_name,employee_id,LEVEL
 FROM hr.employees 
 START WITH employee_id=100
 CONNECT BY PRIOR employee_id=manager_id;
connect by中的循環(huán)錯誤

nocycle和connect_by_iscycle可以用來檢測層級中的循環(huán)。nocycle參數(shù)可以阻止發(fā)ora-1436錯誤,使得所有行都要以輸出。connect_by_iscycle運算符使得你可以很容易地找到導致錯誤發(fā)生的行。

 --通過connect_by_iscycle檢測循環(huán)
 SELECT LPAD(' ',2*(LEVEL-1))|| last_name last_name,
 first_name,employee_id,LEVEL,
 connect_by_iscycle
 FROM hr.employees 
 START WITH employee_id=100
 CONNECT BY NOCYCLE PRIOR employee_id=manager_id;
通過connect_by_iscycle檢測循環(huán)

connect_by_iscycle的值為1,表示smith的那一行數(shù)據(jù)導致了錯誤。接下來查詢Smith的數(shù)據(jù),所有一切看上去都很正常。最后,你再以Smith的員工ID尋找他所管理的所有員工,錯誤就是公司總裁沒有經(jīng)理。因此解決辦法不是將這一行的manager_id設(shè)置回空值。

SELECT e.last_name, e.first_name, e.employee_id, e.manager_id
  FROM hr.employees e
 WHERE e.employee_id = 171
    OR e.manager_id = 171;
公司總裁沒有經(jīng)理
--在遞歸查詢中檢測循環(huán)
WITH emp(employee_id,manager_id,last_name,first_name,lvl) AS (
 SELECT e.employee_id,NULL AS manager_id,e.last_name,e.first_name,1 AS lvl
 FROM hr.employees e
 WHERE e.employee_id=100
 UNION ALL 
 SELECT e.employee_id,e.manager_id,e.last_name,e.first_name,emp.lvl+1 AS lvl
 FROM hr.employees e
 JOIN emp ON emp.employee_id=e.manager_id
)
search DEPTH FIRST BY last_name SET order1
CYCLE employee_id SET is_cycle TO '1' DEFAULT '0'
SELECT LPAD(' ',2*(lvl-1))||last_name last_name,first_name,employee_id,lvl,is_cycle 
FROM emp ORDER BY order1;

在遞歸查詢中檢測循環(huán)

注意: cycle子句讓你將is_cycle列值設(shè)置為0或1的。這里只允許單值字符。這一列的名稱同樣是用戶自定義的。檢查輸出,可以看到RSF中的cycle子句在指明導致數(shù)據(jù)循環(huán)的行時做得更好。出現(xiàn)錯誤的數(shù)據(jù)行很清楚地標記為King那一行,因此可以查詢那一行并迅速確定錯誤所在。

connect_by_isleaf偽列

connect_by_isleaf用來在層級數(shù)據(jù)中識別葉子節(jié)點。

--connect_by_isleaf偽列
SELECT LPAD(' ',2*(LEVEL-1))|| e.last_name last_name,connect_by_isleaf
FROM hr.employees e
START WITH e.manager_id IS NULL
CONNECT BY PRIOR e.employee_id=e.manager_id
ORDER SIBLINGS BY e.last_name;
connect_by_isleaf偽列

RSF中要復制這一點還比較困難的。你需要在員工層級中標識出葉子節(jié)點,從定義上來說,葉子節(jié)點都不是經(jīng)理。所有不是經(jīng)理的行就是葉子節(jié)點。

--在遞歸查詢中找出葉子節(jié)點
WITH leaves AS (
 SELECT e.employee_id FROM hr.employees e
 WHERE e.employee_id NOT IN (
  SELECT manager_id FROM hr.employees WHERE manager_id IS NOT NULL
 )
),
emp(manager_id,employee_id,last_name,lvl,isleaf) AS (
 SELECT e.manager_id,e.employee_id,e.last_name,1 AS lvl,0 AS isleaf
 FROM hr.employees e
 WHERE e.manager_id IS NULL
 UNION ALL
 SELECT e.manager_id,nvl(e.employee_id,NULL),e.last_name,emp.lvl+1 AS lvl, DECODE(l.employee_id,NULL,0,1) AS isleaf
 FROM hr.employees e
 JOIN emp ON emp.employee_id=e.manager_id
 LEFT OUTER JOIN leaves l ON l.employee_id=e.employee_id
)
search DEPTH FIRST BY last_name SET order1
SELECT LPAD(' ',2*(lvl-1))||last_name last_name, isleaf 
FROM emp;
在遞歸查詢中找出葉子節(jié)點

leaves子查詢被用來尋找葉子節(jié)點,然后將結(jié)果與employees表進行左外聯(lián)結(jié)。leaves.employee_id列的值表時當前行是否是葉子。
別一種方法利用分析函數(shù)lead()使用lvl列的值來確定數(shù)據(jù)行是否為葉子節(jié)點。lead()函數(shù)依賴seach子句中所定的last_name_order列的值。

--使用lead()尋找葉子節(jié)點
WITH emp(manager_id,employee_id,last_name,lvl) AS (
 SELECT e.manager_id,e.employee_id,e.last_name,1 AS lvl
 FROM hr.employees e
 WHERE e.manager_id IS NULL
 UNION ALL
 SELECT e.manager_id,nvl(e.employee_id,NULL),e.last_name,emp.lvl+1 AS lvl
 FROM hr.employees e
 JOIN emp ON emp.employee_id=e.manager_id
)
search DEPTH FIRST BY last_name SET last_name_order
SELECT LPAD(' ',2*(lvl-1))||last_name last_name,lvl,
LEAD(lvl) OVER(ORDER BY last_name_order) leadlvlorder,
CASE
  WHEN (lvl-LEAD(lvl) OVER (ORDER BY last_name_order))<0
  THEN 
    0
  ELSE 1
  END isleaf
FROM emp;
使用lead()尋找葉子節(jié)點

如果search子從depth first改為breadth first,它因為依賴數(shù)據(jù)的順序而顯得有點脆弱,這樣的輸出有可能是不正確的,如下的運行:

--使用breadth first的lead()
WITH emp(manager_id,employee_id,last_name,lvl) AS (
 SELECT e.manager_id,e.employee_id,e.last_name,1 AS lvl
 FROM hr.employees e
 WHERE e.manager_id IS NULL
 UNION ALL
 SELECT e.manager_id,nvl(e.employee_id,NULL),e.last_name,emp.lvl+1 AS lvl
 FROM hr.employees e
 JOIN emp ON emp.employee_id=e.manager_id
)
search breadth FIRST BY last_name SET last_name_order
SELECT LPAD(' ',2*(lvl-1))||last_name last_name,lvl,
LEAD(lvl) OVER(ORDER BY last_name_order) leadlvlorder,
CASE
  WHEN (lvl-LEAD(lvl) OVER (ORDER BY last_name_order))<0
  THEN 
    0
  ELSE 1
  END isleaf
FROM emp;
使用lead()尋找葉子節(jié)點

盡管在大多數(shù)據(jù)實踐中你都可以使用在遞歸因子化子查詢中復制connect by的功能,但很多情況下,全用connect by語法更簡單,在RSF中做同樣的事件在多數(shù)情況下需要更多的SQL代碼。connect by可以產(chǎn)生比RSF更好的執(zhí)行計劃,鋮是對于相對簡單的查詢。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 230,182評論 6 543
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 99,489評論 3 429
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 178,290評論 0 383
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經(jīng)常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 63,776評論 1 317
  • 正文 為了忘掉前任,我火速辦了婚禮,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 72,510評論 6 412
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 55,866評論 1 328
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,860評論 3 447
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 43,036評論 0 290
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 49,585評論 1 336
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 41,331評論 3 358
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 43,536評論 1 374
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 39,058評論 5 363
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 44,754評論 3 349
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 35,154評論 0 28
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 36,469評論 1 295
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 52,273評論 3 399
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 48,505評論 2 379

推薦閱讀更多精彩內(nèi)容