PostgreSQL 源碼解讀(18)- 查詢語句#3(SQL Parse)

本文簡單介紹了PG執行SQL的流程,重點介紹了查詢語句的解析(Parse)過程。

一、SQL執行流程

PG執行SQL的過程有以下幾個步驟:
第一步,根據輸入的SQL語句執行SQL Parse,進行詞法和語法分析等,最終生成解析樹;
第二步,根據解析樹,執行查詢邏輯/物理優化、查詢重寫,最終生成查詢樹;
第三步,根據查詢樹,生成執行計劃;
第四步,執行器根據執行計劃,執行SQL。

二、SQL解析

如前所述,PG的SQL Parse(解析)過程由函數pg_parse_query實現,在exec_simple_query函數中調用。
代碼如下:

 /*
  * Do raw parsing (only).
  *
  * A list of parsetrees (RawStmt nodes) is returned, since there might be
  * multiple commands in the given string.
  *
  * NOTE: for interactive queries, it is important to keep this routine
  * separate from the analysis & rewrite stages.  Analysis and rewriting
  * cannot be done in an aborted transaction, since they require access to
  * database tables.  So, we rely on the raw parser to determine whether
  * we've seen a COMMIT or ABORT command; when we are in abort state, other
  * commands are not processed any further than the raw parse stage.
  */
 List *
 pg_parse_query(const char *query_string)
 {
     List       *raw_parsetree_list;
 
     TRACE_POSTGRESQL_QUERY_PARSE_START(query_string);
 
     if (log_parser_stats)
         ResetUsage();
 
     raw_parsetree_list = raw_parser(query_string);
 
     if (log_parser_stats)
         ShowUsage("PARSER STATISTICS");
 
 #ifdef COPY_PARSE_PLAN_TREES
     /* Optional debugging check: pass raw parsetrees through copyObject() */
     {
         List       *new_list = copyObject(raw_parsetree_list);
 
         /* This checks both copyObject() and the equal() routines... */
         if (!equal(new_list, raw_parsetree_list))
             elog(WARNING, "copyObject() failed to produce an equal raw parse tree");
         else
             raw_parsetree_list = new_list;
     }
 #endif
 
     TRACE_POSTGRESQL_QUERY_PARSE_DONE(query_string);
 
     return raw_parsetree_list;
 }
 
 /*
  * raw_parser
  *      Given a query in string form, do lexical and grammatical analysis.
  *
  * Returns a list of raw (un-analyzed) parse trees.  The immediate elements
  * of the list are always RawStmt nodes.
  */
 List *
 raw_parser(const char *str)
 {
     core_yyscan_t yyscanner;
     base_yy_extra_type yyextra;
     int         yyresult;
 
     /* initialize the flex scanner */
     yyscanner = scanner_init(str, &yyextra.core_yy_extra,
                              ScanKeywords, NumScanKeywords);
 
     /* base_yylex() only needs this much initialization */
     yyextra.have_lookahead = false;
 
     /* initialize the bison parser */
     parser_init(&yyextra);
 
     /* Parse! */
     yyresult = base_yyparse(yyscanner);
 
     /* Clean up (release memory) */
     scanner_finish(yyscanner);
 
     if (yyresult)               /* error */
         return NIL;
 
     return yyextra.parsetree;
 }
 

重要的數據結構:SelectStmt結構體

/* ----------------------
  *      Select Statement
  *
  * A "simple" SELECT is represented in the output of gram.y by a single
  * SelectStmt node; so is a VALUES construct.  A query containing set
  * operators (UNION, INTERSECT, EXCEPT) is represented by a tree of SelectStmt
  * nodes, in which the leaf nodes are component SELECTs and the internal nodes
  * represent UNION, INTERSECT, or EXCEPT operators.  Using the same node
  * type for both leaf and internal nodes allows gram.y to stick ORDER BY,
  * LIMIT, etc, clause values into a SELECT statement without worrying
  * whether it is a simple or compound SELECT.
  * ----------------------
  */
 typedef enum SetOperation
 {
     SETOP_NONE = 0,
     SETOP_UNION,
     SETOP_INTERSECT,
     SETOP_EXCEPT
 } SetOperation;
 
 typedef struct SelectStmt
 {
     NodeTag     type;
 
     /*
      * These fields are used only in "leaf" SelectStmts.
      */
     List       *distinctClause; /* NULL, list of DISTINCT ON exprs, or
                                  * lcons(NIL,NIL) for all (SELECT DISTINCT) */
     IntoClause *intoClause;     /* target for SELECT INTO */
     List       *targetList;     /* the target list (of ResTarget) */
     List       *fromClause;     /* the FROM clause */
     Node       *whereClause;    /* WHERE qualification */
     List       *groupClause;    /* GROUP BY clauses */
     Node       *havingClause;   /* HAVING conditional-expression */
     List       *windowClause;   /* WINDOW window_name AS (...), ... */
 
     /*
      * In a "leaf" node representing a VALUES list, the above fields are all
      * null, and instead this field is set.  Note that the elements of the
      * sublists are just expressions, without ResTarget decoration. Also note
      * that a list element can be DEFAULT (represented as a SetToDefault
      * node), regardless of the context of the VALUES list. It's up to parse
      * analysis to reject that where not valid.
      */
     List       *valuesLists;    /* untransformed list of expression lists */
 
     /*
      * These fields are used in both "leaf" SelectStmts and upper-level
      * SelectStmts.
      */
     List       *sortClause;     /* sort clause (a list of SortBy's) */
     Node       *limitOffset;    /* # of result tuples to skip */
     Node       *limitCount;     /* # of result tuples to return */
     List       *lockingClause;  /* FOR UPDATE (list of LockingClause's) */
     WithClause *withClause;     /* WITH clause */
 
     /*
      * These fields are used only in upper-level SelectStmts.
      */
     SetOperation op;            /* type of set op */
     bool        all;            /* ALL specified? */
     struct SelectStmt *larg;    /* left child */
     struct SelectStmt *rarg;    /* right child */
     /* Eventually add fields for CORRESPONDING spec here */
 } SelectStmt;

重要的結構體:Value

 /*----------------------
  *      Value node
  *
  * The same Value struct is used for five node types: T_Integer,
  * T_Float, T_String, T_BitString, T_Null.
  *
  * Integral values are actually represented by a machine integer,
  * but both floats and strings are represented as strings.
  * Using T_Float as the node type simply indicates that
  * the contents of the string look like a valid numeric literal.
  *
  * (Before Postgres 7.0, we used a double to represent T_Float,
  * but that creates loss-of-precision problems when the value is
  * ultimately destined to be converted to NUMERIC.  Since Value nodes
  * are only used in the parsing process, not for runtime data, it's
  * better to use the more general representation.)
  *
  * Note that an integer-looking string will get lexed as T_Float if
  * the value is too large to fit in an 'int'.
  *
  * Nulls, of course, don't need the value part at all.
  *----------------------
  */
 typedef struct Value
 {
     NodeTag     type;           /* tag appropriately (eg. T_String) */
     union ValUnion
     {
         int         ival;       /* machine integer */
         char       *str;        /* string */
     }           val;
 } Value;
 
 #define intVal(v)       (((Value *)(v))->val.ival)
 #define floatVal(v)     atof(((Value *)(v))->val.str)
 #define strVal(v)       (((Value *)(v))->val.str)
 

實現過程本節暫時擱置,先看過程執行的結果,函數pg_parse_query返回的結果是鏈表List,其中的元素是RawStmt,具體的結構需根據NodeTag確定(這樣的做法類似于Java/C++的多態)。
測試數據

testdb=# -- 單位信息
testdb=# drop table if exists t_dwxx;
ues('Y有限公司','1002','北京市海淀區');
insert into t_dwxx(dwmc,dwbh,dwdz) values('Z有限公司','1003','廣西南寧市五象區');
NOTICE:  table "t_dwxx" does not exist, skipping
DROP TABLE
testdb=# create table t_dwxx(dwmc varchar(100),dwbh varchar(10),dwdz varchar(100));
CREATE TABLE
testdb=# 
testdb=# insert into t_dwxx(dwmc,dwbh,dwdz) values('X有限公司','1001','廣東省廣州市荔灣區');
INSERT 0 1
testdb=# insert into t_dwxx(dwmc,dwbh,dwdz) values('Y有限公司','1002','北京市海淀區');
INSERT 0 1
testdb=# insert into t_dwxx(dwmc,dwbh,dwdz) values('Z有限公司','1003','廣西南寧市五象區');
INSERT 0 1
testdb=# -- 個人信息
testdb=# drop table if exists t_grxx;
NOTICE:  table "t_grxx" does not exist, skipping
DROP TABLE
testdb=# create table t_grxx(dwbh varchar(10),grbh varchar(10),xm varchar(20),nl int);
CREATE TABLE
insert into t_grxx(dwbh,grbh,xm,nl) values('1002','903','王五',43);
testdb=# 
testdb=# insert into t_grxx(dwbh,grbh,xm,nl) values('1001','901','張三',23);
INSERT 0 1
testdb=# insert into t_grxx(dwbh,grbh,xm,nl) values('1002','902','李四',33);
INSERT 0 1
testdb=# insert into t_grxx(dwbh,grbh,xm,nl) values('1002','903','王五',43);
INSERT 0 1
testdb=# -- 個人繳費信息
testdb=# drop table if exists t_jfxx;
NOTICE:  table "t_jfxx" does not exist, skipping
DROP TABLE
testdb=# create table t_jfxx(grbh varchar(10),ny varchar(10),je float);
CREATE TABLE
testdb=# 
testdb=# insert into t_jfxx(grbh,ny,je) values('901','201801',401.30);
insert into t_jfxx(grbh,ny,je) values('901','201802',401.30);
insert into t_jfxx(grbh,ny,je) values('901','201803',401.30);
insert into t_jfxx(grbh,ny,je) values('902','201801',513.30);
insert into t_jfxx(grbh,ny,je) values('902','201802',513.30);
insert into t_jfxx(grbh,ny,je) values('902','201804',513.30);
insert into t_jfxx(grbh,ny,je) values('903','201801',372.22);
insert into t_jfxx(grbh,ny,je) values('903','201804',372.22);
testdb=# insert into t_jfxx(grbh,ny,je) values('901','201801',401.30);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('901','201802',401.30);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('901','201803',401.30);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('902','201801',513.10);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('902','201802',513.30);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('902','201804',513.30);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('903','201801',372.22);
INSERT 0 1
testdb=# insert into t_jfxx(grbh,ny,je) values('903','201804',372.22);
INSERT 0 1
testdb=# -- 獲取pid
testdb=# select pg_backend_pid();
 pg_backend_pid 
----------------
           1560
(1 row)
-- 用于測試的查詢語句
testdb=# select t_dwxx.dwmc,t_grxx.grbh,t_grxx.xm,t_jfxx.ny,t_jfxx.je
testdb-# from t_dwxx,t_grxx,t_jfxx
testdb-# where t_dwxx.dwbh = t_grxx.dwbh 
testdb-# and t_grxx.grbh = t_jfxx.grbh
testdb-# and t_dwxx.dwbh IN ('1001','1002')
testdb-# order by t_grxx.grbh
testdb-# limit 8;
   dwmc    | grbh |  xm  |   ny   |   je   
-----------+------+------+--------+--------
 X有限公司 | 901  | 張三 | 201801 |  401.3
 X有限公司 | 901  | 張三 | 201802 |  401.3
 X有限公司 | 901  | 張三 | 201803 |  401.3
 Y有限公司 | 902  | 李四 | 201801 |  513.1
 Y有限公司 | 902  | 李四 | 201802 |  513.3
 Y有限公司 | 902  | 李四 | 201804 |  513.3
 Y有限公司 | 903  | 王五 | 201801 | 372.22
 Y有限公司 | 903  | 王五 | 201804 | 372.22
(8 rows)

結果分析

[xdb@localhost ~]$ gdb -p 1560
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
...
(gdb) b pg_parse_query
Breakpoint 1 at 0x84c6c9: file postgres.c, line 615.
(gdb) c
Continuing.

Breakpoint 1, pg_parse_query (
    query_string=0x1a46ef0 "select t_dwxx.dwmc,t_grxx.grbh,t_grxx.xm,t_jfxx.ny,t_jfxx.je\nfrom t_dwxx inner join t_grxx on t_dwxx.dwbh = t_grxx.dwbh\ninner join t_jfxx on t_grxx.grbh = t_jfxx.grbh\nwhere t_dwxx.dwbh IN ('1001','100"...) at postgres.c:615
615     if (log_parser_stats)
(gdb) n
618     raw_parsetree_list = raw_parser(query_string);
(gdb) 
620     if (log_parser_stats)
(gdb) 
638     return raw_parsetree_list;
(gdb) p *(RawStmt *)(raw_parsetree_list->head.data->ptr_value)
$7 = {type = T_RawStmt, stmt = 0x1a48c00, stmt_location = 0, stmt_len = 232}
(gdb) p *((RawStmt *)(raw_parsetree_list->head.data->ptr_value))->stmt
$8 = {type = T_SelectStmt}
#轉換為實際類型SelectStmt 
(gdb)  p *(SelectStmt *)((RawStmt *)(raw_parsetree_list->head.data->ptr_value))->stmt
$16 = {type = T_SelectStmt, distinctClause = 0x0, intoClause = 0x0, targetList = 0x1a47b18, 
  fromClause = 0x1a48900, whereClause = 0x1a48b40, groupClause = 0x0, havingClause = 0x0, windowClause = 0x0, 
  valuesLists = 0x0, sortClause = 0x1afd858, limitOffset = 0x0, limitCount = 0x1afd888, lockingClause = 0x0, 
  withClause = 0x0, op = SETOP_NONE, all = false, larg = 0x0, rarg = 0x0}
#設置臨時變量
(gdb) set $stmt=(SelectStmt *)((RawStmt *)(raw_parsetree_list->head.data->ptr_value))->stmt
#查看結構體中的各個變量
#------------------->targetList 
(gdb) p *($stmt->targetList)
$28 = {type = T_List, length = 5, head = 0x1a47af8, tail = 0x1a48128}
#targetList有5個元素,分別對應t_dwxx.dwmc,t_grxx.grbh,t_grxx.xm,t_jfxx.ny,t_jfxx.je
#先看第1個元素
(gdb) set $restarget=(ResTarget *)($stmt->targetList->head.data->ptr_value)
(gdb) p *$restarget->val
$25 = {type = T_ColumnRef}
(gdb) p *(ColumnRef *)$restarget->val
$26 = {type = T_ColumnRef, fields = 0x1a47a08, location = 7}
(gdb) p *((ColumnRef *)$restarget->val)->fields
$27 = {type = T_List, length = 2, head = 0x1a47a88, tail = 0x1a479e8}
(gdb) p *(Node *)(((ColumnRef *)$restarget->val)->fields)->head.data->ptr_value
$32 = {type = T_String}
#fields鏈表的第1個元素是數據表,第2個元素是數據列
(gdb) p *(Value *)(((ColumnRef *)$restarget->val)->fields)->head.data->ptr_value
$37 = {type = T_String, val = {ival = 27556248, str = 0x1a47998 "t_dwxx"}}
(gdb) p *(Value *)(((ColumnRef *)$restarget->val)->fields)->tail.data->ptr_value
$38 = {type = T_String, val = {ival = 27556272, str = 0x1a479b0 "dwmc"}}
#其他類似

#------------------->fromClause 
(gdb) p *(Node *)($stmt->fromClause->head.data->ptr_value)
$41 = {type = T_JoinExpr}
(gdb) set $fromclause=(JoinExpr *)($stmt->fromClause->head.data->ptr_value)
(gdb) p *$fromclause
$42 = {type = T_JoinExpr, jointype = JOIN_INNER, isNatural = false, larg = 0x1a484f8, rarg = 0x1a48560, 
  usingClause = 0x0, quals = 0x1a487d0, alias = 0x0, rtindex = 0}

#------------------->whereClause 
(gdb)  p *(Node *)($stmt->whereClause)
$44 = {type = T_A_Expr}
(gdb)  p *(FromExpr *)($stmt->whereClause)
$46 = {type = T_A_Expr, fromlist = 0x1a48bd0, quals = 0x1a489d0}

#------------------->sortClause 
(gdb)  p *(Node *)($stmt->sortClause->head.data->ptr_value)
$48 = {type = T_SortBy}
(gdb)  p *(SortBy *)($stmt->sortClause->head.data->ptr_value)
$49 = {type = T_SortBy, node = 0x1a48db0, sortby_dir = SORTBY_DEFAULT, sortby_nulls = SORTBY_NULLS_DEFAULT, 
  useOp = 0x0, location = -1}

#------------------->limitCount 
(gdb)  p *(Node *)($stmt->limitCount)
$50 = {type = T_A_Const}
(gdb)  p *(Const *)($stmt->limitCount)
$51 = {xpr = {type = T_A_Const}, consttype = 0, consttypmod = 216, constcollid = 0, constlen = 8, 
  constvalue = 231, constisnull = 16, constbyval = false, location = 0}

以上為簡單的數據結構介紹,下一節將詳細解析parseTree的Tree結構。

三、小結

1、SQL執行流程:簡單介紹了SQL的執行流程,簡單分為解析、查詢優化、執行計劃生成和執行這四步;
2、SQL解析:解析執行后,結果存儲在解析樹中,分為distinctClause、intoClause、targetList等多個部分。

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 228,461評論 6 532
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 98,538評論 3 417
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 176,423評論 0 375
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 62,991評論 1 312
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 71,761評論 6 410
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 55,207評論 1 324
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,268評論 3 441
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 42,419評論 0 288
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 48,959評論 1 335
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 40,782評論 3 354
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 42,983評論 1 369
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,528評論 5 359
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,222評論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,653評論 0 26
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,901評論 1 286
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 51,678評論 3 392
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 47,978評論 2 374

推薦閱讀更多精彩內容

  • 關于Mongodb的全面總結 MongoDB的內部構造《MongoDB The Definitive Guide》...
    中v中閱讀 31,993評論 2 89
  • 一、背景 為了分析postgresql代碼,了解其執行查詢語句的過程,我采用eclipse + gdb集成調試環境...
    hemny閱讀 12,279評論 2 6
  • 一、正則表達式的作用: 給定的字符串是否符合正則表達式的過濾邏輯(稱作“匹配”) 可以通過正則表達式,從字符串中獲...
    伯恩的遺產閱讀 4,661評論 18 81
  • 早上起床,我們就開始了抱怨的一天。我們抱怨衛生間總是被隔壁住戶占用著,我們抱怨沒有漂亮衣服可以好好裝扮自己, 我們...
    北斗夜行閱讀 196評論 0 0
  • 一周過去了 可是什么都沒干 感覺自己像一個保鏢一樣 每天跟著就好 唉 不知道做什么 什么都不會 要學的還很多 時間...
    洛維閱讀 247評論 0 0