關鍵基因和hub基因(生物網絡角度)

寫在前面

這篇文章仍然來自幾篇文章及自己平時的積累,主要闡述關鍵基因和hub基因。很多人誤以為hub基因就是關鍵基因,甚至有人認為差異表達基因就是關鍵基因。在正式看本文章之前,我先以個人理解的角度簡單的來說明這三者之間的關系,不同見解的請留言。

  • 差異表達基因是兩個group之間有統計學差異的gene,以芯片為例的話,幾萬個探針里可能差異的就1000個左右(當然根據設定閾值差異很大)
  • hub基因,是degree高的gene,在基因表達網絡中有高的連接度degree,不涉及betweeness等。并且hub基因的篩選有很大的人為因素,到底是取前5%還是10%沒有具體要求,一般建議5%。也就是說這是一個很寬松的設定。
  • 關鍵基因,有人從hub里挑靠前的,有人從差異表達基因里挑p值大的。到怎么才算關鍵基因?籠統來說,假如你這個基因被敲減,表型顯著消失,那肯定是關鍵基因。但僅從生物信息分析角度怎么挑?不可能有一種方法就可以直接解決這個問題,現在只從表達網絡的角度,稍后我會寫一篇多個角度如何篩選關鍵基因的文章。,其范圍要比hub小。hub不一定關鍵,關鍵不一定hub。

總之,在數目上獲范疇上

DGEs>Hubs>key genes(candidate genes)

------------------------------------------------

好了,開始正文吧

HUB 基因

The WGCNA approach typically deals with the identification of gene modules by using the gene expression levels that are highly correlated across samples. This technique has been successfully utilized to detect gene modules in Arabidopsis, rice, maize and poplar for various biotic and abiotic stresses . Further, this approach also leads to construction of Gene Co-expression Network (GCN), a scale free network, where, genes are represented as nodes and edges depict associations among genes . In such network, highly connected genes are called hub genes, which are expected to play an important role in understanding the biological mechanism of response under stresses/conditions. Identification of hub genes will also help in mitigating the stress in plants through genetic engineering. The existing approaches have mainly focused on hub gene identification, based only on gene connection degrees in the GCN. Moreover, these techniques select such genes empirically without any statistical criteria. Besides, few approaches can be found in the literature for the identification of hub nodes in a scale free network.

這里可以看出,hub基因是是在無尺度共表達網絡中存在的,對應著degree,也就是說在GCN中。現存的方法主要關注hub基因的鑒定,基于的就是GCN中的連接度,這些技術只是憑經驗選擇,并沒有統計學標準。另外,在文獻中很少有方法發現來鑒定無尺度網絡的中hub nodes。
所以作者提出了一個算法,并寫了一個包,對hub gene提供p值,可以根據p值標準來減少hub gene數目。
包在這里
文章地址1
文章地址2

It has been a long-standing長久存在的 goal in systems biology to find relations between the topological properties and functional features of protein networks. However, most of the focus in network studies has been on highly connected proteins (“hubs”). As a complementary notion, it is possible to define bottlenecks as proteins with a high betweenness centrality (i.e., network nodes that have many “shortest paths” going through them, analogous to major bridges and tunnels on a highway map). Bottlenecks are, in fact, key connector proteins with surprising functional and dynamic properties. In particular, they are more likely to be essential proteins. In fact, in regulatory and other directed networks, betweenness (i.e., “bottleneck-ness”) is a much more significant indicator of essentiality than degree (i.e., “hub-ness”). Furthermore, bottlenecks correspond to the dynamic components of the interaction network—they are significantly less well coexpressed with their neighbors than nonbottlenecks, implying that expression dynamics is wired into the network topology.
A network is a graph consisting of a number of nodes with edges connecting them. Recently, network models have been widely applied to biological systems. Here, we are mainly interested in two types of biological networks: the interaction network, where nodes are proteins and edges connect interacting partners; and the regulatory network, where nodes are proteins and edges connect transcription factors and their targets. Betweenness is one of the most important topological properties of a network. It measures the number of shortest paths going through a certain node. Therefore, nodes with the highest betweenness control most of the information flow in the network, representing the critical points of the network. We thus call these nodes the “bottlenecks” of the network. Here, we focus on bottlenecks in protein networks. We find that, in the regulatory network, where there is a clear concept of information flow, protein bottlenecks indeed have a much higher tendency to be essential genes. In this type of network, betweenness is a good predictor of essentiality. Biological researchers can therefore use the betweenness as one more feature to choose potential targets for detailed analysis.


Figure1.png
Figure2.png

下面是關于hub和bottlenecks的區別解釋

Central complex members have a low betweenness and are hub–nonbottlenecks. 中心復合體成員低betweenness,屬于hub-nonbottlenecks.

Because of the high connectivity inside these complexes, paths can go through them and all their neighbors. On the other hand, hub–bottlenecks tend to correspond to highly central proteins that connect several complexes or are peripheral members of central complexes.

Hub-bottlenecks傾向于對應那些高中心性蛋白,連接幾個復合體,或者是中心復合體的周邊成員,他們有高betweenness的事實顯示這些蛋白不是簡單的大的蛋白復合體的成員(nonbottleneck-hubs的特點),而是把這個復合體和網絡中其他部分連接起來,一定意義上說,是真正的連接度瓶頸。

The fact that they have a high betweenness suggests that these proteins are not, however, simply members of large protein complexes (which is true for nonbottleneck–hubs), but are those members that connect the complex to the rest of the graph; in a sense, real connectivity bottlenecks. While hub–nonbottlenecks mainly consist of structural proteins, hub–bottlenecks are more likely to be part of signal transduction pathways.
Hub-nonbottlenecks主要構成結構蛋白,
Hub-bottlenecks更傾向于是信號轉導通路的一部分

Furthermore, hub–bottlenecks are (by construction) the most efficient in disrupting the network upon hub removal. This relates nicely to the date/party-hub concept by Han et al. : hub–bottlenecks tend to be date-hubs, whereas hub–nonbottlenecks tend to be party-hubs.

另外,一旦hub被移走,hub-bottlenecks是破壞網絡最有效的節點。這和Han的hub概念非常接近:hub-bottlenecks傾向于是date-hubs,hub-nonbottlenecks傾向于party-hubs(hans的文章看了就明白,datehubs更容易是大架構的組織者維持者,是大老板)。(han的這個觀點發表在nature上,下面是han的觀點)

上面說的那個han的nature上的文章
https://www.nature.com/articles/nature02555
In apparently scale-free protein–protein interaction networks, or ‘interactome’ networks1,2, most proteins interact with few partners, whereas a small but significant proportion of proteins, the ‘hubs’, interact with many partners.
在無尺度蛋白相互作用網絡或叫相互作用組網絡,大多數蛋白都是和少數的partners作用,只有少部分蛋白,也就是hubs,和很多partners作用.

非hub但瓶頸通常比那些非hub非瓶頸蛋白和他們的鄰居共表達更少,符合這個觀察:betweenness是和鄰接蛋白平均相關性的指標,非hub但瓶頸蛋白很少是復合體成員,并且大部分都是調節蛋白和信號轉到machinery。
不管是生物還是非生物,只要是無尺度網絡,都對隨機的node移除有抵抗能力,但是對hubs的移除非常敏感。
大概就是酵母做了個實驗,移除敲除編碼hub蛋白的基因,比非hub的死亡率大3倍,我們發現了兩類hub:party hubs黨派型,同時和partners的大部分相互作用。Date hubs約會型,不同的時間或位置結合不同的partners。


Figure3.png

這樣,酵母中的相互作用網絡的hub基于他們的partners‘表達譜,可以分為兩類:date和party hubs。這種區分揭示了酵母蛋白組組織模塊的模型,通過regulators,mediators或adaptors連接模塊,這就是date hubs。Party hubs代表不同的模塊內部的必須的成分,對這這些模塊介導的功能很重要(因此傾向于是必須蛋白),傾向于在蛋白組的組織上低水平工作。(大概意思是date hubs是大boss,溝通銜接,而party hubs是模塊內部的小老板)。我們提出,date hubs在整個蛋白組網絡中生物模塊的總體組織中是必須的,參與的是大范圍的整合連接(雖然一些date hub可以簡單的共享,并且調節模塊內或跨模塊的局部功能)。這種相互作用網絡的關鍵特點,比如對抗外界環境的遺傳穩定性和彈性,使用這樣的模塊組織方式作為框架就更好理解了。

因此,所謂的date-hubs是那些有高的betweeness(hub-bottlenecks),
而party-hubs更可能是有著低betweeness的hubs(hub-nonbottlenecks)
這個發現,或許表明了相互作用網絡中動態和拓撲特性之間的聯系,而這迄今為止是人類未知的。
作者相信,雖然先有不好實現的地方,但是betweenness將來會被證明是一個非常有用的工具對很多蛋白昂立來說,尤其是有方向的edges(調控網絡)。
總之,我們提供了兩種互補的拓撲網絡特性的整合分析,這適合于不同的網絡類型。這種整合的方法解釋了先前不為人知的網絡拓撲性質之間的聯系,蛋白質必要性和表達動態。我們相信,這種整合的方法就像現在提出的這種,會對將來的預測模型至為重要。

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 230,578評論 6 544
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 99,701評論 3 429
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 178,691評論 0 383
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 63,974評論 1 318
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 72,694評論 6 413
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 56,026評論 1 329
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 44,015評論 3 450
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 43,193評論 0 290
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 49,719評論 1 336
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 41,442評論 3 360
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 43,668評論 1 374
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 39,151評論 5 365
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,846評論 3 351
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 35,255評論 0 28
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 36,592評論 1 295
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 52,394評論 3 400
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 48,635評論 2 380

推薦閱讀更多精彩內容