單細胞數(shù)據(jù)分析中,細胞通訊是最重要的一環(huán)
當(dāng)然,已經(jīng)有了很多的軟件用于單細胞數(shù)據(jù)的通訊分析,比如cellphoneDB,celltalker,cellchat,iTALK等,每個軟件都有自己的優(yōu)缺點,就目前而言,cellphoneDB的引用率最高,效果相對較好,NicheNet也有大文章引用,甚至于cellphoneDB與NicheNet的聯(lián)合運用。
今天我們來說一下一個新的軟件NATMI,相對其他軟件而言,有其獨有的優(yōu)勢,分析結(jié)果更加有利于科研研究,現(xiàn)在我們來看一下
文章在這里Predicting cell-to-cell communication networks using NATMI
首先來看數(shù)據(jù)庫:
NATMI uses connectomeDB2020 (a database of 2293 manually curated ligand-receptor pairs with literature support) to predict and visualise cell-to-cell communication networks from single-cell (or bulk) expression data
一、軟件的優(yōu)勢
(1)identify the cell-type pairs that are communicating the most (or most specifically) within a network。
(2)the most active (or specific) ligand-receptor pairs active within a network.
(3)putative highly-communicating cellular communities
(4)differences in intercellular communication when profiling given cell types under different conditions
從軟件的優(yōu)勢結(jié)果來看,基本涵蓋了我們想要的所有分析結(jié)果,接下來我們來看看軟件的具體介紹與應(yīng)用。
文獻的介紹中提到:
Specifically, NATMI can:
(1) show all cell types predicted to communicate via a user-specified ligand–receptor pair
(2)show all ligand–receptor pairs used for communication between a userspecified pair of cell types
(3)summarise the entire communication network to show how strongly or specifically each cell type in a complex sample communicates to every other cell type, thus identifying highly communicating cell pairs or communities
(4)compare communication networks from two different conditions and identify edges (ligand–receptor pairs) that differ (delta network) between them。
總而言之,一切都是為了做細胞通訊而服務(wù)。
二、edge file
這是第一個值得注意的地方
For each analysed dataset NATMI creates an edge file that summarises the levels and fractions of cells in each cell type expressing each ligand and receptor. From this it calculates two different edge weights. The mean-expression edge weights are calculated by multiplying the mean-expression level of the ligand in the sending cell type by the mean expression of the receptor in the target cell type. This weighting is useful to emphasise highly expressed ligands and receptors but provides no discrimination between cell-type-specific and housekeeping edges. The specificity-based edge weights, on the other hand, help identify the most specific edges in the network regardless of expression levels and are calculated as the product of the ligand and receptor specificities, where each specificity is defined as the mean expression of the ligand/receptor in a given cell type divided by the sum of the mean expression of that ligand/receptor across all cell types. The specificity-based edge weights range from 0 to 1 where a weight of 1 means both the ligand and receptor are only expressed in one (not necessarily the same) cell type。
這個地方作者介紹說,edge file包含了細胞類型表達配受體的levels和fraction,基于此計算兩種通訊的權(quán)重。一種是mean-expression edge weights,就是計算配受體在兩種細胞類型中的平均表達值。provides no discrimination between cell-type-specific and housekeeping edges. 另外一種是specificity-based edge weights(特異性權(quán)重),不管配受體表達水平,識別細胞類型之間最特異的配受體。至于特異性權(quán)重的方法這里寫了,我們在后面的方法中討論。
下面是該軟件運用的具體范例。
依據(jù)表達權(quán)重可以得到如下分析結(jié)果,(top20)
而基于特異性權(quán)重分析得到如下結(jié)果:
二者之間其實有很大的不同,也就是說,表達水平高的配受體不一定是特異性的配受體,存在互補作用。
三、Cell-connectivity-summary networks.
One of the primary aims of cell-to-cell communication network analysis is to identify which cell types are mutually coordinating their activities by ligand–receptor-mediated communication. Our analyses indicate, however, that all cell types have substantial potential to communicate with each other.Consequently, this leads to the question of which cell types are communicating the most? Or the most specifically? The simplest strategy to measure the degree of communication from one cell type to another is to count the number of ligand–receptor pairs connecting them.
也就是說,我們需要解決的是,對于感興趣的細胞類型,那種或者哪些細胞類型與其交流最多?或者哪種細胞類型與其交流最特異。Filtering by expression weights can provide users a higher confidence that the ligands and receptors are expressed at sufficient levels.In contrast, filtering on specificity weights highlights a different set of top cell-to-cell pairs.
從圖上可以看出,如果基于平均表達的權(quán)重,F(xiàn)ibroblast是交流最多的細胞類型,而基于特異性,結(jié)果有所不同,內(nèi)皮細胞特異性交流很多,分析結(jié)果之間存在一些區(qū)別。
We next compared our results with those obtained by filtering edges based on p values calculated by CellPhoneDB,The resulting heatmap(如下圖) is similar to that observed for the expression filtered network。suggesting NATMI may better highlight high specificity edges.Lastly, the network can also be summarised using the summed-specificity weights(這個地方我們在方法中討論) between each cell type pair。
我們建議對大多數(shù)分析使用匯總特異性,因為這會捕獲細胞類型之間的特定信號。
接下來的部分主要介紹了配受體對在同一細胞類型的表達情況,我們來看一下主要的結(jié)論。
autocrine edges had higher rankings, meaning that, on average, autocrine edges tend to be more specific than intra-organ and inter-organ edges。whereas repeated analyses using randomly permuted receptor-ligand pairs abolished these differences(這個地方注意,是cellphoneDB使用的方法)。A slight enrichment was also observed for intra-organ signalling for outgoing plasma-membrane ligand-mediated edges,while no such enrichment was found for the secreted ligandmediated edges and for the plasma-membrane receiving edges。we conclude that autocrine signalling is a major predicted feature of cell-to-cell communication networks.
這里主要的結(jié)論就是,交流最多的就是細胞類型內(nèi)部之間的交流。
四、Prediction of cellular communities in the Tabula Muris
To examine whether the summed-specificity weighted cellconnectivity-summary networks might help reveal sets of cell types that work together within an organ or to achieve a biological process, we carried out hierarchical clustering of cell types by the vectors of their summed-specificity weights。對于分泌的配體和血漿膜配體介導(dǎo)的網(wǎng)絡(luò),這都無法揭示細胞類型到器官,組織或細胞群落的任何潛在簇集。
We next examined the top 10 summed-specificity edges based on the secreted and plasma-membrane ligands and visualised them as cell-connectivity-summary networks which revealed distinct cell communities for both secreted and plasmamembrane ligands(top10的配受體有了明顯的劃分,如下圖)
For the connections involving secreted ligands, we observed four disconnected communities。
Examining the most specific ligand–receptor pairs involved in each cellular interaction identified both well-known and novel pairs that appear to be biologically relevant。
其實對于社區(qū)的劃分我們這里分析用到的可能不是那么的頻繁,但是對于真正了解細胞之間的通訊聯(lián)系的一個新穎角度。
五、Differential network analysis in NATMI.
這也是我們關(guān)注的另外一個重點,不同樣本之間的通訊差異。
Lastly, we used
NATMI to predict age-related changes in cell communication within the murine mammary gland (mammary glands from 3-and 18-month-old mice profiled in the Tabula Muris Senis42 were compared).鼠乳腺組織不同時期的細胞通訊之間的差異。A simple edge count analysis, at a detection rate threshold of 20%, revealed that there were substantially more ligand–receptor edges predicted as active at 3 months than at 18 months (2045 edges were detected at both ages; 1247 edges were detected at 3 months only; and 340 edges were detected at 18 months only).配受體數(shù)量上的差異。Examining differences in the cell-connectivity-summary networks based on the 3- and 18-month-old mammary gland revealed specific cell types were driving these age-related differences。
也就是說,首先分析配受體數(shù)量上的差異。(圖上展示的是配體數(shù)量)。Furthermore, 266 (78.2%) of the 340 edges only detected in the 18-month-old mammary gland involved signalling to or from T and B cells, while only 141 (11.3%) of the 1247 edges exclusively active at 3 months involved lymphocytes.相同細胞類型的配受體比例的差異。Conversely, 613 (49.2%) of the 1247 ligand–receptor edges only detected at 3 months involved signalling to or from basal cells while only 52 (15.6%) of the 340 edges exclusively active at 18 months involved basal cells.這是第一步,分析不同年齡段配受體分析的中相同細胞類型表達配體的數(shù)量差異,以及與某種細胞類型表達配受體比例的變化,或者某種細胞類型在配受體表達上的變化。
Examining the basal cell data in more detail identified 9 receptors (Gpc1, Procr, Fzd7, Itga5, Ldlr, Tlr2, Lrp6, Ephb1, and Tfrc) and 8 ligands (Tgfa, Ngf, Col5a2, Il11, Col4a2, Jag1, Col18a1 and Hspg2) at least twofold down-regulated at 18 months。這是第二步,配受體強度的表達變化,Notably, many of these top downregulated ligands and receptors are known to be important in maintenance of normal mammary basal stem cells and are implicated in basal-like and triple negative breast cancer。就是說,樣本中同樣存在的配受體強度的變化也是研究的一個方向,值得注意。
這個地方的分析思路值得我們借鑒,主要有三點:
(1)不同樣本細胞類型之間配受體表達數(shù)量上的差異(也關(guān)注比例差異),直接決定了通訊的頻繁性。
(2)樣本之間消失或者新的配受體對的研究,直接反映了不同處理(包括年齡等)細胞交流上生物學(xué)功能的差異。
(3)配受體強度的變化,即使配受體存在, 表達強度的下降或者上升也會影響細胞的功能。
六、算法,這里我們只關(guān)注權(quán)重算法
NATMI outputs weights of edges from a ligandproducing cell type/cluster to a receptor-expressing cell-type/cluster using three metrics.
(1)mean-expression weight,平均表達權(quán)重計算為細胞類型/簇中配體的平均表達與細胞類型/簇中受體的平均表達的乘積:
edge(cell-type1→cell-type2)meanligand1-receptor1 = cell-type1mean ligand1 × cell-type2meanreceptor1
(2)specificity weight (特異性權(quán)重),calculated as the product of (1) the mean expression of the ligand in a cell type divided by the sum of the mean expression of the ligand across all cell types in the dataset and (2) the mean expression of the receptor in a cell type divided by the sum of the mean expression of the receptor across all cell types in the dataset
edge(cell-type1→celltype2)specificityligand1-receptor1 = cell-type1meanligand1 × (Σ (cell-typemeanligand1))?1 × cell-type2meanreceptor1 × (Σ (cell-typemeanreceptor1))?1.
(3)Cell-connectivity-summary-network edge weights.
To summarise cell-to-cell connectivity within the network, NATMI generates a matrix of cell-connectivity-summary- network edges. These can be weighted by edge-count or summed expression and specificity.Using the ligand–receptor weights described above users can generate edge-count based summaries that simply count the number of ligand–receptor pairs, from cell-type1 to cell-type2, that pass a set of user-defined thresholds. For example, count all pairs observed at a detectionThreshold of 20%, an expressionThreshold of 10CPM, or with a specificityThreshold of 0.1.
說到這里,軟件的介紹就已經(jīng)結(jié)束了,軟件的運用github在這里,NATMI,大家不妨用一下,個人感覺還是不錯的。
請保持憤怒,讓王多魚傾家蕩產(chǎn)