10X單細胞之細胞通訊篇章-----Connectome

今天我們來分享一篇關于細胞通訊的文章,其實關于細胞通訊我分享的就已經很多了,不知道大家對10X單細胞做細胞通訊有了怎樣的見解,今天我們來分享一篇新的細胞通訊的分析文章,文章在Connectome: computation and visualization of cell-cell signaling topologies in single-cell systems data,還是按照我們原來的方法,先分享文章內容,再來示例參考代碼。

Abstract

Single-cell RNA-sequencing data can revolutionize our understanding of the patterns of cell-cell and ligand-receptor connectivity that influence the function of tissues and organs.(這個是常識,但是大家應該知道,真正發揮通訊作用的其實是蛋白分子)。However,the quantification and visualization of these patterns are major computational and epistemological challenges.(這個軟件的可視化做的相當不錯)。Here, we present Connectome, a software package for R which facilitates rapid calculation, and interactive exploration, of cell-cell signaling network topologies contained in single-cell RNA-sequencing data. Connectome can be used with any reference set of known ligand-receptor mechanisms. It has built-in functionality to facilitate differential and comparative connectomics, in which complete mechanistic networks are quantitatively compared between systems. Connectome includes computational and graphical tools designed to analyze and explore cell-cell connectivity patterns across disparate(不同的) single-cell datasets. We present approaches to quantify these topologies and discuss some of the biologic theory leading to their design.(日常自夸自己的軟件

Introduction

Cell-to-cell communication is a major driver of cell differentiation and physiological function determining tissue/organ development, homeostasis, and response to injury. Within tissues, cells have local neighbors with whom they directly communicate via paracrine signaling and direct cell-cell contact(這個地方涉及到細胞臨近通訊,光有單細胞數據是無法做到的,需要借助10X空間轉錄組技術), and long-range or mobile partners with whom they exchange information via endocrine signaling(通過內分泌信號與他們交換信息的遠程或移動partner)。In solid tissues, cell types have specific cellular niches incorporating localized matrix and signaling environments, which facilitate phenotypic maintenance and support specialized cell functions。(看這里的分析主要還是看重細胞通訊和細胞位置的相對關系)。Circulating immune cells use an extensive library of chemokines to coordinate multicellular system responses to threat or injury.(免疫細胞相關性的信號具有流動性)。

接下來做了一些配受體分析的分析點,之前全部分享過,這里不再贅述。
然后介紹了文章開發的軟件,Connectome,以及一些好用的方法。

Defining the data structure of tissue connectomics

The connectomic mapping discussed here treats every cell parcellation (e.g., different cell types, phenotypically distinct cell states of the same cell type, etc.) as a single node, averaging ligand and receptor values across a given cell parcellation to yield mean values which are then linked(仍然是以平均值作為配受體的強度)。Mean-wise connectomics has the advantage of accommodating the zero-values intrinsic to single-cell data,while simplifying the system so that every cell parcellation is represented by
a single, canonical node.(細胞類型的整體代表了細胞,容納了部分細胞不表達配受體的情況)。However, as this approach will blend the effects of all cellular archetypes within a cluster, initial cell parcellation must be done carefully for the resulting connectomic networks to be biologically meaningful.(同一cluster的不同來源)。
An edgeweight must be defined for each edge in the celltype-celltype connectomic dataset. Connectome, by default, calculates two distinct edgeweights, each of which captures biologically relevant information. The first edgeweight, also referred to as ‘w1’ or ‘weightnorm,’ is defined as a function (by default, the product) of the celltype-wise normalized expression for the ligand and the receptor, or

圖片.png

這里需要注意的是,矩陣是進行均一化的,均一化后的平均值)。edges are compared across tissue conditions.However, Weightnorm is encumbered by the fact that it ignores the cell-type specificity of many ligands and receptors.Weightnnorm will weigh a highly expressed but non-specific cell-cell link as stronger than a highly specific but lowlyexpressed gene.(均一化矩陣使用的缺點,忽視了細胞類型的配受體特異性,在之前的一篇文章中我提出了一個問題,文章在10X單細胞通訊分析之ICELLNET,對于最后的問題,不知道大家心里有沒有答案)。This does not align with biologic intuition for many cell-cell signaling mechanisms, nor with the goal of parsing out rare cell types that may have outsized biological relevance.(這與許多細胞-細胞信號傳導機制的生物學直覺不符,也與分析可能具有過大生物學相關性的稀有細胞類型的目的不符。)Therefore, Connectome also, by default, calculates a second edgeweight, alternatively referred to as ‘w2’ or ‘weightscale,’ which is defined as a function
圖片.png

single tissue system. It also can be used to compare cellular prominence within signaling networks across disparate tissue systems, or across species, in which normalized values may vary but their relative expressions across cell types do not(這里重點標注了,知道什么意思吧)。
Alternative edgeweights, such as those incorporating a score for likelihood of downstream signal transduction or those which take into account proximity information from spatial transcriptomics or histology-registered techniques(這里也需要注意),may reveal additional patterns. For the purposes of this software program, Connectome defaults to the above edgeweights definitions because we have found them to be effective for understanding biological questions addressed in our studies.(看來作者更傾向于scale矩陣的下游分析)。

Statistical significance depends upon the question being asked, the test being applied, and the threshold for important differences, and has a very different meaning depending on whether a research team is investigating a single tissue, comparing multiple tissues, or studying a physiologic process. However, we find two general statistical patterns to be of key importance, and we have built-in functionality to Connectome to help the researcher narrow in on these patterns(分析模式,我們來看一下):
(1)單樣本分析First, when studying a single tissue system (i.e., one organ in one condition), it is often desirable to focus on those ligand-receptor interactions which are highly associated with (markers for) specific celltype-celltype vectors. While such a pattern is not a guarantee of biological relevance, it is often useful to learn those mechanisms which can mark celltypecelltype vectors in the same ways that transcripts can mark celltypes。
(2)多樣本分析Second, when comparing two tissue systems (i.e., one organ in two experimental conditions) it is generally interesting to focus on those mechanisms which, regardless of their association with a specific celltypecelltype vector, are differentially regulated across condition。(多樣本對比分析細胞通訊差異的問題不知道大家有沒有什么深入的認識,多樣本分析是需要不需要整合也是值得考慮的問題

可視化Visualizing the connectomic signature of a single tissue(注釋在圖的后面)

圖片.png

圖片.png

看一下分析過程:
(1)the tissue data are parcellated into defined cell types using a standard clustering workflow.細胞定義,這個是無論如何都逃脫不了的。
(2)Then, a mapping is created against a ground-truth database of known ligand-receptor mechanisms.(匹配已知的配受體對,大部分軟件都是這么做的)。This mapping yields a large edgelist, wherein nodes are defined as cell types and edge attributes contain quantitative information derived from cell type-specific expression levels.Source nodes denote(表示) the ligands, and target nodes refer to the cognate(同源的) receptors.
(3)Conceptually, the data architecture for a single edge attribute (one column in the above discussed edgelist data frame in Figure 1A) can be thought of as a 3D matrix (Figure 1B), where rows are source (sending) cells, columns are target (receiving) cells, and the z-axis is the full list of ground-truth known ligand-receptor mechanisms.(這是對圖的解釋說明),This allows clear visualization of the data that needs to be subset in order to explore a single interactome (red), outgoing network(purple), niche network (blue), or cell-cell vector (green).劃重點了啊)。
我們來分享一下圖一的代碼

library(Seurat)
library(SeuratData)
library(Connectome)
library(ggplot2)
library(cowplot)
####Load Pancreas Data (Used in Fig 1 & 2) ####
data('panc8')
table(Idents(panc8))
Idents(panc8) <- panc8[['celltype']]
table(Idents(panc8))
#Make Connectome
panc8 <- NormalizeData(panc8)
connectome.genes <-
union(Connectome::ncomms8866_human$Ligand.ApprovedSymbol,Connectome::ncomms8866_human$Receptor.A
pprovedSymbol)
genes <- connectome.genes[connectome.genes %in% rownames(panc8)]
panc8 <- ScaleData(panc8,features = genes)
panc8.con <- CreateConnectome(panc8,species = 'human',min.cells.per.ident = 75,p.values = T,calculate.DOR = F)
#Embed
pancreas.list <- SplitObject(panc8, split.by = "tech")
pancreas.list <- pancreas.list[c("celseq", "celseq2", "fluidigmc1", "smartseq2")]
for (i in 1:length(pancreas.list)) {
pancreas.list[[i]] <- NormalizeData(pancreas.list[[i]], verbose = FALSE)
pancreas.list[[i]] <- FindVariableFeatures(pancreas.list[[i]], selection.method = "vst",
nfeatures = 2000, verbose = FALSE)
}
reference.list <- pancreas.list[c("celseq", "celseq2", "smartseq2")]
pancreas.anchors <- FindIntegrationAnchors(object.list = reference.list, dims = 1:30)
pancreas.integrated <- IntegrateData(anchorset = pancreas.anchors, dims = 1:30)
DefaultAssay(pancreas.integrated) <- "integrated"
pancreas.integrated <- ScaleData(pancreas.integrated, verbose = FALSE)
pancreas.integrated <- RunPCA(pancreas.integrated, npcs = 30, verbose = FALSE)
pancreas.integrated <- RunUMAP(pancreas.integrated, reduction = "pca", dims = 1:30)
#Save
DefaultAssay(pancreas.integrated) <- "RNA"
save(pancreas.integrated,file = 'pancreas.integrated.Robj')
load("~/Box Sync/Connectome Paper/Figures/pancreas.integrated.Robj")
#### Plotting Figure 1 #####
#UMAP
png('panc8_UMAP.png',width = 5.5,height = 4,units = 'in',res = 300)
DimPlot(pancreas.integrated,reduction = 'umap',group.by = "celltype")
dev.off()
#colors~
gg_color_hue <- function(n) {
hues = seq(15, 375, length = n + 1)
hcl(h = hues, l = 65, c = 100)[1:n]
}
cell.types <- sort(as.character(names(table(Idents(panc8)))))
cols.use <- gg_color_hue(length(cell.types))
names(cols.use) <- cell.types
# x1 Interactome
png('interactome_circos.png',width = 7,height = 4.5,units = 'in',res = 300)
CircosPlot(panc8.con,
mechanisms.include = c('DLL4 - NOTCH3'),
weight.attribute = 'weight_norm',
min.z = NULL,
min.pct = 0.1,
lab.cex = 0.01, max.p = 0.05,
gap.degree = 5,
cols.use = cols.use,
title = 'Single Interactome (DLL4 -> NOTCH3)')
dev.off()
# x1 Vector
png('vector_circos.png',width = 7,height = 4.5,units = 'in',res = 300)
CircosPlot(panc8.con,
sources.include = 'endothelial',
targets.include = 'activated_stellate',
min.z = 0,
min.pct = 0.1,
lab.cex = 0.3, max.p = 0.05,
weight.attribute = 'weight_sc',
cols.use = cols.use,
title = 'Single Vector (Endothelial -> Activated Stellate)')
dev.off()
# x1 Niche
png('niche_circos.png',width = 7,height = 4.5,units = 'in',res = 300)
CircosPlot(panc8.con,
targets.include = 'activated_stellate',
min.pct = 0.1,
lab.cex = 0.3,
min.z = 1,max.p = 0.05,
weight.attribute = 'weight_sc',
cols.use = cols.use,
title = 'Single Niche (Activated Stellate)')
dev.off()

可視化2Tissue network centrality analysis(組織網絡集中度分析)

A major goal of tissue science and cell biology is to understand the roles that individual cells types play and their ability to affect other cell types.It is of interest, therefore, to rank cell types based on their ability to produce or receive information within a given signaling network, or signaling family(細胞之間的通訊網絡)。如下圖:


圖片.png

圖片.png

這個可視化做的還是相當到位的。

In this analysis, the total connectome for a single tissue is first subset down to only those edges belonging to a single signaling family。
(1)This weighted graph is then used to calculate two centrality metrics: the Kleinberg hub and Kleinberg authority scores (這個地方如果大家感興趣可以查一下)(represented as the dot size), and the cumulative directed edgeweight (x-axis), for each cell type within the network.(這個圖就是上面的圖B,x軸的大小可以理解為某個配受體家族的權重,點的大小代表得分,點越大,下周越大,代表這個信號通路主要由該細胞類型“發射”(“接收))。The result is a ranking of outgoing and incoming centrality for every cell type, across every signaling family, for each target cell type within the tissue.

看看圖二的代碼

# Big connectome
png('Big_Connectome_v2.png',width = 10*.82,height = 8*.82,units = 'in',res = 300)
CircosPlot(panc8.con,min.pct = 0.2,max.p = 0.05,lab.cex = 0.01,gap.degree = 0.01,cols.use = cols.use)
dev.off()
png('Single_mode_connectome.png',width = 10*.8,height = 8*.8,units = 'in',res = 300)
CircosPlot(panc8.con,modes.include = 'VEGF',min.pct = 0.05,max.p = 0.05,lab.cex = 0.01,gap.degree =
0.01,cols.use = cols.use)
dev.off()
# Centrality categorization
Centrality(panc8.con,weight.attribute = 'weight_norm',cols.use = cols.use,min.pct = 0.05,max.p = 0.05)
epi.auth <- c('ADAM','Transferrin','MET','TNF','Ephrins','NEGF','MMP','Laminins','FGF','EGF','Intracellular
trafficking','Matrix glycoproteins','UNCAT','Neurotrophins','APOE')
ec.auth <- c('Interleukins','CXCL','VEGF','KIT','Collagens','SLIT','Semaphorins','TGFB','Fibronectin','IGF','WNT','Matrix
(assorted)','NOTCH','ANGPT','BMP')
mes.auth <- c('Vasoactive','Lysophosphatidic acid','PDGF','Protease inhibition')
imm.auth <- c('RARRES','MHC','Cell-cell adhesion','CC','CSF','Complement','TLR','Chemotaxis')
pdf('Centrality.Mes.pdf',width = 12,height = 3)
Centrality(panc8.con,weight.attribute = 'weight_norm',cols.use = cols.use,
modes.include = mes.auth,min.pct = 0.05,max.p = 0.05)
dev.off()
pdf('Centrality.Epi.pdf',width = 12,height = 4)
Centrality(panc8.con,weight.attribute = 'weight_norm',cols.use = cols.use,
modes.include = epi.auth,min.pct = 0.05,max.p = 0.05)
dev.off()
pdf('Centrality.Imm.pdf',width = 12,height = 4)
Centrality(panc8.con,weight.attribute = 'weight_norm',cols.use = cols.use,
modes.include = imm.auth,min.pct = 0.05,max.p = 0.05)
dev.off()
pdf('Centrality.Endo.pdf',width = 12,height = 5)
Centrality(panc8.con,weight.attribute = 'weight_norm',cols.use = cols.use,
modes.include = ec.auth,min.pct = 0.05,max.p = 0.05)
dev.off()

這個地方我們要重點看一下,首先配受體進行了家族的分類,這個很重要,不是所有的軟件都有這個功能,更多時候需要我們的先驗知識去實現這個能力。還有就是這樣的可是畫相當直觀,直接可以看某個配受體家族的主要sender和receiver。。希望引起大家的重視

可視化3 Differential connectomics with aligned nodes

It is common in systems biology to examine changes in cell-cell signaling between two systems,(信號通路的強度變化,這個我多次強調過)。

圖片.png

圖片.png

In one such example, if the same cell types are present in both systems(我們可以理解為多個樣本), we may calculate direct, one-to-one comparison of all edges using the Connectome package.
while conversely, when both the ligand and receptor are downregulated, we may think of the edge as ‘deactivated’. The two alternate cases, when a ligand is upregulated and a receptor down, or vice-versa, are more complicated to interpret. Although such patterns may initially suggest ligand-pressure or ligand-starvation, in which the downstream cell changes its character in response to upstream influence, it is important to note that in most tissue systems there are many cells interacting at once, and that there are likely multi-cell feedback loops in play which confound easy interpretation of such patterns.(四種情況)。
看一下這部分的代碼

InstallData('ifnb')
data('ifnb')
table(Idents(ifnb))
Idents(ifnb) <- ifnb[['seurat_annotations']]
table(Idents(ifnb))
# Make differential connectomes:
# First identify ligands and receptors which have mapped in the dataset:
connectome.genes <-
union(Connectome::ncomms8866_human$Ligand.ApprovedSymbol,Connectome::ncomms8866_human$Receptor.A
pprovedSymbol)
genes <- connectome.genes[connectome.genes %in% rownames(ifnb)]
# Split the object by condition:
ifnb.list <- SplitObject(ifnb,split.by = 'stim')
# Normalize, Scale, and create Connectome:
ifnb.con.list <- list()
for (i in 1:length(ifnb.list)){
ifnb.list[[i]] <- NormalizeData(ifnb.list[[i]])
ifnb.list[[i]] <- ScaleData(ifnb.list[[i]],features = rownames(ifnb.list[[i]]))
ifnb.con.list[[i]] <- CreateConnectome(ifnb.list[[i]],species = 'human',p.values = F)
}
names(ifnb.con.list) <- names(ifnb.list)
diff <- DifferentialConnectome(ifnb.con.list[[1]],ifnb.con.list[[2]])
# Stat sig:
# Stash idents and make new identities which identify each as stimulated vs. control
celltypes <- as.character(unique(Idents(ifnb)))
celltypes.stim <- paste(celltypes, 'STIM', sep = '_')
celltypes.ctrl <- paste(celltypes, 'CTRL', sep = '_')
ifnb$celltype.condition <- paste(Idents(ifnb), ifnb$stim, sep = "_")
ifnb$celltype <- Idents(ifnb)
Idents(ifnb) <- "celltype.condition"
# Identify which ligands and receptors, for which cell populations, have an adjusted p-value < 0.05 based on a
Wilcoxon rank test
diff.p <- data.frame()
for (i in 1:length(celltypes)){
temp <- FindMarkers(ifnb,
ident.1 = celltypes.stim[i],
ident.2 = celltypes.ctrl[i],
verbose = FALSE,
features = genes,
min.pct = 0,
logfc.threshold = 0)
temp2 <- subset(temp, p_val_adj < 0.05)
if (nrow(temp2)>0){
temp3 <- data.frame(genes = rownames(temp2),cells = celltypes[i])
diff.p <- rbind(diff.p, temp3)
}
}
diff.p$cell.gene <- paste(diff.p$cells,diff.p$genes,sep = '.')
# Filter differential connectome to only include significantly perturbed edges
diff$source.ligand <- paste(diff$source,diff$ligand,sep = '.')
diff$target.receptor <- paste(diff$target,diff$receptor,sep = '.')
diff.sub <- subset(diff,source.ligand %in% diff.p$cell.gene & target.receptor %in% diff.p$cell.gene)
#### Plotting Figure 3 ####
pdf('diff.score.small.pdf',width = 12,height=3.25)
DifferentialScoringPlot(diff.sub,min.score = 1.5,min.pct = 0.1,infinity.to.max = T,aligned = F)
dev.off()
pdf('diff.score.big.pdf',width = 22 ,height=4.5)
DifferentialScoringPlot(diff.sub,min.score = 1.5,min.pct = 0.1,infinity.to.max = T,aligned = F)
dev.off()
pdf('diff.score.small.2.pdf',width = 12*0.9,height=3.25*0.9)
DifferentialScoringPlot(diff.sub,min.score = 2,min.pct = 0.2,infinity.to.max = T,aligned = F)
dev.off()
pdf('diff.score.big.2.pdf',width = 22*0.9 ,height=4.5*0.9)
DifferentialScoringPlot(diff.sub,min.score = 2,min.pct = 0.2,infinity.to.max = T,aligned = F)
dev.off()
## Ligand and receptor are both **UP**:
diff.up.up <- subset(diff.sub,ligand.norm.lfc > 0 & recept.norm.lfc > 0 )
diff.up.down <- subset(diff.sub,ligand.norm.lfc > 0 & recept.norm.lfc < 0 )
diff.down.up <- subset(diff.sub,ligand.norm.lfc < 0 & recept.norm.lfc > 0 )
diff.down.down <- subset(diff.sub,ligand.norm.lfc < 0 & recept.norm.lfc < 0 )
png('up.up.png',width = 8.5,height = 4.5,units = 'in',res = 300)
CircosDiff(diff.up.up,min.score = 2,min.pct = 0.1,lab.cex = 0.8)
dev.off()
png('up.down.png',width = 8.5,height = 4.5,units = 'in',res = 300)
CircosDiff(diff.up.down,min.score = 2,min.pct = 0.1,lab.cex = 0.8)
dev.off()
png('down.up.png',width = 8.5,height = 4.5,units = 'in',res = 300)
CircosDiff(diff.down.up,min.score = 2,min.pct = 0.1,lab.cex = 0.8)
dev.off()
png('down.down.png',width = 8.5,height = 4.5,units = 'in',res = 300)
CircosDiff(diff.down.down,min.score = 2,min.pct = 0.1,lab.cex = 0.8)
dev.off()

可視化真的做的相當不錯。,注意作者用的矩陣是經過scale的。

可視化4 分析動態的組織過程

圖片.png

這一部分的代碼,我希望大家可以自己寫出來。

那么問題來了,究竟我們做細胞通訊應該用均一化的矩陣呢?還是scale處理過的?心里有答案的同學不妨寫一下自己的見解吧!

生活很好,有你更好

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
禁止轉載,如需轉載請通過簡信或評論聯系作者。

推薦閱讀更多精彩內容