適用于scRNA-seq的細(xì)胞通訊工具-1:CellphoneDB

先記錄一下,我安裝過程的慘痛經(jīng)歷??,希望小伙伴們避免像我一樣入坑。
起初每一次安裝,不論是使用conda還是pip工具,發(fā)現(xiàn)安裝好的CellphoneDB(4.0.0)在shell終端總是調(diào)用不了,開始以為沒有設(shè)置好環(huán)境變量,就花了一些時間反復(fù)設(shè)置環(huán)境變量,仍是沒有調(diào)用成功。后來搜了下,說是python版本對其也有影響,然后就安裝了python3.8,這一次在python編譯器cellphonedb可以被調(diào)用且為4.0.0版,但是shell終端還是用不了(由于目前很多教程都是基于shell運行的),因此我又折騰了一番,仍是沒有結(jié)果。后來在官網(wǎng)才意外發(fā)現(xiàn),他說CellphoneDB 4.0.0好像是只能在python里使用(那我這里多次安裝都是4.0.0的版本,可能是存在于我環(huán)境里的該軟件依賴包與4.0.0的可兼容,因此安裝不了其它低版本的吧~~)。言歸正傳,接下來一起感受一下這個細(xì)胞通訊工具吧!

一、安裝:在終端先創(chuàng)建一個新的環(huán)境,下載適合cellphoneDB的python版本

conda create -n cpdb python=3.8

conda activate cpdb#激活環(huán)境

python3.8 -m pip3 install cellphonedb -i https://pypi.tuna.tsinghua.edu.cn/simple

打開3.8版本的python

python3.8

二、測試:是否安裝成功

import cellphonedb
cellphonedb.version  #4.0.0
from cellphonedb.src.core.methods import cpdb_statistical_analysis_method

三、使用:

1、數(shù)據(jù)準(zhǔn)備,前期使用R處理數(shù)據(jù)

pbmc <- readRDS("pbmc.rds")#這里直接讀入我處理的pbmc3k的示例數(shù)據(jù)

counts <- as.matrix(pbmc@assays$RNA@data)
write.table(counts,'./example_data/cellphonedb_count.txt', header = T,sep='\t', quote=F)

meta_data <- cbind(rownames(pbmc@meta.data), pbmc@meta.data[,'cell_type', drop=F]) 
meta_data <- as.matrix(meta_data)
meta_data[is.na(meta_data)] = "Unkown" #細(xì)胞類型不能為空
write.table(meta_data,'./example_data/cellphonedb_meta.txt', sep='\t', quote=F, row.names=F)

2、再次進入python3.8,利用統(tǒng)計學(xué)分析推斷細(xì)胞間的通訊

import cellphonedb

cpdb_file_path = './example_data/cellphonedb.zip'
test_meta_file_path = './example_data/cellphonedb_meta.txt'
test_counts_file_path = './example_data/cellphonedb_count.txt'

from cellphonedb.src.core.methods import cpdb_statistical_analysis_method

deconvoluted, means, pvalues, significant_means = cpdb_statistical_analysis_method.call(
  cpdb_file_path = cpdb_file_path,
  meta_file_path = test_meta_file_path,
  counts_file_path = test_counts_file_path,
  counts_data = 'hgnc_symbol',#注意[ensembl | gene_name | hgnc_symbol] Type of gene 
  iterations = 100,
  threshold = 0.1,
  threads = 6,
  output_suffix = "pbmc",
  output_path = "./example_data/pbmc_out_path")
輸出文件

三、使用R對結(jié)果進行可視化(主要是網(wǎng)絡(luò)圖和點圖,python版還在開發(fā)中)

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
devtools::install_github('zktuong/ktplots', dependencies = TRUE)
setwd('./example_data/pbmc_out_path')
library(Seurat)
library(dplyr)
library(psych)
library(qgraph)
library(igraph)
library(tidyverse)
library(ktplots)
library(SingleCellExperiment)
library(reticulate)
data(cpdb_output)#provided an example dataset
pbmc <- readRDS("./example_data/pbmc.rds")
counts <- as.matrix(pbmc@assays$RNA@data)
meta_data <- cbind(rownames(pbmc@meta.data), pbmc@meta.data[,'cell_type', drop=F])  
meta_data <- as.matrix(meta_data)
meta_data[is.na(meta_data)] = "Unkown"

sce <- SingleCellExperiment(assays = list(counts = counts),colData = meta_data)

pvals <- read.delim("statistical_analysis_pvalues_pbmc.txt", check.names = FALSE)
means <- read.delim("statistical_analysis_means_pbmc.txt", check.names = FALSE)
#這里的cell_type1、2指定互作細(xì)胞,如果不指定則默認(rèn)所有細(xì)胞,split.by用于分組數(shù)據(jù)的
plot_cpdb(cell_type1 = 'T', cell_type2 = 'Mono',scdata = sce,
          idents = 'cell_type', # column name where the cell ids are located in the metadata
          #split.by = 'Experiment', # column name where the grouping column is. Optional.
          means = means, pvals = pvals,
          genes = c("XCR1", "CXCL10", "CCL5")) +
  small_axis(fontsize = 3) + small_grid() + small_guide() + small_legend(fontsize = 2) # some helper functions included in ktplots to help with the plotting
image1.png
plot_cpdb(cell_type1 = 'DC', cell_type2 = '', scdata = sce,
          idents = 'cell_type', means = means, pvals = pvals,
          gene.family = 'chemokines',highlight = 'blue',keep_significant = F, noir = TRUE) + 
  small_guide() + small_axis() + small_legend(keysize=.5)
image2.png

plot_cpdb2繪制圈圖:

deconvoluted <- read.delim('statistical_analysis_deconvoluted_pbmc.txt', check.names = FALSE)
interaction_annotation <- read.delim()#由于打不開網(wǎng)站,因此沒有下載到這個文件,在前面可以通過data(cpdb_output)加載出該文件的
也可加載示列數(shù)據(jù)data(cpdb_output2),我這里仍使用的pbmc3k:
plot_cpdb2(cell_type1 = 'DC', cell_type2 = 'NK',
           scdata = sce,
           idents = 'cell_type', # column name where the cell ids are located in the metadata
           means = means,
           pvals = pvals,
           deconvoluted = deconvoluted, # new options from here on specific to plot_cpdb2
           desiredInteractions = list(
             c('DC', 'NK'),
             c('NK', 'DC')),
           interaction_grouping = interaction_annotation,
           edge_group_colors = c(
             "Activating" = "#e15759",
             "Chemotaxis" = "#59a14f",
             "Inhibitory" = "#4e79a7",
             "Intracellular trafficking" = "#9c755f",
             "DC_development" = "#B07aa1",
             "Unknown" = "#e7e7e7"
           ),
           node_group_colors = c(
             "DC" = "red",
             "NK" = "blue"),
           keep_significant_only = TRUE,
           standard_scale = TRUE,
           remove_self = TRUE)
image3.png
plot_cpdb2(cell_type1 = "B|DC|NK", # same usage style as plot_cpdb
           cell_type2 = "CD4 T",
           idents = 'cell_type',
           #split.by = 'treatment_group_1',
           scdata = sce,
           means = means,
           pvals = pvals,
           deconvoluted = deconvoluted, # new options from here on specific to plot_cpdb2
           gene_symbol_mapping = 'index', # column name in rowData holding the actual gene symbols if the row names is ENSG Ids. Might be a bit buggy
           desiredInteractions = list(c('B', 'Naive CD4 T'), c('B', 'Memory CD4 T'), c('DC', 'Naive CD4 T'), c('DC', 'Memory CD4 T'), c('NK', 'Naive CD4 T'), c('NK', 'Memory CD4 T')),
           interaction_grouping = interaction_annotation,
           edge_group_colors = c("Activating" = "#e15759", "Chemotaxis" = "#59a14f", "Inhibitory" = "#4e79a7", "   Intracellular trafficking" = "#9c755f", "DC_development" = "#B07aa1"),
           node_group_colors = c("B" = "#86bc86", "DC" = "#79706e", "NK" = "#ff7f0e", 'Naive CD4 T' = "#bcbd22"  ,'Memory CD4 T' = "#17becf"),
           keep_significant_only = TRUE,
           standard_scale = TRUE,
           remove_self = TRUE)
image4.png

plot_cpdb3

plot_cpdb3(cell_type1 = 'T', cell_type2 = 'Mono',
    scdata = sce,
    idents = 'cell_type', # column name where the cell ids are located in the metadata
    means = means,
    pvals = pvals,
    deconvoluted = deconvoluted,
    keep_significant_only = TRUE,
    standard_scale = TRUE,
    remove_self = TRUE)
image5.png

plot_cpdb4

#這里由于上面的pbmc數(shù)據(jù)畫出的圖都不太好看,因此下面直接加載包中自帶的示列數(shù)據(jù)進行演示
data(kidneyimmune)
data(cpdb_output2)
plot_cpdb4(
    interaction = 'CLEC2D-KLRB1',
    cell_type1 = 'NK', cell_type2 = 'Mast',
    scdata = kidneyimmune,
    idents = 'celltype',
    means = means2,
    pvals = pvals2,
    deconvoluted = decon2,
    keep_significant_only = TRUE,
    standard_scale = TRUE)
image6.png
plot_cpdb4(
        interaction = c('CLEC2D-KLRB1', 'CD40-CD40LG'),
        cell_type1 = 'NK|B', cell_type2 = 'Mast|CD4T',
        scdata = kidneyimmune,
        idents = 'celltype',
        means = means2,
        pvals = pvals2,
        deconvoluted = decon2,
        desiredInteractions = list(
            c('NK cell', 'Mast cell'),
            c('NK cell', 'NKT cell'),
            c('NKT cell', 'Mast cell'),
            c('B cell', 'CD4T cell')),
        keep_significant_only = TRUE)
image7.png

好了,關(guān)于CellphoneDB的初學(xué)使用先到這里吧,如有錯誤之處,請指正。謝謝!??

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。