在2019/08/07的Nature刊中,中科院景乃禾課題組發表了文章——Molecular architecture of lineage allocation and tissue organization in early mouse embryo ,我在這篇文章中發現了一個被湯神組 (就是Hemberg-lab單細胞轉錄組數據分析(二)- 實驗平臺中開辟了單細胞轉錄組領域的人)反復用到的R工具-SCENIC
,現在讓我們來一起看看該工具有何妙用!
SCENIC簡介
SCENIC
是一種同時重建基因調控網絡并從單細胞RNA-seq數據中鑒定stable cell states的工具。基于共表達和DNA模基序 (motif)分析推斷基因調控網絡 ,然后在每個細胞中分析網絡活性以鑒定細胞狀態。
SCENIC
發表于2017年的Nature method文章。具體見鏈接:
https://www.nature.com/articles/nmeth.4463
要求:當前版本的SCENIC支持人類,鼠和果蠅(Drosophila melanogaster)。
要將SCENIC應用于其他物種,需要手動調整第二步(例如使用新的RcisTarget
數據庫或使用不同的motif-enrichment-analysis
工具)。
輸入:SCENIC需要輸入的是單細胞RNA-seq表達矩陣
—— 每列對應于樣品(細胞),每行對應一個基因。基因ID應該是gene-symbol并存儲為rownames (尤其是基因名字部分是為了與RcisTarget
數據庫兼容);表達數據是Gene的reads count。根據作者的測試,提供原始的或Normalized UMI count,無論是否log轉換,或使用TPM值,結果相差不大。(Overall, SCENIC is quite robust to this choice, we have applied SCENIC to datasets using raw (logged) UMI counts, normalized UMI counts, and TPM and they all provided reliable results (see Aibar et al. (2017)).)
SCENIC
先安裝R,如果處理大樣本數據,則建議使用Python版 (https://pyscenic.readthedocs.io/en/latest/);
SCENIC在R中實現基于三個R包:
-
GENIE3:
推斷基因共表達網絡
-
RcisTarget:
用于分析轉錄因子結合motif
-
AUCell:
用于鑒定scRNA-seq數據中具有活性基因集(基因網絡)的細胞
運行SCENIC需要安裝這些軟件包以及一些額外的依賴包:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::version()
# If your bioconductor version is previous to 3.9, see the section bellow
## Required
BiocManager::install(c("AUCell", "RcisTarget"))
BiocManager::install(c("GENIE3")) # Optional. Can be replaced by GRNBoost
## Optional (but highly recommended):
# To score the network on cells (i.e. run AUCell):
BiocManager::install(c("zoo", "mixtools", "rbokeh"))
# For various visualizations and perform t-SNEs:
BiocManager::install(c("DT", "NMF", "pheatmap", "R2HTML", "Rtsne"))
# To support paralell execution (not available in Windows):
BiocManager::install(c("doMC", "doRNG"))
# To export/visualize in http://scope.aertslab.org
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("aertslab/SCopeLoomR", build_vignettes = TRUE)
Version: AUCell >=1.4.1 (minimum 1.2.4), RcisTarget>=1.2.0 (minimum 1.0.2) and GENIE3>=1.4.0(minimum 1.2.1).
安裝SCENIC:
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("aertslab/SCENIC")
packageVersion("SCENIC")
下載評分數據庫
除了必要的R包之外,需要下載RcisTarget
的物種特定數據庫(https://resources.aertslab.org/cistarget/
;主題排名)。默認情況下,SCENIC
使用在基因啟動子(TSS上游500 bp
)和TSS周圍 20 kb (+/- 10kb)
中對模序進行評分的數據庫。
- For human:
dbFiles <- c("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather",
"https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather")
# mc9nr: Motif collection version 9: 24k motifs
- For mouse:
dbFiles <- c("https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.feather",
"https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.feather")
# mc9nr: Motif collection version 9: 24k motifs
- For fly:
dbFiles <- c("https://resources.aertslab.org/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/gene_based/dm6-5kb-upstream-full-tx-11species.mc8nr.feather")
# mc8nr: Motif collection version 8: 20k motifs
注意:下載后最好確認下載的數據是否完整,可基于MD5值評估。(參考鏈接:https://resources.aertslab.org/cistarget/databases/sha256sum.txt)
完成這些設置步驟后,SCENIC
即可運行!
數據格式不同時如何讀入?
最終讀入的信息有兩個,一個是前面說的表達矩陣,還有一個是樣品分組信息。
- a) From
.loom
file
.loom文件可以通過SCopeLoomR
包直接導入SCENIC。(loom
格式是用于存儲非常大的組學數據集的專屬格式,具體見 http://linnarssonlab.org/loompy/)
## Download:download.file("http://loom.linnarssonlab.org/clone/Previously%20Published/Cortex.loom", "Cortex.loom")
loomPath <- "Cortex.loom"
- b) From 10X/CellRanger output files
10X/CellRanger
輸出結果可用作SCENIC的輸入文件 (需要先安裝Seurat
)。
# BiocManager::install("Seurat")
# 測試數據也可以從Seurat官網下載,自己修改路徑
# 測試數據:https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/matrices#r-load-mat>
singleCellMatrix <- Seurat::Read10X(data.dir="data/pbmc3k/filtered_gene_bc_matrices/hg19/")
cellInfo <- data.frame(seuratCluster=Idents(seuratObject))
- c) From other R objects (e.g. Seurat, SingleCellExperiment)
sce <- load_as_sce(dataPath) # any SingleCellExperiment object
exprMat <- counts(sce)
cellInfo <- colData(sce)
- d) From GEO
從GEO下載和格式化數據集的示例(例如,GEOGSE60361
:3005小鼠腦細胞數據集)
# 獲取數據及數據標注
# dir.create("SCENIC_MouseBrain"); setwd("SCENIC_MouseBrain") # if needed
# (This may take a few minutes)
if (!requireNamespace("GEOquery", quietly = TRUE)) BiocManager::install("GEOquery")
library(GEOquery)
geoFile <- getGEOSuppFiles("GSE60361", makeDirectory=FALSE)
gzFile <- grep("Expression", basename(rownames(geoFile)), value=TRUE)
txtFile <- gsub(".gz", "", gzFile)
gunzip(gzFile, destname=txtFile, remove=TRUE)
library(data.table)
geoData <- fread(txtFile, sep="\t")
geneNames <- unname(unlist(geoData[,1, with=FALSE]))
exprMatrix <- as.matrix(geoData[,-1, with=FALSE])
rm(geoData)
dim(exprMatrix)
rownames(exprMatrix) <- geneNames
exprMatrix <- exprMatrix[unique(rownames(exprMatrix)),]
exprMatrix[1:5,1:4]
# Remove file downloaded:
file.remove(txtFile)
運行 SCENIC
SCENIC workflow:
建立基因調控網絡(Gene Regulation Network,GRN
):
基于共表達識別每個轉錄因子TF的潛在靶標。
過濾表達矩陣并運行GENIE3或者GRNBoost,它們是利用表達矩陣推斷基因調控網絡的一種算法,能得到轉錄因子和潛在靶標的相關性網絡;
將目標從GENIE3或者GRNBoost格式轉為共表達模塊。根據DNA模序分析(
motif
)選擇潛在的直接結合靶標(調節因子)(利用RcisTarget
包:TF基序分析)
確定細胞狀態及其調節因子:
- 分析每個細胞中的網絡活性(AUCell)
- 在細胞中評分調節子(計算AUC)
SCENIC完整流程
### 導入數據
loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")
library(SCopeLoomR)
loom <- open_loom(loomPath, mode="r")
exprMat <- get_dgem(loom)
cellInfo <- get_cellAnnotation(loom)
close_loom(loom)
### Initialize settings 初始設置,導入評分數據庫
library(SCENIC)
scenicOptions <- initializeScenic(org="mgi", dbDir="cisTarget_databases", nCores=10)
# scenicOptions@inputDatasetInfo$cellInfo <- "int/cellInfo.Rds"
saveRDS(scenicOptions, file="int/scenicOptions.Rds")
### 共表達網絡
genesKept <- geneFiltering(exprMat, scenicOptions)
exprMat_filtered <- exprMat[genesKept, ]
runCorrelation(exprMat_filtered, scenicOptions)
exprMat_filtered_log <- log2(exprMat_filtered+1)
runGenie3(exprMat_filtered_log, scenicOptions)
### Build and score the GRN 構建并給基因調控網絡(GRN)打分
exprMat_log <- log2(exprMat+1)
scenicOptions@settings$dbs <- scenicOptions@settings$dbs["10kb"] # Toy run settings
runSCENIC_1_coexNetwork2modules(scenicOptions)
runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) # Toy run settings
runSCENIC_3_scoreCells(scenicOptions, exprMat_log)
# Export:
export2scope(scenicOptions, exprMat)
# Binarize activity?
# aucellApp <- plotTsne_AUCellApp(scenicOptions, exprMat_log)
# savedSelections <- shiny::runApp(aucellApp)
# newThresholds <- savedSelections$thresholds
# scenicOptions@fileNames$int["aucell_thresholds",1] <- "int/newThresholds.Rds"
# saveRDS(newThresholds, file=getIntName(scenicOptions, "aucell_thresholds"))
# saveRDS(scenicOptions, file="int/scenicOptions.Rds")
runSCENIC_4_aucell_binarize(scenicOptions)
### Exploring output
# Check files in folder 'output'
# .loom file @ http://scope.aertslab.org
# output/Step2_MotifEnrichment_preview.html in detail/subset:
motifEnrichment_selfMotifs_wGenes <- loadInt(scenicOptions, "motifEnrichment_selfMotifs_wGenes")
tableSubset <- motifEnrichment_selfMotifs_wGenes[highlightedTFs=="Sox8"]
viewMotifs(tableSubset)
# output/Step2_regulonTargetsInfo.tsv in detail:
regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
viewMotifs(tableSubset)
舉個栗子!
輸入表達矩陣
在本教程中,我們提供了一個示例,樣本是小鼠大腦的200個細胞和862個基因:
loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")
打開loom文件并加載表達矩陣;
library(SCopeLoomR)
loom <- open_loom(loomPath, mode="r")
exprMat <- get_dgem(loom)
cellInfo <- get_cellAnnotation(loom)
close_loom(loom)
dim(exprMat)
細胞信息/表型
# cellInfo$nGene <- colSums(exprMat>0)
head(cellInfo)
cellInfo <- data.frame(cellInfo)
cellTypeColumn <- "Class"
colnames(cellInfo)[which(colnames(cellInfo)==cellTypeColumn)] <- "CellType"
cbind(table(cellInfo$CellType))
# Color to assign to the variables (same format as for NMF::aheatmap)
colVars <- list(CellType=c("microglia"="forestgreen",
"endothelial-mural"="darkorange",
"astrocytes_ependymal"="magenta4",
"oligodendrocytes"="hotpink",
"interneurons"="red3",
"pyramidal CA1"="skyblue",
"pyramidal SS"="darkblue"))
colVars$CellType <- colVars$CellType[intersect(names(colVars$CellType), cellInfo$CellType)]
saveRDS(colVars, file="int/colVars.Rds")
plot.new(); legend(0,1, fill=colVars$CellType, legend=names(colVars$CellType))
初始化SCENIC設置
為了在SCENIC
的多個步驟中保持設置一致,SCENIC包中的大多數函數使用一個公共對象,該對象存儲當前運行的選項并代替大多數函數的“參數”。比如下面的org
,dbDir
等,可以在開始就將物種rog(mgi
—— mouse, hgnc
—— human, dmel
—— fly)和RcisTarge
數據庫位置分別讀給對象org
,dbDir
,之后統一用函數initializeScenic
得到對象scenicOptions
。具體參數設置可以用?initializeScenic
help一下。
library(SCENIC)
org="mgi" # or hgnc, or dmel
dbDir="cisTarget_databases" # RcisTarget databases location
myDatasetTitle="SCENIC example on Mouse brain" # choose a name for your analysis
data(defaultDbNames)
dbs <- defaultDbNames[[org]]
scenicOptions <- initializeScenic(org=org, dbDir=dbDir, dbs=dbs, datasetTitle=myDatasetTitle, nCores=10)
# Modify if needed
scenicOptions@inputDatasetInfo$cellInfo <- "int/cellInfo.Rds"
scenicOptions@inputDatasetInfo$colVars <- "int/colVars.Rds"
# Databases:
# scenicOptions@settings$dbs <- c("mm9-5kb-mc8nr"="mm9-tss-centered-5kb-10species.mc8nr.feather")
# scenicOptions@settings$db_mcVersion <- "v8"
# Save to use at a later time...
saveRDS(scenicOptions, file="int/scenicOptions.Rds")
共表達網絡
SCENIC
工作流程的第一步是根據表達數據推斷潛在的轉錄因子靶標。為此,我們使用GENIE3或GRNBoost,輸入文件是表達矩陣(過濾后的)和轉錄因子列表。GENIE3/GRBBoost的輸出結果和相關矩陣將用于創建共表達模塊(runSCENIC_1_coexNetwork2modules()
)。
基因過濾/選擇
-
按每個基因的reads總數進行過濾。
該filter旨在去除最可能是噪音的基因。
默認情況下,它(
minCountsPerGene
)保留所有樣品中至少帶有6個UMI reads的基因(例如,如果在1%的細胞中以3的值表達,則基因將具有的總數)。 -
通過基因的細胞數來實現過濾(例如 UMI > 0 ,或log 2(TPM)> 1 )。
默認情況下(
minSamples
),保留下來的基因能在至少1%的細胞中檢測得到。 最后,只保留
RcisTarget
數據庫中可用的基因。
# (Adjust minimum values according to your dataset)
genesKept <- geneFiltering(exprMat, scenicOptions=scenicOptions,
minCountsPerGene=3*.01*ncol(exprMat),
minSamples=ncol(exprMat)*.01)
在進行網絡推斷之前,檢查是否有任何已知的相關基因被過濾掉(如果缺少任何相關基因,請仔細檢查filter
設置是否合適):
interestingGenes <- c("Sox9", "Sox10", "Dlx5")
# any missing?
interestingGenes[which(!interestingGenes %in% genesKept)]
相關性
GENIE33或者GRNBoost可以檢測正負關聯。為了區分潛在的激活和抑制,我們將目標分為正相關和負相關目標(比如TF與潛在目標之間的Spearman
相關性)。
runCorrelation(exprMat_filtered, scenicOptions)
運行GENIE3
得到潛在轉錄因子TF
## If launched in a new session, you will need to reload...
# setwd("...")
# loomPath <- "..."
# loom <- open_loom(loomPath, mode="r")
# exprMat <- get_dgem(loom)
# close_loom(loom)
# genesKept <- loadInt(scenicOptions, "genesKept")
# exprMat_filtered <- exprMat[genesKept,]
# library(SCENIC)
# scenicOptions <- readRDS("int/scenicOptions.Rds")
# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1)
# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)
構建并評分GRN(runSCENIC_ …)
必要時重新加載表達式矩陣:
loom <- open_loom(loomPath, mode="r")
exprMat <- get_dgem(loom)
close_loom(loom)
# Optional: log expression (for TF expression plot, it does not affect any other calculation)
logMat <- log2(exprMat+1)
dim(exprMat)
使用wrapper
函數運行其余步驟:
library(SCENIC)
scenicOptions <- readRDS("int/scenicOptions.Rds")
scenicOptions@settings$verbose <- TRUE
scenicOptions@settings$nCores <- 10
scenicOptions@settings$seed <- 123
# For a very quick run:
# coexMethod=c("top5perTarget")
scenicOptions@settings$dbs <- scenicOptions@settings$dbs["10kb"] # For toy run
# save...
runSCENIC_1_coexNetwork2modules(scenicOptions)
runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) #** Only for toy run!!
runSCENIC_3_scoreCells(scenicOptions, logMat)
可選步驟
將network activity轉換成ON/OFF(二進制)格式
nPcs <- c(5) # For toy dataset
# nPcs <- c(5,15,50)
scenicOptions@settings$seed <- 123 # same seed for all of them
# Run t-SNE with different settings:
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50))
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50), onlyHighConf=TRUE, filePrefix="int/tSNE_oHC")
# Plot as pdf (individual files in int/):
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_", list.files("int"), value=T), value=T))
par(mfrow=c(length(nPcs), 3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)
# Using only "high-confidence" regulons (normally similar)
par(mfrow=c(3,3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_oHC_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)
輸出到 loom/SCope
SCENIC
生成的結果既能在http://scope.aertslab.org
查看,還能用函數export2scope()
(需要SCopeLoomR
包)保存成.loom
文件。
# DGEM (Digital gene expression matrix)
# (non-normalized counts)
# exprMat <- get_dgem(open_loom(loomPath))
# dgem <- exprMat
# head(colnames(dgem)) #should contain the Cell ID/name
# Export:
scenicOptions@fileNames$output["loomFile",] <- "output/mouseBrain_SCENIC.loom"
export2scope(scenicOptions, exprMat)
加載.loom文件中的結果
SCopeLoomR
中也有函數可以導入.loom
文件中的內容,比如調節因子,AUC
和封裝內容(比如regulon activity
的t-SNE和UMAP
結果)。
library(SCopeLoomR)
scenicLoomPath <- getOutName(scenicOptions, "loomFile")
loom <- open_loom(scenicLoomPath)
# Read information from loom file:
regulons_incidMat <- get_regulons(loom)
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonsAUC <- get_regulonsAuc(loom)
regulonsAucThresholds <- get_regulonThresholds(loom)
embeddings <- get_embeddings(loom)
解讀結果
1. 細胞狀態
AUCell
提供跨細胞的調節子的活性,AUCell使用“Area under Curve 曲線下面積”(AUC
)來計算輸入基因集的關鍵子集是否在每個細胞的表達基因中富集。通過該調節子活性(連續或二進制AUC矩陣)來聚類細胞,我們可以看出是否存在傾向于具有相同調節子活性的細胞群,并揭示在多個細胞中反復發生的網絡狀態。這些狀態等同于網絡的吸引子狀態。將這些聚類與不同的可視化方法相結合,我們可以探索細胞狀態與特定調節子的關聯。
將AUC和TF表達投射到t-SNE上
logMat <- exprMat # Better if it is logged/normalized
aucellApp <- plotTsne_AUCellApp(scenicOptions, logMat) # default t-SNE
savedSelections <- shiny::runApp(aucellApp)
print(tsneFileName(scenicOptions))
tSNE_scenic <- readRDS(tsneFileName(scenicOptions))
aucell_regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
# Show TF expression:
par(mfrow=c(2,3))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, exprMat, aucell_regulonAUC[onlyNonDuplicatedExtended(rownames(aucell_regulonAUC))[c("Dlx5", "Sox10", "Sox9","Irf1", "Stat6")],], plots="Expression")
# Save AUC as PDF:
Cairo::CairoPDF("output/Step4_BinaryRegulonActivity_tSNE_colByAUC.pdf", width=20, height=15)
par(mfrow=c(4,6))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, cellsAUC=aucell_regulonAUC, plots="AUC")
dev.off()
library(KernSmooth)
library(RColorBrewer)
dens2d <- bkde2D(tSNE_scenic$Y, 1)$fhat
image(dens2d, col=brewer.pal(9, "YlOrBr"), axes=FALSE)
contour(dens2d, add=TRUE, nlevels=5, drawlabels=FALSE)
#par(bg = "black")
par(mfrow=c(1,2))
regulonNames <- c( "Dlx5","Sox10")
cellCol <- plotTsne_rgb(scenicOptions, regulonNames, aucType="AUC", aucMaxContrast=0.6)
text(0, 10, attr(cellCol,"red"), col="red", cex=.7, pos=4)
text(-20,-10, attr(cellCol,"green"), col="green3", cex=.7, pos=4)
regulonNames <- list(red=c("Sox10", "Sox8"),
green=c("Irf1"),
blue=c( "Tef"))
cellCol <- plotTsne_rgb(scenicOptions, regulonNames, aucType="Binary")
text(5, 15, attr(cellCol,"red"), col="red", cex=.7, pos=4)
text(5, 15-4, attr(cellCol,"green"), col="green3", cex=.7, pos=4)
text(5, 15-8, attr(cellCol,"blue"), col="blue", cex=.7, pos=4)
GRN:Regulon靶標和模序
regulons <- loadInt(scenicOptions, "regulons")
regulons[c("Dlx5", "Irf1")]
regulons <- loadInt(scenicOptions, "aucell_regulons")
head(cbind(onlyNonDuplicatedExtended(names(regulons))))
regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
viewMotifs(tableSubset)
2. 細胞群的調控因子
regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
regulonActivity_byCellType <- sapply(split(rownames(cellInfo), cellInfo$CellType),
function(cells) rowMeans(getAUC(regulonAUC)[,cells]))
regulonActivity_byCellType_Scaled <- t(scale(t(regulonActivity_byCellType), center = T, scale=T))
pheatmap::pheatmap(regulonActivity_byCellType_Scaled, #fontsize_row=3,
color=colorRampPalette(c("blue","white","red"))(100), breaks=seq(-3, 3, length.out = 100),
treeheight_row=10, treeheight_col=10, border_color=NA)
# filename="regulonActivity_byCellType.pdf", width=10, height=20)
topRegulators <- reshape2::melt(regulonActivity_byCellType_Scaled)
colnames(topRegulators) <- c("Regulon", "CellType", "RelativeActivity")
topRegulators <- topRegulators[which(topRegulators$RelativeActivity>0),]
viewTable(topRegulators)
minPerc <- .7
binaryRegulonActivity <- loadInt(scenicOptions, "aucell_binary_nonDupl")
cellInfo_binarizedCells <- cellInfo[which(rownames(cellInfo)%in% colnames(binaryRegulonActivity)),, drop=FALSE]
regulonActivity_byCellType_Binarized <- sapply(split(rownames(cellInfo_binarizedCells), cellInfo_binarizedCells$CellType),
function(cells) rowMeans(binaryRegulonActivity[,cells, drop=FALSE]))
binaryActPerc_subset <- regulonActivity_byCellType_Binarized[which(rowSums(regulonActivity_byCellType_Binarized>minPerc)>0),]
pheatmap::pheatmap(binaryActPerc_subset, # fontsize_row=5,
color = colorRampPalette(c("white","pink","red"))(100), breaks=seq(0, 1, length.out = 100),
treeheight_row=10, treeheight_col=10, border_color=NA)
參考文獻
Aibar, Sara, Carmen Bravo González-Blas, Thomas Moerman, Jasper Wouters, Van Anh Huynh-Thu, Hana Imrichová, Zeynep Kalender Atak, et al. 2017. “SCENIC: Single-Cell Regulatory Network Inference and Clustering.” Nature Methods 14 (october): 1083–6. doi:10.1038/nmeth.4463.
Davie, Kristofer, Jasper Janssens, Duygu Koldere, and “et al.” 2018. “A Single-Cell Transcriptome Atlas of the Aging Drosophila Brain.” Cell, june. doi:10.1016/j.cell.2018.05.057.
Huynh-Thu, Van Anh, Alexandre Irrthum, Louis Wehenkel, and Pierre Geurts. 2010. “Inferring Regulatory Networks from Expression Data Using Tree-Based Methods.” PloS One 5 (9). doi:10.1371/journal.pone.0012776.
Marbach, Daniel, James C. Costello, Robert Küffner, Nicole M. Vega, Robert J. Prill, Diogo M. Camacho, Kyle R. Allison, et al. 2012. “Wisdom of Crowds for Robust Gene Network Inference.” Nature Methods 9 (8): 796–804. doi:10.1038/nmeth.2016.