寫在前面
之前我們介紹了Seurat
、Harmony
,rliger
三個(gè)包,用于3'
和5'
數(shù)據(jù)合并的方法。??
但有時(shí)候我們會(huì)遇到兩個(gè)datasets
只有部分重疊,這和之前介紹的方法就有一點(diǎn)不同了。??
用到的包
rm(list = ls())
library(Seurat)
library(SeuratDisk)
library(SeuratWrappers)
library(patchwork)
library(harmony)
library(rliger)
library(RColorBrewer)
library(tidyverse)
library(reshape2)
library(ggsci)
library(ggstatsplot)
示例數(shù)據(jù)
這里我們提供1
個(gè)3’ PBMC dataset
和1
個(gè)whole blood dataset
。??
umi_gz <- gzfile("./GSE149938_umi_matrix.csv.gz",'rt')
umi <- read.csv(umi_gz,check.names = F,quote = "")
matrix_3p <- Read10X_h5("./3p_pbmc10k_filt.h5",use.names = T)
創(chuàng)建Seurat
對(duì)象。??
srat_wb <- CreateSeuratObject(t(umi),project = "whole_blood")
srat_3p <- CreateSeuratObject(matrix_3p,project = "pbmc10k_3p")
rm(umi_gz)
rm(umi)
rm(matrix_3p)
srat_wb
srat_3p
修改metadata
為了方便后續(xù)分析,這里我們對(duì)metadata
進(jìn)行一下注釋修改。 ??
colnames(srat_wb@meta.data)[1] <- "cell_type"
srat_wb@meta.data$orig.ident <- "whole_blood"
srat_wb@meta.data$orig.ident <- as.factor(srat_wb@meta.data$orig.ident)
head(srat_wb[[]])
基礎(chǔ)質(zhì)控
做一下標(biāo)準(zhǔn)操作,計(jì)算線粒體基因和核糖體基因。??
srat_wb <- SetIdent(srat_wb,value = "orig.ident")
srat_wb[["percent.mt"]] <- PercentageFeatureSet(srat_wb, pattern = "^MT-")
srat_wb[["percent.rbp"]] <- PercentageFeatureSet(srat_wb, pattern = "^RP[SL]")
srat_3p[["percent.mt"]] <- PercentageFeatureSet(srat_3p, pattern = "^MT-")
srat_3p[["percent.rbp"]] <- PercentageFeatureSet(srat_3p, pattern = "^RP[SL]")
p1 <- VlnPlot(srat_wb, ncol = 4,
features = c("nFeature_RNA","nCount_RNA","percent.mt","percent.rbp"))
p2 <- VlnPlot(srat_3p, ncol = 4,
features = c("nFeature_RNA","nCount_RNA","percent.mt","percent.rbp"))
p1/p2
交集基因
whole blood dataset
使用的是Cell Ranger GRCh38-2020A
進(jìn)行注釋,與3’ PBMC dataset
差的比較多,所以我們先看一下有多少共同基因吧。??
# table(rownames(srat_3p) %in% rownames(srat_wb))
common_genes <- rownames(srat_3p)[rownames(srat_3p) %in% rownames(srat_wb)]
length(common_genes)
過濾基因
我們?cè)O(shè)置一下過濾條件,把一些表達(dá)過低或過高的細(xì)胞去掉,以及一些線粒體基因過高的細(xì)胞(細(xì)胞狀態(tài)不佳)。??
srat_3p <- subset(srat_3p, subset = nFeature_RNA > 500 & nFeature_RNA < 5000 & percent.mt < 15)
srat_wb <- subset(srat_wb, subset = nFeature_RNA > 1000 & nFeature_RNA < 6000)
srat_3p <- srat_3p[rownames(srat_3p) %in% common_genes,]
srat_wb <- srat_wb[rownames(srat_wb) %in% common_genes,]
數(shù)據(jù)整合
8.1 合并為list
wb_list <- list()
wb_list[["pbmc10k_3p"]] <- srat_3p
wb_list[["whole_blood"]] <- srat_wb
8.2 Normalization與特征基因
for (i in 1:length(wb_list)) {
wb_list[[i]] <- NormalizeData(wb_list[[i]], verbose = F)
wb_list[[i]] <- FindVariableFeatures(wb_list[[i]], selection.method = "vst", nfeatures = 2000, verbose = F)
}
8.3 尋找Anchors并整合數(shù)據(jù)
wb_anchors <- FindIntegrationAnchors(object.list = wb_list, dims = 1:30)
wb_seurat <- IntegrateData(anchorset = wb_anchors, dims = 1:30)
rm(wb_list)
rm(wb_anchors)
整合效果可視化
9.1 整合前
DefaultAssay(wb_seurat) <- "RNA"
wb_seurat <- NormalizeData(wb_seurat, verbose = F)
wb_seurat <- FindVariableFeatures(wb_seurat, selection.method = "vst", nfeatures = 2000, verbose = F)
wb_seurat <- ScaleData(wb_seurat, verbose = F)
wb_seurat <- RunPCA(wb_seurat, npcs = 30, verbose = F)
wb_seurat <- RunUMAP(wb_seurat, reduction = "pca", dims = 1:30, verbose = F)
DimPlot(wb_seurat,reduction = "umap") +
scale_color_npg()+
plot_annotation(title = "10k 3' PBMC and whole blood, before integration")
9.2 整合后
DefaultAssay(wb_seurat) <- "integrated"
wb_seurat <- ScaleData(wb_seurat, verbose = F)
wb_seurat <- RunPCA(wb_seurat, npcs = 30, verbose = F)
wb_seurat <- RunUMAP(wb_seurat, reduction = "pca", dims = 1:30, verbose = F)
DimPlot(wb_seurat, reduction = "umap") +
scale_color_npg()+
plot_annotation(title = "10k 3' PBMC and white blood cells, after integration")
降維與聚類
10.1 聚類可視化
wb_seurat <- FindNeighbors(wb_seurat, dims = 1:30, k.param = 10, verbose = F)
wb_seurat <- FindClusters(wb_seurat, verbose = F)
ncluster <- length(unique(wb_seurat[[]]$seurat_clusters))
mycol <- colorRampPalette(brewer.pal(8, "Set2"))(ncluster)
DimPlot(wb_seurat,label = T, reduction = "umap",
cols = mycol, repel = T) +
NoLegend()
10.2 具體查看及可視化
count_table <- table(wb_seurat@meta.data$seurat_clusters,
wb_seurat@meta.data$orig.ident)
count_table
#### 可視化
count_table %>%
as.data.frame() %>%
ggbarstats(x = Var2,
y = Var1,
counts = Freq)+
scale_fill_npg()
<img src="https://upload-images.jianshu.io/upload_images/24475539-a8a83e37395b6820.png" alt="面包" style="zoom:25%;" />
<center>最后祝大家早日不卷!~</center>
需要示例數(shù)據(jù)的小伙伴,在公眾號(hào)回復(fù)
Merge2
獲取吧!點(diǎn)個(gè)在看吧各位~ ?.???? ??? ?
<center> <b>?? 往期精彩 <b> </center>
?? <font size=1>?? Google | 谷歌翻譯崩了我們?cè)趺崔k!?(附完美解決方案)</font>
?? <font size=1>?? scRNA-seq | 吐血整理的單細(xì)胞入門教程</font>
?? <font size=1>?? Reticulate | 如何在Rstudio中優(yōu)雅地調(diào)用Python!?</font>
?? <font size=1>?? NetworkD3 | 讓我們一起畫個(gè)動(dòng)態(tài)的桑基圖吧~</font>
?? <font size=1>?? RColorBrewer | 再多的配色也能輕松搞定!~</font>
?? <font size=1>?? rms | 批量完成你的線性回歸</font>
?? <font size=1>?? CMplot | 連Nature上的曼哈頓圖都卷起來啦</font>
?? <font size=1>?? CMplot | 完美復(fù)刻N(yùn)ature上的曼哈頓圖</font>
?? <font size=1>?? Network | 高顏值動(dòng)態(tài)網(wǎng)絡(luò)可視化工具</font>
?? <font size=1>?? boxjitter | 完美復(fù)刻N(yùn)ature上的高顏值統(tǒng)計(jì)圖</font>
?? <font size=1>?? linkET | 完美解決ggcor安裝失敗方案(附教程)</font>
?? <font size=1>......</font>
本文由mdnice多平臺(tái)發(fā)布