0.背景知識(shí)
TIDE評(píng)分越高越容易發(fā)生免疫逃逸,免疫治療獲益的可能性就越低,評(píng)分>0視為無(wú)響應(yīng),<0視為有反應(yīng)。
只有網(wǎng)頁(yè)工具和python版。網(wǎng)頁(yè)工具需要注冊(cè)登陸使用,普通郵箱注冊(cè)就可以。
https://github.com/liulab-dfci/TIDEpy
1.亞型
可以是現(xiàn)成的:
TCGA的亞型數(shù)據(jù)下載鏈接: https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com/download/TCGASubtype.20170308.tsv.gz
也可以是自己聚類(lèi)得到的,例如
實(shí)戰(zhàn)-TCGA數(shù)據(jù)的NMF聚類(lèi)和可視化
TCGA數(shù)據(jù)的一致性聚類(lèi)實(shí)戰(zhàn)和可視化
有人把亞型分析做成了一站式R包
多組學(xué)、多算法聚類(lèi)神器-MOVICS
rm(list = ls())
library(stringr)
sub = rio::import("TCGASubtype.20170308.tsv.gz")
k = stringr::str_starts(sub$Subtype_Selected,"ACC");table(k)
## k
## FALSE TRUE
## 7643 91
sub = sub[k,]
table(sub$Subtype_Selected)
##
## ACC.CIMP-high ACC.CIMP-intermediate ACC.CIMP-low
## 20 27 32
## ACC.NA
## 12
k2 = sub$Subtype_Selected!="ACC.NA";table(k2)
## k2
## FALSE TRUE
## 12 79
sub = sub[k2,]
sub = data.frame(row.names = sub$sample,
subtype = str_remove_all(sub$Subtype_Selected,"ACC.CIMP-| "))
head(sub)
## subtype
## TCGA-OR-A5J1-01 high
## TCGA-OR-A5J2-01 low
## TCGA-OR-A5J3-01 intermediate
## TCGA-OR-A5J4-01 high
## TCGA-OR-A5J5-01 intermediate
## TCGA-OR-A5J6-01 low
搞成行名是樣本名稱(chēng),內(nèi)容是亞型的格式即可。
2.表達(dá)矩陣
load("D:/TCGA_RNA_seq/count/acc_exp.Rdata")
acc[1:3,1:3]
## TCGA-OR-A5JD-01A-11R-A29S-07 TCGA-OR-A5L5-01A-11R-A29S-07
## ENSG00000000003.15 2086 3813
## ENSG00000000005.6 2 3
## ENSG00000000419.13 2086 2909
## TCGA-OR-A5KX-01A-11R-A29S-07
## ENSG00000000003.15 2145
## ENSG00000000005.6 2
## ENSG00000000419.13 2546
library(tinyarray)
exp = trans_exp_new(acc)
table(make_tcga_group(exp)) #都是tumor,不是的話要去除normal樣本
##
## normal tumor
## 0 79
#exp = exp[,make_tcga_group(exp)=="tumor"]
3.匹配表達(dá)矩陣與亞型數(shù)據(jù)的樣本順序
head(colnames(exp))
## [1] "TCGA-OR-A5JD-01A-11R-A29S-07" "TCGA-OR-A5L5-01A-11R-A29S-07"
## [3] "TCGA-OR-A5KX-01A-11R-A29S-07" "TCGA-OR-A5JY-01A-31R-A29S-07"
## [5] "TCGA-OR-A5JV-01A-11R-A29S-07" "TCGA-PK-A5H8-01A-11R-A29S-07"
head(rownames(sub))
## [1] "TCGA-OR-A5J1-01" "TCGA-OR-A5J2-01" "TCGA-OR-A5J3-01" "TCGA-OR-A5J4-01"
## [5] "TCGA-OR-A5J5-01" "TCGA-OR-A5J6-01"
colnames(exp) = str_sub(colnames(exp),1,15)
s = intersect(colnames(exp),rownames(sub));length(s)
## [1] 78
exp = exp[,s]
sub = sub[s,,drop=F]
4.將表達(dá)矩陣進(jìn)行標(biāo)準(zhǔn)化并導(dǎo)出
TIDE首頁(yè)有明顯的提示:
Note: The gene expression value should be normalized toward a control sample which could be either normal tissues related with a cancer type or mixture sample from diverse tumor samples. The log2(RPKM+1) values from a RNA-seq experiment may not be meaningful unless a good reference control is available to adjust the batch effect and cancer type difference. In our study, we used the all sample average in each study as the normalization control.
最后一句話說(shuō)“使用每個(gè)研究中的所有樣本平均值作為歸一化對(duì)照” 代碼是:
exp2 <- sweep(exp,1, apply(exp,1,mean,na.rm=T))
write.table(exp2,"TIDE.txt",sep = "\t",quote = F)
5.讀取結(jié)果并作圖
將這個(gè)文件上傳的TIDE,得到的結(jié)果是tide.csv
res <- read.csv("tide.csv",row.names = 1,check.names = F)
res[1:4,1:4]
## No benefits Responder TIDE IFNG
## TCGA-OR-A5J9-01 False False 0.90 -1353.97
## TCGA-P6-A5OF-01 False False 0.68 -818.80
## TCGA-OR-A5KV-01 False False 0.66 -1885.47
## TCGA-OR-A5JF-01 False False 0.64 -1489.63
res = merge(res,sub,by = "row.names")
table(res$Responder,res$subtype)
##
## high intermediate low
## False 11 11 14
## True 8 16 18
f = fisher.test(table(res$subtype,res$Responder))
label = paste("fisher.test p value =",round(f$p.value,3))
label
## [1] "fisher.test p value = 0.525"
fisher.test用來(lái)檢驗(yàn)subtype和Responder是否相關(guān),p<0.05表示相關(guān)
很不幸這個(gè)例子是不相關(guān)滴。
5.畫(huà)圖
TIDE列是TIDE分?jǐn)?shù)。Responder是免疫治療是否響應(yīng)
5.1 TIDE評(píng)分柱狀圖
library(ggplot2)
library(dplyr)
res = arrange(res,desc(TIDE))
p1 = ggplot(res, aes(x = 1:nrow(res),
y = TIDE,
fill = Responder)) +
geom_bar(stat = "identity") +
scale_fill_manual(values = c("#f87669","#2fa1dd"))+
xlab("patient")+
annotate("text", x = 40, y = 1, label = label,size = 5) +
theme_minimal()
5.2.免疫反應(yīng)與亞型
library(dplyr)
dat = count(res,subtype,Responder)
dat = dat %>% group_by(subtype) %>%
summarise(Responder = Responder,n = n/sum(n))
dat$Responder = factor(dat$Responder,levels = c("False","True"))
dat
## # A tibble: 6 × 3
## # Groups: subtype [3]
## subtype Responder n
## <chr> <fct> <dbl>
## 1 high False 0.579
## 2 high True 0.421
## 3 intermediate False 0.407
## 4 intermediate True 0.593
## 5 low False 0.438
## 6 low True 0.562
library(ggplot2)
p2 = ggplot(data = dat)+
geom_bar(aes(x = subtype,y = n,
fill = Responder),
stat = "identity")+
scale_fill_manual(values = c("#f87669","#2fa1dd"))+
geom_label(aes(x = subtype,y = n,
label = scales::percent(n),
fill = Responder),
color = "white",
size = 4,label.size = 0,
show.legend = F,
position = position_fill(vjust = 0.5))+
theme_minimal()+
guides(label = "none")
library(patchwork)
p1+p2+ plot_layout(widths = c(3,2),guides = "collect")
6.其他幾個(gè)評(píng)分
colnames(res)
## [1] "Row.names" "No benefits" "Responder" "TIDE" "IFNG"
## [6] "MSI Expr Sig" "Merck18" "CD274" "CD8" "CTL.flag"
## [11] "Dysfunction" "Exclusion" "MDSC" "CAF" "TAM M2"
## [16] "subtype"
IFNG:Interferon-gamma,干擾素-γ是一種由免疫細(xì)胞,特別是T細(xì)胞和自然殺傷細(xì)胞產(chǎn)生的細(xì)胞因子。 Cytotoxic T Lymphocyte(CTL.flag,細(xì)胞毒性T淋巴細(xì)胞) T cell dysfunction score(Dysfunction,T細(xì)胞功能障礙評(píng)分) T cell exclusion score(Exclusion,T細(xì)胞排斥評(píng)分) cancer-associated fibroblasts (CAF,癌癥相關(guān)成纖維細(xì)胞) myeloid-derived suppressor cells (MDSC,髓源性抑制細(xì)胞) M2 macrophages.
詳細(xì)的分?jǐn)?shù)計(jì)算原理在這里: https://liulab-dfci.github.io/RIMA/Response.html
可以作圖比較他們
dat = t(res[,c(4:5,8:9,11:15)])
draw_boxplot(dat,res$Responder)+
facet_wrap(~rows,scales = "free")
draw_boxplot(dat,res$subtype)+
facet_wrap(~rows,scales = "free")