hello,大家好,隨著10X單細胞、10X空間轉錄組如火如荼的進行中,我們的分析內容和手段也要進入深水區了,很多深入和細節的分析需要我們格外注意了,今天我們來分享兩個非常好的點,希望大家能夠深入分析自己的數據,發大文章。
首先第一點,Spatial Correlation Analysis,其實這個談過好幾次了,文章在10X空間轉錄組之共定位分析(細胞類型和配受體基因),10X空間轉錄組之基因的空間表達模式,10X空間轉錄組(10X單細胞)之論細胞通訊空間分布的重要性等。這一次我們在文章Multimodal Analysis of Composition and Spatial Architecture in Human Squamous Cell Carcinoma分享一些很經典和值得注意的方法,大家一定要重點關注。
We reasoned that genes expressed in adjacent spots in ST were potentially meaningful and that a simple correlation of genes across spots could overlook this adjacency structure within the data(在ST的相鄰斑點中表達的基因具有潛在的意義,并且各個斑點之間的基因簡單相關可能會忽略數據中的這種鄰接結構,這個地方已經多次強調過,希望引起大家的重視 ). Thus, we calculated average normalized gene expression(均一化的數據) across a ‘‘sliding window’’ of spot groups consisting of a central spot surrounding by its N nearest neighbors(臨近spot), where N = 4 in the original ST data and N = 6 in Visium samples for each spot in the tissue, generating a matrix of genes by average spot group expression across all spots
(重點關注,臨近spot平均之后產生新的矩陣). This matrix can be correlated with any ‘‘anchoring’’ gene of interest (FOXP3 in our case) by calculating pairwise Pearson correlations of the FOXP3 expression vector across all spots and the gene average group expression vectors across spots(這個地方體現其準備的價值). These values reflect if the expression of a gene in the area surrounding the anchoring gene is correlated with the expression of the anchoring gene and termed ‘‘spatial gene correlation’’ with FOXP3 .(空間基因的相關性)。
關于空間基因的相關性分析,多次的強調過,因為組織有一個有序的“實體”,組織上的細胞類型,基因表達的分布都有其深刻的生物學意義,一定要重點關注。
第二個分析點,cellphoneDB與NicheNet聯合進行細胞通訊分析,這個方法相當經典
Ligand-receptor interactions were inferred using a similar approach as previously described (Vento-Tormo et al., 2018)(這個地方就是cellphoneDB的分析結果). We first calculated average expression of ligand and receptor pairs across cell type pairs in normalized scRNA-seq data from an aggregate of the seven patient tumor samples containing TSK cells(老套路). We only considered genes with more than 10% of cells demonstrating expression within each cell type considered. We calculated a null distribution for average ligand-receptor by shuffling cell identities in the aggregated data and re-calculating ligand-receptor average pair expression across 1,000 permutations of randomized cell identities. The P value was the number of randomized pairs exceeding the observed data. For bar plots shown in Figures 6B and 6C, in addition to including only ligand-receptor pairs with p < 0.001, we further thresholded individual ligand or receptor expression with a cutoff of average expression > 0.2 (in log space). The 0.2 cutoff was determined by calculating the average log gene expression distribution for all genes across each cell type, and genes expressed at or above this cutoff corresponded with the top 12% or higher of expressed genes for each cell type.(這個地方就是cellphoneDB的一般流程)。
For NicheNet analysis, we derived TME cell type signatures by taking the top 100 differentially expressed genes in cells isolated from tumors or normal skin, including B cells, endothelial cells, fibroblasts, Langerhans cells, plasmacytoid DCs, CD1C DCs, CLEC9A DCs, T cells, NK cells, macrophages, and MDSCs(熟悉這個軟件的同學應該不陌生,需要輸入靶基因列表,但是這個靶基因的選擇很有講究,不是簡單的cluster之間的差異。)。 We input these signatures into NicheNet to derive a union set of predicted ligands modulating tumor-specific TME cell type signatures(依據靶基因預測配體). For ligands predicting TSK modulation, we input the top 100 TSK-differentially expressed genes . The top 15% of predicted ligands (配體的挑選)by regulatory potential that also demonstrated significance in our scRNA-seq ligand-receptor interaction analysis .we used the FindAllMarkers function in Seurat to generate average logFC values per cell type compared to other cell types from the scRNAseq data.(千萬注意)。
For ligand-receptor spatial transcriptomic proximity analysis, the average value of all ligand-receptor pairs across the leading edge from the eight sections from patients 2, 4, and 10 were calculated first by averaging the ligand and receptor expression among each leading edge spot and its 4-6 nearest neighbors (depending on ST technology), and then taking the average values of all of these groups of five or seven spots across the leading edge. This calculation for each ligand-receptor pair was then performed on 1,000 randomized permutations of spot identities while preserving total number of spots per replicate section to generate a null distribution per patient. P value was calculated by number of randomized permutation calculations that exceeded the true average.(邊界分析)。
簡單總結一下,cellphoneDB分析配受體,依據感興趣的靶基因,通過NicheNet分析,挑選高活性的配體,然后再從cellphoneDB里面匹配顯著的配受體對,從而達到分析目的,說起來很簡單,但真正的操作,很需要智慧和能力。
生活很好,等你超越