2024-03-29基因型分布畫出箱線圖標注顯著性

GWAS分析后,顯著性的SNP,我們想把不同的分型在群體里面的表型進行可視化,現(xiàn)在最流行的方法是箱線圖+散點圖+顯著性標識,類似的圖片如下:


圖片.png

上面A、B、C是三種基因型,比如AA、AT、TT,可以看到A分型顯著低于B和C分型,B和C分型之間差異不顯著。這個圖包括:箱線圖+散點圖+顯著性,非常直觀,一圖勝千言。

除了上面的上面的需求,我們還有其他的圖可以繪制,具體匯總?cè)缦拢?/p>

本節(jié)要實現(xiàn)下面截個圖:

「單因素二水平T檢驗箱線圖可視化」


圖片.png

「單因素三水平T檢驗箱線圖可視化」


圖片.png

「單因素三水平柱形圖」
圖片.png

「單因素三水平折線圖」


圖片.png

「二因素柱形圖」
圖片.png

「二因素折線圖」
圖片.png
  1. 單因素二水平
    這種試驗,比如有兩個品種,株高的差異,每個品種調(diào)查了10株,就構(gòu)成了這樣的試驗數(shù)據(jù)。

「模擬數(shù)據(jù):」

set.seed(123)
y1 = rnorm(10) + 5
y2 = rnorm(10) + 15
dd = data.frame(Group = rep(c("A","B"),each=10),y = c(y1,y2))
dd
str(dd)
dd$Group = as.factor(dd$Group)

「數(shù)據(jù):」

> dd
   Group         y
1      A  4.439524
2      A  4.769823
3      A  6.558708
4      A  5.070508
5      A  5.129288
6      A  6.715065
7      A  5.460916
8      A  3.734939
9      A  4.313147
10     A  4.554338
11     B 16.224082
12     B 15.359814
13     B 15.400771
14     B 15.110683
15     B 14.444159
16     B 16.786913
17     B 15.497850
18     B 13.033383
19     B 15.701356
20     B 14.527209

這里,使用的是ggpubr包進行繪圖:
1.1 繪制箱線圖

library(ggplot2)
library(ggpubr)
ggboxplot(dd,x = "Group",y = "y")
圖片.png

1.2 箱線圖添加不同顏色

ggboxplot(dd,x = "Group",y = "y",color = "Group")
圖片.png

1.3 箱線圖添加散點圖

ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter")
圖片.png

1.4 箱線圖+散點圖+顯著性水平
這里,默認的統(tǒng)計方法是非參數(shù)統(tǒng)計Wilcoxon,如果想用t.test,見下面操作

ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + 
  stat_compare_means()
圖片.png

1.5 用t.test作為統(tǒng)計方法

ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + 
  stat_compare_means(method = "t.test")
圖片.png

1.6 直接輸出顯著性

ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + 
  stat_compare_means(method = "t.test",label = "p.signif")
圖片.png
  1. 單因素三水平
    二個水平可以用T檢驗,三個水平或者多個水平的數(shù)據(jù),如何檢驗?zāi)兀?/li>

「模擬數(shù)據(jù):」

# 構(gòu)建三個水平 ANOVA
set.seed(123)
y1 = rnorm(10) + 5
y2 = rnorm(10) + 15
y3 = rnorm(10) + 15

dd = data.frame(Group = rep(c("A","B","C"),each=10),y = c(y1,y2,y3))
dd
str(dd)
dd$Group = as.factor(dd$Group)

「數(shù)據(jù)如下:」

> dd
   Group         y
1      A  4.439524
2      A  4.769823
3      A  6.558708
4      A  5.070508
5      A  5.129288
6      A  6.715065
7      A  5.460916
8      A  3.734939
9      A  4.313147
10     A  4.554338
11     B 16.224082
12     B 15.359814
13     B 15.400771
14     B 15.110683
15     B 14.444159
16     B 16.786913
17     B 15.497850
18     B 13.033383
19     B 15.701356
20     B 14.527209
21     C 13.932176
22     C 14.782025
23     C 13.973996
24     C 14.271109
25     C 14.374961
26     C 13.313307
27     C 15.837787
28     C 15.153373
29     C 13.861863
30     C 16.253815

2.1 箱線圖+散點圖

p = ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") 
p
圖片.png

2.2 箱線圖+散點圖+顯著性

p + stat_compare_means(method = "anova")
圖片.png

2.3 兩兩之間顯著性繪制

my_comparisons = list( c("A", "B"), c("A", "C"), c("B", "C") )
p + stat_compare_means(comparisons = my_comparisons,
                       # label = "p.signif",
                       method = "t.test")
圖片.png

2.4 顯示顯著性

p + stat_compare_means(comparisons = my_comparisons,
                       label = "p.signif",
                       method = "t.test")
  1. 兩因素數(shù)據(jù)
    「模擬數(shù)據(jù):」
# 兩個因素的數(shù)據(jù)
set.seed(123)
y1 = rnorm(10) + 5
y2 = rnorm(10) + 8
y3 = rnorm(10) + 7
y4 = rnorm(10) + 15
y5 = rnorm(10) + 18
y6 = rnorm(10) + 17

dd = data.frame(Group1 = rep(c("A","B","C"),each=10),
                Group2 = rep(c("X","Y"),each=30),
                y = c(y1,y2,y3,y4,y5,y6))
dd
str(dd)
dd$Group1 = as.factor(dd$Group1)
dd$Group2 = as.factor(dd$Group2)
str(dd)

「數(shù)據(jù)預(yù)覽:」

> dd
   Group1 Group2         y
1       A      X  4.439524
2       A      X  4.769823
3       A      X  6.558708
4       A      X  5.070508
5       A      X  5.129288
6       A      X  6.715065
7       A      X  5.460916
8       A      X  3.734939
9       A      X  4.313147
10      A      X  4.554338
11      B      X  9.224082
12      B      X  8.359814
13      B      X  8.400771
14      B      X  8.110683
15      B      X  7.444159
16      B      X  9.786913
17      B      X  8.497850
18      B      X  6.033383
19      B      X  8.701356
20      B      X  7.527209
21      C      X  5.932176
22      C      X  6.782025
23      C      X  5.973996
24      C      X  6.271109
25      C      X  6.374961
26      C      X  5.313307
27      C      X  7.837787
28      C      X  7.153373
29      C      X  5.861863
30      C      X  8.253815
31      A      Y 15.426464
32      A      Y 14.704929
33      A      Y 15.895126
34      A      Y 15.878133
35      A      Y 15.821581
36      A      Y 15.688640
37      A      Y 15.553918
38      A      Y 14.938088
39      A      Y 14.694037
40      A      Y 14.619529
41      B      Y 17.305293
42      B      Y 17.792083
43      B      Y 16.734604
44      B      Y 20.168956
45      B      Y 19.207962
46      B      Y 16.876891
47      B      Y 17.597115
48      B      Y 17.533345
49      B      Y 18.779965
50      B      Y 17.916631
51      C      Y 17.253319
52      C      Y 16.971453
53      C      Y 16.957130
54      C      Y 18.368602
55      C      Y 16.774229
56      C      Y 18.516471
57      C      Y 15.451247
58      C      Y 17.584614
59      C      Y 17.123854
60      C      Y 17.215942

3.1 繪制分組箱線圖

p = ggboxplot(dd,x = "Group1",y="y",color = "Group2",
              add = "jitter")
p
圖片.png

3.2 增加P值

p + stat_compare_means(aes(group = Group2),method = "t.test")
圖片.png

3.3 修改為顯著性結(jié)果

p + stat_compare_means(aes(group = Group2),method = "t.test",label = "p.signif")

圖片.png

3.4 將分組數(shù)據(jù)分開繪制

p = ggboxplot(dd,x = "Group2",y="y",color = "Group1",
              add = "jitter",facet.by = "Group1")
p
圖片.png

3.5 分組顯示統(tǒng)計檢驗

p + stat_compare_means(method = "t.test")
圖片.png

3.6 分組顯示顯著性結(jié)果

p + stat_compare_means(method = "t.test",label = "p.signif",label.y = 17)

圖片.png
  1. 單因素直方圖繪制

直方圖+標準誤,之前用ggplot2需要很長的代碼,這里有更好的方案。
4.1 直方圖+標準誤

p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group1")
p
圖片.png

4.2 直方圖+標準誤+顯著性

p + stat_compare_means(method = "anova",,label.y = 15)+ 
  stat_compare_means(comparisons = my_comparisons)
圖片.png
  1. 單因素折線圖繪制
    5.1 折線圖+標準誤
p = ggline(dd,x = "Group1",y = "y",add = "mean_se")
p
圖片.png

5.2 折線圖+標準誤+顯著性

p + stat_compare_means(method = "anova",,label.y = 15)+ 
  stat_compare_means(comparisons = my_comparisons)
圖片.png
  1. 二因素直方圖繪制
    6.1 直方圖+標準誤
p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8))
p
圖片.png

6.2 直方圖+標準誤+顯著性

p + stat_compare_means(aes(group=Group2), label = "p.signif")

圖片.png
  1. 二因素折線圖繪制
    7.1 折線圖+標準誤
p = ggline(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8))
p
圖片.png

7.2 折線圖+標準誤+顯著性

p + stat_compare_means(aes(group=Group2), label = "p.signif")
圖片.png
  1. 代碼匯總
    下面代碼是所有代碼的匯總,里面包括生成數(shù)據(jù),做不同類型的圖。只需要將數(shù)據(jù)整理為這種格式,就可以出圖了,對于初學(xué)者而言,是最簡單最快捷的方法。show you the code!
# > 歡迎關(guān)注我的公眾號:`育種數(shù)據(jù)分析之放飛自我`。主要分享R語言,Python,育種數(shù)據(jù)分析,生物統(tǒng)計,數(shù)量遺傳學(xué),混合線性模型,GWAS和GS相關(guān)的知識。

# 構(gòu)建兩個水平 T-test
set.seed(123)
y1 = rnorm(10) + 5
y2 = rnorm(10) + 15
dd = data.frame(Group = rep(c("A","B"),each=10),y = c(y1,y2))
dd
str(dd)
dd$Group = as.factor(dd$Group)

library(ggplot2)
library(ggpubr)
ggboxplot(dd,x = "Group",y = "y")
ggboxplot(dd,x = "Group",y = "y",color = "Group")
ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter")
ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + 
  stat_compare_means()

ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + 
  stat_compare_means(method = "t.test")

ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + 
  stat_compare_means(method = "t.test",label = "p.signif")




# 構(gòu)建三個水平 ANOVA
set.seed(123)
y1 = rnorm(10) + 5
y2 = rnorm(10) + 15
y3 = rnorm(10) + 15

dd = data.frame(Group = rep(c("A","B","C"),each=10),y = c(y1,y2,y3))
dd
str(dd)
dd$Group = as.factor(dd$Group)

p = ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") 
p
p + stat_compare_means(method = "anova")


# Perorm pairwise comparisons
# compare_means(y ~ Group,  data = dd,method = "anova")

my_comparisons = list( c("A", "B"), c("A", "C"), c("B", "C") )
p + stat_compare_means(comparisons = my_comparisons,
                       # label = "p.signif",
                       method = "t.test")

p + stat_compare_means(comparisons = my_comparisons,
                       label = "p.signif",
                       method = "t.test")

# 兩個因素的數(shù)據(jù)
set.seed(123)
y1 = rnorm(10) + 5
y2 = rnorm(10) + 8
y3 = rnorm(10) + 7
y4 = rnorm(10) + 15
y5 = rnorm(10) + 18
y6 = rnorm(10) + 17

dd = data.frame(Group1 = rep(c("A","B","C"),each=10),
                Group2 = rep(c("X","Y"),each=30),
                y = c(y1,y2,y3,y4,y5,y6))
dd
str(dd)
dd$Group1 = as.factor(dd$Group1)
dd$Group2 = as.factor(dd$Group2)
str(dd)

## 分組查看
p = ggboxplot(dd,x = "Group1",y="y",color = "Group2",
              add = "jitter")
p
p + stat_compare_means(aes(group = Group2),method = "t.test")
p + stat_compare_means(aes(group = Group2),method = "t.test",label = "p.signif")

## 分組查看
p = ggboxplot(dd,x = "Group2",y="y",color = "Group1",
              add = "jitter",facet.by = "Group1")
p
p + stat_compare_means(method = "t.test")
p + stat_compare_means(method = "t.test",label = "p.signif",label.y = 17)


# 單分組
# 三水平直方圖
p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group1")
p
p + stat_compare_means(method = "anova",,label.y = 15)+ 
  stat_compare_means(comparisons = my_comparisons)

# 有誤差的折線圖
p = ggline(dd,x = "Group1",y = "y",add = "mean_se")
p
p + stat_compare_means(method = "anova",,label.y = 15)+ 
  stat_compare_means(comparisons = my_comparisons)


# 二分組
p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8))
p
p + stat_compare_means(aes(group=Group2), label = "p.signif")


# 有誤差的折線圖
p = ggline(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8))
p
p + stat_compare_means(aes(group=Group2), label = "p.signif")

本文引自育種數(shù)據(jù)分析之放飛自我

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容