目前所學(xué)知識(shí)里,主要需要用到factor因子的地方就是繪圖時(shí)(例如箱線圖)用于調(diào)整順序。forcats提供了一些便捷函數(shù)用于調(diào)整factor的levels順序
1、基礎(chǔ)函數(shù)
-
factor()
將非因子類向量轉(zhuǎn)為因子類向量;
默認(rèn)順序?yàn)樽址鬃帜赶群箜樞颍蛘邤?shù)值從小到大順序。
可通過levels=
參數(shù)自定義 因子順序 -
levels()
用于查看 levels組成、及順序
test = factor(c("c","b","c"))
test
# [1] c b c
# Levels: b c
levels(test)
# [1] "b" "c"
factor(c("c","b","c"), levels = c("c","b"))
# [1] c b c
# Levels: c b
2、forcats包調(diào)整因子順序
首先forcats包本身提供了關(guān)于factor多種多樣的處理技巧,這里僅提及學(xué)習(xí)其中關(guān)于因子順序的相關(guān)函數(shù)。
2.1 根據(jù)向量本身特點(diǎn)修改levels
-
fct_relevel()
自定義更新levels順序
f <- factor(c("a", "c", "b", "a"),
levels = c("a", "b", "c"))
fct_relevel(f, c("b", "c", "a"))
# [1] a c b a
# Levels: b c a
-
fct_infreq
根據(jù)頻率修改levels順序
f <- factor(c("b", "c", "c", "b","a","b"))
fct_infreq(f)
# [1] b c c b a b
# Levels: b c a
-
fct_inorder()
根據(jù)出現(xiàn)順序修改levels順序
f <- factor(c("b", "c", "c", "a","b","a"))
fct_inorder(f)
# [1] b c c a b a
# Levels: b c a
-
fct_rev()
逆序
f <- factor(c("b", "c", "c", "a","b","a"))
fct_rev(f)
# [1] b c c a b a
# Levels: c b a
2.2 根據(jù)向量相關(guān)屬性(data.frame)指標(biāo)修改levels
以iris
示例數(shù)據(jù)為例
data("iris")
summary(iris)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50
# 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50
# Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50
# Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
# 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
# Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
str(iris)
# 'data.frame': 150 obs. of 5 variables:
# $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
levels(iris$Species)
# [1] "setosa" "versicolor" "virginica"
fct_reorder()
- 第一個(gè)參數(shù)為待修改的factor,第二個(gè)參數(shù)為相關(guān)屬性信息,第三個(gè)參數(shù)為計(jì)算方法
- 例如我想按照每種花的花萼的平均寬度由低到高 設(shè)置Species的順序
fct_reorder(iris$Species, iris$Sepal.Width) %>% levels()
# [1] "versicolor" "virginica" "setosa"
aggregate(iris$Sepal.Width, by=list(iris$Species), mean)
# Group.1 x
# 1 setosa 3.428
# 2 versicolor 2.770
# 3 virginica 2.974
-
fct_reorder()
默認(rèn)計(jì)算方法為mean均值,默認(rèn)排序?yàn)樯?code>.desc=F。在實(shí)際使用使可按需修改
#aggregate(iris$Sepal.Width, by=list(iris$Species), median)
fct_reorder(iris$Species, iris$Sepal.Width, median, .desc=T) %>% levels()
- 下面以ggplot繪圖演示fct_reorder的用法
p1=ggplot(iris, aes(x=Species, y=Sepal.Width)) +
geom_boxplot() +
ggtitle("default factor levels")
p2=ggplot(iris, aes(x=fct_reorder(Species, Sepal.Width), y=Sepal.Width)) +
geom_boxplot() +
ggtitle("fct_reorder default levels")
p3=ggplot(iris, aes(x=fct_reorder(Species, Sepal.Width, .desc=T), y=Sepal.Width)) +
geom_boxplot() +
ggtitle("fct_reorder descent levels")
library(patchwork)
p1 | p2 | p3