上下簡單合并
merge(df1, df2, all = T)
dplyr的join系列合并數據框
library(tidyverse)
—— 加載tidyverse包
full_join(df1, df2, by = '?')
—— 兩表格并集
inner_join(df1, df2, by = '?')
—— 兩表格交集
left_join(df1, df2, by = '?')
—— 左連接
right_join(df1, df2, by = '?')
—— 右連接
數據拆分
- separate_rows(df_x, col_y, sep = ",")
將數據按分隔符拆分為多行。
PS:分隔符為“|”時應該輸入sep = '\\|'
去重
去除重復行,保留一行
- unique()
- duplicated()
去除空行
- 一行中只要有NA就整行刪除;
# 創建一個數據框
df <- data.frame(Name = c("Tom", "Nick", "John", "Tom"),
Age = c(23, NA, 21, 25),
Score = c(85, 90, NA, 88))
# 使用na.omit()函數刪除含有NA值的行
df <- na.omit(df)