自分用のメモです。
TRUE/FALSEの数を数えたい
> a = c(TRUE, TRUE, FALSE, TRUE, FALSE)
> sum(a)
[1] 3
> b <- data.frame(x=c(1, 2, 3, NA), y=c(NA, 4, 5, NA), z=c(6, 7, 8, 9))
> table(is.na(b))
FALSE TRUE
9 3
データセットから特定のIDのデータをomit
> df = data.frame(ID = c("ID1","ID2","ID3","ID4","ID5","ID6","ID7"),
score1=c(4,6,2,9,5,7,8),
score2=c(45,78,44,89,66,49,72),
score3=c(56,52,45,88,33,90,47))
> df
ID score1 score2 score3
1 ID1 4 45 56
2 ID2 6 78 52
3 ID3 2 44 45
4 ID4 9 89 88
5 ID5 5 66 33
6 ID6 7 49 90
7 ID7 8 72 47
>
> omit_ID <- c("ID2","ID3","ID6")
> df2 <- df[is.na(match(df$ID,omit_ID)==FALSE),]
> df2
ID score1 score2 score3
1 1 4 45 56
4 4 9 89 88
5 5 5 66 33
7 7 8 72 47
データセットから特定列に特定キーワードを含む行をomit(大文字小文字も無視)
> df
ID score1 score2 score3
1 1 Yes 45 56
2 2 No 78 52
3 3 Maybe yes 44 45
4 4 No 89 88
5 5 No Thank you 66 33
6 6 Yes I do 49 90
7 7 of course Yes 72 47
>
> df2 <- df %>% filter(!grepl("yes",score1,ignore.case=TRUE))
> df2
ID score1 score2 score3
1 2 No 78 52
2 4 No 89 88
3 5 No Thank you 66 33
特定列の列名変更
> df
ID score1 score2 score3
1 1 Yes 45 56
2 2 No 78 52
3 3 Maybe yes 44 45
4 4 No 89 88
5 5 No Thank you 66 33
6 6 Yes I do 49 90
7 7 of course Yes 72 47
>
> df <- dplyr::select(df, No = ID, dplyr::everything())
> df
No score4 score1 score2
1 ID1 56 4 45
2 ID2 52 6 78
3 ID3 45 2 44
4 ID4 88 9 89
5 ID5 33 5 66
6 ID6 90 7 49
7 ID7 47 8 72