Posted at

ar8月号の煽り文「髪がイケてると絶対的に可愛く見えちゃう伝説」はlegendを誤訳した結果なのではないか

More than 5 years have passed since last update.

なんだよ伝説って。アホか。

box plotにカウントを載せる 参考


R

give.n <- function(x){

return(c(y = mean(x) + 10, label = length(x)))
}
ggplot(df, aes(qtr, value)) + geom_boxplot(colour = "gold", outlier.colour = "gray", outlier.size = 1) + stat_summary(fun.y = mean, geom = "line", aes(group = 1)) + stat_summary(fun.data = give.n, geom = "text", size = 2)

色々載せる


R

give.l <- function(x){

return(c(y = mean(x)*1.01 + 10, label = mean(x)))
}
ggplot(subset(sample, sample$lib_strategy == "RNA-Seq"), aes(qtr, log(total_sequences))) + geom_boxplot(colour = "lightskyblue", outlier.colour = "gray", outlier.size = 1) + stat_summary(fun.y = mean, geom = "line", aes(group = 1)) + stat_summary(fun.data = give.n, geom = "text", size = 2) + stat_summary(fun.data = give.l, geom = "text", size = 2)

外れ値がでかすぎてY軸が死ぬの我慢ならん 参考

コードを抜粋


R

# create a dummy data frame with outliers

df = data.frame(y = c(-100, rnorm(100), 100))

# create boxplot that includes outliers
p0 = ggplot(df, aes(y = y)) + geom_boxplot(aes(x = factor(1)))

# compute lower and upper whiskers
ylim1 = boxplot.stats(df$y)$stats[c(1, 5)]

# scale y limits based on ylim1
p1 = p0 + coord_cartesian(ylim = ylim1*1.05)


最終的にこんな感じになっとる


R

ylim1 = log10(boxplot.stats(rnaseq_sample$total_sequences)$stats[c(1,5)])

ggplot(rnaseq_sample, aes(qtr, log10(total_sequences))) + geom_boxplot(colour="lightskyblue", outlier.size = 2, outlier.colour = "gray") + coord_cartesian(ylim = ylim1+2) + stat_summary(fun.y = median, geom = "line", aes(group = 1)) + stat_summary(fun.data = give.n, geom = "text", size = 2) + stat_summary(fun.data = give.l, geom = "text", size = 2)

outliersをそもそも書かないという手段もある


R

ggplot(data, aes(qtr, values)) + geom_boxplot(outlier.shape = NA)


後できれいにします


R

ggplot(sample, aes(log10(total_sequences), log10(max_length))) + geom_point(alpha = 1/10)


scatter plotとか