LoginSignup
5
3

More than 5 years have passed since last update.

ggplot2 に依存しているパッケージ一覧を rvest で取得する #rstatsj

Last updated at Posted at 2015-04-16

こういう話がある。

このページ、スクレイピングすればリストぐらいは簡単に作れるのではないだろうか。

というわけで

R
library(rvest)

pkg_name <- "ggplot2"
url <- sprintf("http://cran.r-project.org/web/packages/%s/index.html", pkg_name)
html <- html(url)

depends <- html %>%
  html_nodes(xpath = '/html/body/table[3]/tr[1]/td[2]/a') %>%
  html_text
imports <- html %>%
  html_nodes(xpath = '/html/body/table[3]/tr[2]/td[2]/a') %>%
  html_text

targets <- c(depends, imports)

library(lambdaR)

result <- targets %>% Map_(pkg_name: {
  cat(pkg_name, "\n")
  url <- sprintf("http://cran.r-project.org/web/packages/%s/index.html", pkg_name)
  Sys.sleep(1)
  html <- html(url)

  title <- html %>%
    html_nodes(xpath = '/html/body/h2') %>%
    html_text
  desc <- html %>%
    html_nodes(xpath = '/html/body/p') %>%
    html_text

  data.frame(name=pkg_name, title, desc)
}) %>% Reduce_(rbind)

write.csv(result, "ggplot2_extend.csv", row.names=FALSE)

結果はこちら

Enjoy!

続き

関連

5
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
5
3