More than 5 years have passed since last update.

【Ruby】超簡単！h1とmeta titleとmeta descriptionをNokogiriでぶっこ抜くだけ！

Last updated at 2015-07-20Posted at 2015-07-18

最近SEOってる私です。
手始めに、htmlの改善をやろうとしていますが、取ってくるのが面倒でした。

先輩のアドバイスなどを元に、Nokogiriでぶっこ抜く簡単なrubyスクリプトを作りました。

#
# h1　title descriptionを抜き出すファイル。
# nokogiriをinstallして有れば使えるはず。
# ex) ruby get_meta.rb http://google.com
#

require 'open-uri'
require 'nokogiri'

ARGV.each do |uri|
  html = open(uri).read
  doc = Nokogiri::HTML.parse(html)

  print doc.title + "\t"
  print doc.xpath('/html/head/meta[@name="description"]/@content').to_s + "\t"
  doc.css("h1").each do |h1|
    print h1.text
  end
  print "\n"
end

引数にどんどんURLを書き足せばいくらでもぶっこ抜けます（＾ω＾）

ちょっと作り変えて配列にするとか、外部ファイル読み込むとかそのへんは自由に作って下さい☆

参考

nokogiriを使ったスクレイピング入門 @gaaamiiのブログ

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up