More than 5 years have passed since last update.

Pythonのfeedparserをつかってみる。

Last updated at 2013-01-05Posted at 2013-01-05

自分用メモです。

(取得しているフィードを気にしてはいけない)

skeameba = feedparser.parse('http://rssblog.ameba.jp/ske48official/rss.html')

このままだと、日本語がエスケープされている

In [21]: skeameba['entries'][0]['title']
u'(\u5927\u77e2\u771f\u90a3)\u6709\u540d\u306a\u306e\u304b\u306a'

と、いうことで日本語の表示

def pp(obj):
    pp = pprint.PrettyPrinter(indent=4, width=160)
    str = pp.pformat(obj)
    return re.sub(r"\\u([0-9a-f]{4})", lambda x unichr(int ("0x" + x.group(1), 16)), str)

そうするとこんな感じで。

for entry in skeameba['entries']:
    title = entry['title']
    link = entry['link']
    print "title:", pp(title)    print "link: " , link

title: u'(大矢真那)有名なのかな'
link:  http://ameblo.jp/ske48official/entry-11442256228.html
title: u'ちゅり＝(((O゜◇゜)o｛２０１３年もカメラ女子楽しんでこ～!!!'
link:  http://ameblo.jp/ske48official/entry-11442248464.html
title: u'初詣。（花・з・）～♪'
link:  http://ameblo.jp/ske48official/entry-11442205471.html
title: u'オレンジジュースとおにぎり(・∀・＞)'
link:  http://ameblo.jp/ske48official/entry-11442174092.html
title: u'古川(゜∀。*)ポチ'
link:  http://ameblo.jp/ske48official/entry-11442139096.html
title: u'おいひ～＊♪(゜ー゜夏)'
link:  http://ameblo.jp/ske48official/entry-11442056727.html
title: u'小木曽(・*・隠れ家。)3汐莉'
link:  http://ameblo.jp/ske48official/entry-11442023303.html
title: u'秦＊その瞬間'
link:  http://ameblo.jp/ske48official/entry-11441959773.html
title: u'☆あけおめです!!★KUMI'
link:  http://ameblo.jp/ske48official/entry-11441813459.html
title: u'れな(初詣・ω・さん)'
link:  http://ameblo.jp/ske48official/entry-11441713875.html

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up