3
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Python はてなブックマークをキーワード検索

Last updated at Posted at 2021-02-11

Python はてなブックマークをキーワード検索

はてブのエントリーはRSSを使って取得する
公式 ... http://developer.hatena.ne.jp/ja/documents/bookmark/misc/feed

RSS

パラメーター
q ... キーワード
sort ... ソート(新着順:recent, 人気順:popular)
threshold ... 最低はてブ数
date_begin ... 開始日 {YYYY-MM-DD}
date_end ... 終了日{YYYY-MM-DD}

キーワード
https://b.hatena.ne.jp/search/text?q=PHP&mode=rss&date_begin=2021-02-01&date_end=2021-02-01

タグ
https://b.hatena.ne.jp/search/tag?q=PHP&mode=rss&date_begin=2021-02-01&date_end=2021-02-01

タイトル
https://b.hatena.ne.jp/search/title?q=PHP&mode=rss&date_begin=2021-02-01&date_end=2021-02-01

Python側

RSS取得は feedparser を使う
feedparser ... https://github.com/kurtmckee/feedparser

インストール

$ pip install feedparser

タイプ、キーワード、開始・終了日付で検索できるようにした
feedparser をRSS取得しDataFrameに保存

search = hatena.get_search("tag", "PHP", "2021-02-01", "2021-02-11")
import feedparser
import pandas as pd


def get_search(type: str, q: str, start_date: str, end_date: str):

    df = pd.DataFrame(
        columns=[
            "id",
            "title",
            "link",
            "summary",
            "updated",
            "hatena_bookmarkcount",
            "hatena_bookmarkcommentlistpageurl",
            "hatena_imageurl",
        ],
    )

    rss = f"https://b.hatena.ne.jp/search/{type}?q={q}&mode=rss&date_begin={start_date}&date_end={end_date}"
    d = feedparser.parse(rss)
    for entry in d.entries:

        df = df.append(
            {
                "id": entry.id,
                "title": entry.title,
                "link": entry.link,
                "summary": entry.summary,
                "updated": entry.updated,
                "hatena_bookmarkcount": entry.hatena_bookmarkcount,
                "hatena_bookmarkcommentlistpageurl": entry.hatena_bookmarkcommentlistpageurl,
                "hatena_imageurl": entry.hatena_imageurl,
            },
            ignore_index=True,
        )

    return df

実行結果

                                                  id  ...                                    hatena_imageurl
0  https://www.1st-net.jp/blog/2021/02/04/php_mai...  ...  https://www.1st-net.jp/blog/wp-content/uploads...
1  https://www.datadoghq.com/blog/engineering/php...  ...  https://imgix.datadoghq.com/img/blog/engineeri...
2  https://qiita.com/tajima_taso/items/18a2c593a3...  ...  https://qiita-user-contents.imgix.net/https%3A...

はてブ投稿を取得できました
いいね!と思ったら LGTM お願いします :clap::clap::clap:

【PR】週末ハッカソンというイベントやってます! → https://weekend-hackathon.toyscreation.jp/about/

3
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?