0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

(自分用メモ)IOSアプリの評価ランク別の割合をスクレイピングで取得する

Last updated at Posted at 2019-10-11

自分用メモです

色々調べたらItunesSearchAPIではトータルの評価数、及び平均評価ランク(0.5で区切っているらしく、大ざっぱな数になる)を取得する事はできるが、ランク別の評価数を取得する事は不可能な模様。
よって、”トータルの評価数*評価ランクの全体中の割合”でなら大まかにだが取得できるだろうという考え。

AppStoreのアプリ別のページをスクレイピングすればできるできた。

以下、サンプルコード

import urllib.request as url_req
import re
from html.parser import HTMLParser


class AppStoreScraping(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.cnt = 5
        self.__rating = {}
        self.avarage_flg = False

    @property
    def rating(self):
        assert isinstance(self.__rating, dict), 'Sorry, Can not Get Ratingg.'
        return self.__rating

    def handle_starttag(self, name, attrs):
        attrs = dict(attrs)
        if name == "div":
            if 'class' in attrs \
                    and attrs['class'] == 'we-star-bar-graph__bar__foreground-bar':
                h_r = re.match('^width: ([0-9]+)%;$', attrs['style'])
                h = str(h_r.group(1) + '%') if h_r else 0
                self.__rating = {**self.__rating, **
                                 {'Rating{cnt}Percentage'.format(cnt=self.cnt): h}}

                self.cnt -= 1

        elif name == "span":
            if 'class' in attrs \
                    and attrs['class'] == 'we-customer-ratings__averages__display':
                self.avarage_flg = True

    def handle_data(self, data):
        if self.avarage_flg:
            self.__rating = {**self.__rating, **
                             {'AverageRating': data}}
            self.avarage_flg = False

    @classmethod
    def req_exec(cls, appstoreID):
        request = url_req.Request(
            url='https://apps.apple.com/jp/app/id{}'.format(appstoreID)
        )
        tmp = url_req.urlopen(request, timeout=15)

        parser = cls()

        charset = tmp.info().get_content_charset()

        parser.feed(str(tmp.read().decode(charset)))

        return parser.rating


res = AppStoreScraping.req_exec(appstoreID='任意のAppleID')
print(res)

以下、フォトンドリヴン世界救済RPGで実行した結果

>>{'AverageRating': '4.8', 'Rating5Percentage': '84%', 'Rating4Percentage': '12%', 'Rating3Percentage': '3%', 'Rating2Percentage': '1%', 'Rating1Percentage': '1%'}

自分の環境だとリクエスト投げた際に結構な頻度で500コードが返ってくる事が多かったので、割としっかり例外処理しなければならないと思います。人気の高いアプリのページが調子が悪い傾向にあると思います。
image.png

以上です。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?