15
23

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

pythonを使用したwebサイトの変化監視

Last updated at Posted at 2020-01-08

あるサイトの変化を監視したいと思い作ってみました。現在は私の自己紹介文を監視していますがurlとclass_nameを変更すると他サイトでも使えます。
20秒毎にページデータを取得し、前回の取得時とデータが異なるかを表示しています。

import requests
import time

from bs4 import BeautifulSoup


url = "https://qiita.com/sssssssiiiiinnn"
class_name ='div.newUserPageProfile_info_body.newUserPageProfile_description'
file = "elems_text.txt"


def is_not_changed(old_elem, new_elem):
    return old_elem == new_elem


def set_old_elems():
    try:
        f = open(file)
        old_elems = f.read()
        print(f'{"old_elem":10} : {old_elems}')
    except:
        old_elems = ''
    return old_elems


def set_new_elems():
    response = requests.get(url)
    response.encoding = response.apparent_encoding
    bs = BeautifulSoup(response.text, 'html.parser')
    new_elems = str(bs.select(class_name))
    print(f'{"new_elem":10} : {new_elems}')
    return new_elems


def display_result(old_elem, new_elem):
    if not is_not_changed(old_elems, new_elems):
        f = open(file, 'w')
        f.writelines(new_elems)
        f.close()
        print("Change is detected!!")
    else:
        print("not changed...")


if __name__ == '__main__':
    try:
        while(1):
            print("="*100)
            new_elems = set_new_elems()
            old_elems = set_old_elems()
            display_result(old_elems, new_elems)
            time.sleep(20)
    except KeyboardInterrupt:
        print("Interrupted by Ctrl + C")
15
23
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
15
23

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?