TL;DR
- タイトルの通り
- 同様の試みは一般的なものだと思っていたが、ネット上に情報が少なかったので残しておく
ソースコード
.github/workflows/update_atcoder.yml
name: Upload AtCoder Submissions
on:
schedule:
# Every Monday at 00:00 JST
- cron: "0 15 * * MON"
workflow_dispatch:
permissions:
contents: write
jobs:
upload:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: "3.12"
- name: Install dependencies
run: |
pip install get-chrome-driver --upgrade
pip install -r scripts/requirements.txt
- name: Fetch AtCoder submissions
run: python scripts/fetch_atcoder_submissions.py
- name: Commit and push changes
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git config --local user.email "github-actions[bot]@users.noreply.github.com"
git config --local user.name "github-actions[bot]"
git add submissions/
git commit -m "Update AtCoder submissions"
git push
scripts/fetch_atcoder_submissions.py
import json
import os
import time
import requests
from get_chrome_driver import GetChromeDriver
from selenium import webdriver
from selenium.webdriver.common.by import By
from tqdm import tqdm
def main():
# Get settings from settings.json
with open("settings.json") as f:
settings = json.load(f)
user_id = settings["user_id"]
with open("submissions/LastUpdate") as f:
last_update = int(f.read())
api_url = f"https://kenkoooo.com/atcoder/atcoder-api/v3/user/submissions/?user={user_id}&from_second={last_update}"
# Get submissions
res = requests.get(api_url)
submissions = res.json()
if not submissions:
return
# Extract get data
submissions.sort(key=lambda x: x["id"])
with open("submissions/LastUpdate", "w") as f:
f.write(str(submissions[-1]["epoch_second"] + 1))
submission_div_contest = dict()
for submission in submissions:
if submission["result"] != "AC":
continue
contest_id = submission["contest_id"]
if contest_id not in submission_div_contest:
submission_div_contest[contest_id] = []
submission_div_contest[contest_id].append(submission)
# Get and save submission code
get_driver = GetChromeDriver()
get_driver.install()
options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(options=options)
driver.implicitly_wait(2)
root_dir = "submissions"
with tqdm(submission_div_contest.items()) as pbar1:
for contest_id, submissions in pbar1:
pbar1.set_description(contest_id)
contest_dir = f"{root_dir}/{contest_id}"
os.makedirs(contest_dir, exist_ok=True)
with tqdm(submissions, leave=False) as pbar2:
for submission in pbar2:
problem_id = submission["problem_id"]
pbar2.set_description(problem_id)
if "C++" in submission["language"]:
ext = "cpp"
elif "Python" in submission["language"]:
ext = "py"
path = f"{contest_dir}/{problem_id}.{ext}"
submission_url = f"https://atcoder.jp/contests/{contest_id}/submissions/{str(submission['id'])}"
driver.get(submission_url)
script = "return ace.edit('submission-code').getValue();"
source_code = driver.execute_script(script)
with open(path, "w") as f:
f.write(source_code)
time.sleep(3)
driver.quit()
if __name__ == "__main__":
main()
基本方針
- AtCoder ProblemsのAPIを用いて、提出一覧を取得する
- 最新のAC提出のみを抽出する
- seleniumを用いて、スクレイピングで提出コードを取得する
- コンテストごとにディレクトリを分けて保存する
参考にした記事
先駆者さま
AtCoder Problems
のAPI
GitHub Actions
の書き方
GitHub Actions
上でSelenium
を実行する方法
Selenium
3系のコードを4系に修正する
おわりに
需要があれば解説形式でまとめたいと思うので、よければコメントを残していってください!