Search ConsoleのCTR改善候補・圏外ギリ記事をPythonで自動抽出してMarkdownレポートに出力する

Last updated at 2026-04-20Posted at 2026-04-20

結論

Search Console APIからパフォーマンスデータを取得し、CTR改善候補と掲載順位11〜20位の記事を自動抽出してMarkdownレポートに出力するスクリプトです。

コマンド1行で改善候補一覧が手元に出てきます。

解決する課題

Search Consoleを登録したが、どの記事を改善すればいいか判断できていない
表示回数は多いのにクリックされていない記事を手動で探すのが手間
週1回の定点チェックが続かない

手順

STEP 1：Google Cloud ConsoleでAPIを有効化する

Google Cloud Consoleでプロジェクトを作成します
「APIとサービス → APIとサービスを有効化」から「Google Search Console API」を有効化します
「APIとサービス → 認証情報」でOAuthクライアントIDを作成します
- 同意画面のUser Type：「外部」
- アプリケーションの種類：「デスクトップアプリ」
ダウンロードしたJSONファイルを scripts/ に配置します
同意画面の「テストユーザー」に自分のGoogleアカウントを追加します

STEP 2：ライブラリをインストールする

pip3 install google-auth google-auth-oauthlib google-api-python-client

STEP 3：スクリプトを配置する

scripts/search_console_report.py として保存します。

SITE_URL と CREDENTIALS_FILE は自分の環境に合わせて変更してください。

#!/usr/bin/env python3
import os
import re
import time
import urllib.request
from datetime import datetime, timedelta, timezone
from html.parser import HTMLParser

from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.discovery import build

BASE_DIR = os.path.dirname(__file__)
CREDENTIALS_FILE = os.path.join(BASE_DIR, "client_secret_***.apps.googleusercontent.com.json")
TOKEN_FILE = os.path.join(BASE_DIR, "token.json")
SITE_URL = "https://yoursite.com/"
SCOPES = ["https://www.googleapis.com/auth/webmasters.readonly"]
REPORT_DIR = os.path.join(BASE_DIR, "..", "posts", "report")


class TitleParser(HTMLParser):
    def __init__(self):
        super().__init__()
        self._in_title = False
        self.title = ""

    def handle_starttag(self, tag, attrs):
        if tag == "title":
            self._in_title = True

    def handle_endtag(self, tag):
        if tag == "title":
            self._in_title = False

    def handle_data(self, data):
        if self._in_title:
            self.title += data


def fetch_title(url):
    try:
        req = urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0"})
        with urllib.request.urlopen(req, timeout=5) as res:
            html = res.read().decode("utf-8", errors="ignore")
        parser = TitleParser()
        parser.feed(html)
        title = parser.title.strip()
        title = re.sub(r"\s*[|\-–]\s*.*$", "", title).strip()
        return title or url
    except Exception:
        return url


def fetch_titles(urls):
    titles = {}
    for i, url in enumerate(urls):
        titles[url] = fetch_title(url)
        if i < len(urls) - 1:
            time.sleep(0.5)
    return titles


def get_credentials():
    creds = None
    if os.path.exists(TOKEN_FILE):
        creds = Credentials.from_authorized_user_file(TOKEN_FILE, SCOPES)
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(CREDENTIALS_FILE, SCOPES)
            creds = flow.run_local_server(port=0)
        with open(TOKEN_FILE, "w") as f:
            f.write(creds.to_json())
    return creds


def fetch_page_data(service, days=28):
    end_date = datetime.now(timezone.utc).date()
    start_date = end_date - timedelta(days=days)
    response = service.searchanalytics().query(
        siteUrl=SITE_URL,
        body={
            "startDate": str(start_date),
            "endDate": str(end_date),
            "dimensions": ["page"],
            "rowLimit": 500,
        }
    ).execute()
    return response.get("rows", [])


def analyze(rows):
    low_ctr, borderline = [], []
    for row in rows:
        url = row["keys"][0]
        clicks = row.get("clicks", 0)
        impressions = row.get("impressions", 0)
        ctr = row.get("ctr", 0) * 100
        position = row.get("position", 0)

        if impressions >= 50 and ctr < 3.0:
            low_ctr.append({"url": url, "impressions": int(impressions), "clicks": int(clicks), "ctr": round(ctr, 1), "position": round(position, 1)})
        if 11 <= position <= 20 and impressions >= 20:
            borderline.append({"url": url, "impressions": int(impressions), "clicks": int(clicks), "ctr": round(ctr, 1), "position": round(position, 1)})

    low_ctr.sort(key=lambda x: x["impressions"], reverse=True)
    borderline.sort(key=lambda x: x["position"])
    return low_ctr, borderline


def generate_report(low_ctr, borderline, titles):
    today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
    lines = [
        f"# Search Console 定点チェック ({today})", "",
        "## CTR改善候補（表示多いのにクリック率低い）",
        "表示回数50以上 & CTR3%未満", "",
        "| 記事タイトル | 表示回数 | クリック | CTR | 順位 |",
        "|------------|---------|--------|-----|------|",
    ]
    for r in low_ctr[:20]:
        title = titles.get(r["url"], r["url"])
        lines.append(f"| [{title}]({r['url']}) | {r['impressions']} | {r['clicks']} | {r['ctr']}% | {r['position']} |")

    lines += [
        "", "## 圏外ギリ候補（11〜20位 → あと一押しで1ページ目）",
        "表示回数20以上 & 掲載順位11〜20位", "",
        "| 記事タイトル | 表示回数 | クリック | CTR | 順位 |",
        "|------------|---------|--------|-----|------|",
    ]
    for r in borderline[:20]:
        title = titles.get(r["url"], r["url"])
        lines.append(f"| [{title}]({r['url']}) | {r['impressions']} | {r['clicks']} | {r['ctr']}% | {r['position']} |")

    return "\n".join(lines)


def main():
    creds = get_credentials()
    service = build("searchconsole", "v1", credentials=creds)

    print("Search Consoleからデータ取得中...")
    rows = fetch_page_data(service)
    print(f"{len(rows)}件取得完了")

    low_ctr, borderline = analyze(rows)

    all_urls = list({r["url"] for r in low_ctr + borderline})
    print(f"記事タイトル取得中（{len(all_urls)}件）...")
    titles = fetch_titles(all_urls)

    report = generate_report(low_ctr, borderline, titles)

    os.makedirs(REPORT_DIR, exist_ok=True)
    today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
    report_path = os.path.join(REPORT_DIR, f"{today}_search_console_report.md")
    with open(report_path, "w", encoding="utf-8") as f:
        f.write(report)

    print(f"レポート出力: {report_path}")
    print(f"CTR改善候補: {len(low_ctr)}件 / 圏外ギリ候補: {len(borderline)}件")


if __name__ == "__main__":
    main()

STEP 4：実行する

python3 scripts/search_console_report.py

初回はブラウザが開いてGoogleアカウントの認証を求められます。許可すると token.json が生成され、2回目以降は認証不要です。

STEP 5：出力を確認する

posts/report/2026-04-20_search_console_report.md のようなレポートが出力されます。

# Search Console 定点チェック (2026-04-20)

## CTR改善候補（表示多いのにクリック率低い）
表示回数50以上 & CTR3%未満

| 記事タイトル | 表示回数 | クリック | CTR | 順位 |
|------------|---------|--------|-----|------|
| [記事タイトルA](https://...) | 187 | 1 | 0.5% | 7.7 |
| [記事タイトルB](https://...) | 82 | 1 | 1.2% | 18.0 |

ハマりどころ

『認証エラーが出てスクリプトが止まる』

token.json が古くなると認証エラーになることがあります。token.json を削除してから再実行すると、ブラウザ認証からやり直せます。

『サイトマップが登録されていないと表示回数データが少ない』

データが少ない場合は先にSearch Consoleでサイトマップを登録してください。WordPressにRank Math SEOを入れていれば /sitemap_index.xml が生成されます。パーマリンク設定が「基本」のままの場合はクエリパラメータ形式のURLになりますが、登録・動作ともに問題ありません。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up