ChatGPTからの挑戦(Python) Advent Calendar 2025

Day16. Markdown → HTML の超簡易コンバータ - 勝手にChatGPTチャレンジ (Python)

Last updated at 2025-12-16Posted at 2025-12-16

前提

本日のお題

17. Markdown → HTML の超簡易コンバータ

何を作る？
# 見出し を <h1>、* 箇条書き を <li> など、最低限のルールだけで変換するスクリプト。

学べること

行ごとのテキスト処理
簡単なパーサ設計
文字列フォーマット（f-string）や join

面白いところ

「パーサを書く」楽しさをシンプルに体験できる
対応するルールを増やして “なんちゃって Markdown エンジン” に育てられる

回答

コード

17_md2html.py

"""
Markdown → HTML の超簡易コンバータ

対応ルール（最低限）:
- "# 見出し"  → <h1>見出し</h1>
- "## 見出し" → <h2>見出し</h2>
- "### 見出し"→ <h3>見出し</h3> （# の数で h1〜h6 まで対応）
- "* 箇条書き" / "- 箇条書き" → <ul><li>箇条書き</li>...</ul>
- それ以外の非空行 → <p>段落</p>
- 空行 → 段落やリストの区切り

使い方:
  python md2html.py input.md
  python md2html.py input.md -o output.html
  cat input.md | python md2html.py
"""

import argparse
import sys
import html
from pathlib import Path


def convert_markdown(md_text: str) -> str:
    """超簡易 Markdown → HTML 変換."""
    lines = md_text.splitlines()
    result: list[str] = []
    in_list = False  # <ul> の中かどうか

    for raw_line in lines:
        line = raw_line.rstrip()

        # 空行：リストを閉じるだけにしてスキップ
        if not line:
            if in_list:
                result.append("</ul>")
                in_list = False
            continue

        stripped = line.lstrip()

        # 見出し (#, ##, ### ... )
        if stripped.startswith("#"):
            # リストの途中なら閉じる
            if in_list:
                result.append("</ul>")
                in_list = False

            # 先頭の # の数を数える
            level = 0
            for ch in stripped:
                if ch == "#":
                    level += 1
                else:
                    break
            level = max(1, min(level, 6))  # h1〜h6 に制限

            # "# " の後ろのテキスト部分を取り出す
            content = stripped[level:].lstrip()
            content = html.escape(content)
            result.append(f"<h{level}>{content}</h{level}>")
            continue

        # 箇条書き（* or - で始まる行）
        if stripped.startswith("* ") or stripped.startswith("- "):
            if not in_list:
                result.append("<ul>")
                in_list = True
            # "* " もしくは "- " を除いた部分
            item_text = stripped[2:].strip()
            item_text = html.escape(item_text)
            result.append(f"  <li>{item_text}</li>")
            continue

        # 通常のテキスト行 → <p>...</p>
        if in_list:
            result.append("</ul>")
            in_list = False

        paragraph = html.escape(line.strip())
        result.append(f"<p>{paragraph}</p>")

    # ファイル末尾でリストが開いていたら閉じる
    if in_list:
        result.append("</ul>")

    return "\n".join(result)


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Markdown を超簡易的に HTML に変換するスクリプト"
    )
    parser.add_argument(
        "input",
        nargs="?",
        help="入力 Markdown ファイルパス（省略時は stdin）",
    )
    parser.add_argument(
        "-o",
        "--output",
        type=Path,
        help="出力 HTML ファイルパス（省略時は stdout）",
    )
    return parser.parse_args()


def main() -> None:
    args = parse_args()

    # 入力を読む
    if args.input:
        in_path = Path(args.input)
        if not in_path.exists():
            print(f"[ERROR] 入力ファイルが見つかりません: {in_path}", file=sys.stderr)
            sys.exit(1)
        md_text = in_path.read_text(encoding="utf-8")
    else:
        # 標準入力から
        md_text = sys.stdin.read()

    html_text = convert_markdown(md_text)

    if args.output:
        args.output.write_text(html_text, encoding="utf-8")
    else:
        print(html_text)


if __name__ == "__main__":
    main()

実行例

$python 17_md2html.py 17_sample.md
<h1>タイトル</h1>
<p>これは段落です。</p>
<h2>小見出し</h2>
<ul>
  <li>りんご</li>
  <li>バナナ</li>
  <li>みかん</li>
</ul>
<ul>
  <li>別のリスト1</li>
  <li>別のリスト2</li>
</ul>
<p>普通のテキスト行。</p>

感想

特になし！！！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up