古典で学ぶPythonカリキュラム

Posted at 2025-03-30

古典で学ぶPythonカリキュラム構成

第1章：イントロダクションと環境構築

Pythonとは？なぜ古典と組み合わせるのか？
Google Colab / Jupyter Notebookの使い方
古典テキストのデータを読み込む準備（txt, csv）

第2章：文字列処理で読む漢文・和歌

str型、スライス、文字数カウント
古文・漢文のテキスト読み込みと整形
例：『論語』や『徒然草』から一節を読み取り表示

# 例：論語「学而時習之、不亦説乎」
text = "学而時習之、不亦説乎。"
print(text[0:2])  # 学而

第3章：for文で名言を繰り返し味わう

for ループとenumerate
一文字ずつ、または一節ずつの出力
和歌の構造（五・七・五・七・七）を反復確認

第4章：条件分岐で文学的判断

if / else の基本
特定語句（例：憂い、哀しみ、喜び）に応じて色分け出力

# 例：気持ちの語を分類する
word = "喜"
if word in ["喜", "楽"]:
    print("ポジティブな感情")
elif word in ["哀", "憂"]:
    print("ネガティブな感情")

第5章：辞書とリストで語彙を管理

dict, listの使い方
例：「春」「秋」などの季語を分類管理
枕詞と掛詞を対応させる辞書を作成

第6章：形態素解析と漢字の意味

MeCab or Janome を使った形態素解析
古文・漢文の助詞・助動詞を抽出
熟語の読み、送り仮名の分離

第7章：データ可視化 ― 言葉の頻度とネットワーク

matplotlib / wordcloud / networkx
古典の語句頻出ランキングを棒グラフで表示
特定語（「心」「夢」など）との共起ネットワークを可視化

第8章：文体の特徴を機械学習で識別

ベクトル化（TF-IDF）とクラスタリング
『源氏物語』『枕草子』『徒然草』の文体の違いを分類
分類器：SVM or ランダムフォレスト

第9章：自動生成と創作

マルコフ連鎖を用いた和歌の自動生成
ChatGPT APIと連携して「現代風・古風の和歌」を作る

目指す成果物

古文や漢文の自動分析ツール
季語辞書と連携した和歌分析アプリ
ネットワーク図や感情分布グラフ
Pythonで遊べる古典生成Bot

GPTで遊んでみた

# ---------------------
#  第2章：文字列処理
# ---------------------

text = "学而時習之、不亦説乎。"
print("第2章：文字列処理で読む漢文")
print("全文：", text)
print("前2文字（スライス）：", text[0:2])  # 学而
print("文字数：", len(text))
print()

# ---------------------
#  第3章：for文で名言を繰り返す
# ---------------------

print("第3章：for文で名言を味わう（1文字ずつ表示）")
for i, char in enumerate(text):
    print(f"{i+1}文字目：{char}")
print()

# 和歌構造の確認（五・七・五・七・七）
waka = "春の夜の夢の浮橋とだえして峰にわかるる横雲の空"
print("和歌：", waka)
print("構造確認：")
lines = [waka[0:5], waka[5:12], waka[12:17], waka[17:24], waka[24:]]
for i, line in enumerate(lines):
    print(f"{i+1}行目（{len(line)}文字）：{line}")
print()

# ---------------------
#  第4章：条件分岐で文学的判断
# ---------------------

print("第4章：感情語の判定")
words = ["喜", "楽", "哀", "憂", "怒"]
for word in words:
    if word in ["喜", "楽"]:
        print(f"{word}：ポジティブな感情")
    elif word in ["哀", "憂"]:
        print(f"{word}：ネガティブな感情")
    else:
        print(f"{word}：分類できません")
print()

# ---------------------
#  第5章：辞書とリストで語彙を管理
# ---------------------

# 季語の分類
kigo_dict = {
    "春": ["桜", "霞", "うぐいす"],
    "秋": ["紅葉", "虫の声", "月"]
}

# 枕詞と掛詞の対応
makurakotoba_dict = {
    "あしひきの": "山",
    "しろたへの": "衣・袖",
    "たらちねの": "母"
}

print("第5章：季語の辞書")
for season, words in kigo_dict.items():
    print(f"{season}の季語：{', '.join(words)}")
print()

print("枕詞と掛詞の辞書")
for makura, target in makurakotoba_dict.items():
    print(f"「{makura}」 → 「{target}」にかかる")

# 必要なライブラリをインポート / Import necessary libraries
import matplotlib.pyplot as plt
from wordcloud import WordCloud
from collections import Counter
import re

# --------------------------
# 日本古典テキストの定義 / Define classical Japanese texts
# --------------------------
texts = {
    '源氏物語': "春はあけぼの。やうやう白くなりゆく山ぎは、少しあかりて...",
    '枕草子': "香炉峰の雪いかならむと、簾を高く上げたれば、御格子のうち明かくて...",
    '徒然草': "つれづれなるままに、日くらし硯に向かひて、心にうつりゆくよしなし事を..."
}

# --------------------------
# テキストの前処理（句読点や記号の除去）/ Preprocessing: Remove punctuation and symbols
# --------------------------
def preprocess(text):
    # 記号類を削除 / Remove punctuation symbols
    return re.sub(r'[、。…「」『』・（）\s]', '', text)

# 全テキストを前処理 / Preprocess all texts
preprocessed_texts = {title: preprocess(content) for title, content in texts.items()}

# --------------------------
# 頻出文字のカウント / Count frequent characters
# --------------------------
all_characters = []
for content in preprocessed_texts.values():
    all_characters.extend(list(content))  # 文字単位でリスト化 / Split text into characters

# 頻度をカウント / Count character frequencies
char_freq = Counter(all_characters)

# --------------------------
# 上位10文字の棒グラフ / Bar chart for top 10 frequent characters
# --------------------------
top_chars = char_freq.most_common(10)
chars, counts = zip(*top_chars)

plt.figure(figsize=(10, 5))
plt.bar(chars, counts)
plt.title("Top 10 Frequent Characters in Classical Texts")
plt.xlabel("Character")
plt.ylabel("Frequency")
plt.grid(True)
plt.tight_layout()
plt.show()

# --------------------------
# Word Cloud の描画 / Draw Word Cloud
# --------------------------
# フォントパス（環境によって変更が必要）/ Change path as needed
font_path = "/usr/share/fonts/opentype/ipafont-gothic/ipagp.ttf"

# WordCloud オブジェクトの作成 / Create WordCloud object
wc = WordCloud(font_path=font_path, background_color="white", width=800, height=400)

# 文字頻度からワードクラウドを生成 / Generate from character frequency
wc.generate_from_frequencies(char_freq)

# 描画 / Plot
plt.figure(figsize=(12, 6))
plt.imshow(wc, interpolation="bilinear")
plt.axis("off")
plt.title("Word Cloud of Classical Texts")
plt.tight_layout()
plt.show()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up