LoginSignup
0
1

More than 3 years have passed since last update.

全国の都道府県知事の年齢と当選回数を調べる

Last updated at Posted at 2020-10-31

全国知事会知事ファイルから現職の知事をスクレイピング

import datetime
import re

import pandas as pd
import matplotlib.pyplot as plt


def wareki2date(s):

    m = re.search("(昭和|平成|令和)([ 0-9元]{1,2})年( [0-9]{1,2})月([ 0-9]{1,2})日", s)

    if m:

        year, month, day = [1 if i == "元" else int(i.strip()) for i in m.group(2, 3, 4)]

        if m.group(1) == "昭和":
            year += 1925
        elif m.group(1) == "平成":
            year += 1988
        elif m.group(1) == "令和":
            year += 2018

        return pd.Timestamp(year, month, day)

    else:
        return pd.NaT


df = pd.read_html("http://www.nga.gr.jp/app/chijifile/", attrs={"summary": "検索結果一覧"})[0]

# 和暦を西暦に変換
df["生年月日"] = df["生年月日"].apply(wareki2date)
df["選挙施行日"] = df["選挙施行日"].apply(wareki2date)
df["任期満了日"] = df["任期満了日"].apply(wareki2date)
df["就任年月"] = df["就任年月"].apply(wareki2date)

df["年齢"] = df["年齢"].str.rstrip("歳").astype(int)

年齢

df["年齢"].value_counts(bins=[30,40,45,50,55,60,65,70,75,80]).sort_index().plot.bar()

ages.png

df["年齢"].describe()

count 47.000000
mean 61.680851
std 9.273868
min 39.000000
25% 56.000000
50% 60.000000
75% 69.500000
max 78.000000
Name: 年齢, dtype: float64

当選回数

df["当選回数"].describe()

count 47.000000
mean 2.765957
std 1.447828
min 1.000000
25% 1.000000
50% 3.000000
75% 4.000000
max 7.000000
Name: 当選回数, dtype: float64

全国市長会

全国市長会の一覧は名前しか載っていない

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1