Automating GIS Processes 2024 写経　Exercise 4（Problem 2）とLesson 4振り返り

Last updated at 2025-08-13Posted at 2025-08-12

Problem 2: Calculate and visualise the dominance areas of shopping centres (10 points)

各々のショッピングセンターのdominance areas（優位な範囲）を計算して可視化する（10点）

In this problem, the aim is to delineate the dominance area of each shopping centre. For this exercise, we define the ‘dominance area’ of a shopping centre as the area from which it can be reached faster than other shopping centres. We will use public transport travel times.

この問題では、各々のショッピングセンターの優位な範囲の輪郭を描くことが目的です。
この演習では、ショッピングセンターの「優位な範囲」を、ほかのショッピングセンターより早く到着できる範囲と定義します。公共交通機関の移動時間を使います。

Sample result: a map showing the areas of dominance of each shopping centre, and the travel times to the closest shopping centre in the entire metropolitan area

サンプル回答：各々ショッピングセンターの優位な範囲、ならびに、最も近いショッピングセンターへの移動距離を示す一つの地図　

Data

The input data is identical to what you have used for problem 1, see there for detailed data descriptions.

入力データは同じです。　問題１で使用したものと同じです。なので詳細はそちらを参照願います。

An overview of the tasks

概要

This task comprises of three major subtasks. In contrast to earlier exercises, we do not provide a detailed, step-by-step ‘cooking recipe’. Rather, you are free to implement the necessary steps in any order you see fit, and choose any variable names of your liking.

このタスクには３つのサブタスクがあります。これまでの演習とは違って、ステップバイステップであり、レシピのような詳細は提供しません。必要な処理をいいと思う順番で、変数名もお好みで、自由に実装してください。

To test intermediate results, implement assert statements, output the head() of a data frame, or plot the data. Remember to add comments to all of your code, so future you (and us) can understand what each section does.

中間結果をテストするために、assertやhead()メソッドによるデータフレームの先頭行の表示、もしくはデータのプロットを実装してください

The only strict requirement is the file name of the output map plot: DATA_DIRECTORY / "dominance_areas.png".

ただひとつの要求は、プロットした地図のファイル名が、DATA_DIRECTORY / "dominance_areas.png"であることです。

Load the YKR grid and the individual travel time data sets, and combine them into one geo data frame. This is essentially the same as problem 1, except that you must load all eight shopping centre data files. (2 points)

Find the closest shopping centre to each grid cell. In the combined data set, find the minimum travel time to any of the shopping centres, save the value in a new column, and shopping centre name in another new column. (4 points) See the hints to this exercise for a suggestions on how to achieve this (pandas.DataFrame.min() and pandas.DataFrame.idxmin() will be helpful)

Visualise the dominance areas and travel times. Use 2⨉1 subplots to plot the most dominant (closest) shopping centre for each grid cell, and the travel time to the closest shopping centre for each grid cell. (4 points)

YKRグリッドと各ショッピングセンターの移動距離のデータセットを読み込み、それらをひとつのGoeDataFrameに結合します
各グリッドセルで、最近傍のショッピングセンターを見つけます。上記で結合したデータセットから、各ショッピングセンターへの移動距離をもとに、いずれかのショッピングセンターへの最小移動時間を見つけ、その移動距離を別の列に保存します。また、そのショッピングセンターの名前を別の列に保存します。（4点）この演習をどのように成し遂げるか（pandas.DataFrame.min()やpandas.DataFrame.idxmin()が役に立ちます）のヒントを参照してください
優位な範囲と移動時間を可視化してください。最も優位な（近い）ショッピングセンターと最近傍のショッピングセンターへの移動時間を同時にプロットするために、2⨉1のサブプロットを使ってください。（4点）

では実践です。

import pathlib
NOTEBOOK_PATH = pathlib.Path().resolve()
DATA_DIRECTORY = NOTEBOOK_PATH / "data"

ファイルが存在していることを確認します。

念のため、geopandasをインストールします。たまに入っていないときがあるので。理由はわかりません。

YKRグリッドのデータセットを読み込み、先頭行を表示します。あとでjoin()する（インデックスベースでの結合をする）ので、set_index()で結合キー"YKR_ID"のインデックス化も行います。

import geopandas as gpd

grid : gpd.geopandas = gpd.read_file(DATA_DIRECTORY / "YKR_grid_EPSG3067.gpkg").set_index("YKR_ID")  # YKR_ID列をインデックスに

grid.head()

各ショッピングセンターのデータを読み込み、gridに結合します。ショッピングセンターが複数あるので、ループ処理をします。

ファイル名は、travel_times_to_[数字8桁]_[ショッピングセンター名].txtというルールがあります。

なので、glob()メソッドで、travel_times_to_*.txtにヒットするファイル名群を抽出し、それでループします。
ループ内の処理については以下。

・あとでjoin()する（インデックスベースでの結合をする）ので、set_index()で結合キー"from_id"をインデックスにします
・replace()メソッドで、"pt_r_t"（移動時間）が-1の場合、np.nan（非数）に置換します
・rename()メソッドで、"pt_r_t"（移動時間）をリネームします。各ファイル名から、input_file.stem[prefix_len:]で、input_file.stem＝ファイル名（拡張子抜き）のうち、「26文字目（インデックスは25）以降」を抽出し、それを列名とします

import pandas as pd
import numpy as np

prefix_len = len("travel_times_to_99999999")  # 25文字

shopping_center_df_dict: dict = {}

for input_file in DATA_DIRECTORY.glob("travel_times_to_*.txt"):

    place_name = input_file.stem[prefix_len:]  # 拡張子含まないファイル名の先頭25文字をカット

    df: pd.DataFrame = pd.read_csv(
        input_file, sep=";"
    )[
        ["from_id", "pt_r_t"]
    ].set_index(
        "from_id"
    ).replace(
        -1, np.nan
    ).rename(
        columns={"pt_r_t": place_name}
    )
    grid= grid.join(df)


grid

各ショッピングセンターへの移動距離のうち、最小値を"min"列に設定します。
min()メソッドの引数をaxis="columns"とすることで、gridの各行の["Myyrmanni","Iso_Omena", "Jumbo", "Dixi", "Ruoholahti", "Forum", "Itis"]列の最小値を取得できます。

# 各ショッピングセンターへの移動距離の最小値を"min"列に設定
grid["min"] = grid[["Myyrmanni","Iso_Omena",	"Jumbo",	"Dixi", "Ruoholahti", "Forum", "Itis"]].min(axis="columns")

grid

次に、各ショッピングセンターへの移動距離が最小となる列を"idxmin"列に設定します。上記と大体同じです。

# 各ショッピングセンターへの移動距離が最小となる列を"idxmin"列に設定
grid["idxmin"] = grid[["Myyrmanni","Iso_Omena",	"Jumbo",	"Dixi", "Ruoholahti", "Forum", "Itis"]].idxmin(axis="columns")

grid

なにやらFutureWarning（将来にたいする警告）が出ていますね。

/tmp/ipython-input-1566533038.py:2: FutureWarning: The behavior of DataFrame.idxmin with all-NA values, or any-NA and skipna=False, is deprecated. In a future version this will raise ValueError
  grid["idxmin"] = grid[["Myyrmanni","Iso_Omena",	"Jumbo",	"Dixi", "Ruoholahti", "Forum", "Itis"]].idxmin(axis="columns")

すべてのみ、もしくはどれかが非数に対して、引数skipnaをFalseでidxminをすることは廃止されました。将来のバージョンではValueErrorになります。

うーん、ちょっと謎ですね。pandasでは、skipna無指定の場合は、Trueとして扱われると書いてありますが。

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmin.html

ひとまず先に進みます。計算は終わったので、いったんプロットしてみましょう。まずは「優位な範囲」からです。

なお、grid[~pd.isna(grid["min"])]で、NaA（非数）の行はプロットの対象から外しています。

grid[~pd.isna(grid["min"])].plot(
    column = "idxmin",
    legend = True,
)

うーん、サンプルの結果（下図）と比較すると、右下（ForumとItusの色分け）が違っています。ほかにも違うところがあるかもです。

まあ、気にせず進みます。移動距離をプロットします。

grid[~pd.isna(grid["min"])].plot(
    column = "min",
    cmap="RdYlBu",
    legend = True,

)

こちらはサンプルの結果（下図）と同じように見えます。

多くの地域（グリッドセル）では、公共交通機関を使えば、一時間以内にどこかのショッピングセンターへたどり着くことができるようですね。

では、最後に、２つまとめてプロットしましょう。legend（凡例）は見せ方を変えています。（見やすさを考慮）

import matplotlib.pyplot as plt

# 縦1⨉横2、画像のサイズが15⨉15のプロットを定義
fig, ax = plt.subplots(
    nrows=1,
    ncols=2,
    figsize=(15, 15)
)

# 左、最も近いショッピングセンターをプロット、ただし、移動時間が非数のものは除く
target_ax = ax[0]
target_ax.set_title("Area of dominance of shopping centres")

#
grid[~pd.isna(grid["min"])].plot(
    ax=target_ax,
    column = "idxmin",
    legend = True,
    legend_kwds={
        "loc":"upper center",
        "ncol": 4,
        }
)

# 右、最も近いショッピングセンターまでの移動時間をプロット、ただし、移動時間が非数のものは除く
target_ax = ax[1]
target_ax.set_title("Travel time to the closest shopping centre")

grid[~pd.isna(grid["min"])].plot(
    ax=target_ax,
    column = "min",
    cmap="RdYlBu",
)

プロットを保存します。

fig.savefig(DATA_DIRECTORY /"dominance_areas.png")

保存先を検証します。

# NON-EDITABLE TEST CELL
# Check that output figure file exists
assert (DATA_DIRECTORY / "dominance_areas.png").exists()

良さそうですね。

以下は省略します。

Do not forget to plot the result map, and save it to DATA_DIRECTORY / "dominance_areas.png"!

Reflections

This was a significantly more complex exercise that previous ones, and it included finding a solution yourself.

What was most difficult part?

Where did you get stuck?

What was the easiest, and

what was the most fun part of this exercise?

Add your answer below

Well done!

Congratulations, you completed exercise 4. Good Job!

Lesson 4振り返り

Learning goals
After this lesson, you should know how to:

create new geometries by adding, subtracting or intersecting two geometries,

combine geometries based on a common attribute (dissolving them),

create categories for numerical data based on classifiers such as natural breaks, equal interval, or quantiles, and

simplify geometries according to a maximum-error threshold

2つのジオメトリを足したり、引いたり、交差させることで、新しいジオメトリを作成します →できました
2つのジオメトリを共通する属性で結合します（ひとつのgeometryにします） →できました
natural breaks（自然分類）、equal interval（等間隔）、quantiles（分位）などのclassifier（分類器）を使って、数値データのカテゴリを作成します →できました
maximum-error threshold（許容できる誤差の閾値）に沿って、geometryを簡易化します →できました

次は Lession 5です。どのくらいの期間でできるでしょうか。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Automating GIS Processes 2024 写経 Exercise 4（Problem 2）とLesson 4振り返り