0
0

pythonのdockerでeccodesのv28のpygribを使って気象庁のgrib2から気温と日射量を取り出したメモ

Last updated at Posted at 2023-03-06

概要

気象庁のデータを読んでみたかった。
ソース

pygribについてはpygribで気象庁の数値予報GPVデータを読み込む
eccodesについてはecCodesに入ったGRIB2 Template 5.200/7.200サポートを用いてpygribで気象庁全国合成レーダーGPVを読むを参考にした。

フォルダ構成

- docker-compose.yml
- eccodes
  - Dockerfile 
- src
  - get-rad-tem.py
- grib2
  - Z__C_RJTD_20171205000000_MSM_GPV_Rjp_Lsurf_FH00-15_grib2.bin

ソース

eccodes/dockerfile
FROM python:3

WORKDIR /app
RUN wget https://confluence.ecmwf.int/download/attachments/45757960/eccodes-2.28.0-Source.tar.gz \
  && tar -xvf eccodes-2.28.0-Source.tar.gz \
  && rm eccodes-2.28.0-Source.tar.gz

RUN apt-get update

# eccodes v28からc++のコンパイルが必要なのでcmakeの最新版をインストール
RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null
RUN apt install -y software-properties-common 
RUN apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main'
RUN apt update
RUN apt install -y cmake

RUN apt-get install -y \
  build-essential \
  gfortran \
  libaec-dev  \
  && rm -rf /var/lib/apt/lists/*

RUN mkdir build && cd build && cmake ../eccodes-2.28.0-Source && make && make install

RUN pip3 install eccodes
RUN pip3 install pygrib
RUN pip3 install pandas
docker-compose.yml
version: "3.5"

services:
  eccodes:
    build:
      context: ./eccodes
    volumes:
      - ./grib2:/grib2
      - ./src:/src
    command: python /src/get-rad-tem.py '{"lat":35.6745,"lon":139.7169}'
get-rad-tem.py
from datetime import timedelta
import pygrib
import pandas as pd
import json
import sys

# 日本時間に直す用
time_diff = timedelta(hours=9)

# ケルビンから℃変換用
F_C_DIFF=273.15

# 1kmメッシュの刻み幅
LAT_STEP=0.025
LON_STEP=0.03125

# マージ用
def merge_lists(list1, list2):
    merged_list = []
    for item1 in list1:
        for item2 in list2:
            if item1[0] == item2[0] and item1[1] == item2[1] and item1[2] == item2[2]:
                merged_list.append([item1[0], item1[1], item1[2], item1[3], item2[3]])
                break
    return merged_list

# 引数取得
json_str = sys.argv[1]
json_obj = json.loads(json_str)
lat = json_obj["lat"]
lon = json_obj["lon"]

gpv_file = pygrib.open("/grib2/Z__C_RJTD_20171205000000_MSM_GPV_Rjp_Lsurf_FH00-15_grib2.bin")
t_messages = gpv_file.select(parameterName="Downward short-wave radiation flux")

# 日射量データ取得
radiation_data = []
for grb in t_messages:
    values, lats, lons = grb.data(lat1=lat - LAT_STEP,lat2=lat + LAT_STEP,lon1=lon - LON_STEP,lon2=lon + LON_STEP)
    jst =  grb.validDate + time_diff
    for i, lat in enumerate(lats):
        for j, x in enumerate(lat):
          radiation_data.append([jst, lats[i][j], lons[i][j], values[i][j]])
    
t_messages_temperature = gpv_file.select(name="Temperature")

# 気温データ取得
temperature_data = []
for grb in t_messages_temperature:
    values, lats, lons = grb.data(lat1=lat - LAT_STEP,lat2=lat + LAT_STEP,lon1=lon - LON_STEP,lon2=lon + LON_STEP)
    jst =  grb.validDate + time_diff
    for i, lat in enumerate(lats):
        for j, x in enumerate(lat):
          temperature_data.append([jst, lats[i][j], lons[i][j], values[i][j] - F_C_DIFF])

gpv_file.close()

# DataFrameの作成
merged = merge_lists(radiation_data, temperature_data)
df = pd.DataFrame(merged, columns=['Date', 'Latitude', 'Longitude', 'Radiation', 'Temperature'])


print(df)

実行結果

$docker-compose run eccodes
                  Date  Latitude  Longitude   Radiation  Temperature
0  2017-12-05 09:00:00     35.65   139.6875  423.758726     9.810709
1  2017-12-05 10:00:00     35.65   139.6875  517.047066    11.409219
2  2017-12-05 11:00:00     35.65   139.6875  551.867081    12.358789
3  2017-12-05 12:00:00     35.65   139.6875  506.780586    13.116861
4  2017-12-05 13:00:00     35.65   139.6875  373.049217    13.770517
5  2017-12-05 14:00:00     35.65   139.6875  289.076391    14.698541
6  2017-12-05 15:00:00     35.65   139.6875  125.875000    14.488687
7  2017-12-05 16:00:00     35.65   139.6875    7.375000    13.063196
8  2017-12-05 17:00:00     35.65   139.6875    0.000000    11.467722
9  2017-12-05 18:00:00     35.65   139.6875    0.000000    10.320886
10 2017-12-05 19:00:00     35.65   139.6875    0.000000     9.592111
11 2017-12-05 20:00:00     35.65   139.6875    0.000000     8.797952
12 2017-12-05 21:00:00     35.65   139.6875    0.000000     8.171686
13 2017-12-05 22:00:00     35.65   139.6875    0.000000     7.832407
14 2017-12-05 23:00:00     35.65   139.6875    0.000000     7.461450

2023.11.24 追記 GSMからも読む

GSM-GPV(日本域0.1度×0.125度) より、刻み幅がMSMと異なるため、STEPを変更する。

docker-comopse.yml
-    command: python /src/get-rad-tem.py '{"lat":35.6745,"lon":139.7169}'
+    command: python /src/get-rad-tem-gsm.py '{"lat":35.6745,"lon":139.7169}'
# メッシュの刻み幅
- LAT_STEP=0.025
- LON_STEP=0.03125
+ LAT_STEP=0.1 / 2
+ LON_STEP=0.125 / 2

実行結果

docker-python-eccodes-1  |                   Date  Latitude  Longitude   Radiation  Temperature
docker-python-eccodes-1  | 0  2022-10-13 09:00:00      35.7     139.75  251.806757    17.588800
docker-python-eccodes-1  | 1  2022-10-13 10:00:00      35.7     139.75  246.188946    18.273828
docker-python-eccodes-1  | 2  2022-10-13 11:00:00      35.7     139.75  246.010120    18.831323
docker-python-eccodes-1  | 3  2022-10-13 12:00:00      35.7     139.75  236.026525    19.298059
docker-python-eccodes-1  | 4  2022-10-13 13:00:00      35.7     139.75  165.826084    19.541956
docker-python-eccodes-1  | 5  2022-10-13 14:00:00      35.7     139.75   79.939624    19.261438
docker-python-eccodes-1  | 6  2022-10-13 15:00:00      35.7     139.75   52.655281    18.694604
docker-python-eccodes-1  | 7  2022-10-13 16:00:00      35.7     139.75    8.115316    18.399561
docker-python-eccodes-1  | 8  2022-10-13 17:00:00      35.7     139.75    0.062500    18.090021
docker-python-eccodes-1  | 9  2022-10-13 18:00:00      35.7     139.75    0.000000    17.945459
docker-python-eccodes-1  | 10 2022-10-13 19:00:00      35.7     139.75    0.000000    17.840509
docker-python-eccodes-1  | 11 2022-10-13 20:00:00      35.7     139.75    0.000000    17.723596
docker-python-eccodes-1  | 12 2022-10-13 21:00:00      35.7     139.75    0.000000    17.731744
docker-python-eccodes-1  | 13 2022-10-13 22:00:00      35.7     139.75    0.000000    17.806970
docker-python-eccodes-1  | 14 2022-10-13 23:00:00      35.7     139.75    0.000000    17.807825
docker-python-eccodes-1  | 15 2022-10-14 00:00:00      35.7     139.75    0.000000    17.732416
docker-python-eccodes-1  | 16 2022-10-14 01:00:00      35.7     139.75    0.000000    17.541833
docker-python-eccodes-1  | 17 2022-10-14 02:00:00      35.7     139.75    0.000000    17.342340
docker-python-eccodes-1  | 18 2022-10-14 03:00:00      35.7     139.75    0.000000    17.204065
docker-python-eccodes-1  | 19 2022-10-14 04:00:00      35.7     139.75    0.000000    17.130579
docker-python-eccodes-1  | 20 2022-10-14 05:00:00      35.7     139.75    0.687500    17.083429
docker-python-eccodes-1  | 21 2022-10-14 06:00:00      35.7     139.75   55.875000    16.925287
docker-python-eccodes-1  | 22 2022-10-14 07:00:00      35.7     139.75  205.463552    17.102747
docker-python-eccodes-1  | 23 2022-10-14 08:00:00      35.7     139.75  386.634411    17.931696
docker-python-eccodes-1 exited with code 0

2023.12.11 追記 validDateを使うのは日射量と降水量では不適

テンプレートが異なるせいか、期待する値とならない(rainがnullとなってしまっていた)。
代わりにvalidityDateとvalidityTimeを使う。
ソースコード
変更コミット

  # データの探索
  for grb in t_messages:
      values, lats, lons = grb.data(lat1=la1,lat2=la2,lon1=lo1,lon2=lo2)
      # 予報時刻(UTC)を日本時間に直す
-      jst =  grb.validDate + time_diff
+      jst =  datetime.strptime(str(grb.validityDate) + str(grb.validityTime).zfill(4), "%Y%m%d%H%M") + time_diff
      for i, lat in enumerate(lats):
          for j, x in enumerate(lat):
            la = util.round_up_to_5_digits(lats[i][j])
            lo = util.round_up_to_5_digits(lons[i][j])
            key = str(la) + "_" + str(lo)
            if key not in dataMap:
              dataMap[key] = {}
            dataMap[key][util.t2s(jst)] = util.round_up_to_5_digits(values[i][j])

実行結果。
1つ目(FD0000-0100)のデータはnullとなるが、それ以外は値が埋まることを確認した。

[
    {
        "lat_lon": "35.7_139.75",
        "values": [
            {
                "validDate": "2023-11-24 09:00:00",
                "analDate": "2023-11-24 09:00:00",
                "temperature": 17.0838,
                "radiation": null,
                "pressure": 100524.3591,
                "mslp": 100768.1763,
                "uwind": 5.155,
                "vwind": 5.5459,
                "rh": 64.6628,
                "rain": null,
                "lcloud": 0.0,
                "mcloud": 0.0,
                "hcloud": 0.0,
                "tcloud": 0.0
            },
            {
                "validDate": "2023-11-24 10:00:00",
                "analDate": "2023-11-24 09:00:00",
                "temperature": 19.1107,
                "radiation": 443.9706,
                "pressure": 100531.3416,
                "mslp": 100773.4863,
                "uwind": 5.2847,
                "vwind": 5.8537,
                "rh": 59.387,
                "rain": 0.0,
                "lcloud": 0.0,
                "mcloud": 0.0,
                "hcloud": 0.0,
                "tcloud": 0.0
            },
            ...

AWS Lambdaで動かしてみた

pygribはファイルから読み出すことを基本としているので、S3から読んだファイルをいったん一時ファイルに出力してから読み出す。
ソースコード
メモリ512Mbで気温だけ読み出すのに13秒程度かかった。
メモリ128Mbでは30秒以上でタイムアウトとなった。

Dockerfile
FROM public.ecr.aws/lambda/python:3.11

RUN yum install openssl openssl-devel wget tar gzip cmake3 gcc-gfortran gcc-c++ perl -y \
  && yum install -y https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/l/libaec-1.0.4-1.el7.x86_64.rpm \
  && yum install -y https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/l/libaec-devel-1.0.4-1.el7.x86_64.rpm \
  && yum install -y libaec libaec-devel \
  && wget https://confluence.ecmwf.int/download/attachments/45757960/eccodes-2.32.1-Source.tar.gz \
  && tar -xvf eccodes-2.32.1-Source.tar.gz \
  && rm eccodes-2.32.1-Source.tar.gz \
  && mkdir build && cd build && cmake3 ../eccodes-2.32.1-Source && make && make install && make clean \
  && yum remove openssl openssl-devel wget tar gzip cmake3 gcc-gfortran perl  gcc-c++ libaec libaec-devel -y \
  && rm -rf /var/cache/yum/* \
  && yum clean all

COPY requirements.txt ${LAMBDA_TASK_ROOT}
RUN pip install -r requirements.txt
COPY lambda_function.py ${LAMBDA_TASK_ROOT}
CMD [ "lambda_function.handler" ]
requirements.txt
boto3[crt]
eccodes
pygrib
pandas
lambda-function.py
import sys
import os
import json
import io
import boto3
import pygrib
import pandas as pd
from datetime import timedelta, datetime
import math
import tempfile


def handler(event, context):
    lat = 35.6745
    lon = 139.7169
    region = os.environ.get("AWS_REGION")
    bucket_name = os.environ.get("S3_BUCKET_NAME")
    s3_client = boto3.client('s3')
    filepath = tempfile.NamedTemporaryFile().name
    s3_client.download_file(bucket_name, 'Z__C_RJTD_20221013000000_GSM_GPV_Rjp_Gll0p1deg_Lsurf_FD0000-0100_grib2.bin', filepath)
    data = execute(filepath, lat, lon)
    s3_client.put_object(Bucket=bucket_name, Key='test2.json', Body=json.dumps(data))
    os.remove(filepath)
    return 'Success'

# メッシュの刻み幅
LAT_STEP=0.1 / 2 # 緯度
LON_STEP=0.125 / 2 # 経度
def execute(filepath, lat, lon):
    gpv_file = pygrib.open(filepath)
    analDate = getBaseData(gpv_file, lat, lon,  LAT_STEP, LON_STEP)
    temperature = getParamData(gpv_file, "Temperature", lat, lon, LAT_STEP, LON_STEP)
    return toOutputJsonSingle(temperature, analDate)

# 日本時間に直す用
time_diff = timedelta(hours=9)

def getParamData(gpv_file, parameterName, lat, lon, LAT_STEP, LON_STEP):
  la1 = lat - LAT_STEP
  la2 = lat + LAT_STEP
  lo1 = lon - LON_STEP
  lo2 = lon + LON_STEP

  t_messages = gpv_file.select(parameterName=parameterName)

  dataMap = {}
  for grb in t_messages:
      values, lats, lons = grb.data(lat1=la1,lat2=la2,lon1=lo1,lon2=lo2)
      # 予報時刻(UTC)を日本時間に直す
      jst =  datetime.strptime(str(grb.validityDate) + str(grb.validityTime).zfill(4), "%Y%m%d%H%M") + time_diff
      for i, lat in enumerate(lats):
          for j, x in enumerate(lat):
            la = round_up_to_5_digits(lats[i][j])
            lo = round_up_to_5_digits(lons[i][j])
            key = str(la) + "_" + str(lo)
            if key not in dataMap:
              dataMap[key] = {}
            dataMap[key][t2s(jst)] = round_up_to_5_digits(values[i][j])

  return dataMap

# 代表として気温のデータから、解析基準時刻を取得する
def getBaseData(gpv_file, lat, lon, LAT_STEP, LON_STEP):
  t_messages = gpv_file.select(parameterName="Temperature")
  la1 = lat - LAT_STEP
  la2 = lat + LAT_STEP
  lo1 = lon - LON_STEP
  lo2 = lon + LON_STEP
    # データの探索
  for grb in t_messages:
      values, lats, lons = grb.data(lat1=la1,lat2=la2,lon1=lo1,lon2=lo2)
      analDate = t2s(grb.analDate + time_diff)
      return analDate


DATE_FORMAT="%Y-%m-%d %H:%M:%S"
# datetime型をstringに変換
def t2s(time):
    return time.strftime(DATE_FORMAT)
def s2t(time_str):
    return datetime.strptime(time_str, DATE_FORMAT)
def round_up_to_5_digits(number):
    return round(number, 4)


# ケルビンから℃変換用
F_C_DIFF=273.15

def toOutputJsonSingle(temperature, analDate):
    result_list = [{'lat_lon': key, 'values': toOutputSingle(key, temperature,analDate)} for key, value in temperature.items()]
    return result_list

def toOutputSingle( lat_lon, temperature, analDate):
    result = [{'validDate': key
              , 'analDate': analDate
              , 'temperature': round_up_to_5_digits(value - F_C_DIFF)
              } for key, value in temperature[lat_lon].items()]
    return result
python-docker-lambda-stack.ts
import { Aspects,Duration,RemovalPolicy,Stack,StackProps,Tag,Tags } from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import { BlockPublicAccess,Bucket,BucketAccessControl } from 'aws-cdk-lib/aws-s3';
import { BucketDeployment, Source } from 'aws-cdk-lib/aws-s3-deployment';
import { Construct } from 'constructs';

interface PythonDockerLambdaS3StackProps extends StackProps {
  projectDirectory: string;
}

export class PythonDockerLambdaGribStack extends Stack {
  constructor(
    scope: Construct,
    id: string,
    props: PythonDockerLambdaS3StackProps,
  ) {
    super(scope, id, props);
    const bucket = new Bucket(this, 'Bucket', {
      bucketName: 'hibohiboo-python-lambda-project-grib',
      publicReadAccess: false,
      accessControl: BucketAccessControl.PRIVATE,
      blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
      removalPolicy: RemovalPolicy.DESTROY,
      autoDeleteObjects: true,
    });
    new BucketDeployment(this, 'DeployFiles', {
      destinationBucket: bucket,
      sources: [Source.asset(`${props.projectDirectory}/s3Data/`)],
    });

    const gribImageFunction = new lambda.DockerImageFunction(
      this,
      'AssetFunction',
      {
        functionName: 'docker-lambda-python-grib',
        code: lambda.DockerImageCode.fromImageAsset(
          `${props.projectDirectory}/grib/`,
        ),
        timeout: Duration.seconds(30),
        retryAttempts: 0,
        environment: { S3_BUCKET_NAME: bucket.bucketName },
        memorySize: 512, //  128だと30秒タイムアウト
      },
    );
    bucket.grantReadWrite(helloImageFunction);
    Tags.of(gribImageFunction).add('Service', 'lambda');

    Aspects.of(this).add(new Tag('Stack', id));
  }
}

参考

GRIB2の基礎知識:気象情報

気象庁サンプルデータ
メソ数値予報モデルGPV (MSM) - サンプル ... MSM
全球数値予報モデルGPV (GSM全球域・日本域) - サンプル(日本域)... GSM

気象庁データ - 京都大学生存研究所 - 教育機関向け
気象庁サンプルデータより量が豊富。

気象庁が出しているGRIBの読み方

GRIB2フォーマットをpygrib.fromstringとio.BytesIOでメモリ上で扱う

wgrib2をdockerで動かしている人の話

eccodes

【cmake】最新版CMakeをapt installする方法【Ubuntu】
ubuntuでcmakeをビルド

GPVデータで格子点毎の気象要素をAWS上で時系列分析する方法を考えた課程
pygribで気象庁のGRIB2ファイルを読む
公式:pygrbi

気象庁数値予報天気図のカラー版

pygribのインストール方法:Amazon Linux2

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0