More than 1 year has passed since last update.

Raspberry Pi zero の pythonでSQLite3データベースの10万件のデータをCSV出力する

Last updated at 2023-11-07Posted at 2023-11-05

今回はラズベリーパイゼロでSQLite3データベースに登録している気象センサーデータ10万件をPythonでCSV出力する方法を紹介します。

下記に示すラズベリーパイゼロ(右側)の気象データ表示板からローカルの開発PCに気象データのCSVをダウンロードします。
※アプリの開発はNASのデータベースを参照するため古くなったテーブルを更新するために定期的にダウンロードしています。

上記システムの概要は下記GitHubリポジトリでご覧になれます
GitHub(pipito-yukio) ラズベリーパイによる家庭用気象データ監視システム

ラズパイゼロからダウンロードしたCSVは下記のようなフォーマットで103,326件です
※2021年07月30日からラズパイゼロで上記システムの運用を開始し203年11月04日時点で113,234件のレコードを記録してきました。

"did","measurement_time","temp_out","temp_in","humid","pressure"
1,"2022-01-01 00:02:08",-8.6,13.6,41.9,1001.3
1,"2022-01-01 00:11:52",-8.6,12.9,39.2,1001.4
1,"2022-01-01 00:21:35",-8.5,12.8,43.2,1001.4
...途中省略...
1,"2023-10-30 23:41:12",7.8,17.4,53.9,1018.0
1,"2023-10-30 23:50:56",7.8,17.4,53.9,1018.1

スクリプトの実行環境

OS: Raspbian GNU/Linux 10 (buster)
Python仮想環境 py37_pigpio ※I2C, SPI, pigpio等のライブラリをインストール
Python version: 3.7.3

Raspberry Pi Zero WH（UD-RPZWH）基本仕様

CPU Broadcom BCM2835、single core ARM1176JZF-S（ARMv6）SoC
CPUクロック 1GHz
メインメモリ 512MB

以下はCPUとメモリー情報 ※188MB位残っています

pi@raspi-zero:~ $ lscpu
Architecture:        armv6l
Byte Order:          Little Endian
CPU(s):              1
On-line CPU(s) list: 0
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           1
Vendor ID:           ARM
Model:               7
Model name:          ARM1176
Stepping:            r0p7
CPU max MHz:         1000.0000
CPU min MHz:         700.0000
BogoMIPS:            797.66
Flags:               half thumb fastmult vfp edsp java tls
pi@raspi-zero:~ $
pi@raspi-zero:~ $ vmstat --unit M
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0    188     48    136    0    0     0     0   18    2  7  2 91  0  0

1. CSVダウンロード機能

CSVダウンロード機能はFlaskアプリでもバッチアプリでも同一モジュールのCSV生成クラスを使っています

実行にかかった時間は以下の通りで、圧倒的にバッチが早いです。
(1) Flaskアプリ: 約３分半 (２１０秒)
(2) バッチアプリ: 約３０秒

1-1. FlaskアプリのCSVダウンロード

【実行所要時間】開発PC側で測定: 約3分半位

$ date +"%Y-%m-%d %H:%M:%S"
2023-11-03 18:47:00

$ date +"%Y-%m-%d %H:%M:%S"
2023-11-03 18:50:23

Webアプリのソースコードは下記GitHubリポジトリでご覧になれます。
GitHub(pipito-yukio) ラズベリーパイによる家庭用気象データ監視システム flask_web

flask_web/src/
├── run.py
├── start.sh
└── weather_finder
    ├── __init__.py
    ├── config.py
    ├── db -> /home/pi/bin/pigpio/db  # ★バッチアプリ側のdbパッケージのシンボリックリンク
    ├── log
    │   ├── __init__.py
    │   ├── logconf_main.json
    │   └── logsetting.py
    ├── messages
    │   └── messages.conf
    ├── static
    │   ├── css
    │   │   ├── bootstrap-grid.min.css
    │   │   ├── bootstrap-reboot.min.css
    │   │   ├── bootstrap.min.css
    │   │   └── styles.min.css
    │   └── js
    │       ├── bootstrap.bundle.min.js
    │       └── bootstrap.min.js
    ├── templates
    │   └── findweather.html
    └── views
        ├── __init__.py
        └── app_main.py

1-2. pythonバッチアプリでCSVファイル出力

CSVファイルとログファイルを開発PCのダウンロード

$ scp pi@raspi-zero:~/datas/weather_esp8266*.csv .
weather_esp8266_1_20211201-20231031_20231103.csv       100% 4609KB   3.5MB/s   00:01    
$ scp pi@raspi-zero:~/logs/pigpio/application_*.log .
application_202311031900.log                           100%  808   269.8KB/s   00:00

【実行所要時間】約３０秒位で完了
※ 下記はローカルにコピーしたログファイルの内容です

2023-11-03 19:00:45 INFO GetCSVFromWeather.py(22)[<module>] Namespace(date_from='2021-12-01', date_to='2023-10-31', device_name='esp8266_1', with_device_csv=False, without_header=False)
2023-11-03 19:00:45 DEBUG weatherdb.py(335)[_build_query] 
    SELECT did, datetime(measurement_time, 'unixepoch', 'localtime'), temp_out, temp_in, humid, pressure
    FROM t_weather WHERE did=? 
    AND (measurement_time >= strftime('%s', ?, '-9 hours') AND measurement_time < strftime('%s', ?, '-9 hours')) ORDER BY did, measurement_time
2023-11-03 19:00:46 INFO weatherdb.py(390)[find] Record count: 103473
2023-11-03 19:00:46 INFO weatherdb.py(395)[find] Return CSV iterator
2023-11-03 19:01:08 INFO GetCSVFromWeather.py(35)[<module>] Saved Weather CSV: /home/pi/datas/weather_esp8266_1_20211201-20231031_20231103.csv

バッチ処理のソースコードは下記GitHubリポジトリでご覧になれます。
GitHub(pipito-yukio) ラズベリーパイによる家庭用気象データ監視システム raspi_zero/bin

【バッチ処理に関するソースのみ抜粋】

bin/
├── pigpio
│   ├── GetCSVFromWeather.py # バッチ処理Pythonスクリプト
│   ├── conf
│   │   ├── ...無関係の設定ファイルは割愛...
│   │   └── logconf_main_app.json
│   ├── db
│   │   ├── __init__.py
│   │   ├── sqlite3conv.py
│   │   ├── sqlite3db.py
│   │   └── weatherdb.py     # ★CSVを出力するDAOクラスを含むモジュール★
│   └── log
│       ├── __init__.py
│       └── logsetting.py
└── getcsv_from_weather.sh   # バッチ処理シェルスクリプト

ここからはCSV出力モジュール(weatherdb.py)を投稿用に簡略化した実装を紹介いたします。

Pythonにおける実装のポイントはデータベースからのデータ取出しにGeneratorを使用することです。

2.記事用に機能を簡略化した実装

この実装に関しては下記Qiita記事で詳細に説明していますでで参照してください。
※下記記事で説明している内容に関してはこの投稿では割愛させていただきます。
(Qiita) SQLite3データベースでタイムスタンプ列を格納可能なINTEGER型とTEXT型の違いは

2-1. CSV出力モジュール

2-1-1. SQLite3データベース接続取得関数等の定義

import logging
import sqlite3
from datetime import date, datetime, timedelta
from typing import List, Optional, Tuple

"""
Weather database CRUD functions, Finder class
for python 3.7.x
"""

FMT_ISO8601_DATE: str = "%Y-%m-%d"

# SQLite3データベース接続取得関数
def get_connection(db_path: str,
                   auto_commit: bool = False, read_only: bool = False,
                   logger: Optional[logging.Logger] = None) -> sqlite3.Connection:
    try:
        conn: sqlite3.Connection
        if read_only:
            db_uri: str = "file://{}?mode=ro".format(db_path)
            conn = sqlite3.connect(db_uri, uri=True)
        else:
            conn = sqlite3.connect(db_path)
            if auto_commit:
                conn.isolation_level = None
    except sqlite3.Error as e:
        if logger is not None:
            logger.error(e)
        raise e
    return conn


# デバイス名からデバイスIDを取得する関数
def find_device(conn: sqlite3.Connection, device_name: str,
                logger: Optional[logging.Logger] = None, log_level_debug: bool = False
                ) -> Optional[int]:
    rec: Optional[Tuple[int]]
    with conn:
        cur: sqlite3.Cursor = conn.execute(
            "SELECT id FROM t_device WHERE name = ?", (device_name,)
        )
        rec = cur.fetchone()
    if logger is not None and log_level_debug:
        logger.debug("{}: {}".format(device_name, rec))
    # これ以上データがない場合は Noneを返す
    if rec is not None:
        return rec[0]

    return None

2-1-1 (2) ISO8601形式文字列の翌日を取得する関数定義

※検索終了日の翌日を計算するための関数

def get_next_iso8601date(s_date: str) -> str:
    try:
        # ISO8601形式文字列をdatetimeオブジェクトに変換
        dt_obj: datetime = datetime.strptime(s_date, FMT_ISO8601_DATE)
        # 翌日 = 引数の日時 + 1日
        dt_obj += timedelta(days=1)
        # ISO8601形式文字列に戻す
        return dt_obj.strftime(FMT_ISO8601_DATE)
    except ValueError as e:
        raise e

2-1-2. レコードからCSVを出力するWeatherFinderクラス

2-1-2 (1) コンストラクタ・検索クエリー定義等

測定時刻 >= 検索開始日 AND 測定時刻 < 検索終了日の翌日
※これで検索終了日+"23:59:59" 迄のデータが含まれるようになる

class WeatherFinder:
    # Private constants
    _SELECT_WEATHER_COUNT: str = """
SELECT
   COUNT(*)
FROM
   t_weather
WHERE
   did = ?
   AND (
      measurement_time >= strftime('%s', ? ,'-9 hours')
      AND
      measurement_time < strftime('%s', ? ,'-9 hours')
   )
"""
    _SELECT_WEATHER: str = """
SELECT
   did, datetime(measurement_time, 'unixepoch', 'localtime'), temp_out, temp_in, humid, pressure
FROM
   t_weather
WHERE
   did = ?
   AND (
      measurement_time >= strftime('%s', ? ,'-9 hours')
      AND
      measurement_time < strftime('%s', ? ,'-9 hours')
   )
ORDER BY measurement_time;
    """
    # if record count > GENERATOR_THRETHOLD then CSV Generator else CSV list
    _GENERATOR_WEATHER_THRESHOLD: int = 10000
    _GENERATOR_WEATHER_BATCH_SIZE: int = 1000
    # CSV constants
    _FMT_WEATHER_CSV_LINE: str = '{},"{}",{},{},{},{}'
    # Public const
    # CSV t_weather Header
    CSV_WEATHER_HEADER: str = '"did","measurement_time","temp_out","temp_in","humid","pressure"\n'

    def __init__(self, db_path: str, logger: Optional[logging.Logger] = None):
        self.logger = logger
        if logger is not None and (logger.getEffectiveLevel() <= logging.DEBUG):
            self.isLogLevelDebug = True
        else:
            self.isLogLevelDebug = False
        self.db_path: str = db_path
        self.conn: Optional[sqlite3.Connection] = None
        self.cursor: Optional[sqlite3.Cursor] = None
        self.csv_iter = None
        self._csv_name: Optional[str] = None

    def close(self):
        """ Close cursor and connection close """
        if self.cursor is not None:
            self.cursor.close()
        if self.conn is not None:
            self.conn.close()

    @property
    def csv_filename(self) -> str:
        """ CSV filename: 'weather_[device_name]_[from_date]-[to_date]_[today].csv' """
        return "weather_{}.csv".format(self._csv_name)

2-1-2 (2) ジェネレータの実装

CSV出力、ジェネレーターの実装で参考にしたサイトのご紹介
特に下記サイトの Peeweeライブラリのソースコードは大変参考になりました。
４〜５年位前にすこし流行っていたORMライブラリだったと記憶しています
※①、②の実装は下記のようになっておりコードの内容は同一です

(① ジェネレータの実践的な使い方は https://stackoverflow.com/questions/102535/what-can-you-use-generator-functions-for を参考にしました [What can you use generator functions for? : Real World Example])

def ResultGenerator(cursor, batchsize=1000):
    while True:
        results = cursor.fetchmany(batchsize)
        if not results:
            break
        for result in results:
            yield result

(② 著者ジェネレータ実装の元になったのコードは https://github.com/coleifer/peewee/blob/master/playhouse/postgres_ext.py [peewee/playhouse/postgres_ext.py: class FetchManyCursor])

    def row_gen(self):
        while True:
            rows = self.cursor.fetchmany(self.array_size)
            if not rows:
                return
            for row in rows:
                yield row

(PeeWeeライブラリのドキュメント https://docs.peewee-orm.com/en/latest/peewee/playhouse.html [Playhouse, extensions to Peewee])

(CSV出力についてはソースコード scappy/scrapy/iterator.py を参考にしました)

レコード件数が１万件を超えるとバッチサイズ1000レコードのジェネレータを生成

    def _csv_iterator(self):
        """
        Generate Csv generator
          line: did, "YYYY-mm-DD HH:MM:SS(measurement_time)",temp_out,temp_in,humid,pressure
          (*) temp_out,temp_in,humid, pressure: if filedValue is None then empty string
        :return: Record generator
        """
        while True:
            batch_resords: Optional[List[tuple]] = self.cursor.fetchmany(
                self._GENERATOR_WEATHER_BATCH_SIZE)
            # バッチにレコードがなくなったら終了    
            if not batch_resords:
                break

            for rec in batch_resords:
                yield self._FMT_WEATHER_CSV_LINE.format(rec[0],
                                                        rec[1],
                                                        rec[2] if rec[2] is not None else '',
                                                        rec[3] if rec[3] is not None else '',
                                                        rec[4] if rec[4] is not None else '',
                                                        rec[5] if rec[5] is not None else '')

(3-3) 全レコードのリスト(CSV形式)取得

レコード件数が１万件以下だと一括して全レコードのCSVリストを生成

    def _csv_list(self) -> List[str]:
        """
        Get CSV list
          line: did, "YYYY-mm-DD HH:MM:SS(measurement_time)",temp_out,temp_in,humid,pressure
          (*) temp_out,temp_in,humid, pressure: if filedValue is None then empty string
        :return: Record list, if no record then blank list
        """
        return [self._FMT_WEATHER_CSV_LINE.format(rec[0],
                                                  rec[1],
                                                  rec[2] if rec[2] is not None else '',
                                                  rec[3] if rec[3] is not None else '',
                                                  rec[4] if rec[4] is not None else '',
                                                  rec[5] if rec[5] is not None else '') for rec in self.cursor]

2-1-2 (3) CSV出力メイン処理

処理のポイントとしては検索条件のレコード取得前にレコード件数を取得することです
事前に取得した件数よりリスト生成 / ジェネレータ生成を切り替え可能になります

find()メソッドの戻り値の型を省略した理由
戻り値の型は python3.8以上では Generator(or Iterator or Iterable) [str] | List[str] ですが
実行環境のラズパイゼロのpython3.7が下記のようなエラーでスクリプトが停止したため
TypeError: Too few parameters for typing.Generator; actual 1, expected 3
又は
TypeError: unsupported operand type(s) for |: '_GenericAlias' and '_GenericAlias'
関数の戻り値の型を未定義としました ※この関数の呼び出し元も同様
※開発PCのPython 3.10.12ではエラーなく実行できていますが
今回は解決しませんでしかだ、いろいろ調べて解決したいと思います。

    def find(self, device_name: str, date_from: str, date_to: str):
        # ファイル名サフィックスを生成する
        date_part: date = date.today()
        name_suffix: str = "{}_{}_{}".format(
            device_name, date_from.replace("-", ""), date_to.replace("-", "")
        )
        self._csv_name = name_suffix + "_" + date_part.strftime("%Y%m%d")

        # 接続オブジェクト生成
        if self.conn is None:
            self.conn = get_connection(self.db_path, read_only=True, logger=self.logger)
        # デバイス名からデバイスIDを取得
        did: Optional[int] = find_device(self.conn, device_name, logger=self.logger)
        # 一致するデバイス名がなければ０件のリスト返却
        if did is None:
            return []

        # 検索終了日の翌日
        exclude_to_date: str = get_next_iso8601date(date_to)
        # 検索開始日 <= 測定時刻 < 検索終了日の翌日　※検索開始日〜検索終了日のデータ取得
        params: Tuple = (did, date_from, exclude_to_date)
        try:
            # ★★　検索条件でレコード件数を取得 ★★
            self.cursor = self.conn.cursor()
            self.cursor.execute(self._SELECT_WEATHER_COUNT, params)
            row_count: int = self.cursor.fetchone()[0]
            if self.logger is not None:
                self.logger.info("Record count: {}".format(row_count))
            # 一致するレコードがない場合は０件のリスト返却
            if row_count == 0:
                return []

            # 実際のレコード取得クエリー実行
            self.cursor.execute(self._SELECT_WEATHER, params)
            if row_count > self._GENERATOR_WEATHER_THRESHOLD:
                if self.logger is not None:
                    self.logger.info("Return CSV Generator")
                # ★ 10000件超ならジェネレータを取得 ★
                return self._csv_iterator()
            else:
                if self.logger is not None:
                    self.logger.info("Return CSV list")
                # ★ 10000件以下なら一括してCSVリストを取得 ★
                return self._csv_list()
        except sqlite3.Error as err:
            if self.logger is not None:
                self.logger.warning("criteria: {}\nerror:{}".format(params, err))
            raise err

2-2. CSV出力pythonスクリプト

2-2-1. インポート・定数定義

データベースファイルパスとCSV出力パスは環境変数から取得
※1 ラズパイゼロはヘッドレスOSのため ~/Downloads ディレクトリは存在しない
※2 試験機のラズパイゼロでは .bashrc に設定せず実行前にexportする

# bin/pigpio/GetCSVFromWeather.py
import argparse
import logging
import os
from typing import Optional
from db.weatherdb import WeatherFinder

# SQLite3 Databaseファイルパス
PATH_WEATHER_DB: str = os.environ.get("PATH_WEATHER_DB", "~/db/weather.db")
# CSV出力パス
OUTPUT_CSV_PATH = os.environ.get("OUTPUT_CSV_PATH", "~/Downloads/csv/")

2-2-2. 検索条件の入力パラメータ取得処理

　(1) デバイス名 --device-name ※必須
　(2) 検索開始日 --from-date ISO8601形式 ※必須
　(3) 検索終了日(含む) --to-date ISO8601形式 ※必須

if __name__ == '__main__':
    # ...ログ設定は割愛...
    # 入力パラメータ設定処理
    parser = argparse.ArgumentParser()
    parser.add_argument("--device-name", type=str, required=True,
                        help="Device name with t_device name.")
    parser.add_argument("--date-from", type=str, required=True,
                        help="Date from with t_weather.measurement_time.")
    parser.add_argument("--date-to", type=str, required=True,
                        help="Date to with t_weather.measurement_time.")
    args: argparse.Namespace = parser.parse_args()
    app_logger.info(args)

2-2-3. WeatherFinderクラスのオブジェクトを生成しファイル保存

ポイントはCSVを１行毎にファイルに出力することです

    weather_finder: Optional[WeatherFinder] = None
    db_path: str = os.path.expanduser(PATH_WEATHER_DB)
    try:
        weather_finder = WeatherFinder(db_path, logger=app_logger)
        app_logger.info(weather_finder)
        # from t_weather to csv
        csv_iterable = weather_finder.find(
            args.device_name, date_from=args.date_from, date_to=args.date_to
        )
        app_logger.info(f"type(csv_iterable): {type(csv_iterable)}")
        # filename: build "" + "device name" + "date_from" + "date_to" + "date now" + ".csv"
        csv_file: str = os.path.join(
            os.path.expanduser(OUTPUT_CSV_PATH), weather_finder.csv_filename
        )
        # CSVファイル出力処理
        with open(csv_file, 'w', newline='') as fp:
            # CSVヘッダーの出力
            fp.write(WeatherFinder.CSV_WEATHER_HEADER)
            # 行データの出力
            if csv_iterable is not None:
                for line in csv_iterable:
                    fp.write(line + "\n")
        app_logger.info("Saved Weather CSV: {}".format(csv_file))
    except Exception as e:
        app_logger.warning("WeatherFinder error: {}".format(e))
    finally:
        if weather_finder is not None:
            weather_finder.close()

2-3. CSV出力シェルスクリプト

ラズパイ上(Linux)で実行するのでシェルスクリプト(bash)が必要になります

#!/bin/bash

readonly SCRIPT_NAME=${0##*/}

print_help()
{
   cat << END
Usage: $SCRIP_NAME OPTIONS
Execute GetCSVFromWeather.py OPTIONS

--device-name: Required 'ESP module device name'
--date-from: Required SQL Criteria Start date in t_weahter.
--date-to: Required SQL Criteria End date in t_weahter.
--help	display this help and exit

Example:
[short options]
  $SCRIPT_NAME -d esp8266_1 -f 2021-08-01 -t 2021-09-30
[long options]
  $SCRIPT_NAME --device-name esp8266_1 --date-from 2021-08-01 --date-to 2021-09-30
END
}

print_error()
{
   cat << END 1>&2
$SCRIPT_NAME: $1
Try --help option
END
}

params=$(getopt -n "$SCRIPT_NAME" \
       -o d:f:t:\
       -l device-name: -l date-from: -l date-to: -l help \
       -- "$@")

# Check command status: $?
if [[ $? -ne 0 ]]; then
  echo 'Try --help option for more information' 1>&2
  exit 1
fi

eval set -- "$params"

device_name=
date_from=
date_to=

# Parse options
# Positional parameter count: $#
while [[ $# -gt 0 ]]
do
  case "$1" in
    -d | --device-name)
      device_name=$2
      shift 2
      ;;
    -f | --date-from)
      date_from=$2
      shift 2
      ;;
    -t | --date-to)
      date_to=$2
      shift 2
      ;;
    --help)
      print_help
      exit 0
      ;;
    --)
      shift
      break
      ;;
    *)
      echo "Internal Error"
      exit 1
      ;;
  esac
done

echo "$SCRIPT_NAME --device-name $device_name --date-from $date_from --date-to $date_to"

# Check required option: --device-name
if [ -z $device_name ]; then
  print_error "Required --device-name xxxxx"
  exit 1
fi
if [ -z $date_from ]; then
  print_error "Required --date-from iso-8601 date"
  exit 1
fi
if [ -z $date_to ]; then
  print_error "Required --date-to iso-8601 date"
  exit 1
fi

option_device_name="--device-name $device_name"
option_date_from="--date-from $date_from"
option_date_to="--date-to $date_to"

echo "pigpio/GetCSVFromWeather.py  $option_device_name $option_date_from $option_date_to"

. ~/py_venv/py37_pigpio/bin/activate

python pigpio/GetCSVFromWeather.py $option_device_name $option_date_from $option_date_to

deactivate

3.記事用スクリプトの実行環境

3-1.実行環境

開発用ラズパイを使用するので実行環境は本番環境のラズパイと同一

開発用ラズパイゼロにログインしてインストール先とPython仮想環境を確認

pi@raspi-zerodev:~ $ ls -l --time-style long-iso
合計 24
drwxr-xr-x 3 pi pi 4096 2023-11-04 10:50 bin
drwxr-xr-x 2 pi pi 4096 2023-11-04 11:02 datas
drwxr-xr-x 2 pi pi 4096 2021-12-14 13:32 db
drwxr-xr-x 3 pi pi 4096 2021-11-02 20:14 logs
drwxr-xr-x 3 pi pi 4096 2023-11-04 10:30 py_venv
drwxr-xr-x 2 pi pi 4096 2023-11-04 10:29 work
pi@raspi-zerodev:~ $
pi@raspi-zerodev:~ $ ls -l py_venv/ --time-style long-iso
合計 4
drwxr-xr-x 6 pi pi 4096 2021-12-14 13:30 py37_pigpio

開発PCからソースを開発用ラズパイにコピーする

(py_sqlite3) $ scp GetCSVFromWeather.py pi@raspi-zerodev:~/bin/pigpio
GetCSVFromWeather.py                       100% 2389   723.2KB/s   00:00    
(py_sqlite3) $ scp -r db/ pi@raspi-zerodev:~/bin/pigpio
weatherdb.py                               100% 6942     1.4MB/s   00:00    
__init__.py                                100%    0     0.0KB/s   00:00    
(py_sqlite3) $ scp bin/getcsv_from_weather.sh  pi@raspi-zerodev:~/bin
getcsv_from_weather.sh                     100% 2090   639.6KB/s   00:00

本番機から最新の気象データベースを取得する

pi@raspi-zerodev:~/db $ scp pi@raspi-zero:~/db/weather.db .
pi@raspi-zero's password: 
weather.db            100% 6908KB   3.3MB/s   00:02

開発用ラズパイでソースを確認
※ datas ディレクトリは CSVの格納先

pi@raspi-zerodev:~ $ tree -f bin db datas
bin
├── bin/getcsv_from_weather.sh
└── bin/pigpio
    ├── bin/pigpio/GetCSVFromWeather.py
    └── bin/pigpio/db
        ├── bin/pigpio/db/__init__.py
        └── bin/pigpio/db/weatherdb.py
db
├── db/weather.db
└── db/weather_db.sql
datas

3-2.スクリプト実行

実行結果は約２５秒で処理を完了
※1 検索条件は本番機と同じ。
※2 試験機は２つのアプリサービスをインストールしてないので若干早く終わったようです。

pi@raspi-zerodev:~/bin $ export OUTPUT_CSV_PATH=~/datas/
pi@raspi-zerodev:~/bin $ ./getcsv_from_weather.sh --device-name esp8266_1 \
> --date-from '2021-12-01' --date-to '2023-10-31'
getcsv_from_weather.sh --device-name esp8266_1 --date-from 2021-12-01 --date-to 2023-10-31
pigpio/GetCSVFromWeather.py  --device-name esp8266_1 --date-from 2021-12-01 --date-to 2023-10-31
2023-11-05 09:34:13,336 INFO PATH_WEATHER_DB: /home/pi/db/weather.db
2023-11-05 09:34:13,340 INFO OUTPUT_CSV_PATH: /home/pi/datas/
2023-11-05 09:34:13,378 INFO Namespace(date_from='2021-12-01', date_to='2023-10-31', device_name='esp8266_1')
2023-11-05 09:34:13,383 INFO <db.weatherdb.WeatherFinder object at 0xb6693830>
2023-11-05 09:34:13,663 INFO Record count: 103473
2023-11-05 09:34:13,669 INFO Return CSV Generator
2023-11-05 09:34:13,673 INFO type(csv_iterable): <class 'generator'>
2023-11-05 09:34:37,353 INFO Saved Weather CSV: /home/pi/datas/weather_esp8266_1_20211201_20231031_20231105.csv

参考までに著者の開発PC(Dell-t7500 １０年以上前の機種)の実行結果

開発PCの実行環境

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
#...以下省略...

$ LANG=C lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         40 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  24
  On-line CPU(s) list:   0-23
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) CPU           X5690  @ 3.47GHz
    CPU family:          6
    Model:               44
    Thread(s) per core:  2
    Core(s) per socket:  6
    Socket(s):           2
#...以下省略...

約０．５秒で実行完了しました。

(py_sqlite3) $ python -V
Python 3.10.12
(py_sqlite3) yukio@Dell-T7500:~/project/Qiita/download_csv_with_raspi-zero$ python GetCSVFromWeather.py --device-name esp8266_1 --date-from '2021-12-01' --date-to '2023-10-31'
2023-11-05 10:49:19,897 INFO PATH_WEATHER_DB: /home/yukio/db/weather.db
2023-11-05 10:49:19,898 INFO OUTPUT_CSV_PATH: ~/Downloads/csv/
2023-11-05 10:49:19,903 INFO Namespace(device_name='esp8266_1', date_from='2021-12-01', date_to='2023-10-31')
2023-11-05 10:49:19,903 INFO <db.weatherdb.WeatherFinder object at 0x7fb77736bdf0>
2023-11-05 10:49:19,905 INFO params: (1, '2021-12-01', '2023-11-01')
2023-11-05 10:49:19,912 INFO Record count: 103473
2023-11-05 10:49:19,913 INFO Return CSV Generator
2023-11-05 10:49:19,913 INFO type(csv_iterable): <class 'generator'>
2023-11-05 10:49:20,444 INFO Saved Weather CSV: /home/yukio/Downloads/csv/weather_esp8266_1_20211201_20231031_20231105.csv

4. 結論

今回はリソースの乏しいコンピューターで稼働させるために有効なジェネレータの実装について紹介いたしました。

ラズパイゼロではCPUもメモリも非常に少ないためメモリを意識した実装が重要になります。

巷のネット、又は参考書のジェネレータについては実用性に乏しいコード例又は説明が多くPython初心者には鬼門となっています。自分もPythonを始めてから特にそう感じていました。

いまではGitHub等で有名なライブラリのソースコードを容易に取得できる時代になりました。これらのソースコードを参考にし自身のコーディング力を高めてみてはいかがでしょうか?

記事用に簡略化したバッチ用のソースコードは下記GitHubリポジトリで公開しています。

GitHub(pipito-yukio) qiita-posts/python/download_csv_with_raspi-zero

記事用で使用したSQLite3データペースファイルは下記GitHubリポジトリで公開しています。
GitHub(pipito-yukio) matplotlib_knowhow/pandas-read_sql/db/sqlite3/weather.db

観測地点: 日本 - 北海道 - 札幌市豊平区中の島 (43.03990246713225, 141.3598626293165)
レコード件数: 113,353

$ cat sql/MinMaxRec.sql | sqlite3 weather.db 
1|2021-07-30 01:26:51|-11.1|29.9|58.8|999.0
1|2023-11-05 11:47:09|12.3|17.6|45.5|1026.5
$ echo "SELECT COUNT(*) FROM t_weather WHERE did=1;" | sqlite3 weather.db 
113353

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up