Qiita Engineer Festa20242024年7月17日まで開催中！

PostgreSQL: Python3 で csv をインポート

Last updated at 2024-07-09Posted at 2024-07-04

テーブルのコラムが、総て　csv のコラムにあるとします。

プログラム

csv_import.py

#! /usr/bin/python
#	csv_import.py
#
#						Jul/10/2024
# ------------------------------------------------------------------
import pandas as pd
import psycopg2
import sys
# ------------------------------------------------------------------
from prepare_sql import prepare_sql_proc
from prepare_sql import get_columns_order
# ------------------------------------------------------------------
csv_file = sys.argv[1]

dbname = "city"
user = "scott"
password = "tiger123"
host = "localhost"
port = "5432"

conn = psycopg2.connect(dbname=dbname, user=user, password=password, host=host, port=port)

df = pd.read_csv(csv_file)

table_name = 'cities'

cur = conn.cursor()

array_columns = get_columns_order(cur,table_name)
sql = prepare_sql_proc(table_name,array_columns)
print(sql)

try:
	for index, row in df.iterrows():
		data = ()
		for col in array_columns:
			data += (row[col],)
		print(data)
 
		cur.execute(sql, data)
	
	conn.commit()
	print("データのインポートが完了しました。")

except psycopg2.Error as e:
	print(f"データのインポート中にエラーが発生しました: {e}")

finally:
	cur.close()
	conn.close()
#
# ------------------------------------------------------------------

prepare_sql.py

#	prepare_sql.py
#
#						Jul/10/2024
# ------------------------------------------------------------------
def get_columns_order(cur,table_name):
	array_columns = []
	sql = """
	SELECT column_name
	FROM information_schema.columns
	WHERE table_name = %s
	ORDER BY ordinal_position;
	"""
	cur.execute(sql, (table_name,))
#	
	columns = cur.fetchall()
	for row in columns:
		array_columns.append(row[0])
#
	return array_columns
# ------------------------------------------------------------------
def prepare_sql_proc(table_name,array_columns):
	icount = 0
	portion_a = "("
	portion_b = "("
	for col in array_columns:
		if 0 < icount:
			portion_a += ", " 
			portion_b += ", " 
		portion_a += col
		portion_b += "%s"
		icount += 1
	portion_a += ")"
	portion_b += ")"
#
	sql = f"INSERT INTO {table_name} "
	sql += portion_a + " VALUES "
	sql += portion_b + ";"
#
	return sql
# ------------------------------------------------------------------

実行コマンド

./csv_import.py cities.csv

サンプルの CSV

cities.csv

id,name,population,date_mod
t1271,千葉,73561,2019-7-30
t1272,勝浦,24895,2019-8-10
t1273,市原,67294,2019-1-14
t1274,流山,73612,2019-5-9
t1275,八千代,63491,2019-8-4
t1276,我孫子,41827,2019-1-21
t1277,鴨川,12946,2019-7-23
t1278,銚子,79128,2019-11-26
t1279,市川,13572,2019-10-15

コラムの順が違っても問題ありません

cities2.csv

id,name,date_mod,population
t1271,千葉,2019-7-30,73561
t1272,勝浦,2019-8-10,24895
t1273,市原,2019-1-14,67294
t1274,流山,2019-5-9,73612
t1275,八千代,2019-8-4,63491
t1276,我孫子,2019-1-21,41827
t1277,鴨川,2019-7-23,12946
t1278,銚子,2019-11-26,79128
t1279,市川,2019-10-15,13572

テーブル定義

ity=# \d cities
                        Table "public.cities"
   Column   |         Type          | Collation | Nullable | Default 
------------+-----------------------+-----------+----------+---------
 id         | character varying(10) |           | not null | 
 name       | text                  |           |          | 
 population | integer               |           |          | 
 date_mod   | date                  |           |          | 
Indexes:
    "cities_pkey" PRIMARY KEY, btree (id)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up