はじめに
前回の記事の続きです。
https://qiita.com/mizuk1013/items/fb976c1c3957d5679feb
今回は前回利用のデータと一緒に配布されていた在庫や販売内容の情報もテーブルとして追加して、結合や段階的なデータ探索が可能かを検証してみました。
データの追加
PostgreSQLのGUI管理ツールであるpgAdminはインポートする際に事前にテーブルを用意する必要があります。
この時のCREATE TABLE文もデータを数行読み込ませることで生成が可能です。
データ探索
下記のようなリクエストをしました。
ありがとう。これはECサイトを意識したサンプルデータです。
データは偏りがあると思いますが、複数の注文をしている顧客を洗い出し、3か月ごとの注文頻度が高い順に表にしてください。
発行されたSQLは上から順に次のような内容でした。
{
`sql`: `
WITH customer_orders AS (
SELECT
o.customerid,
c.lastname,
c.firstname,
o.orderdate,
o.orderno,
EXTRACT(YEAR FROM o.orderdate) as order_year,
EXTRACT(QUARTER FROM o.orderdate) as order_quarter
FROM \"order\" o
JOIN cust c ON o.customerid = c.customerid
),
customer_order_counts AS (
SELECT
customerid,
COUNT(DISTINCT orderno) as total_orders
FROM customer_orders
GROUP BY customerid
HAVING COUNT(DISTINCT orderno) > 1
),
quarterly_analysis AS (
SELECT
co.customerid,
co.lastname,
co.firstname,
co.order_year,
co.order_quarter,
COUNT(DISTINCT co.orderno) as orders_in_quarter,
CONCAT(co.order_year, '-Q', co.order_quarter) as year_quarter
FROM customer_orders co
JOIN customer_order_counts coc ON co.customerid = coc.customerid
GROUP BY co.customerid, co.lastname, co.firstname, co.order_year, co.order_quarter
ORDER BY co.customerid, co.order_year, co.order_quarter
)
SELECT
customerid,
lastname,
firstname,
COUNT(DISTINCT year_quarter) as active_quarters,
SUM(orders_in_quarter) as total_orders,
ROUND(AVG(orders_in_quarter), 2) as avg_orders_per_quarter,
STRING_AGG(CONCAT(year_quarter, ':', orders_in_quarter), ', ' ORDER BY year_quarter) as quarterly_breakdown
FROM quarterly_analysis
GROUP BY customerid, lastname, firstname
ORDER BY avg_orders_per_quarter DESC, total_orders DESC;
`
}
このSQLは結果が長すぎたようでClaude Desktopにエラーメッセージが表示されました。
{
`sql`: `
WITH customer_orders AS (
SELECT
o.customerid,
c.lastname,
c.firstname,
o.orderdate,
o.orderno,
EXTRACT(YEAR FROM o.orderdate) as order_year,
EXTRACT(QUARTER FROM o.orderdate) as order_quarter
FROM \"order\" o
JOIN cust c ON o.customerid = c.customerid
),
customer_order_counts AS (
SELECT
customerid,
COUNT(DISTINCT orderno) as total_orders
FROM customer_orders
GROUP BY customerid
HAVING COUNT(DISTINCT orderno) > 1
),
quarterly_analysis AS (
SELECT
co.customerid,
co.lastname,
co.firstname,
co.order_year,
co.order_quarter,
COUNT(DISTINCT co.orderno) as orders_in_quarter,
CONCAT(co.order_year, '-Q', co.order_quarter) as year_quarter
FROM customer_orders co
JOIN customer_order_counts coc ON co.customerid = coc.customerid
GROUP BY co.customerid, co.lastname, co.firstname, co.order_year, co.order_quarter
)
SELECT
customerid,
lastname,
firstname,
COUNT(DISTINCT year_quarter) as active_quarters,
SUM(orders_in_quarter) as total_orders,
ROUND(AVG(orders_in_quarter), 2) as avg_orders_per_quarter
FROM quarterly_analysis
GROUP BY customerid, lastname, firstname
ORDER BY avg_orders_per_quarter DESC, total_orders DESC
LIMIT 20;
`
}
次のSQLは出力に制限を掛けたものが自動的に生成されました。
{
`sql`: `
WITH customer_orders AS (
SELECT
o.customerid,
c.lastname,
c.firstname,
o.orderdate,
o.orderno,
EXTRACT(YEAR FROM o.orderdate) as order_year,
EXTRACT(QUARTER FROM o.orderdate) as order_quarter
FROM \"order\" o
JOIN cust c ON o.customerid = c.customerid
),
customer_order_counts AS (
SELECT
customerid,
COUNT(DISTINCT orderno) as total_orders
FROM customer_orders
GROUP BY customerid
HAVING COUNT(DISTINCT orderno) > 1
),
quarterly_analysis AS (
SELECT
co.customerid,
co.lastname,
co.firstname,
co.order_year,
co.order_quarter,
COUNT(DISTINCT co.orderno) as orders_in_quarter,
CONCAT(co.order_year, '-Q', co.order_quarter) as year_quarter
FROM customer_orders co
JOIN customer_order_counts coc ON co.customerid = coc.customerid
GROUP BY co.customerid, co.lastname, co.firstname, co.order_year, co.order_quarter
)
SELECT
customerid,
lastname,
firstname,
COUNT(DISTINCT year_quarter) as active_quarters,
SUM(orders_in_quarter) as total_orders,
ROUND(AVG(orders_in_quarter), 2) as avg_orders_per_quarter
FROM quarterly_analysis
GROUP BY customerid, lastname, firstname
HAVING COUNT(DISTINCT year_quarter) >= 2
ORDER BY avg_orders_per_quarter DESC, total_orders DESC;
`
}
こちらも件数が多くClaude Desktopではエラーメッセージが出ていました。
{
`sql`: `
WITH customer_orders AS (
SELECT
o.customerid,
c.lastname,
c.firstname,
o.orderdate,
o.orderno,
EXTRACT(YEAR FROM o.orderdate) as order_year,
EXTRACT(QUARTER FROM o.orderdate) as order_quarter
FROM \"order\" o
JOIN cust c ON o.customerid = c.customerid
),
customer_order_counts AS (
SELECT
customerid,
COUNT(DISTINCT orderno) as total_orders
FROM customer_orders
GROUP BY customerid
HAVING COUNT(DISTINCT orderno) > 1
),
quarterly_analysis AS (
SELECT
co.customerid,
co.lastname,
co.firstname,
co.order_year,
co.order_quarter,
COUNT(DISTINCT co.orderno) as orders_in_quarter,
CONCAT(co.order_year, '-Q', co.order_quarter) as year_quarter
FROM customer_orders co
JOIN customer_order_counts coc ON co.customerid = coc.customerid
GROUP BY co.customerid, co.lastname, co.firstname, co.order_year, co.order_quarter
)
SELECT
customerid,
lastname,
firstname,
COUNT(DISTINCT year_quarter) as active_quarters,
SUM(orders_in_quarter) as total_orders,
ROUND(AVG(orders_in_quarter), 2) as avg_orders_per_quarter
FROM quarterly_analysis
GROUP BY customerid, lastname, firstname
HAVING COUNT(DISTINCT year_quarter) >= 2
ORDER BY avg_orders_per_quarter DESC, total_orders DESC
LIMIT 20;
`
}
こちらも同じく制限を自動的に掛けました。
この結果が最終回答の表データになりました。
{
`sql`: `
-- 複数注文顧客の総数とその分布
WITH customer_orders AS (
SELECT
o.customerid,
c.lastname,
c.firstname,
o.orderdate,
o.orderno,
EXTRACT(YEAR FROM o.orderdate) as order_year,
EXTRACT(QUARTER FROM o.orderdate) as order_quarter
FROM \"order\" o
JOIN cust c ON o.customerid = c.customerid
),
customer_order_counts AS (
SELECT
customerid,
COUNT(DISTINCT orderno) as total_orders
FROM customer_orders
GROUP BY customerid
HAVING COUNT(DISTINCT orderno) > 1
),
quarterly_analysis AS (
SELECT
co.customerid,
co.lastname,
co.firstname,
co.order_year,
co.order_quarter,
COUNT(DISTINCT co.orderno) as orders_in_quarter,
CONCAT(co.order_year, '-Q', co.order_quarter) as year_quarter
FROM customer_orders co
JOIN customer_order_counts coc ON co.customerid = coc.customerid
GROUP BY co.customerid, co.lastname, co.firstname, co.order_year, co.order_quarter
)
SELECT
'複数注文顧客の分布' as analysis_type,
COUNT(DISTINCT customerid) as multiple_order_customers,
SUM(CASE WHEN active_quarters >= 2 THEN 1 ELSE 0 END) as multi_quarter_customers,
ROUND(AVG(avg_orders_per_quarter), 2) as overall_avg_orders_per_quarter
FROM (
SELECT
customerid,
COUNT(DISTINCT year_quarter) as active_quarters,
SUM(orders_in_quarter) as total_orders,
ROUND(AVG(orders_in_quarter), 2) as avg_orders_per_quarter
FROM quarterly_analysis
GROUP BY customerid
) subquery;
`
}
全顧客数10,000人の中で
複数回注文している顧客: 8,602人(86.02%)
単発注文の顧客: 1,398人(13.98%)
複数四半期にわたって注文している顧客: 8,353人(83.53%)
というデータをまとめてくれました。
また、レポートにはわかりやすい洞察も含めてくれています。
自社の売り上げ情報の分析など簡易的な自然文によるチャットBI体験をするのであれば、Claude DesktopとPostgreSQLで簡単に実現できることが分かりました。
次は複数MCPを組み合わせた検証をしてみたいと思います。