はじめに
ローカルの SQL Server だけで「DWHっぽい」データモデルを触ってみたいときに、さっと使えるサンプルがあると便利です。(たまに、Microsoft Fabric の Lakehouse や Warehouse を検証するときに、SQL Server からデータベースをコピーしたいとき、これらのダミーデータがあると良いかもしれませんね)
ここでは、以下のようなミニ DWH を一括で作る SQL スクリプトを遊んでみようと思います。
- データベース:
SalesStarDW - スキーマ:典型的な スター・スキーマ
- 日付ディメンション(
DimDate) - 顧客ディメンション(
DimCustomer) - 商品ディメンション(
DimProduct) - 店舗ディメンション(
DimStore) - 営業ディメンション(
DimSalesperson) - 売上ファクト(
FactSales、計算列SalesAmount付き)
- 日付ディメンション(
- データ:
- 日付:2020-01-01〜2024-12-31
- 各ディメンション:30件ずつ
- ファクト:ランダムに 1,000 行生成
前半:概要と簡単な遊び方
後半:スクリプト全文
という構成にしています。
何が作られるか(ざっくり解説)
ディメンションテーブル
-
DimDate-
DateValue(自然キー)、DayOfWeekName、MonthName、CalendarYear、IsWeekendなど - 日付分析(年・月・曜日・平日/週末)がしやすい構造
-
-
DimCustomer-
CustomerSegment(Retail / SMB / Enterprise) -
City、Country、SignupDate、IsActiveなど
-
-
DimProduct-
Brand、Category、SubCategory -
UnitPrice(標準単価)、IsActive
-
-
DimStore-
StoreType(Retail / Outlet / Online) -
City、Region、Country、OpenDate、IsActive
-
-
DimSalesperson-
Title(Sales Rep / Senior Sales Rep / Sales Manager) -
Region、HireDate、IsActive
-
ファクトテーブル
FactSales は、典型的な売上ファクトテーブルです。
- 外部キー:
DateKey、CustomerKey、ProductKey、StoreKey、SalespersonKey - 明細情報:
OrderNumber、OrderLineNumber、Quantity、UnitPrice、DiscountAmount - 計算列:
SalesAmount(Quantity * UnitPrice - DiscountAmount)
計算列 SalesAmount は PERSISTED なので、クエリ時に毎回式を書かなくてもそのまま合計を取ることができます。
どうやって遊ぶか(最低限のステップ)
1. スクリプトを実行する
SQL Server Management Studio (SSMS) や Azure Data Studio などで、後半の「サンプル SQL コード全体」をそのまま実行します。
-
SalesStarDWデータベースが作成される - 既存のテーブルがあれば DROP される
- ディメンションとファクトが作成され、データが投入される
2. 行数の確認
本当に 1,000 行入っているか確認します。
USE SalesStarDW;
GO
SELECT COUNT(*) AS FactSalesRowCount
FROM dbo.FactSales;
3. 売上明細のサンプルを確認
ディメンションを JOIN して、どんなデータが入っているか簡単に確認します。
SELECT TOP 5
f.SalesKey,
d.DateValue,
c.FullName AS CustomerName,
p.ProductName,
st.StoreName,
sp.FullName AS SalespersonName,
f.Quantity,
f.UnitPrice,
f.DiscountAmount,
f.SalesAmount
FROM dbo.FactSales f
JOIN dbo.DimDate d ON f.DateKey = d.DateKey
JOIN dbo.DimCustomer c ON f.CustomerKey = c.CustomerKey
JOIN dbo.DimProduct p ON f.ProductKey = p.ProductKey
JOIN dbo.DimStore st ON f.StoreKey = st.StoreKey
JOIN dbo.DimSalesperson sp ON f.SalespersonKey = sp.SalespersonKey;
4. 年別・国別の売上を集計してみる
簡単な集計クエリで、スター・スキーマの感覚を掴めます。
SELECT
d.CalendarYear,
st.Country,
SUM(f.SalesAmount) AS TotalSales,
COUNT(*) AS LineCount
FROM dbo.FactSales f
JOIN dbo.DimDate d ON f.DateKey = d.DateKey
JOIN dbo.DimStore st ON f.StoreKey = st.StoreKey
GROUP BY d.CalendarYear, st.Country
ORDER BY d.CalendarYear, st.Country;
実行例
サンプル SQL コード全体
以下が、今回利用しているスクリプト全文です。
そのまま貼り付けて実行すれば、SalesStarDW ミニ DWH が作成されます。
------------------------------------------------------------
-- 0. Create and select database
------------------------------------------------------------
IF DB_ID('SalesStarDW') IS NULL
CREATE DATABASE SalesStarDW;
GO
USE SalesStarDW;
GO
------------------------------------------------------------
-- 1. Drop tables if you re-run the script
------------------------------------------------------------
IF OBJECT_ID('dbo.FactSales', 'U') IS NOT NULL DROP TABLE dbo.FactSales;
IF OBJECT_ID('dbo.DimSalesperson', 'U') IS NOT NULL DROP TABLE dbo.DimSalesperson;
IF OBJECT_ID('dbo.DimStore', 'U') IS NOT NULL DROP TABLE dbo.DimStore;
IF OBJECT_ID('dbo.DimProduct', 'U') IS NOT NULL DROP TABLE dbo.DimProduct;
IF OBJECT_ID('dbo.DimCustomer', 'U') IS NOT NULL DROP TABLE dbo.DimCustomer;
IF OBJECT_ID('dbo.DimDate', 'U') IS NOT NULL DROP TABLE dbo.DimDate;
GO
------------------------------------------------------------
-- 2. Dimension tables
------------------------------------------------------------
-- 2.1 Date dimension
CREATE TABLE dbo.DimDate (
DateKey INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
DateValue DATE NOT NULL, -- natural key
DayNumber TINYINT NOT NULL, -- 1-31
DayOfWeekName NVARCHAR(20) NOT NULL,
MonthNumber TINYINT NOT NULL, -- 1-12
MonthName NVARCHAR(20) NOT NULL,
QuarterNumber TINYINT NOT NULL, -- 1-4
CalendarYear SMALLINT NOT NULL,
IsWeekend BIT NOT NULL
);
GO
CREATE UNIQUE INDEX UX_DimDate_DateValue
ON dbo.DimDate(DateValue);
GO
-- 2.2 Customer dimension
CREATE TABLE dbo.DimCustomer (
CustomerKey INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
CustomerCode NVARCHAR(20) NOT NULL,
FullName NVARCHAR(100) NOT NULL,
CustomerSegment NVARCHAR(20) NOT NULL, -- Retail / SMB / Enterprise
City NVARCHAR(50) NOT NULL,
Country NVARCHAR(50) NOT NULL,
SignupDate DATE NOT NULL,
IsActive BIT NOT NULL
);
GO
CREATE UNIQUE INDEX UX_DimCustomer_CustomerCode
ON dbo.DimCustomer(CustomerCode);
GO
-- 2.3 Product dimension
CREATE TABLE dbo.DimProduct (
ProductKey INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
ProductCode NVARCHAR(20) NOT NULL,
ProductName NVARCHAR(100) NOT NULL,
Brand NVARCHAR(50) NOT NULL,
Category NVARCHAR(50) NOT NULL,
SubCategory NVARCHAR(50) NULL,
UnitPrice DECIMAL(10,2) NOT NULL, -- << key column
IsActive BIT NOT NULL
);
GO
CREATE UNIQUE INDEX UX_DimProduct_ProductCode
ON dbo.DimProduct(ProductCode);
GO
-- 2.4 Store dimension
CREATE TABLE dbo.DimStore (
StoreKey INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
StoreCode NVARCHAR(20) NOT NULL,
StoreName NVARCHAR(100) NOT NULL,
StoreType NVARCHAR(20) NOT NULL, -- Retail / Outlet / Online
City NVARCHAR(50) NOT NULL,
Region NVARCHAR(50) NULL,
Country NVARCHAR(50) NOT NULL,
OpenDate DATE NOT NULL,
IsActive BIT NOT NULL
);
GO
CREATE UNIQUE INDEX UX_DimStore_StoreCode
ON dbo.DimStore(StoreCode);
GO
-- 2.5 Salesperson dimension
CREATE TABLE dbo.DimSalesperson (
SalespersonKey INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
EmployeeCode NVARCHAR(20) NOT NULL,
FullName NVARCHAR(100) NOT NULL,
Title NVARCHAR(50) NOT NULL,
HireDate DATE NOT NULL,
Region NVARCHAR(50) NOT NULL,
IsActive BIT NOT NULL
);
GO
CREATE UNIQUE INDEX UX_DimSalesperson_EmployeeCode
ON dbo.DimSalesperson(EmployeeCode);
GO
------------------------------------------------------------
-- 3. Fact table
------------------------------------------------------------
CREATE TABLE dbo.FactSales (
SalesKey BIGINT IDENTITY(1,1) NOT NULL PRIMARY KEY,
DateKey INT NOT NULL,
CustomerKey INT NOT NULL,
ProductKey INT NOT NULL,
StoreKey INT NOT NULL,
SalespersonKey INT NULL,
OrderNumber NVARCHAR(20) NOT NULL,
OrderLineNumber TINYINT NOT NULL,
Quantity INT NOT NULL,
UnitPrice DECIMAL(10,2) NOT NULL,
DiscountAmount DECIMAL(10,2) NOT NULL
CONSTRAINT DF_FactSales_Discount DEFAULT (0),
SalesAmount AS (Quantity * UnitPrice - DiscountAmount) PERSISTED
);
GO
------------------------------------------------------------
-- 4. Relationships (Foreign Keys + Indexes)
------------------------------------------------------------
ALTER TABLE dbo.FactSales
ADD CONSTRAINT FK_FactSales_DimDate
FOREIGN KEY (DateKey) REFERENCES dbo.DimDate(DateKey);
ALTER TABLE dbo.FactSales
ADD CONSTRAINT FK_FactSales_DimCustomer
FOREIGN KEY (CustomerKey) REFERENCES dbo.DimCustomer(CustomerKey);
ALTER TABLE dbo.FactSales
ADD CONSTRAINT FK_FactSales_DimProduct
FOREIGN KEY (ProductKey) REFERENCES dbo.DimProduct(ProductKey);
ALTER TABLE dbo.FactSales
ADD CONSTRAINT FK_FactSales_DimStore
FOREIGN KEY (StoreKey) REFERENCES dbo.DimStore(StoreKey);
ALTER TABLE dbo.FactSales
ADD CONSTRAINT FK_FactSales_DimSalesperson
FOREIGN KEY (SalespersonKey) REFERENCES dbo.DimSalesperson(SalespersonKey);
GO
CREATE INDEX IX_FactSales_DateKey ON dbo.FactSales(DateKey);
CREATE INDEX IX_FactSales_CustomerKey ON dbo.FactSales(CustomerKey);
CREATE INDEX IX_FactSales_ProductKey ON dbo.FactSales(ProductKey);
CREATE INDEX IX_FactSales_StoreKey ON dbo.FactSales(StoreKey);
CREATE INDEX IX_FactSales_SalespersonKey ON dbo.FactSales(SalespersonKey);
GO
------------------------------------------------------------
-- 5. Populate DimDate : full 5-year range
-- 2020-01-01 to 2024-12-31
------------------------------------------------------------
DECLARE @StartDate DATE = '2020-01-01';
DECLARE @EndDate DATE = '2024-12-31';
DECLARE @d DATE = @StartDate;
WHILE @d <= @EndDate
BEGIN
INSERT INTO dbo.DimDate
(DateValue,
DayNumber,
DayOfWeekName,
MonthNumber,
MonthName,
QuarterNumber,
CalendarYear,
IsWeekend)
VALUES
(@d,
DAY(@d),
DATENAME(WEEKDAY, @d),
MONTH(@d),
DATENAME(MONTH, @d),
DATEPART(QUARTER, @d),
YEAR(@d),
CASE WHEN DATENAME(WEEKDAY, @d) IN (N'Saturday', N'Sunday') THEN 1 ELSE 0 END);
SET @d = DATEADD(DAY, 1, @d);
END;
GO
------------------------------------------------------------
-- 6. Populate DimCustomer : 30 customers
------------------------------------------------------------
;WITH N AS (
SELECT TOP (30)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM master.sys.objects
)
INSERT INTO dbo.DimCustomer
(CustomerCode, FullName, CustomerSegment, City, Country, SignupDate, IsActive)
SELECT
'CUST-' + RIGHT('000' + CAST(n AS VARCHAR(3)), 3) AS CustomerCode,
'Customer ' + CAST(n AS VARCHAR(3)) AS FullName,
CASE (n % 3)
WHEN 1 THEN 'Retail'
WHEN 2 THEN 'SMB'
ELSE 'Enterprise'
END AS CustomerSegment,
CASE (n % 5)
WHEN 1 THEN 'Tokyo'
WHEN 2 THEN 'Osaka'
WHEN 3 THEN 'Nagoya'
WHEN 4 THEN 'Jakarta'
ELSE 'Singapore'
END AS City,
CASE (n % 5)
WHEN 1 THEN 'Japan'
WHEN 2 THEN 'Japan'
WHEN 3 THEN 'Japan'
WHEN 4 THEN 'Indonesia'
ELSE 'Singapore'
END AS Country,
DATEADD(DAY, n, '2018-01-01') AS SignupDate,
CASE WHEN n % 10 = 0 THEN 0 ELSE 1 END AS IsActive
FROM N;
GO
------------------------------------------------------------
-- 7. Populate DimProduct : 30 products
------------------------------------------------------------
;WITH N AS (
SELECT TOP (30)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM master.sys.objects
)
INSERT INTO dbo.DimProduct
(ProductCode, ProductName, Brand, Category, SubCategory, UnitPrice, IsActive)
SELECT
'PROD-' + RIGHT('000' + CAST(n AS VARCHAR(3)), 3) AS ProductCode,
'Product ' + CAST(n AS VARCHAR(3)) AS ProductName,
CASE (n % 4)
WHEN 1 THEN 'Contoso'
WHEN 2 THEN 'Fabrikam'
WHEN 3 THEN 'AdventureWorks'
ELSE 'Tailspin'
END AS Brand,
CASE (n % 3)
WHEN 1 THEN 'Electronics'
WHEN 2 THEN 'Furniture'
ELSE 'Accessories'
END AS Category,
CASE
WHEN (n % 3) = 1 THEN
CASE WHEN (n % 2) = 0 THEN 'Laptop' ELSE 'Monitor' END
WHEN (n % 3) = 2 THEN
CASE WHEN (n % 2) = 0 THEN 'Desk' ELSE 'Chair' END
ELSE
CASE WHEN (n % 2) = 0 THEN 'Mouse' ELSE 'Keyboard' END
END AS SubCategory,
CAST(ROUND(50.0 + (n * 15.0), 2) AS DECIMAL(10,2)) AS UnitPrice,
CASE WHEN n % 12 = 0 THEN 0 ELSE 1 END AS IsActive
FROM N;
GO
------------------------------------------------------------
-- 8. Populate DimStore : 30 stores
------------------------------------------------------------
;WITH N AS (
SELECT TOP (30)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM master.sys.objects
)
INSERT INTO dbo.DimStore
(StoreCode, StoreName, StoreType, City, Region, Country, OpenDate, IsActive)
SELECT
'STORE-' + RIGHT('000' + CAST(n AS VARCHAR(3)), 3) AS StoreCode,
'Store ' + CAST(n AS VARCHAR(3)) AS StoreName,
CASE (n % 3)
WHEN 1 THEN 'Retail'
WHEN 2 THEN 'Outlet'
ELSE 'Online'
END AS StoreType,
CASE (n % 6)
WHEN 1 THEN 'Tokyo'
WHEN 2 THEN 'Osaka'
WHEN 3 THEN 'Nagoya'
WHEN 4 THEN 'Fukuoka'
WHEN 5 THEN 'Jakarta'
ELSE 'Singapore'
END AS City,
CASE (n % 6)
WHEN 1 THEN 'Kanto'
WHEN 2 THEN 'Kansai'
WHEN 3 THEN 'Tokai'
WHEN 4 THEN 'Kyushu'
WHEN 5 THEN 'Java'
ELSE 'Singapore Region'
END AS Region,
CASE (n % 6)
WHEN 5 THEN 'Indonesia'
WHEN 0 THEN 'Singapore'
ELSE 'Japan'
END AS Country,
DATEADD(DAY, n * 30, '2015-01-01') AS OpenDate,
CASE WHEN n % 15 = 0 THEN 0 ELSE 1 END AS IsActive
FROM N;
GO
------------------------------------------------------------
-- 9. Populate DimSalesperson : 30 salespeople
------------------------------------------------------------
;WITH N AS (
SELECT TOP (30)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM master.sys.objects
)
INSERT INTO dbo.DimSalesperson
(EmployeeCode, FullName, Title, HireDate, Region, IsActive)
SELECT
'EMP-' + RIGHT('000' + CAST(n AS VARCHAR(3)), 3) AS EmployeeCode,
'Salesperson ' + CAST(n AS VARCHAR(3)) AS FullName,
CASE
WHEN n % 10 = 0 THEN 'Sales Manager'
WHEN n % 5 = 0 THEN 'Senior Sales Rep'
ELSE 'Sales Rep'
END AS Title,
DATEADD(DAY, n * 20, '2016-01-01') AS HireDate,
CASE (n % 5)
WHEN 1 THEN 'Kanto'
WHEN 2 THEN 'Kansai'
WHEN 3 THEN 'Tokai'
WHEN 4 THEN 'Indonesia'
ELSE 'Singapore'
END AS Region,
CASE WHEN n % 18 = 0 THEN 0 ELSE 1 END AS IsActive
FROM N;
GO
------------------------------------------------------------
-- 10. Populate FactSales : 1,000 order lines
-- Spread across 5-year date range and all dimensions
------------------------------------------------------------
;WITH N AS (
SELECT TOP (1000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
FROM master.sys.objects s1
CROSS JOIN master.sys.objects s2
)
INSERT INTO dbo.FactSales
(DateKey,
CustomerKey,
ProductKey,
StoreKey,
SalespersonKey,
OrderNumber,
OrderLineNumber,
Quantity,
UnitPrice,
DiscountAmount)
SELECT
d.DateKey,
c.CustomerKey,
p.ProductKey,
st.StoreKey,
sp.SalespersonKey,
'SO-' + RIGHT('000000' + CAST(((n - 1) / 3) + 1 AS VARCHAR(6)), 6) AS OrderNumber,
CAST(((n - 1) % 3) + 1 AS TINYINT) AS OrderLineNumber,
((n - 1) % 10) + 1 AS Quantity, -- 1–10
p.UnitPrice AS UnitPrice, -- from DimProduct
CASE WHEN n % 7 = 0
THEN CAST(ROUND(p.UnitPrice * 0.10, 2) AS DECIMAL(10,2)) -- ~10% discount
ELSE 0.00
END AS DiscountAmount
FROM N
CROSS APPLY (
SELECT TOP 1 DateKey
FROM dbo.DimDate
ORDER BY NEWID()
) AS d
CROSS APPLY (
SELECT TOP 1 CustomerKey
FROM dbo.DimCustomer
ORDER BY NEWID()
) AS c
CROSS APPLY (
-- *** THIS IS THE IMPORTANT FIX ***
-- Select BOTH ProductKey and UnitPrice so alias p exposes p.UnitPrice
SELECT TOP 1 ProductKey, UnitPrice
FROM dbo.DimProduct
ORDER BY NEWID()
) AS p
CROSS APPLY (
SELECT TOP 1 StoreKey
FROM dbo.DimStore
ORDER BY NEWID()
) AS st
CROSS APPLY (
SELECT TOP 1 SalespersonKey
FROM dbo.DimSalesperson
ORDER BY NEWID()
) AS sp;
GO
------------------------------------------------------------
-- 11. Quick sanity checks (optional)
------------------------------------------------------------
SELECT COUNT(*) AS FactSalesRowCount FROM dbo.FactSales;
SELECT TOP 5
f.SalesKey,
d.DateValue,
c.FullName AS CustomerName,
p.ProductName,
st.StoreName,
sp.FullName AS SalespersonName,
f.Quantity,
f.UnitPrice,
f.DiscountAmount,
f.SalesAmount
FROM dbo.FactSales f
JOIN dbo.DimDate d ON f.DateKey = d.DateKey
JOIN dbo.DimCustomer c ON f.CustomerKey = c.CustomerKey
JOIN dbo.DimProduct p ON f.ProductKey = p.ProductKey
JOIN dbo.DimStore st ON f.StoreKey = st.StoreKey
JOIN dbo.DimSalesperson sp ON f.SalespersonKey = sp.SalespersonKey;
SELECT
d.CalendarYear,
st.Country,
SUM(f.SalesAmount) AS TotalSales,
COUNT(*) AS LineCount
FROM dbo.FactSales f
JOIN dbo.DimDate d ON f.DateKey = d.DateKey
JOIN dbo.DimStore st ON f.StoreKey = st.StoreKey
GROUP BY d.CalendarYear, st.Country
ORDER BY d.CalendarYear, st.Country;
SELECT TOP 3 * FROM dbo.DimCustomer;
SELECT TOP 3 * FROM dbo.DimDate;
SELECT TOP 3 * FROM dbo.DimProduct;
SELECT TOP 3 * FROM dbo.DimSalesperson;
SELECT TOP 3 * FROM dbo.DimStore;
SELECT TOP 3 * FROM dbo.FactSales

