このハンズオンについて
- このハンズオンでは、Redshiftのクラスターとそのクラスタに対してクエリを発行するインスタンスの作成を実施します。
- クエリの発行には、「psql」を利用します。本手順では、Amazon Linux上へのインストールと利用方法を説明します。
前提条件
バージョン確認
このハンズオンは以下のバージョンで動作確認を行いました。
aws --version
aws-cli/1.10.23 Python/2.7.10 Linux/4.4.5-15.26.amzn1.x86_64 botocore/1.4.14
必要な権限
作業にあたっては、以下の権限を有したIAMユーザもしくはIAMロールを利用してください。
- EC2に対するフルコントロール権限
- RedShiftに関するフルコントロール権限
- IAMに関するフルコントロール権限
- S3に関するフルコントロール権限
- STSに関するフルコントロール権限
0. 準備
リージョンを指定
export AWS_DEFAULT_REGION='ap-northeast-1'
資格情報を確認
aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************RDPA iam-role
secret_key ****************9GA8 iam-role
region ap-northeast-1 env AWS_DEFAULT_REGION
1. 作業用EC2インスタンスへログイン
SSHで接続
初回接続時、yesを選択
ssh -l ec2-user -i ~/.ssh/${KEY_MATERIAL_FILE} ${PUBLIC_IP_ADDRESS}
リージョンを指定
export AWS_DEFAULT_REGION='ap-northeast-1'
資格情報を確認
インスタンスプロファイル経由で認証情報を自動的に取得しています。
aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************RDPA iam-role
secret_key ****************9GA8 iam-role
region ap-northeast-1 env AWS_DEFAULT_REGION
2.クライアントツールのインストール
作成したRedshiftクラスターへアクセスするための準備を行います。
SQLクライアントとして、psqlを利用します。
参考情報
http://docs.aws.amazon.com/ja_jp/redshift/latest/mgmt/connecting-from-psql.html
PostgreSQL(とついでにjq)のインストール
sudo yum install postgresql-server -y
sudo yum install jq -y
(省略)
エンドポイントの確認
(クラスタを作成中の場合はしばらくお待ちください)
REDSHIFT_ENDPOINT=`redshift describe-clusters --query Clusters[?ClusterIdentifier==\'mycluster\'].Endpoint.Address --output text` \
&& echo ${REDSHIFT_ENDPOINT}
ユーザ名、DB名、ポート番号、インスタンスプロファイルの確認
DB_NAME="mydb"
PORT="5439"
MASTER_USER_NAME="awsuser"
IAM_ROLE="redshift-role"
パラメータの確認
cat << ETX
REDSHIFT_ENDPOINT:${REDSHIFT_ENDPOINT}
DB_NAME:${DB_NAME}
PORT:${PORT}
MASTER_USER_NAME:${MASTER_USER_NAME}
IAM_ROLE:${IAM_ROLE}
ETX
REDSHIFT_ENDPOINT:mycluster.c1vv0vckcvmq.ap-northeast-1.redshift.amazonaws.com
DB_NAME:mydb
PORT:5439
MASTER_USER_NAME:awsuser
IAM_ROLE:redshift-role
接続確認
(パスワード入力を要求されます)
psql -h ${REDSHIFT_ENDPOINT} -U ${MASTER_USER_NAME} -d ${DB_NAME} -p ${PORT}
バージョン確認
select version();
version
------------------------------------------------------------------------------------------------
--------------------------
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-
6.fc3), Redshift 1.0.1043
(1 row)
ログアウト
\q
3. 認証情報の取得
ここでは、データのロードにIAMロール(インスタンスプロファイル)を利用して取得出来る一時認証情報を利用します。
一時認証情報が取得できることを確認
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${IAM_ROLE}/
{
"Code" : "Success",
"LastUpdated" : "2016-04-30T09:46:48Z",
"Type" : "AWS-HMAC",
"AccessKeyId" : "********************",
"SecretAccessKey" : "*********************************************",
"Token" : "*************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************",
"Expiration" : "2016-04-30T16:20:52Z"
}
アクセスキーの取得
ACCESS_KEY_ID=`curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${IAM_ROLE}/ | jp.py AccessKeyId |sed 's/\"//g'` && echo ${ACCESS_KEY_ID}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 950 100 950 0 0 221k 0 --:--:-- --:--:-- --:--:-- 231k
********************
シークレットアクセスキーの取得
SECRET_ACCESS_KEY=`curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${IAM_ROLE}/ | jq -r .SecretAccessKey` && echo ${SECRET_ACCESS_KEY}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 950 100 950 0 0 221k 0 --:--:-- --:--:-- --:--:-- 231k
****************************************
トークンの取得
TOKEN=`curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${IAM_ROLE}/ | jq -r .Token` && echo ${TOKEN}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 950 100 950 0 0 221k 0 --:--:-- --:--:-- --:--:-- 231k
*************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
確認
cat << ETX
ACCESS_KEY_ID:${ACCESS_KEY_ID}
SECRET_ACCESS_KEY:${SECRET_ACCESS_KEY}
TOKEN:${TOKEN}
ETX
ACCESS_KEY_ID:********************
SECRET_ACCESS_KEY:****************************************
TOKEN:*************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
4.データのロード
AWSが提供するサンプルデータをインポートします。
サンプルデータをコピーするには、AWS アカウント認証情報(アクセスキー ID およびシークレットアクセスキー)が必要になります。
認証されたユーザーのみがこのデータにアクセスできます。
データをロードするためのコマンドを生成
LOAD_SAMPLE_DATA_FILE='load_sample_data.txt'
cat << EOF > ${LOAD_SAMPLE_DATA_FILE}
copy users from 's3://awssampledbuswest2/tickit/allusers_pipe.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '|' region 'us-west-2';
copy venue from 's3://awssampledbuswest2/tickit/venue_pipe.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '|' region 'us-west-2';
copy category from 's3://awssampledbuswest2/tickit/category_pipe.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '|' region 'us-west-2';
copy date from 's3://awssampledbuswest2/tickit/date2008_pipe.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '|' region 'us-west-2';
copy event from 's3://awssampledbuswest2/tickit/allevents_pipe.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '|' timeformat 'YYYY-MM-DD HH:MI:SS' region 'us-west-2';
copy listing from 's3://awssampledbuswest2/tickit/listings_pipe.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '|' region 'us-west-2';
copy sales from 's3://awssampledbuswest2/tickit/sales_tab.txt'
credentials 'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}'
delimiter '\t' timeformat 'MM/DD/YYYY HH:MI:SS' region 'us-west-2';
EOF
確認
(ここで出力したコマンドをpsqlでの接続後に使用します)
cat ${LOAD_SAMPLE_DATA_FILE}
(省略)
クラスタに接続
(パスワード入力を要求されます)
psql -h ${REDSHIFT_ENDPOINT} -U ${MASTER_USER_NAME} -d ${DB_NAME} -p ${PORT}
テーブルを作成
(「CREATE TABLE」が表示されればOKです)
create table users(
userid integer not null distkey sortkey,
username char(8),
firstname varchar(30),
lastname varchar(30),
city varchar(30),
state char(2),
email varchar(100),
phone char(14),
likesports boolean,
liketheatre boolean,
likeconcerts boolean,
likejazz boolean,
likeclassical boolean,
likeopera boolean,
likerock boolean,
likevegas boolean,
likebroadway boolean,
likemusicals boolean);
create table venue(
venueid smallint not null distkey sortkey,
venuename varchar(100),
venuecity varchar(30),
venuestate char(2),
venueseats integer);
create table category(
catid smallint not null distkey sortkey,
catgroup varchar(10),
catname varchar(10),
catdesc varchar(50));
create table date(
dateid smallint not null distkey sortkey,
caldate date not null,
day character(3) not null,
week smallint not null,
month character(5) not null,
qtr character(5) not null,
year smallint not null,
holiday boolean default('N'));
create table event(
eventid integer not null distkey,
venueid smallint not null,
catid smallint not null,
dateid smallint not null sortkey,
eventname varchar(200),
starttime timestamp);
create table listing(
listid integer not null distkey,
sellerid integer not null,
eventid integer not null,
dateid smallint not null sortkey,
numtickets smallint not null,
priceperticket decimal(8,2),
totalprice decimal(8,2),
listtime timestamp);
create table sales(
salesid integer not null,
listid integer not null distkey,
sellerid integer not null,
buyerid integer not null,
eventid integer not null,
dateid smallint not null sortkey,
qtysold smallint not null,
pricepaid decimal(8,2),
commission decimal(8,2),
saletime timestamp);
データのロード
一時認証情報を含んだcopyコマンドを貼り付けます。
INFO: Load into table 'users' completed, 49990 record(s) loaded successfully.
INFO: Load into table 'venue' completed, 202 record(s) loaded successfully.
INFO: Load into table 'category' completed, 11 record(s) loaded successfully.
INFO: Load into table 'date' completed, 365 record(s) loaded successfully.
INFO: Load into table 'event' completed, 8798 record(s) loaded successfully.
INFO: Load into table 'listing' completed, 192497 record(s) loaded successfully.
INFO: Load into table 'sales' completed, 172456 record(s) loaded successfully.
ロード結果の確認
select count(*) from users;
select count(*) from venue;
select count(*) from category;
select count(*) from date;
select count(*) from event;
select count(*) from listing;
select count(*) from sales;
count
-------
49990
(1 row)
count
-------
202
(1 row)
count
-------
11
(1 row)
count
-------
365
(1 row)
count
-------
8798
(1 row)
count
--------
192497
(1 row)
count
--------
172456
(1 row)
5. クエリ
テーブル名のリストを表示する
select distinct(tablename) from pg_table_def where schemaname = 'public';
tablename
-----------
category
date
event
listing
sales
users
venue
(7 rows)
データベースユーザーを表示する
select * from pg_user;
usename | usesysid | usecreatedb | usesuper | usecatupd | passwd | valuntil | useconfig
---------+----------+-------------+----------+-----------+----------+----------+-----------
rdsdb | 1 | t | t | t | ******** | infinity |
awsuser | 100 | t | t | f | ******** | |
(2 rows)
最近のクエリを表示する
select query, pid, elapsed, substring from svl_qlog
where userid = 100
order by starttime desc
limit 5;
query | pid | elapsed | substring
-------+-------+---------+-------------------------------
737 | 32764 | 5577 | select count(*) from sales;
695 | 32764 | 825968 | select count(*) from sales;
694 | 32764 | 5337 | select count(*) from listing;
693 | 32764 | 802109 | select count(*) from event;
692 | 32764 | 835171 | select count(*) from date;
(5 rows)
実行中のクエリのプロセス ID を調べる
select pid, user_name, starttime, query
from stv_recents
where status='Running';
pid | user_name | starttime | query
-----+-----------+-----------+-------
(0 rows)
ユーザデータに対してクエリを実行する
select * from users limit 10;
userid | username | firstname | lastname | city | state | email | phone | likesports | liketheatre | likeconcerts | likejazz | likeclassical | likeopera | likerock | likevegas | li
kebroadway | likemusicals
--------+----------+-----------+----------+--------------+-------+---------------------------------------------+----------------+------------+-------------+--------------+----------+---------------+-----------+----------+-----------+---
-----------+--------------
2 | PGL08LJI | Vladimir | Humphrey | Murfreesboro | SK | Suspendisse.tristique@nonnisiAenean.edu | (783) 492-1886 | | | | t | t | | | t | f
| t
4 | XDZ38RDD | Barry | Roy | Omaha | AB | sed@lacusUtnec.ca | (355) 452-8168 | f | t | | f | | | | |
| f
5 | AEB55QTM | Reagan | Hodge | Forest Lake | NS | Cum@accumsan.com | (476) 519-9131 | | | t | f | | | t | t | f
| t
7 | OWY35QYB | Tamekah | Juarez | Moultrie | WV | elementum@semperpretiumneque.ca | (297) 875-7247 | | | | t | t | f | | | f
| f
9 | MSD36KVR | Mufutau | Watkins | Port Orford | MD | Integer.mollis.Integer@tristiquealiquet.org | (725) 719-7670 | t | f | | f | t | | | | f
| t
10 | WKW41AIW | Naida | Calderon | Waterbury | MB | Donec.fringilla@sodalesat.org | (197) 726-8249 | f | f | f | | f | t | | t |
|
15 | OWU78MTR | Scarlett | Mayer | Gadsden | GA | lorem.ipsum@Vestibulumante.com | (189) 882-8412 | t | f | t | | | t | | | t
|
16 | ZMG93CDD | Kieran | Drake | Hot Springs | BC | molestie.tellus@dapibusgravidaAliquam.com | (192) 914-0016 | | t | t | | f | | t | t |
| f
18 | VDP05MXU | Germaine | Valdez | Kokomo | WY | cursus.Integer@arcuVestibulumante.com | (998) 879-8668 | | t | t | | t | t | | f | t
| t
19 | CXQ97IWP | Amal | Landry | Lomita | NT | euismod@turpis.org | (891) 526-1468 | | f | t | | t | | | f | f
| t
(10 rows)
select * from venue limit 10;
venueid | venuename | venuecity | venuestate | venueseats
---------+----------------------------+---------------+------------+------------
2 | Columbus Crew Stadium | Columbus | OH | 0
4 | CommunityAmerica Ballpark | Kansas City | KS | 0
5 | Gillette Stadium | Foxborough | MA | 68756
7 | BMO Field | Toronto | ON | 0
9 | Dick's Sporting Goods Park | Commerce City | CO | 0
10 | Pizza Hut Park | Frisco | TX | 0
15 | McAfee Coliseum | Oakland | CA | 63026
16 | TD Banknorth Garden | Boston | MA | 0
18 | Madison Square Garden | New York City | NY | 20000
19 | Wachovia Center | Philadelphia | PA | 0
(10 rows)
select * from category limit 10;
catid | catgroup | catname | catdesc
-------+----------+----------+---------------------------------
2 | Sports | NHL | National Hockey League
4 | Sports | NBA | National Basketball Association
5 | Sports | MLS | Major League Soccer
7 | Shows | Plays | All non-musical theatre
9 | Concerts | Pop | All rock and pop music concerts
10 | Concerts | Jazz | All jazz singers and bands
1 | Sports | MLB | Major League Baseball
3 | Sports | NFL | National Football League
6 | Shows | Musicals | Musical theatre
8 | Shows | Opera | All opera and light opera
(10 rows)
select * from date limit 10;
dateid | caldate | day | week | month | qtr | year | holiday
--------+------------+-----+------+-------+-------+------+---------
1827 | 2008-01-01 | WE | 1 | JAN | 1 | 2008 | t
1831 | 2008-01-05 | SU | 2 | JAN | 1 | 2008 | f
1836 | 2008-01-10 | FR | 2 | JAN | 1 | 2008 | f
1837 | 2008-01-11 | SA | 3 | JAN | 1 | 2008 | f
1840 | 2008-01-14 | TU | 3 | JAN | 1 | 2008 | f
1843 | 2008-01-17 | FR | 3 | JAN | 1 | 2008 | f
1845 | 2008-01-19 | SU | 4 | JAN | 1 | 2008 | f
1846 | 2008-01-20 | MO | 4 | JAN | 1 | 2008 | f
1847 | 2008-01-21 | TU | 4 | JAN | 1 | 2008 | f
1849 | 2008-01-23 | TH | 4 | JAN | 1 | 2008 | f
(10 rows)
select * from event limit 10;
eventid | venueid | catid | dateid | eventname | starttime
---------+---------+-------+--------+-------------------+---------------------
1217 | 238 | 6 | 1827 | Mamma Mia! | 2008-01-01 20:00:00
1433 | 248 | 6 | 1827 | Grease | 2008-01-01 19:00:00
2811 | 207 | 7 | 1827 | Spring Awakening | 2008-01-01 15:00:00
3915 | 51 | 9 | 1827 | Return To Forever | 2008-01-01 14:00:00
4135 | 16 | 9 | 1827 | Nas | 2008-01-01 14:30:00
5807 | 45 | 9 | 1827 | Return To Forever | 2008-01-01 15:00:00
6649 | 6 | 9 | 1827 | Hannah Montana | 2008-01-01 19:30:00
6711 | 72 | 9 | 1827 | Smashing Pumpkins | 2008-01-01 19:00:00
7747 | 6 | 9 | 1827 | K.D. Lang | 2008-01-01 15:00:00
521 | 248 | 6 | 1828 | West Side Story | 2008-01-02 15:00:00
(10 rows)
select * from listing limit 10;
listid | sellerid | eventid | dateid | numtickets | priceperticket | totalprice | listtime
--------+----------+---------+--------+------------+----------------+------------+---------------------
614 | 25339 | 770 | 1827 | 10 | 236.00 | 2360.00 | 2008-01-01 05:07:30
776 | 20797 | 1811 | 1827 | 18 | 133.00 | 2394.00 | 2008-01-01 06:59:39
2092 | 42560 | 8609 | 1827 | 22 | 194.00 | 4268.00 | 2008-01-01 05:49:06
2688 | 10629 | 8791 | 1827 | 3 | 20.00 | 60.00 | 2008-01-01 09:18:40
5736 | 32170 | 1221 | 1827 | 8 | 221.00 | 1768.00 | 2008-01-01 11:20:20
6635 | 30023 | 2426 | 1827 | 8 | 50.00 | 400.00 | 2008-01-01 07:09:40
7786 | 47561 | 117 | 1827 | 6 | 124.00 | 744.00 | 2008-01-01 02:12:34
8301 | 45285 | 8582 | 1827 | 4 | 71.00 | 284.00 | 2008-01-01 08:09:16
8434 | 46015 | 3937 | 1827 | 4 | 233.00 | 932.00 | 2008-01-01 08:51:49
8942 | 23600 | 2557 | 1827 | 12 | 105.00 | 1260.00 | 2008-01-01 07:20:48
(10 rows)
select * from sales limit 10;
salesid | listid | sellerid | buyerid | eventid | dateid | qtysold | pricepaid | commission | saletime
---------+--------+----------+---------+---------+--------+---------+-----------+------------+---------------------
33095 | 36572 | 30047 | 660 | 2903 | 1827 | 2 | 234.00 | 35.10 | 2008-01-01 09:41:06
88268 | 100813 | 45818 | 698 | 8649 | 1827 | 4 | 836.00 | 125.40 | 2008-01-01 07:26:20
110917 | 127048 | 37631 | 116 | 1749 | 1827 | 1 | 337.00 | 50.55 | 2008-01-01 07:05:02
150314 | 173969 | 48680 | 816 | 8762 | 1827 | 2 | 688.00 | 103.20 | 2008-01-01 03:50:02
157751 | 206999 | 3003 | 157 | 6605 | 1827 | 1 | 1730.00 | 259.50 | 2008-01-01 12:50:55
1134 | 1176 | 37614 | 301 | 5414 | 1828 | 1 | 218.00 | 32.70 | 2008-01-02 08:22:32
1924 | 2067 | 27144 | 256 | 6977 | 1828 | 2 | 494.00 | 74.10 | 2008-01-02 01:11:16
8325 | 8942 | 23600 | 1078 | 2557 | 1828 | 5 | 525.00 | 78.75 | 2008-01-02 05:27:54
40331 | 45102 | 13005 | 1091 | 1756 | 1828 | 2 | 58.00 | 8.70 | 2008-01-02 05:57:53
40741 | 45570 | 18692 | 226 | 4650 | 1828 | 3 | 90.00 | 13.50 | 2008-01-02 01:40:59
(10 rows)
-- Get definition for the sales table.
SELECT *
FROM pg_table_def
WHERE tablename = 'sales';
-- Find total sales on a given calendar date.
SELECT sum(qtysold)
FROM sales, date
WHERE sales.dateid = date.dateid
AND caldate = '2008-01-05';
-- Find top 10 buyers by quantity.
SELECT firstname, lastname, total_quantity
FROM (SELECT buyerid, sum(qtysold) total_quantity
FROM sales
GROUP BY buyerid
ORDER BY total_quantity desc limit 10) Q, users
WHERE Q.buyerid = userid
ORDER BY Q.total_quantity desc;
-- Find events in the 99.9 percentile in terms of all time gross sales.
SELECT eventname, total_price
FROM (SELECT eventid, total_price, ntile(1000) over(order by total_price desc) as percentile
FROM (SELECT eventid, sum(pricepaid) total_price
FROM sales
GROUP BY eventid)) Q, event E
WHERE Q.eventid = E.eventid
AND percentile = 1
ORDER BY total_price desc;
しばし、クエリをお楽しみください。
6. バキューム、分析
説明は割愛しますw
バキューム
vacuum;
VACUUM
分析
analyze;
ANALYZE
7. アンロードする
Redshift上のデータをS3バケットにアンロード(出力)します。
クラスターからログアウトする
\q
アンロード先バケット名の指定
バケット名は、グローバルでユニークである必要があります。適宜、バケット名を修正してください。
BUCKET_NAME="jawsug-cli-unloaded-data-2016-04-28"
バケットの作成
aws s3 mb s3://${BUCKET_NAME}
make_bucket: s3://jawsug-cli-unloaded-data-2016-04-28/
バケットの確認
aws s3 ls
2016-04-30 07:33:44 jawsug-cli-unloaded-data-2016-04-28
バケット内のファイルの確認
aws s3 ls s3://${BUCKET_NAME}
(返値無し)
パラメータの確認
cat << ETX
BUCKET_NAME:${BUCKET_NAME}
ACCESS_KEY_ID:${ACCESS_KEY_ID}
SECRET_ACCESS_KEY:${SECRET_ACCESS_KEY}
TOKEN:${TOKEN}
ETX
BUCKET_NAME:jawsug-cli-unloaded-data-2016-04-28
ACCESS_KEY_ID:********************
SECRET_ACCESS_KEY:****************************************
TOKEN:********************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
コマンドの生成(アンロード)
UNLOAD_SAMPLE_DATA_FILE='unload_sample_data.txt'
cat << EOF > ${UNLOAD_SAMPLE_DATA_FILE}
unload ('select * from users')
to 's3://${BUCKET_NAME}/unload/users_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
unload ('select * from venue')
to 's3://${BUCKET_NAME}/unload/venue_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
unload ('select * from category')
to 's3://${BUCKET_NAME}/unload/category_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
unload ('select * from date')
to 's3://${BUCKET_NAME}/unload/date_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
unload ('select * from event')
to 's3://${BUCKET_NAME}/unload/event_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
unload ('select * from listing')
to 's3://${BUCKET_NAME}/unload/listing_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
unload ('select * from sales')
to 's3://${BUCKET_NAME}/unload/sales_' credentials
'aws_access_key_id=${ACCESS_KEY_ID};aws_secret_access_key=${SECRET_ACCESS_KEY};token=${TOKEN}';
EOF
確認
cat ${UNLOAD_SAMPLE_DATA_FILE}
(省略)
クラスターへのログイン
cat << ETX
REDSHIFT_ENDPOINT:${REDSHIFT_ENDPOINT}
DB_NAME:${DB_NAME}
PORT:${PORT}
MASTER_USER_NAME:${MASTER_USER_NAME}
ETX
REDSHIFT_ENDPOINT:mycluster.c1vv0vckcvmq.ap-northeast-1.redshift.amazonaws.com
DB_NAME:mydb
PORT:5439
MASTER_USER_NAME:awsuser
psql -h ${REDSHIFT_ENDPOINT} -U ${MASTER_USER_NAME} -d ${DB_NAME} -p ${PORT}
psql (9.2.15, server 8.0.2)
WARNING: psql version 9.2, server version 8.0.
Some psql features might not work.
SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.
mydb=#
アンロード
生成したコマンドを貼り付けます。
アンロード結果の確認
select query, substring(path,0,100) as path
from stl_unload_log
order by path;
query | path
-------+-----------------------------------------------------------------------
315 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/category_0000_part_00
315 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/category_0001_part_00
316 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/date_0000_part_00
316 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/date_0001_part_00
317 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/event_0000_part_00
317 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/event_0001_part_00
318 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/listing_0000_part_00
318 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/listing_0001_part_00
319 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/sales_0000_part_00
319 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/sales_0001_part_00
313 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/users_0000_part_00
313 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/users_0001_part_00
314 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/venue_0000_part_00
314 | s3://jawsug-cli-unloaded-data-2016-04-28/unload/venue_0001_part_00
(14 rows)
クラスターからログアウト
\q
S3を参照
aws s3 ls s3://${BUCKET_NAME} --recursive
2016-05-01 09:15:55 243 unload/category_0000_part_00
2016-05-01 09:15:55 211 unload/category_0001_part_00
2016-05-01 09:15:56 6583 unload/date_0000_part_00
2016-05-01 09:15:56 6133 unload/date_0001_part_00
2016-05-01 09:15:58 218011 unload/event_0000_part_00
2016-05-01 09:15:58 219029 unload/event_0001_part_00
2016-05-01 09:15:59 5680079 unload/listing_0000_part_00
2016-05-01 09:15:59 5712457 unload/listing_0001_part_00
2016-05-01 09:16:01 5956875 unload/sales_0000_part_00
2016-05-01 09:16:01 5997091 unload/sales_0001_part_00
2016-05-01 09:15:52 2494270 unload/users_0000_part_00
2016-05-01 09:15:52 2502481 unload/users_0001_part_00
2016-05-01 09:15:54 3521 unload/venue_0000_part_00
2016-05-01 09:15:54 4235 unload/venue_0001_part_00
オブジェクトの削除
aws s3 rm s3://${BUCKET_NAME} --recursive
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/category_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/users_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/users_0001_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/event_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/venue_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/listing_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/date_0001_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/listing_0001_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/date_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/category_0001_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/sales_0000_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/event_0001_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/venue_0001_part_00
delete: s3://jawsug-cli-unloaded-data-2016-04-28/unload/sales_0001_part_00
バケットの削除
aws s3 rb s3://${BUCKET_NAME}
remove_bucket: s3://jawsug-cli-unloaded-data-2016-04-28/
EC2インスタンスからログアウト
exit
以上