More than 1 year has passed since last update.

GCP コマンドライン備忘録

Last updated at 2023-03-24Posted at 2023-03-24

cloud shell又はCloud SDK組込済みMacターミナルから実行した場合のコマンドです。

1.全般

有効なアカウント名一覧の表示

$ gcloud auth list
  Credentialed Accounts
ACTIVE  ACCOUNT
*       xxxxxxx@gmail.com

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

プロジェクトIDの一覧表示

$ gcloud config list project
[core]
project = xxxproject-xxxxx

Your active configuration is: [default]

localからCloud Shellへの接続

#初回はパスフレーズ設定する
$ gcloud beta cloud-shell ssh
Automatic authentication with GCP CLI tools in Cloud Shell is disabled. To enable, please rerun command with `--authorize-session` flag.
WARNING: The private SSH key file for gcloud does not exist.
WARNING: The public SSH key file for gcloud does not exist.
WARNING: You do not have an SSH key for gcloud.
WARNING: SSH keygen will be executed to generate a key.
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /Users/xxx/.ssh/google_compute_engine.
Your public key has been saved in /Users/xxx/.ssh/google_compute_engine.pub.

#2回目以降はパスフレーズ入力で接続可能
$ gcloud beta cloud-shell ssh
Automatic authentication with GCP CLI tools in Cloud Shell is disabled. To enable, please rerun command with `--authorize-session` flag.
Enter passphrase for key '/Users/xxx/.ssh/google_compute_engine':

2.BigQuery

queryコマンドのhelp一覧表示

# bq全コマンドの一覧表示はbq helpで表示可能
$ bq help query

表示結果

Python script for interacting with BigQuery.


USAGE: bq.py [--global_flags] <command> [--command_flags] [args]


query                      Execute a query.

                           Query should be specified on command line, or passed on stdin.

                           Examples:
                           bq query 'select count(*) from publicdata:samples.shakespeare'
                           echo 'select count(*) from publicdata:samples.shakespeare' | bq query

                           Usage:
                           query [<sql_query>]

                           Flags for query:

                             /Users/xxxx/gcp/cloud-sdk/google-cloud-sdk/platform/bq/bq.py:
                               --[no]allow_large_results: Enables larger destination table sizes for legacy SQL queries.
                               --[no]append_table: When a destination table is specified, whether or not to append.
                                 (default: 'false')
                               --[no]batch: Whether to run the query in batch mode.
                                 (default: 'false')
                               --clustering_fields: Comma separated field names. Can only be specified with time based
                                 partitioning. Data will be first partitioned and subsequently "clustered on these
                                 fields.
                               --destination_kms_key: Cloud KMS key for encryption of the destination table data.
                               --destination_schema: Schema for the destination table. Either a filename or a comma-
                                 separated list of fields in the form name[:type].
                                 (default: '')
                               --destination_table: Name of destination table for query results.
                                 (default: '')
                               --display_name: Display name for the created scheduled query configuration.
                                 (default: '')
                               --[no]dry_run: Whether the query should be validated without executing.
                               --external_table_definition: Specifies a table name and either an inline table definition
                                 or a path to a file containing a JSON table definition to use in the query. The format
                                 is "table_name::path_to_file_with_json_def" or
                                 "table_name::schema@format=uri@connection". Note using connection is an experimental
                                 feature and is still under development.For example, "--
                                 external_table_definition=Example::/tmp/example_table_def.txt" will define a table
                                 named "Example" using the URIs and schema encoded in example_table_def.txt.;
                                 repeat this option to specify a list of values
                               --[no]flatten_results: Whether to flatten nested and repeated fields in the result schema
                                 for legacy SQL queries. If not set, the default behavior is to flatten.
                               --job_timeout_ms: Maximum time to run the entire script.
                               --label: A label to set on a query job. The format is "key:value";
                                 repeat this option to specify a list of values
                               --max_child_jobs: Maximum number of child jobs to fetch results from after executing a
                                 script.  If the number of child jobs exceeds this limit, only the final result will be
                                 displayed.
                                 (default: '1000')
                                 (an integer)
                               -n,--max_rows: How many rows to return in the result.
                                 (default: '100')
                                 (an integer)
                               --max_statement_results: Maximum number of script statements to display the results for.
                                 (default: '100')
                                 (an integer)
                               --maximum_billing_tier: The upper limit of billing tier for the query.
                                 (an integer)
                               --maximum_bytes_billed: The upper limit of bytes billed for the query.
                                 (an integer)
                               --min_completion_ratio: [Experimental] The minimum fraction of data that must be scanned
                                 before a query returns. If not set, the default server value (1.0) will be used.
                                 (a number in the range [0, 1.0])
                               --parameter: Either a file containing a JSON list of query parameters, or a query
                                 parameter in the form "name:type:value". An empty name produces a positional parameter.
                                 The type may be omitted to assume STRING: name::value or ::value. The value "NULL"
                                 produces a null value.;
                                 repeat this option to specify a list of values
                               --range_partitioning: Enables range partitioning on the table. The format should be
                                 "field,start,end,interval". The table will be partitioned based on the value of the
                                 field. Field must be a top-level, non-repeated INT64 field. Start, end, and interval
                                 are INT64 values defining the ranges.
                               --[no]replace: If true, erase existing contents before loading new data.
                                 (default: 'false')
                               --request_id: The request_id to use for the jobs.query request. Only valid when used in
                                 combination with --rpc.
                               --[no]require_cache: Whether to only run the query if it is already cached.
                               --[no]require_partition_filter: Whether to require partition filter for queries over this
                                 table. Only apply to partitioned table.
                               --[no]rpc: If true, use rpc-style query API instead of jobs.insert().
                                 (default: 'false')
                               --schedule: Scheduled query schedule. If non-empty, this query requests could create a
                                 scheduled query understand the customer project. See
                                 https://cloud.google.com/appengine/docs/flexible/python/scheduling-jobs-with-cron-
                                 yaml#the_schedule_format for the schedule format
                               --schema_update_option: Can be specified when append to a table, or replace a table
                                 partition. When specified, the schema of the destination table will be updated with the
                                 schema of the new data. One or more of the following options can be specified:
                                 ALLOW_FIELD_ADDITION: allow new fields to be added
                                 ALLOW_FIELD_RELAXATION: allow relaxing required fields to nullable;
                                 repeat this option to specify a list of values
                               --script_statement_byte_budget: Maximum bytes that can be billed for any statement in a
                                 script.
                               --script_statement_timeout_ms: Maximum time to complete each statement in a script.
                               -s,--start_row: First row to return in the result.
                                 (default: '0')
                                 (an integer)
                               --target_dataset: Target dataset used to create scheduled query.
                               --time_partitioning_expiration: Enables time based partitioning on the table and sets the
                                 number of seconds for which to keep the storage for the partitions in the table. The
                                 storage in a partition will have an expiration time of its partition time plus this
                                 value.
                                 (an integer)
                               --time_partitioning_field: Enables time based partitioning on the table and the table
                                 will be partitioned based on the value of this field. If time based partitioning is
                                 enabled without this value, the table will be partitioned based on the loading time.
                               --time_partitioning_type: Enables time based partitioning on the table and set the type.
                                 The default value is DAY, which will generate one partition per day. Other supported
                                 values are HOUR, MONTH, and YEAR.
                               --udf_resource: The URI or local filesystem path of a code file to load and evaluate
                                 immediately as a User-Defined Function resource.;
                                 repeat this option to specify a list of values
                               --[no]use_cache: Whether to use the query cache to avoid rerunning cached queries.
                               --[no]use_legacy_sql: Whether to use Legacy SQL for the query. If not set, the default
                                 value is true.

                             absl.flags:
                               --flagfile: Insert flag definitions from the given file into the command line.
                                 (default: '')
                               --undefok: comma-separated list of flag names that it is okay to specify on the command
                                 line even if the program does not define a flag with that name.  IMPORTANT: flags in
                                 this list that have arguments MUST use the --flag=value format.
                                 (default: '')


Run 'bq.py --help' to get help for global flags.
Run 'bq.py help' to see the list of available commands.

データセットを作成する

$ bq mk [データセット名]
Dataset 'xxxproject-xxxx:[データセット名]' successfully created.

#リージョン指定(これをしないとデフォルトリージョンがUSになる)
$ bq mk -d --data_location=asia-northeast1 [データセット名]
Dataset 'xxxproject-xxxx:[データセット名]' successfully created.

データセットを確認する

$ bq ls
  datasetId   
 ------------ 
  [データセット名]  

#特定のプロジェクト内のデータセットを確認する場合は以下
$ bq ls xxxproject-xxxx:

テーブルの作成または更新とデータの読み込み

# コマンドの引数説明
# datasetID: [データセット名]
# tableID: [テーブル名]
# source: [データファイル名xxx.txt]
# schema: col1:string,col2:string,col3:string

#テーブル作成
$ bq load [データセット名].[テーブル名] xxx.txt \
> col1:string,col2:string,col3:string
Upload complete.
Waiting on bqjob_xxxx ... (0s) Current status: DONE

テーブルのスキーマを調べる

$ bq show [データセット名].[テーブル名]
Table [プロジェクト名]:[データセット名].[テーブル名]

   Last modified          Schema            Type     Total URIs   Expiration   Labels  
 ----------------- --------------------- ---------- ------------ ------------ -------- 
  28 Dec 11:39:03   |- col1: string     EXTERNAL   1                                 
                    |- col2: string                                                   
                    |- col3: string

SQLを実行する

$ bq query --use_legacy_sql=false \
> 'select
>   col1,
>   col2,
>   col3
>  from
>   [データセット名].[テーブル名]
> '
Waiting on bqjob_xxxx ... (0s) Current status: DONE   
+--------+--------+-----------+
| col1   | col2   | col3      |
+--------+--------+-----------+
| 001    | 000001 | 123456789 |
| 001    | 000002 | 234567890 |
| 003    | 000001 | 145678908 |
| 004    | 000002 | 123456789 |
| 004    | 000010 | 098765432 |
| 005    | 000010 | 000000001 |
| 007    | 000100 | 000000010 |
| 010    | 123456 | 00000009  |
| 011    | 123456 | 111111111 |
| 100    | 234577 | 234567899 |
+--------+--------+-----------+

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up