2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

StudioZ TechAdvent Calendar 2019

Day 10

mysqlに大型データを圧縮保存する

Posted at

MySQLに大型のデータを圧縮したい

経緯

同じリクエストに対して同じレスポンスを数日程度返せるようにしたい
redisに保存していたがメモリ逼迫しはじめたのでMySQLに逃がすこととする
この時、DB容量ケアの為レスポンスは圧縮データとしたい

テーブルの作成

対象カラムはlongblob型とした

CREATE TABLE IF NOT EXISTS `user_data_t` (
  `user_data_id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'ID',
  `user_id` int(11) unsigned NOT NULL COMMENT 'ユーザーID',
  `bin_data` longblob,
  `create_time` datetime NOT NULL COMMENT '作成日時',
  `update_time` datetime NOT NULL COMMENT '更新日時',
  `delete_flg` tinyint(4) unsigned NOT NULL DEFAULT '0' COMMENT '削除フラグ',
  PRIMARY KEY (`user_data_id`),
  UNIQUE KEY `user_id` (`user_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COMMENT='テスト' AUTO_INCREMENT=1;

検証

非圧縮

		$sampleData = array(
			'hoge' => array(1,2,3),
			'huga'	=> array(3,4,5),
			'hego'	=> 11213,
			'huga1'	=> array(3,4,5),
			'huga2'	=> array(3,4,5),
			'huga3'	=> array(3,4,5),
			'huga4'	=> array(3,4,5),
			'huga5'	=> array(3,4,5),
			'huga6'	=> array(3,4,5),
			'huga7'	=> array(3,4,5),
			'huga8'	=> array(3,4,5),
			'huga9'	=> array(3,4,5),
			'hugaA'	=> array(3,4,5),
			'hugaB'	=> array(3,4,5),
			'hugaC'	=> array(3,4,5),
			'hugaD'	=> array(3,4,5),
			'hugaE'	=> array(3,4,5),
			'hugaF'	=> array(3,4,5),
			'hugaG'	=> array(3,4,5),
			'hugaH'	=> array(3,4,5),
		);
		$insertData = [
			'user_id'		=> $userId,
			'resume_type'	=> $resumeType,
			'bin_data'		=> json_encode($targetData),
		];
		$insertKey = $this->insert($insertData);

サイズ確認

mysql [xxxx]  select octet_length(bin_data) from user_data_t WHERE user_data_id = 1;
--------------
select octet_length(bin_data) from user_data_t WHERE user_data_id = 1
--------------

+------------------------+
| octet_length(bin_data) |
+------------------------+
|                    316 |
+------------------------+
1 row in set (0.00 sec)

圧縮

gzcompress

文字列を圧縮する標準メソッド(gzip圧縮とは違うらしい)
https://www.php.net/manual/ja/function.gzcompress.php

		$sampleData = array(
			'hoge' => array(1,2,3),
			'huga'	=> array(3,4,5),
			'hego'	=> 11213,
			'huga1'	=> array(3,4,5),
			'huga2'	=> array(3,4,5),
			'huga3'	=> array(3,4,5),
			'huga4'	=> array(3,4,5),
			'huga5'	=> array(3,4,5),
			'huga6'	=> array(3,4,5),
			'huga7'	=> array(3,4,5),
			'huga8'	=> array(3,4,5),
			'huga9'	=> array(3,4,5),
			'hugaA'	=> array(3,4,5),
			'hugaB'	=> array(3,4,5),
			'hugaC'	=> array(3,4,5),
			'hugaD'	=> array(3,4,5),
			'hugaE'	=> array(3,4,5),
			'hugaF'	=> array(3,4,5),
			'hugaG'	=> array(3,4,5),
			'hugaH'	=> array(3,4,5),
		);
		$insertData = [
			'user_id'		=> $userId,
			'resume_type'	=> $resumeType,
			'bin_data'		=> gzcompress(json_encode($reportData)),
		];
		$insertKey = $this->insert($insertData);

サイズ確認

mysql [xxx]  select octet_length(bin_data) from user_data_t WHERE user_data_id = 2;
--------------
select octet_length(bin_data) from user_data_t WHERE user_data_id = 2
--------------

+------------------------+
| octet_length(bin_data) |
+------------------------+
|                     95 |
+------------------------+
1 row in set (0.00 sec)

圧縮結果

実際のarrayデータから1/3のサイズで保存できた

スピード

上記insertを100回で計測

非圧縮

float(3.0973808765411)
float(3.1544151306152)
float(3.0928800106049)
float(3.3873400688171)
float(3.2760629653931)

圧縮

float(3.1296010017395)
float(3.3456451892853)
float(3.3879458904266)
float(3.1482610702515)
float(3.1463890075684)

結論

10%ほど遅くなるか。inpuptするデータサイズにもよりそう?

2
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?