Help us understand the problem. What is going on with this article?

【MySQL5.6】完全外部結合(FULL OUTER JOIN)のやりかた

More than 1 year has passed since last update.

※例となるようなSQLは一番下にあります。
SQLだけ知りたい方は、下部にある結論から読んでみてください。
下部にある結論に飛ぶ

■ はじめに

MySQLで完全外部結合FULL OUTER JOINはまだ使えません。
ですので、同じ結果を得る別の方法を考える必要があります。
なお基礎知識は本当に基礎的なことなので、読み飛ばしていただいても大丈夫です。

■ 基礎知識

まず結合について、次の2テーブルを考えます。

テーブルA (table_a)

user_id user_name
100001 Miura
100002 Suzuki
100003 Kimura

テーブルB (table_b)

user_id synonym
100002 beast
100004 lizard

▼ INNER JOIN(内部結合)する

→ 両テーブルの共通行を結合する

● SQL

SELECT 
    *
FROM
    table_a AS a
        INNER JOIN
    table_b AS b ON b.user_id = a.user_id
;

● 出力結果

user_id user_name user_id synonym
100002 Suzuki 100002 beast

● テーブルイメージ

image11.png

▼ LEFT OUTER JOIN(左外部結合)する

→左側テーブルの全行と、右側テーブルの共通行を結合する ※OUTERは省略してOK

● SQL

SELECT 
    *
FROM
    table_a AS a
        LEFT OUTER JOIN
    table_b AS b ON b.user_id = a.user_id
;

● 出力結果

user_id user_name user_id synonym
100002 Suzuki 100002 beast
100001 Miura null null
100003 Kimura null null

● テーブルイメージ

image12.png

▼ RIGHT OUTER JOIN(右外部結合)する

→ 左側テーブルの共通行と、右側テーブルの全行を結合する ※OUTERは省略してOK

● SQL

SELECT 
    *
FROM
    table_a AS a
        RIGHT OUTER JOIN
    table_b AS b ON b.user_id = a.user_id
;

● 出力結果

user_id user_name user_id synonym
100002 Suzuki 100002 beast
null null 100004 lizard

● テーブルイメージ

image13.png

▼ FULL OUTER JOIN(完全外部結合)する

→ 左側テーブルの全行と、右側テーブルの全行を結合する ※MySQLだと使えません

● SQL

SELECT 
    *
FROM
    table_a AS a
        FULL OUTER JOIN
    table_b AS b ON b.user_id = a.user_id
;

● 出力結果(たぶんこうなる)

user_id user_name user_id synonym
100002 Suzuki 100002 beast
100001 Miura null null
100003 Kimura null null
null null 100004 lizard

● テーブルイメージ

image14.png

▼ UNIONする

→ 2つのテーブルを重複抜きしながら結合する
※2つのテーブルは列数と列の内容を合わせる必要があります

● SQL

SELECT 
    a.user_id AS user_id
FROM
    table_a AS a
UNION SELECT 
    b.user_id AS user_id
FROM
    table_b AS b
;

● 出力結果

user_id
100001
100002
100003
100004

▼ UNION ALLする

→ 2つのテーブルを重複抜きせず結合する
※2つのテーブルは列数と列の内容を合わせる必要があります

● SQL

SELECT 
    a.user_id AS user_id
FROM
    table_a AS a
UNION ALL SELECT 
    b.user_id AS user_id
FROM
    table_b AS b
;

● 出力結果

user_id
100001
100002
100003
100002
100004

▼ ためしにそれっぽくUNIONしてみる

● SQL

SELECT 
    a.user_id AS user_id
    , a.user_name AS user_name
    , NULL AS synonym
FROM
    table_a AS a
UNION SELECT 
    b.user_id AS user_id
    , NULL AS user_name
    , b.synonym AS synonym
FROM
    table_b AS b
;

● 出力結果

user_id user_name synonym
100001 Miura null
100002 Suzuki null
100003 Kimura null
100002 null beast
100004 null lizard

※本当はuser_id: 100002が1行になってほしいのに2行のままなのでダメ

■ 実現方法

はじめにもどってMySQLでFULL OUTER JOINはまだ使えないので、
別の方法で実現することを考えます。ついでに実行速度も計測します。

使用したMySqlのバージョン: 5.6.32

▼ 準備

CREATE SCHEMA `sandbox`;
USE `sandbox`;
CREATE TABLE city_list_2010 (
    city_code CHAR(5) NOT NULL PRIMARY KEY,
    city_name VARCHAR(10) NOT NULL
);
CREATE TABLE city_list_2005 (
    city_code CHAR(5) NOT NULL PRIMARY KEY,
    city_name VARCHAR(10) NOT NULL
);
CREATE TABLE mutex (
    i INT NOT NULL PRIMARY KEY
);
INSERT INTO mutex values (0), (1);

・ここから2010年と2005年それぞれのファイルについて、
 『都道府県・市区町村コード』と『都道府県・市区町村名』をとってきます
 http://www.e-stat.go.jp/SG1/estat/List.do?bid=000001037709

▼ 1. UNIONと非排他なJOIN

UNIONで重複抜きしつつ結合する

SELECT 
    *
FROM
    city_list_2010 AS cl2010
        LEFT OUTER JOIN
    city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code 
UNION SELECT 
    *
FROM
    city_list_2010 AS cl2010
        RIGHT OUTER JOIN
    city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code
;

LEFT OUTER JOINのみ(RIGHT OUTER JOINを使わない)版

SELECT 
    cl2010.*
    , cl2005.*
FROM
    city_list_2010 AS cl2010
        LEFT OUTER JOIN
    city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code 
UNION SELECT 
    cl2010.*
    , cl2005.*
FROM
    city_list_2005 AS cl2005
        LEFT OUTER JOIN
    city_list_2010 AS cl2010 ON cl2010.city_code = cl2005.city_code
;

▼ 2. UNION ALLと排他なJOIN

UNION ALLで結合してWHEREで重複抜きする

SELECT 
    *
FROM
    city_list_2010 AS cl2010
        LEFT OUTER JOIN
    city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code 
UNION ALL SELECT 
    *
FROM
    city_list_2010 AS cl2010
        RIGHT OUTER JOIN
    city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code
WHERE
    cl2010.city_code IS NULL
;

LEFT OUTER JOINのみ(RIGHT OUTER JOINを使わない)版

SELECT 
    cl2010.*
    , cl2005.*
FROM
    city_list_2010 AS cl2010
        LEFT OUTER JOIN
    city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code 
UNION ALL SELECT 
    cl2010.*
    , cl2005.*
FROM
    city_list_2005 AS cl2005
        LEFT OUTER JOIN
    city_list_2010 AS cl2010 ON cl2010.city_code = cl2005.city_code
WHERE
    cl2010.city_code IS NULL
;

▼ 3. mutex tableをつかう

SELECT 
    cl2010.*, cl2005.*
FROM
    mutex
        LEFT OUTER JOIN
    city_list_2010 AS cl2010 ON i = 0
        LEFT OUTER JOIN
    city_list_2005 AS cl2005 ON i = 1 OR cl2010.city_code = cl2005.city_code
        LEFT OUTER JOIN
    city_list_2010 AS cl2010_2 ON i = 1 AND cl2010_2.city_code = cl2005.city_code
WHERE
    cl2005.city_code IS NULL OR cl2010_2.city_code IS NULL
;

■ 実現方法のベンチマーク

▼ 1. UNIONと非排他なJOIN

$ mysqlslap --user=rhap --password --create-schema=sandbox --concurrency=50 --iterations=100 --query "SELECT * FROM city_list_2010 AS cl2010 LEFT OUTER JOIN city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code UNION SELECT * FROM city_list_2010 AS cl2010 RIGHT OUTER JOIN city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code;"
Benchmark
        Average number of seconds to run all queries: 0.565 seconds
        Minimum number of seconds to run all queries: 0.545 seconds
        Maximum number of seconds to run all queries: 0.977 seconds
        Number of clients running queries: 50
        Average number of queries per client: 1

▼ 2. UNION ALLと排他なJOIN

$ mysqlslap --user=rhap --password --create-schema=sandbox --concurrency=50 --iterations=100 --query "SELECT * FROM city_list_2010 AS cl2010 LEFT OUTER JOIN city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code UNION ALL SELECT * FROM city_list_2010 AS cl2010 RIGHT OUTER JOIN city_list_2005 AS cl2005 ON cl2010.city_code = cl2005.city_code WHERE cl2010.city_code IS NULL;"
Benchmark
        Average number of seconds to run all queries: 0.307 seconds
        Minimum number of seconds to run all queries: 0.295 seconds
        Maximum number of seconds to run all queries: 0.537 seconds
        Number of clients running queries: 50
        Average number of queries per client: 1

▼ 3. mutex tableをつかう

$ mysqlslap --user=rhap --password --create-schema=sandbox --concurrency=50 --iterations=100 --query "SELECT cl2010.*, cl2005.* FROM mutex LEFT OUTER JOIN city_list_2010 AS cl2010 ON i = 0 LEFT OUTER JOIN city_list_2005 AS cl2005 ON i = 1 OR cl2010.city_code = cl2005.city_code LEFT OUTER JOIN city_list_2010 AS cl2010_2 ON i = 1 AND cl2010_2.city_code = cl2005.city_code WHERE cl2005.city_code IS NULL OR cl2010_2.city_code IS NULL;"
Benchmark
        Average number of seconds to run all queries: 27.726 seconds
        Minimum number of seconds to run all queries: 27.470 seconds
        Maximum number of seconds to run all queries: 28.201 seconds
        Number of clients running queries: 50
        Average number of queries per client: 1

■ 結論

・MySQLでFULL OUTER JOINしたいときは
まずUNION ALLしてWHERE ... IS NULLしてみてください。

● 結合の対象となるテーブル

テーブルA (table_a)

user_id user_name
100001 Miura
100002 Suzuki
100003 Kimura

テーブルB (table_b)

user_id synonym
100002 beast
100004 lizard

● SQL

SELECT 
    a.user_id AS user_id,
    a.user_name AS user_name,
    b.synonym AS synonym
FROM
    table_a AS a
        LEFT OUTER JOIN
    table_b AS b ON b.user_id = a.user_id 
UNION ALL SELECT 
    b.user_id AS user_id,
    a.user_name AS user_name,
    b.synonym AS synonym
FROM
    table_a AS a
        RIGHT OUTER JOIN
    table_b AS b ON b.user_id = a.user_id
WHERE
    a.user_id IS NULL
ORDER BY user_id
;

LEFT OUTER JOINのみ(RIGHT OUTER JOINを使わない)版

SELECT 
    a.user_id AS user_id,
    a.user_name AS user_name,
    b.synonym AS synonym
FROM
    table_a AS a
        LEFT OUTER JOIN
    table_b AS b ON b.user_id = a.user_id 
UNION ALL SELECT 
    b.user_id AS user_id,
    a.user_name AS user_name,
    b.synonym AS synonym
FROM
    table_b AS b
        LEFT OUTER JOIN
    table_a AS a ON b.user_id = a.user_id
WHERE
    a.user_id IS NULL
ORDER BY user_id
;

● 出力結果

user_id user_name synonym
100001 Miura null
100002 Suzuki beast
100003 Kimura null
100004 null lizard

なおこの文章はどこにでも載っている情報ですが、mutex tableが試したかったから書きました
参照サイト: http://www.xaprb.com/blog/2006/05/26/how-to-write-full-outer-join-in-mysql/
※『mutex tableはUNIONをサポートしていない古いバージョン向け』と参照サイトにあるように早いわけでもありませんでした
UNION ALLしたあとuser_idでGROUP BYしてSELECTするときMAX()するとかは考えるまでもなく遅いはず
※個人的には、テーブルは左に足していくものという頭があるので、RIGHT OUTER JOINは極力使いません(というか使っているところ初めて見た

rhap
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away