概要
SF映画の愛好家の多くは、「地球上に異星人はいるのか」という疑問に興味を持ちます。私自身、データサイエンスの実践者であり、地球上のエイリアンの詳細がわかるデータセットがないかとよく考えていました。今回、Senckenberg World of Biodiversityが所有するSenckenberg(メタ)データポータルで、その1つを見つけることができ、嬉しく思っています。
ゼンケンベルグ(メタ)データポータルにあるように、このデータセットは藻類から哺乳類まで、世界中の外来種を詳細にまとめたものです。早速ですが、分析に入りましょう。この分析には、GridDB CloudとPythonを使用する予定です。
jupyterファイルへのリンクはこちらです:
https://github.com/griddbnet/Blogs/tree/alien_life
全体的な前提
データセットをSenckenberg(メタ)データポータルからダウンロードし、GridDBのWebApiを使用してGridDB Cloudにアップロードする予定です。そして、GridDBのSQL機能を利用して、Pythonで分析を行います。以下は、そのアプローチの概要です。
- Pythonによるデータ準備
- GridDBで使用するContainer構造の作成
- GridDB WebApiを使ってGridDB CloudにContainerをセットアップする
- GridDBでコンテナにデータを追加する
- GridDBクラウドからデータを取得する
- 分析フェーズ - GridDBのSQLエンジンの力を利用する
前提条件: i. Web apiによるGridDBのセットアップ ii. クラスタの作成 i と ii の詳細は - https://griddb.net/en/blog/an-introduction-to-griddb-cloud/ に記載されています。iii. python環境(バージョン3以上)と任意のIDE。iv. データセットは https://dataportal.senckenberg.de/dataset/global-alien-species-first-record-database/resource/bfcc1603-c923-4299-a92a-3168a072d2a4 からダウンロードします。
Pythonによるデータ準備
データ準備では、まずSenckenberg(メタ)データポータルからダウンロードしたデータセットを読み込むことから始めます。また、GridDB WebAPI用のコンテナ構造の作成も含まれます。
# Read data from the downloaded dataset
import pandas as pd
Alien_species_global_trends = pd.read_excel("Alien_Species.xlsx", sheet_name=1, header=0,keep_default_na=False, index_col=None)
# Extracting column headers for the GridDB WebApi Container structure
col_vals = (list(Alien_species_global_trends.columns.values))
col_dtypes = (list(Alien_species_global_trends.dtypes))
data_tuples = list(zip(col_vals,col_dtypes)) # Binding lists to a tuple
# Converting list of tuples to pandas dataframe
container_columns = pd.DataFrame(data_tuples, columns=['col_vals','col_datatypes'])
container_columns.to_csv('container_columns.csv',index=False)
こうして作成されたデータフレームには、すべてのヘッダとそのデータ型が含まれます。この手順は、多くのヘッダを持つデータセットがある場合に有効です。GridDBのデータ型は、Pythonのデータ型とは異なることに注意してください。例えば、カラム 'Bryophytes' は Python では 'object' ですが、GridDB では 'STRING' です。
GridDB WebAPIを使用したGridDBのContainer構造の作成
前提条件として、GridDBにおけるコンテナについて詳しく知るには、https://docs.griddb.net/ja/ を読んでください。
クラスタ内にコンテナを作成するための基本的な構造は以下の通りです。
curl --location --request POST :GridDBCloudURL/:cluster/dbs/:database/containers/ \ --header 'Authorization: Basic base encoded version of username:password \ --header 'Content-Type: application/json; charset=UTF-8' \ --data '{
"container_name": “Container Name",
"container_type": “Container Type",
"rowkey": true,
"columns": [
{
"name": "timestamp",
"type": "TIMESTAMP"
},
{
"name": “HEADER_1",
"type": "BOOL"
},
{
"name": “HEADER_N",
"type": “INTEGER"
}
]
}'
以下は、Pythonによるコード・スニペットです。
GridDB CloudでWebApiを使ってコンテナをセットアップする
import requests # to make http calls
import http
http.client.HTTPConnection.debuglevel = 1 #Tip - This command enables logging of http calls
#Construct an object to hold the request headers
header_obj = {"Authorization":"Basic U3ViaGE6Yjk+RWtDUA==","Content-Type":"application/json; charset=UTF-8","User-Agent":"PostmanRuntime/7.29.0"}
#Construct an object to hold the request body (i.e., the container that needs to be created)
data_obj = {
"container_name": "Alien_Species_Analysis",
"container_type": "COLLECTION",
"rowkey": True,
"columns": [
{
"name": "Year",
"type": "INTEGER"
},
{
"name": "All",
"type": "INTEGER"
},
{
"name": "VascularPlants",
"type": "STRING"
},
{
"name": "Bryophytes",
"type": "STRING"
},
{
"name": "Algae",
"type": "STRING"
},
{
"name": "Mammals",
"type": "STRING"
},
{
"name": "Birds",
"type": "STRING"
},
{
"name": "Reptiles",
"type": "STRING"
},
{
"name": "Amphibians",
"type": "STRING"
},
{
"name": "Fishes",
"type": "STRING"
},
{
"name": "Insects",
"type": "STRING"
},
{
"name": "Molluscs",
"type": "STRING"
},
{
"name": "Crustaceans",
"type": "STRING"
},
{
"name": "Arachnids",
"type": "STRING"
},
{
"name": "Arthropods",
"type": "STRING"
},
{
"name": "Invertebrates",
"type": "STRING"
},
{
"name": "Fungi",
"type": "STRING"
},
{
"name": "BacteriaProtozoans",
"type": "STRING"
}
]
}
#Set up the GridDB WebAPI URL
url = 'https://cloud1.griddb.com/trial1302/griddb/v2/gs_clustertrial1302/dbs/public/containers'
#Invoke the GridDB WebAPI with the headers and the request body
x = requests.post(url, json = data_obj, headers = header_obj})
GridDBでコンテナにデータを追加する
次のステップは、新しく作成したコンテナ('Alien_Species_Analysis')にデータを追加することです。GridDB WebAPIはこれを実現するためのAPIを提供しています。
コンテナ内に行を作成するための基本的な構造は以下の通りです。
curl --location --request PUT :GridDBCloudURL/:cluster/dbs/:database/containers/:container/rows \ --header 'Authorization: Basic base encoded version of username:password \ --header 'Content-Type: application/json; charset=UTF-8' \ --data '[
[
"Year": "0",
"All": "1503",
"Vascular plants": "2",
"Bryophytes": "",
"Algae": "",
"Mammals": "",
"Birds": "2",
"Reptiles": "",
"Amphibians": "",
"Fishes": "",
"Insects": "",
"Molluscs": "",
"Crustaceans": "",
"Arachnids": "",
"Arthropods p.p. (Myriapods, Diplopods etc.)": "",
"Invertebrates (excl. Arthropods, Molluscs)": "",
"Fungi": "",
"Bacteria and protozoans": ""
]
]'
詳しくは、http://www.toshiba-sol.co.jp/en/pro/griddb/docs-en/v4_3/GridDB_Web_API_Reference.html#%E3%83%AD%E3%82%A6%E7%99%BB%E9%8C%B2 をご覧ください。
以下は、Pythonによるコード・スニペットです。
ヒント:pandasの'to_json'関数は、データフレームのデータをjson構造体に変換するために使用されることに注意してください。
#Convert the data in the dataframe to the JSON format
Alien_species_global_trends_json = Alien_species_global_trends.to_json(orient='values')
request_body = Alien_species_global_trends_json
#Setup the URL to be used to invoke the GridDB WebAPI to register rows in the container created previously
url = 'https://cloud1.griddb.com/trial1302/griddb/v2/gs_clustertrial1302/dbs/public/containers/Alien_Species_Analysis/rows'
#Invoke the GridDB WebAPI using the request constructed
x = requests.put(url, data=request_body, headers=header_obj)
send: b'PUT /trial1302/griddb/v2/gs_clustertrial1302/dbs/public/containers/Alien_Species_Analysis/rows HTTP/1.1\r\nHost: cloud1.griddb.com\r\nUser-Agent: PostmanRuntime/7.29.0\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: Basic U3ViaGE6Yjk+RWtDUA==\r\nContent-Type: application/json; charset=UTF-8\r\nContent-Length: 24016\r\n\r\n'
send: b'[[1503,2,"","","",2,"","","","","","","","","","","",""],[1504,2,1,"","",1,"","","","","","","","","","","",""],[1508,1,1,"","","","","","","","","","","","","","",""],[1509,3,1,"","",1,"","","","","",1,"","","","","",""],[1511,10,"","","",10,"","","","","","","","","","","",""],[1512,2,"","","",1,1,"","","","","","","","","","",""],[1513,2,"","","",1,1,"","","","","","","","","","",""],[1515,1,"","","",1,"","","","","","","","","","","",""],[1517,1,1,"","","","","","","","","","","","","","",""],[1520,2,"","","",1,1,"","","","","","","","","","",""],[1521,2,"","","",1,1,"","","","","","","","","","",""],[1523,1,1,"","","","","","","","","","","","","","",""],[1525,1,"","","",1,"","","","","","","","","","","",""],[1526,2,1,"","",1,"","","","","","","","","","","",""],[1529,1,"","","","",1,"","","","","","","","","","",""],[1530,3,"","","",1,2,"","","","","","","","","","",""],[1531,3,"","","","",3,"","","","","","","","","","",""],[1532,5,5,"","","","","","","","","","","","","","",""],[1535,30,24,"","",6,"","","","","","","","","","","",""],[1536,2,"","","",2,"","","","","","","","","","","",""],[1538,6,6,"","","","","","","","","","","","","","",""],[1544,2,"","","",2,"","","","","","","","","","","",""],[1547,1,"","","",1,"","","","","","","","","","","",""],[1548,26,26,"","","","","","","","","","","","","","",""],[1550,16,4,"","",9,3,"","","","","","","","","","",""],[1551,8,8,"","","","","","","","","","","","","","",""],[1553,7,7,"","","","","","","","","","","","","","",""],[1554,2,2,"","","","","","","","","","","","","","",""],[1557,2,1,"","","",1,"","","","","","","","","","",""],[1560,2,"","","",1,"","","",1,"","","","","","","",""],[1561,1,1,"","","","","","","","","","","","","","",""],[1562,11,11,"","","","","","","","","","","","","","",""],[1563,4,3,"","",1,"","","","","","","","","","","",""],[1565,3,2,"","","","","","",1,"","","","","","","",""],[1568,2,2,"","","","","","","","","","","","","","",""],[1570,10,10,"","","","","","","","","","","","","","",""],[1572,1,"","","","",1,"","","","","","","","","","",""],[1574,2,"","","",2,"","","","","","","","","","","",""],[1575,1,"","","","",1,"","","","","","","","","","",""],[1576,3,3,"","","","","","","","","","","","","","",""],[1577,2,1,"","","",1,"","","","","","","","","","",""],[1578,3,2,"","","",1,"","","","","","","","","","",""],[1579,2,1,"","","",1,"","","","","","","","","","",""],[1580,2,1,"","",1,"","","","","","","","","","","",""],[1581,1,1,"","","","","","","","","","","","","","",""],[1583,3,3,"","","","","","","","","","","","","","",""],[1584,1,1,"","","","","","","","","","","","","","",""],[1585,1,1,"","","","","","","","","","","","","","",""],[1588,1,1,"","","","","","","","","","","","","","",""],[1589,4,4,"","","","","","","","","","","","","","",""],[1590,3,"","","",1,2,"","","","","","","","","","",""],[1595,1,"","","","",1,"","","","","","","","","","",""],[1596,3,2,"","",1,"","","","","","","","","","","",""],[1597,45,44,"","","",1,"","","","","","","","","","",""],[1598,2,"","","",2,"","","","","","","","","","","",""],[1599,5,2,"","",2,1,"","","","","","","","","","",""],[1600,58,44,"","",4,3,"","",1,2,4,"","","","","",""],[1601,2,"","","",1,1,"","","","","","","","","","",""],[1602,1,"","","","","","","","","",1,"","","","","",""],[1604,1,1,"","","","","","","","","","","","","","",""],[1605,2,1,"","",1,"","","","","","","","","","","",""],[1606,5,4,"","","",1,"","","","","","","","","","",""],[1607,3,2,"","",1,"","","","","","","","","","","",""],[1610,3,1,"","",1,1,"","","","","","","","","","",""],[1611,1,1,"","","","","","","","","","","","","","",""],[1612,1,1,"","","","","","","","","","","","","","",""],[1613,6,5,"","",1,"","","","","","","","","","","",""],[1614,2,2,"","","","","","","","","","","","","","",""],[1615,1,"","","","",1,"","","","","","","","","","",""],[1616,1,"","","",1,"","","","","","","","","","","",""],[1617,18,"","","",1,"","","","",17,"","","","","","",""],[1618,2,1,"","","","","","","",1,"","","","","","",""],[1620,4,"","","",1,"","","","",3,"","","","","","",""],[1621,1,"","","","",1,"","","","","","","","","","",""],[1622,1,"","","",1,"","","","","","","","","","","",""],[1625,3,3,"","","","","","","","","","","","","","",""],[1626,4,2,"","",2,"","","","","","","","","","","",""],[1628,1,1,"","","","","","","","","","","","","","",""],[1629,9,9,"","","","","","","","","","","","","","",""],[1630,1,1,"","","","","","","","","","","","","","",""],[1631,1,1,"","","","","","","","","","","","","","",""],[1632,9,9,"","","","","","","","","","","","","","",""],[1633,13,13,"","","","","","","","","","","","","","",""],[1634,6,4,"","","","","","",2,"","","","","","","",""],[1635,2,1,"","","","","","","",1,"","","","","","",""],[1638,1,1,"","","","","","","","","","","","","","",""],[1639,1,"","","",1,"","","","","","","","","","","",""],[1640,6,5,"","","",1,"","","","","","","","","","",""],[1641,1,1,"","","","","","","","","","","","","","",""],[1642,12,10,"","","",2,"","","","","","","","","","",""],[1644,2,2,"","","","","","","","","","","","","","",""],[1645,3,3,"","","","","","","","","","","","","","",""],[1648,1,1,"","","","","","","","","","","","","","",""],[1649,1,"","","",1,"","","","","","","","","","","",""],[1650,29,10,"","",8,11,"","","","","","","","","","",""],[1651,2,2,"","","","","","","","","","","","","","",""],[1652,8,6,"","","",2,"","","","","","","","","","",""],[1653,1,1,"","","","","","","","","","","","","","",""],[1654,5,1,"","",4,"","","","","","","","","","","",""],[1656,5,3,"","",2,"","","","","","","","","","","",""],[1657,1,"","","","","","","",1,"","","","","","","",""],[1658,33,30,"","",3,"","","","","","","","","","","",""],[1659,2,1,"","",1,"","","","","","","","","","","",""],[1660,7,6,"","",1,"","","","","","","","","","","",""],[1661,14,14,"","","","","","","","","","","","","","",""],[1662,10,10,"","","","","","","","","","","","","","",""],[1663,1,1,"","","","","","","","","","","","","","",""],[1665,2,2,"","","","","","","","","","","","","","",""],[1666,5,5,"","","","","","","","","","","","","","",""],[1668,1,1,"","","","","","","","","","","","","","",""],[1669,2,"","","","",1,1,"","","","","","","","","",""],[1670,11,7,"","",1,"","","","",3,"","","","","","",""],[1672,19,17,"","",2,"","","","","","","","","","","",""],[1673,4,"","","","",2,"","","",2,"","","","","","",""],[1674,1,1,"","","","","","","","","","","","","","",""],[1675,1,"","","",1,"","","","","","","","","","","",""],[1677,3,3,"","","","","","","","","","","","","","",""],[1678,3,1,"","",1,1,"","","","","","","","","","",""],[1679,2,1,"","",1,"","","","","","","","","","","",""],[1680,6,4,"","",1,1,"","","","","","","","","","",""],[1683,2,2,"","","","","","","","","","","","","","",""],[1684,1,1,"","","","","","","","","","","","","","",""],[1685,12,9,"","",3,"","","","","","","","","","","",""],[1686,2,1,"","",1,"","","","","","","","","","","",""],[1688,8,8,"","","","","","","","","","","","","","",""],[1689,5,4,"","","","","","",1,"","","","","","","",""],[1690,10,7,"","",1,2,"","","","","","","","","","",""],[1693,1,"","","","","","","","","","","","","",1,"",""],[1694,10,9,"","","",1,"","","","","","","","","","",""],[1695,1,1,"","","","","","","","","","","","","","",""],[1696,4,3,"","","","","",1,"","","","","","","","",""],[1697,2,1,"","","","","","",1,"","","","","","","",""],[1699,12,11,"","","","","","","",1,"","","","","","",""],[1700,142,104,"","",13,6,"","",3,11,4,1,"","","","",""],[1701,10,5,"","",3,"","","",2,"","","","","","","",""],[1702,2,2,"","","","","","","","","","","","","","",""],[1703,1,1,"","","","","","","","","","","","","","",""],[1704,4,4,"","","","","","","","","","","","","","",""],[1705,3,1,"","",2,"","","","","","","","","","","",""],[1707,1,"","","",1,"","","","","","","","","","","",""],[1708,1,1,"","","","","","","","","","","","","","",""],[1709,3,2,"","",1,"","","","","","","","","","","",""],[1710,4,"","","","",4,"","","","","","","","","","",""],[1711,3,3,"","","","","","","","","","","","","","",""],[1713,7,7,"","","","","","","","","","","","","","",""],[1714,1,"","","","",1,"","","","","","","","","","",""],[1715,1,"","","","",1,"","","","","","","","","","",""],[1716,3,3,"","","","","","","","","","","","","","",""],[1718,1,1,"","","","","","","","","","","","","","",""],[1719,3,3,"","","","","","","","","","","","","","",""],[1720,6,4,"","",1,1,"","","","","","","","","","",""],[1722,2,1,"","",1,"","","","","","","","","","","",""],[1723,5,5,"","","","","","","","","","","","","","",""],[1724,11,11,"","","","","","","","","","","","","","",""],[1725,6,1,"","",3,1,"","","",1,"","","","","","",""],[1726,1,1,"","","","","","","","","","","","","","",""],[1727,1,"","","",1,"","","","","","","","","","","",""],[1728,3,2,"","","",1,"","","","","","","","","","",""],[1729,1,1,"","","","","","","","","","","","","","",""],[1730,6,2,"","",2,1,"","","","",1,"","","","","",""],[1731,1,1,"","","","","","","","","","","","","","",""],[1732,14,11,"","","",3,"","","","","","","","","","",""],[1733,3,2,"","","",1,"","","","","","","","","","",""],[1734,1,1,"","","","","","","","","","","","","","",""],[1735,2,1,"","",1,"","","","","","","","","","","",""],[1736,2,2,"","","","","","","","","","","","","","",""],[1737,8,7,"","","","","","",1,"","","","","","","",""],[1738,4,2,"","",2,"","","","","","","","","","","",""],[1739,3,"","","","",3,"","","","","","","","","","",""],[1740,10,1,"","","",1,"","","",8,"","","","","","",""],[1741,8,8,"","","","","","","","","","","","","","",""],[1742,1,"","","",1,"","","","","","","","","","","",""],[1743,20,20,"","","","","","","","","","","","","","",""],[1744,10,10,"","","","","","","","","","","","","","",""],[1745,10,10,"","","","","","","","","","","","","","",""],[1746,4,3,"","",1,"","","","","","","","","","","",""],[1747,4,2,"","",1,1,"","","","","","","","","","",""],[1748,4,3,"","","",1,"","","","","","","","","","",""],[1749,2,2,"","","","","","","","","","","","","","",""],[1750,52,2,"","",22,26,"","",1,1,"","","","","","",""],[1752,1,1,"","","","","","","","","","","","","","",""],[1753,4,4,"","","","","","","","","","","","","","",""],[1754,3,1,"","","",1,"","",1,"","","","","","","",""],[1755,3,3,"","","","","","","","","","","","","","",""],[1756,1,1,"","","","","","","","","","","","","","",""],[1757,3,2,"","",1,"","","","","","","","","","","",""],[1758,1,1,"","","","","","","","","","","","","","",""],[1759,1,1,"","","","","","","","","","","","","","",""],[1760,24,20,"","",1,2,"",1,"","","","","","","","",""],[1761,3,2,"","","","","","",1,"","","","","","","",""],[1762,12,9,"","",1,1,"","","","","","","","",1,"",""],[1763,10,8,"","",1,"","","",1,"","","","","","","",""],[1764,9,2,"","",6,"","","","","","",1,"","","","",""],[1765,5,2,"","","",2,"","","","",1,"","","","","",""],[1767,11,6,"","",2,3,"","","","","","","","","","",""],[1768,7,5,"","",1,1,"","","","","","","","","","",""],[1769,7,4,"","",1,1,"",1,"","","","","","","","",""],[1770,25,11,"","",5,5,"","","",4,"","","","","","",""],[1771,2,1,"","","",1,"","","","","","","","","","",""],[1772,26,23,"","",2,1,"","","","","","","","","","",""],[1773,8,3,"","",1,4,"","","","","","","","","","",""],[1774,5,1,"","",4,"","","","","","","","","","","",""],[1775,21,19,"","",2,"","","","","","","","","","","",""],[1776,3,2,"","","","","","","",1,"","","","","","",""],[1777,16,14,"","",2,"","","","","","","","","","","",""],[1778,21,10,"","",7,"","",1,"",1,2,"","","","","",""],[1779,6,3,"","",3,"","","","","","","","","","","",""],[1780,10,7,"","",1,1,"","","",1,"","","","","","",""],[1781,4,3,"","",1,"","","","","","","","","","","",""],[1782,4,4,"","","","","","","","","","","","","","",""],[1783,3,3,"","","","","","","","","","","","","","",""],[1784,7,6,"","",1,"","","","","","","","","","","",""],[1785,14,12,1,"",1,"","","","","","","","","","","",""],[1786,7,6,"","",1,"","","","","","","","","","","",""],[1787,13,11,"","",1,"","","","",1,"","","","","","",""],[1788,18,8,"","",6,4,"","","","","","","","","","",""],[1789,8,4,"","",1,1,"","","","",2,"","","","","",""],[1790,21,9,"","",4,7,"","","",1,"","","","","","",""],[1791,47,41,"","",1,1,"","","",4,"","","","","","",""],[1792,14,14,"","","","","","","","","","","","","","",""],[1793,13,10,"","",3,"","","","","","","","","","","",""],[1794,25,22,"","",1,1,"","","",1,"","","","","","",""],[1795,10,8,"","",1,"","","","",1,"","","","","","",""],[1796,12,10,"","","",1,"","","",1,"","","","","","",""],[1797,16,12,"","",3,1,"","","","","","","","","","",""],[1798,14,12,"","",1,1,"","","","","","","","","","",""],[1799,29,22,"","",2,4,"","","",1,"","","","","","",""],[1800,122,65,"","",22,8,1,"",1,19,5,"","","",1,"",""],[1801,21,11,"","","",2,"",1,"",2,4,1,"","","","",""],[1802,24,19,"","","","","",1,"",4,"","","","","","",""],[1803,24,19,"","",2,"",1,"","",1,1,"","","","","",""],[1804,41,38,"","",1,"","",2,"","","","","","","","",""],[1805,26,22,"","","","",2,"","",1,"",1,"","","","",""],[1806,22,20,"","","","",1,1,"","","","","","","","",""],[1807,14,12,"","",1,"",1,"","","","","","","","","",""],[1808,12,8,"",1,"","",1,"","","",1,1,"","","","",""],[1809,27,22,"","",2,1,1,"",1,"","","","","","","",""],[1810,23,10,"","",6,6,"","","",1,"","","","","","",""],[1811,25,24,"","","","","","","","","",1,"","","","",""],[1812,8,7,"","","",1,"","","","","","","","","","",""],[1813,61,55,"","",1,3,"","","",2,"","","","","","",""],[1814,15,14,"","",1,"","","","","","","","","","","",""],[1815,16,11,"","",4,1,"","","","","","","","","","",""],[1816,16,16,"","","","","","","","","","","","","","",""],[1817,26,24,"","",1,"","","","",1,"","","","","","",""],[1818,24,22,"","",1,1,"","","","","","","","","","",""],[1819,47,44,"","",3,"","","","","","","","","","","",""],[1820,55,32,"","",6,6,1,1,1,7,1,"","","","","",""],[1821,65,64,"","","","","","","","",1,"","","","","",""],[1822,15,14,"","","","","",1,"","","","","","","","",""],[1823,22,19,"","",2,"","","","",1,"","","","","","",""],[1824,39,32,"","",2,1,1,"","","",3,"","","","","",""],[1825,55,47,"","",3,"","","","",5,"","","","","","",""],[1826,41,32,"","",2,2,"","","",3,2,"","","","","",""],[1827,42,34,"","","",3,"","","",3,"",2,"","","","",""],[1828,19,18,"","","","","","","",1,"","","","","","",""],[1829,17,9,"","",4,1,"","","",3,"","","","","","",""],[1830,59,51,"","",6,2,"","","","","","","","","","",""],[1831,14,10,"","",1,"","","",2,1,"","","","","","",""],[1832,49,37,"",1,6,1,"","",1,2,"",1,"","","","",""],[1833,32,26,"","",1,1,"",1,"",2,1,"","","","","",""],[1834,69,65,"","","","","",1,"",3,"","","","","","",""],[1835,57,43,"","",2,2,"","","",7,2,1,"","","","",""],[1836,40,35,"","",1,1,"",1,"",2,"","","","","","",""],[1837,63,55,"","",2,"","",1,"",4,"","","",1,"","",""],[1838,215,208,"","",1,2,"","","",4,"","","","","","",""],[1839,35,23,"",1,1,4,1,"","",4,1,"","","","","",""],[1840,89,56,"","",7,8,1,1,"",8,5,1,"","",1,1,""],[1841,41,29,"","",2,"","","","",9,1,"","","","","",""],[1842,107,104,"","","",2,"","","",1,"","","","","","",""],[1843,63,62,"","",1,"","","","","","","","","","","",""],[1844,82,68,"","",1,1,"",2,"",2,"",7,"","","",1,""],[1845,71,59,"","",4,4,"","","","",1,"","","","",3,""],[1846,71,60,"","",3,2,"","","",2,4,"","","","","",""],[1847,78,65,"","",1,2,"","","",8,2,"","","","","",""],[1848,91,80,1,"",1,"","","","",7,1,"","","",1,"",""],[1849,54,51,"","","",1,1,"","",1,"","","","","","",""],[1850,213,99,"",1,30,45,"","",1,27,5,2,1,"",1,1,""],[1851,66,49,"","",6,4,"","","",7,"","","","","","",""],[1852,70,57,"","",2,4,"","","",6,"","",1,"","","",""],[1853,69,50,"","",2,4,"","",2,6,4,1,"","","","",""],[1854,67,48,"","",4,5,1,"",1,6,1,1,"","","","",""],[1855,54,42,"","",2,6,"","","",2,1,"","","",1,"",""],[1856,39,21,"","",2,5,"","",2,6,2,"","","",1,"",""],[1857,79,72,"","",1,2,"","",1,1,"",1,"","",1,"",""],[1858,94,74,"",1,6,4,"","","",5,1,2,"","",1,"",""],[1859,100,80,"","","",3,"","",2,7,3,"",3,1,"",1,""],[1860,116,58,"","",14,19,2,1,2,15,3,"",1,"",1,"",""],[1861,72,57,"","",1,9,"","",1,3,"","",1,"","","",""],[1862,106,81,1,"",6,12,1,"",1,"",2,1,"","","",1,""],[1863,52,34,"","",1,8,"",1,2,5,"","","","",1,"",""],[1864,97,81,"","",3,6,"","",1,4,1,1,"","","","",""],[1865,131,106,"","",2,10,1,"",1,7,1,"","","",1,2,""],[1866,194,177,"","",1,6,"","",1,4,4,1,"","","","",""],[1867,161,132,"","",5,7,1,2,4,7,1,"",1,"",1,"",""],[1868,123,103,"","","",6,"","",1,12,1,"","","","","",""],[1869,131,109,"","",4,10,1,"","",6,1,"","","","","",""],[1870,167,104,"","",25,8,"",1,4,19,3,1,"","",1,1,""],[1871,93,83,"","",2,"","","",2,5,1,"","","","","",""],[1872,124,87,"","",2,7,1,"",5,19,2,1,"","","","",""],[1873,68,42,"","",3,3,1,"",4,10,1,"",2,1,1,"",""],[1874,106,83,"","",6,1,1,"",1,9,2,1,"","",2,"",""],[1875,145,96,"","",4,8,2,3,1,29,"","","","",1,1,""],[1876,158,125,"","",2,4,3,"",4,16,1,"",2,"","",1,""],[1877,110,68,"","",3,3,1,1,10,18,2,1,1,"",1,1,""],[1878,127,88,"","",3,6,"","",6,21,1,1,1,"","","",""],[1879,153,105,1,"",1,5,"",3,3,21,8,1,"","",4,1,""],[1880,205,115,"","",15,19,3,2,7,33,2,1,2,"",3,3,""],[1881,181,154,"","",3,1,"",1,1,17,3,"","","",1,"",""],[1882,190,164,"","",5,5,"","",8,7,"","","","",1,"",""],[1883,158,125,"",1,9,5,2,1,6,5,1,"",1,"",2,"",""],[1884,172,127,"","",6,6,"",1,6,19,1,"","","",6,"",""],[1885,198,154,"","",7,11,"",3,7,12,"","","","",2,2,""],[1886,160,120,"","",2,9,"",1,5,19,2,1,"","","",1,""],[1887,165,124,"","",3,6,1,1,3,21,2,2,"","",2,"",""],[1888,183,148,"",1,7,6,"","",7,11,"","","","",3,"",""],[1889,108,70,"","","",4,"","",4,19,"","","",1,10,"",""],[1890,198,116,"","",9,20,"",1,10,31,7,3,"","",1,"",""],[1891,157,125,1,"",2,5,"",1,6,13,2,1,"","",1,"",""],[1892,124,93,1,"",3,1,"",1,3,18,2,1,"","","",1,""],[1893,142,111,"","",2,2,"","",1,15,8,"",1,"","",2,""],[1894,254,218,"",1,4,4,1,"",6,18,1,"",1,"","","",""],[1895,222,147,"",1,12,9,"",2,5,38,5,"","",1,1,1,""],[1896,201,151,"",2,4,6,"","",5,19,4,5,"",3,"",2,""],[1897,172,137,"","",3,4,1,1,4,18,4,"","","","","",""],[1898,221,178,"",1,1,7,"",1,3,20,7,2,1,"","","",""],[1899,230,144,"",2,5,7,1,2,5,54,4,2,"","",2,2,""],[1900,422,185,"",3,28,27,5,2,20,120,14,3,1,"",10,4,""],[1901,131,92,"","",5,3,1,"",3,22,2,"","",1,2,"",""],[1902,220,152,"",6,3,4,"","",7,35,3,1,1,3,1,3,1],[1903,250,202,"",3,3,10,1,"",1,22,2,2,2,1,1,"",""],[1904,204,134,"",2,8,10,1,2,13,27,1,"","","",1,5,""],[1905,232,138,"",5,7,8,3,"",14,43,5,2,1,"",6,"",""],[1906,216,136,"",1,1,8,1,"",7,52,5,"",1,1,"",3,""],[1907,185,121,"",5,5,2,1,"",5,42,"",1,"",1,"",2,""],[1908,175,111,1,3,3,4,"","",7,40,2,2,"","","",2,""],[1909,228,167,"","",3,2,3,"",1,41,3,2,"","",3,3,""],[1910,289,183,1,2,25,20,1,3,7,43,2,1,"","",1,"",""],[1911,192,120,"",2,8,4,1,"",4,44,4,"",1,"",2,2,""],[1912,223,144,"","",9,5,"",1,4,45,2,6,2,"",4,1,""],[1913,217,147,"","",4,9,2,"",3,42,5,2,2,"",1,"",""],[1914,216,144,"","",8,"","",1,9,51,"",1,"","",2,"",""],[1915,217,131,"","",4,7,1,2,6,53,3,1,1,"",6,1,""],[1916,205,144,"","",3,2,"",2,4,46,1,2,"","",1,"",""],[1917,157,103,"","",1,5,1,"","",43,2,"","","",1,1,""],[1918,158,112,1,"",2,4,"",1,1,32,2,"","","",2,1,""],[1919,130,83,1,1,1,7,"",1,2,27,3,1,1,"",2,"",""],[1920,299,144,"",1,26,31,4,4,10,69,5,1,"","",2,1,1],[1921,160,68,"",1,4,4,1,"",12,55,1,4,"","",9,1,""],[1922,208,120,"",3,3,13,2,2,11,41,2,5,1,"",5,"",""],[1923,238,151,"",2,4,6,1,"",3,63,1,1,1,"",3,2,""],[1924,219,97,"",2,7,6,1,2,17,51,1,18,2,2,8,5,""],[1925,255,139,"","",2,21,1,1,16,65,1,5,"","",3,1,""],[1926,216,130,"",4,3,10,1,"",4,57,2,3,"","",1,"",1],[1927,220,114,"",1,7,9,3,1,16,51,4,3,1,"",5,5,""],[1928,229,120,1,"",7,15,1,1,14,50,8,6,"","",5,1,""],[1929,270,130,2,1,16,17,2,"",21,52,5,12,2,"",9,1,""],[1930,327,146,1,3,30,16,2,2,25,72,3,13,2,3,8,1,""],[1931,260,147,"",2,14,13,"",2,9,58,4,7,"","",3,1,""],[1932,305,186,1,1,18,6,4,5,6,64,4,5,"","",3,2,""],[1933,253,115,2,1,14,10,"",3,5,77,2,11,"",1,12,"",""],[1934,252,146,"",1,7,9,1,2,10,59,6,6,"",1,3,1,""],[1935,274,131,"",2,14,16,2,7,17,56,11,6,"","",8,3,""],[1936,209,116,"",1,8,6,1,2,6,55,5,3,1,"",3,2,""],[1937,267,136,"",1,11,13,"",3,16,65,6,6,"",1,8,1,""],[1938,224,125,"",3,10,22,1,2,10,44,2,2,"","",2,"",1],[1939,246,119,"",3,12,12,4,2,12,67,6,3,1,"",3,2,""],[1940,275,155,1,"",18,13,3,5,17,40,"",8,1,"",10,3,""],[1941,144,87,1,1,6,4,4,1,3,27,1,2,"","",6,"",""],[1942,153,89,1,1,2,"",1,1,10,40,1,1,"",1,3,2,""],[1943,146,79,1,1,3,6,"","",10,37,2,3,"",1,2,1,""],[1944,186,123,2,7,5,4,"",1,5,30,"",2,1,"",5,1,""],[1945,205,108,"",2,7,14,1,1,9,55,2,1,"","",2,3,""],[1946,212,99,2,1,6,24,"","",10,49,5,5,1,"",7,3,""],[1947,206,146,"",2,3,3,1,"",5,33,1,5,"","",5,2,""],[1948,218,98,2,20,8,2,2,"",18,40,5,5,1,2,10,4,""],[1949,196,85,"",3,5,6,1,1,15,57,7,7,1,"",4,3,1],[1950,377,145,1,6,24,21,2,1,34,99,10,12,3,1,11,5,1],[1951,233,91,1,2,7,9,2,2,27,68,2,10,1,2,5,4,""],[1952,286,125,3,2,14,7,4,2,12,76,8,10,6,3,14,"",""],[1953,259,113,1,"",15,7,2,2,19,74,3,8,3,"",7,4,1],[1954,327,165,1,1,24,5,1,1,36,74,5,4,3,"",5,2,""],[1955,333,134,3,2,6,33,3,3,46,52,14,16,2,3,12,4,""],[1956,289,135,"",2,21,13,4,"",33,55,2,14,2,"",8,"",""],[1957,312,138,1,8,15,14,1,"",29,86,4,9,1,"",6,"",""],[1958,245,113,2,1,8,9,2,1,28,54,8,4,2,1,10,2,""],[1959,306,124,"",1,11,25,2,1,23,83,8,14,5,"",8,"",1],[1960,545,267,1,5,17,53,4,6,48,87,12,22,4,2,12,4,1],[1961,320,133,2,4,10,37,5,1,28,66,8,9,3,1,10,1,1],[1962,345,168,1,1,10,14,5,2,40,68,5,11,1,"",16,1,1],[1963,337,159,1,3,8,16,4,3,33,63,14,7,2,"",22,2,""],[1964,379,160,1,7,4,10,9,4,29,115,10,11,7,"",8,4,""],[1965,353,140,6,1,9,46,5,1,35,63,9,7,5,3,12,9,""],[1966,374,166,3,9,5,10,2,1,32,101,9,7,2,"",18,4,3],[1967,316,127,4,6,11,11,1,3,25,91,9,10,2,"",13,1,""],[1968,367,183,2,3,5,19,5,3,27,74,14,7,3,1,18,1,1],[1969,325,148,"",8,6,21,2,3,18,71,11,18,"",1,12,5,""],[1970,436,179,3,15,17,33,7,2,28,100,19,13,5,"",14,1,""],[1971,443,240,1,6,17,17,9,1,28,79,7,5,6,"",22,3,1],[1972,327,114,1,6,7,23,8,6,27,70,15,15,3,2,24,6,""],[1973,323,132,8,13,3,20,4,2,13,78,18,13,1,1,11,5,1],[1974,370,155,3,7,5,22,9,1,21,98,14,3,4,1,17,7,3],[1975,427,132,8,8,4,47,4,5,38,131,10,14,2,4,12,8,""],[1976,373,147,3,10,7,15,8,2,25,106,11,14,5,2,12,4,1],[1977,353,138,4,6,4,15,1,2,20,112,8,18,4,"",16,4,1],[1978,417,159,2,16,5,19,7,1,27,123,20,15,3,"",17,2,""],[1979,324,116,2,13,6,28,7,3,25,89,12,8,1,"",10,3,1],[1980,365,155,3,10,11,16,6,2,13,89,22,6,2,"",18,9,1],[1981,400,117,1,8,3,81,5,1,18,105,17,19,11,1,7,5,1],[1982,402,158,3,8,2,37,7,5,23,108,13,10,1,"",20,2,4],[1983,373,127,3,6,3,32,8,4,26,110,15,14,5,1,16,3,""],[1984,415,188,"",15,3,31,5,3,32,109,14,6,3,"",5,"",1],[1985,479,182,1,15,4,39,3,2,23,143,24,11,4,1,14,11,2],[1986,409,110,2,24,5,21,5,5,26,113,49,16,4,2,15,4,5],[1987,421,160,"",13,4,34,3,5,21,138,11,11,4,"",10,6,""],[1988,391,137,1,7,2,37,2,3,14,131,24,7,3,"",14,5,2],[1989,514,158,2,15,2,40,8,"",13,217,16,11,8,1,12,7,4],[1990,473,161,"",12,9,18,8,6,24,166,22,8,8,2,23,4,1],[1991,401,114,1,26,2,30,7,5,16,142,19,13,5,1,12,6,2],[1992,439,165,1,14,3,20,6,2,29,133,22,19,2,2,17,3,1],[1993,438,160,2,17,3,36,11,6,20,121,15,17,4,"",17,4,3],[1994,431,135,3,19,3,21,6,2,16,159,19,19,5,2,19,1,""],[1995,492,133,5,11,6,22,6,2,24,204,15,29,8,1,15,6,3],[1996,585,160,1,14,1,132,10,4,13,144,23,22,4,2,22,13,19],[1997,474,176,1,15,4,22,7,6,23,124,20,29,7,"",28,8,1],[1998,490,125,2,20,6,89,5,2,22,130,24,26,4,2,26,3,2],[1999,494,190,2,24,5,33,1,1,23,128,18,26,8,4,15,9,5],[2000,507,182,3,16,3,31,4,"",23,147,25,27,17,7,16,4,""],[2001,443,172,3,26,2,24,3,3,21,116,22,15,6,3,12,12,1],[2002,407,121,"",15,"",20,2,1,29,124,16,31,7,2,31,6,1],[2003,459,147,"",18,3,53,4,1,19,116,36,24,4,"",30,2,2],[2004,439,151,2,20,"",12,2,3,18,129,12,29,15,4,15,17,4],[2005,484,104,1,9,3,107,2,"",26,164,12,18,8,2,13,11,2],[2006,382,100,1,16,1,31,2,"",9,152,11,16,4,1,22,11,4],[2007,299,72,2,8,"",38,"",2,15,111,12,8,10,2,7,10,2],[2008,266,58,1,28,1,13,1,"",21,87,3,10,8,"",16,12,5],[2009,266,61,2,3,"",41,1,1,21,90,8,16,3,"",15,4,""],[2010,202,46,"","","",60,"","",15,53,6,7,"","",10,3,2],[2011,159,36,"",1,2,16,"","",8,81,3,3,"","",8,1,""],[2012,78,16,"",3,"",2,"","",2,42,"",3,"","",6,3,1],[2013,53,13,"","",1,1,"","",3,24,2,3,"",1,2,3,""],[2014,28,2,"","","","","","",1,19,1,4,"","","",1,""]]'
reply: 'HTTP/1.1 200 \r\n'
header: Date: Thu, 14 Jul 2022 20:09:04 GMT
header: Content-Type: application/json;charset=UTF-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Server: Apache/2.4.6 (CentOS)
ステータスコード200で応答があることを確認します。以下は、私たちのケースで受け取った応答メッセージのスニペットです。
reply: 'HTTP/1.1 200 \r\n'
header: Date: Thu, 14 Jul 2022 20:09:04 GMT
header: Content-Type: application/json;charset=UTF-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Server: Apache/2.4.6 (CentOS)
分析フェーズ - GridDBのSQLエンジンの力を活用する
GridDBは、高いスケーラビリティとパフォーマンスが要求される処理に適したNo-sqlデータベースであることは周知の事実です。しかし、GridDBはSQLのクエリ構文に慣れている開発者向けに、SQLユーティリティも提供しています。GridDBが提供するSQLユーティリティは、使いやすいだけでなく、結果の取得が非常に速いのが良いところです。今回の分析では、GridDBの外来種データを様々なSQL文を使って問い合わせ、その結果をPythonでプロットする予定です。
以下に例を示します。 - request_body = '[{"type":"sql-select", "stmt":"select * from Alien_Species_Analysis where Year=2000"}]' x = requests.post(url, data=request_body, headers=header_obj)
なお、ここでのテーブル名は、作成したコンテナ名です。
何世紀もの間、どれだけの異星人がいたのでしょうか?
#Setup the URL to be used to invoke the GridDB WebAPI to retrieve data from the container
url = 'https://cloud1.griddb.com/trial1302/griddb/v2/gs_clustertrial1302/dbs/public/sql'
#Construct the request body which has the SQL that is to be used to retrieve the data
request_body = '[{"type":"sql-select", "stmt":"SELECT CASE when Year between 1500 and 1599 then 16 when Year between 1600 and 1699 then 17 when Year between 1700 and 1799 then 18 when Year between 1800 and 1899 then 19 when Year between 1900 and 1999 then 20 when Year between 2000 and 2099 then 21 end as Century, sum([All]) as cnt from Alien_Species_Analysis group by 1"}]'
#Invoke the GridDB WebAPI
x = requests.post(url, data=request_body, headers=header_obj)
send: b'POST /trial1302/griddb/v2/gs_clustertrial1302/dbs/public/sql HTTP/1.1\r\nHost: cloud1.griddb.com\r\nUser-Agent: PostmanRuntime/7.29.0\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: Basic U3ViaGE6Yjk+RWtDUA==\r\nContent-Type: application/json; charset=UTF-8\r\nContent-Length: 358\r\n\r\n'
send: b'[{"type":"sql-select", "stmt":"SELECT CASE when Year between 1500 and 1599 then 16 when Year between 1600 and 1699 then 17 when Year between 1700 and 1799 then 18 when Year between 1800 and 1899 then 19 when Year between 1900 and 1999 then 20 when Year between 2000 and 2099 then 21 end as Century, sum([All]) as cnt from Alien_Species_Analysis group by 1"}]'
reply: 'HTTP/1.1 200 \r\n'
header: Date: Thu, 14 Jul 2022 21:24:39 GMT
header: Content-Type: application/json;charset=UTF-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Server: Apache/2.4.6 (CentOS)
'[{"columns":[{"name":"Century","type":"LONG"},{"name":"cnt","type":"LONG"}],"results":[[16,265],[17,438],[18,922],[19,9270],[20,30446],[21,4472]]}]'
ヒント: 私のデータセットには 'All' というカラムがあります。上のクエリでは、SUM([All])を角括弧で囲んで使っています。これは、'All'がSQLの予約キーワードだからです。
#Process the response received and construct a Pandas dataframe with the data from the response
myJson = x.json()
Alien_spcs_centuries = pd.DataFrame(myJson[0]["results"], columns=[myJson[0]["columns"][0]["name"], myJson[0]["columns"][1]["name"]])
Century | cnt | |
---|---|---|
0 | 16 | 265 |
1 | 17 | 438 |
2 | 18 | 922 |
3 | 19 | 9270 |
4 | 20 | 30446 |
5 | 21 | 4472 |
#Plot a barplot using Seaborn
import seaborn as sns
plt.figure(figsize = (15,8))
ax = sns.barplot(x = 'Century', y = 'cnt', data = Alien_spcs_centuries)
ax.set_xticklabels(['16th Century', '17th Century', '18th Century', '19th Century', '20st Century', '21st Century'])
ax.set(ylabel='Number of alien species')
ax.set_title ('How many alien species have we had across different centuries?',fontsize=22)
Text(0.5, 1.0, 'How many alien species have we had across different centuries?')
推論 外来種が最も多く記録されたのは20世紀前後です。しかし、これは20世紀に人口が増加したことと、それ以前に比べて技術が進歩したことが原因であると思われます。21世紀は100年のうち20年しか経過しておらず、そのうち14年分のデータしかないため、全体の数は少ないようです。21世紀が完全に過ぎれば、20世紀よりも高い数字になる可能性があります。
これまでどのような外来種が記録されてきたのでしょうか?
Tip: 以下のクエリは、タイプキャスティングを使用しています。GridDBは標準SQLのタイプキャスティングをサポートしています。
#Constuct the SQL query to be used to retrieve the data for each alien type.
sql =("SELECT 'Vascular Plants' AS AlienType, SUM(CASE WHEN LENGTH(VascularPlants)=0 THEN 0 ELSE CAST(VascularPlants AS INTEGER) END) AS Count "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Bryophytes', SUM(CASE WHEN LENGTH(Bryophytes)=0 THEN 0 ELSE CAST(Bryophytes AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Algae', SUM(CASE WHEN LENGTH(Algae)=0 THEN 0 ELSE CAST(Algae AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Mammals', SUM(CASE WHEN LENGTH(Mammals)=0 THEN 0 ELSE CAST(Mammals AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Birds', SUM(CASE WHEN LENGTH(Birds)=0 THEN 0 ELSE CAST(Birds AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Reptiles', SUM(CASE WHEN LENGTH(Reptiles)=0 THEN 0 ELSE CAST(Reptiles AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Amphibians', SUM(CASE WHEN LENGTH(Amphibians)=0 THEN 0 ELSE CAST(Amphibians AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Fishes', SUM(CASE WHEN LENGTH(Fishes)=0 THEN 0 ELSE CAST(Fishes AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Insects', SUM(CASE WHEN LENGTH(Insects)=0 THEN 0 ELSE CAST(Insects AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Molluscs', SUM(CASE WHEN LENGTH(Molluscs)=0 THEN 0 ELSE CAST(Molluscs AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Crustaceans', SUM(CASE WHEN LENGTH(Crustaceans)=0 THEN 0 ELSE CAST(Crustaceans AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Arachnids', SUM(CASE WHEN LENGTH(Arachnids)=0 THEN 0 ELSE CAST(Arachnids AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Arthropods', SUM(CASE WHEN LENGTH(Arthropods)=0 THEN 0 ELSE CAST(Arthropods AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Invertebrates', SUM(CASE WHEN LENGTH(Invertebrates)=0 THEN 0 ELSE CAST(Invertebrates AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'Fungi', SUM(CASE WHEN LENGTH(Fungi)=0 THEN 0 ELSE CAST(Fungi AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT 'BacteriaProtozoans', SUM(CASE WHEN LENGTH(BacteriaProtozoans)=0 THEN 0 ELSE CAST(BacteriaProtozoans AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1")
#Setup the GridDB WebAPI URL to retrieve data from the previously created container
url = 'https://cloud1.griddb.com/trial1302/griddb/v2/gs_clustertrial1302/dbs/public/sql'
#Construct the request to be used for the GridDB WebAPI call
request_body = '[{"type":"sql-select", "stmt":"'+sql+'"}]'
#Invoke the GridDB WebAPI
x = requests.post(url, data=request_body, headers=header_obj)
send: b'POST /trial1302/griddb/v2/gs_clustertrial1302/dbs/public/sql HTTP/1.1\r\nHost: cloud1.griddb.com\r\nUser-Agent: PostmanRuntime/7.29.0\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: Basic U3ViaGE6Yjk+RWtDUA==\r\nContent-Type: application/json; charset=UTF-8\r\nContent-Length: 2360\r\n\r\n'
send: b'[{"type":"sql-select", "stmt":"SELECT \'Vascular Plants\' AS AlienType, SUM(CASE WHEN LENGTH(VascularPlants)=0 THEN 0 ELSE CAST(VascularPlants AS INTEGER) END) AS Count from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Bryophytes\', SUM(CASE WHEN LENGTH(Bryophytes)=0 THEN 0 ELSE CAST(Bryophytes AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Algae\', SUM(CASE WHEN LENGTH(Algae)=0 THEN 0 ELSE CAST(Algae AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Mammals\', SUM(CASE WHEN LENGTH(Mammals)=0 THEN 0 ELSE CAST(Mammals AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Birds\', SUM(CASE WHEN LENGTH(Birds)=0 THEN 0 ELSE CAST(Birds AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Reptiles\', SUM(CASE WHEN LENGTH(Reptiles)=0 THEN 0 ELSE CAST(Reptiles AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Amphibians\', SUM(CASE WHEN LENGTH(Amphibians)=0 THEN 0 ELSE CAST(Amphibians AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Fishes\', SUM(CASE WHEN LENGTH(Fishes)=0 THEN 0 ELSE CAST(Fishes AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Insects\', SUM(CASE WHEN LENGTH(Insects)=0 THEN 0 ELSE CAST(Insects AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Molluscs\', SUM(CASE WHEN LENGTH(Molluscs)=0 THEN 0 ELSE CAST(Molluscs AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Crustaceans\', SUM(CASE WHEN LENGTH(Crustaceans)=0 THEN 0 ELSE CAST(Crustaceans AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Arachnids\', SUM(CASE WHEN LENGTH(Arachnids)=0 THEN 0 ELSE CAST(Arachnids AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Arthropods\', SUM(CASE WHEN LENGTH(Arthropods)=0 THEN 0 ELSE CAST(Arthropods AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Invertebrates\', SUM(CASE WHEN LENGTH(Invertebrates)=0 THEN 0 ELSE CAST(Invertebrates AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'Fungi\', SUM(CASE WHEN LENGTH(Fungi)=0 THEN 0 ELSE CAST(Fungi AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT \'BacteriaProtozoans\', SUM(CASE WHEN LENGTH(BacteriaProtozoans)=0 THEN 0 ELSE CAST(BacteriaProtozoans AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1"}]'
reply: 'HTTP/1.1 200 \r\n'
header: Date: Fri, 15 Jul 2022 00:47:46 GMT
header: Content-Type: application/json;charset=UTF-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Server: Apache/2.4.6 (CentOS)
'[{"columns":[{"name":"AlienType","type":"STRING"},{"name":"Count","type":"LONG"}],"results":[["Insects",10023],["Arthropods",99],["Fishes",2153],["Birds",2909],["Vascular Plants",23354],["Reptiles",375],["Bryophytes",145],["Amphibians",257],["Molluscs",1193],["Invertebrates",1172],["BacteriaProtozoans",104],["Mammals",1389],["Fungi",414],["Algae",746],["Crustaceans",1099],["Arachnids",322]]}]'
#Process the response received from the GriDB WebAPI invocation
myJson = x.json()
#Create a Pandas dataframe from the response received
Types_Alien_spcs = pd.DataFrame(myJson[0]["results"], columns=[myJson[0]["columns"][0]["name"], myJson[0]["columns"][1]["name"]])
AlienType | Count | |
---|---|---|
0 | Insects | 10023 |
1 | Arthropods | 99 |
2 | Fishes | 2153 |
3 | Birds | 2909 |
4 | Vascular Plants | 23354 |
5 | Reptiles | 375 |
6 | Bryophytes | 145 |
7 | Amphibians | 257 |
8 | Molluscs | 1193 |
9 | Invertebrates | 1172 |
10 | BacteriaProtozoans | 104 |
11 | Mammals | 1389 |
12 | Fungi | 414 |
13 | Algae | 746 |
14 | Crustaceans | 1099 |
15 | Arachnids | 322 |
#Calculating percentage of totals for each type of species
pct_alien_types= (Types_Alien_spcs['Count']/Types_Alien_spcs['Count'].sum())*100
Types_Alien_spcs['pct_alien_types'] = pct_alien_types
#Create a vertical stem plot
fig, ax = plt.subplots(figsize=(15,10))
plt.hlines(y=Types_Alien_spcs['AlienType'],
xmin=0, xmax=Types_Alien_spcs['pct_alien_types'],
color='blue',
linestyle='dashed', linewidth=2)
plt.plot(Types_Alien_spcs['pct_alien_types'],Types_Alien_spcs['AlienType'],"pc",alpha=0.5,markersize=18)
plt.title('Types of Alien Species recorded so far', fontsize=22)
plt.xlim(0,55)
plt.yticks(fontsize=12)
plt.xticks(fontsize=12)
ax.tick_params()
sns.despine()
plt.show()
推論 上記のプロットは、外来種として最も多く記録されているのが維管束植物であり、次いで昆虫であることを示しています。
20世紀、外来種の分布はどうなっているのだろう?
GridDBはSQLのサブクエリもサポートしています。今回は、宇宙人の種類を世紀別に集計するためにサブクエリを使用しています。
#Construct the sub-query SQL to be used
sql = ("SELECT CASE when Year between 1500 and 1599 then 16 when Year between 1600 and 1699 then 17 when Year between 1700 and 1799 then 18 when Year between 1800 and 1899 then 19 when Year between 1900 and 1999 then 20 when Year between 2000 and 2099 then 21 end as Century, AlienType, SUM(Count) As Count FROM ("
"SELECT Year,'Vascular Plants' AS AlienType, SUM(CASE WHEN LENGTH(VascularPlants)=0 THEN 0 ELSE CAST(VascularPlants AS INTEGER) END) AS Count "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Bryophytes', SUM(CASE WHEN LENGTH(Bryophytes)=0 THEN 0 ELSE CAST(Bryophytes AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Algae', SUM(CASE WHEN LENGTH(Algae)=0 THEN 0 ELSE CAST(Algae AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Mammals', SUM(CASE WHEN LENGTH(Mammals)=0 THEN 0 ELSE CAST(Mammals AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Birds', SUM(CASE WHEN LENGTH(Birds)=0 THEN 0 ELSE CAST(Birds AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Reptiles', SUM(CASE WHEN LENGTH(Reptiles)=0 THEN 0 ELSE CAST(Reptiles AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Amphibians', SUM(CASE WHEN LENGTH(Amphibians)=0 THEN 0 ELSE CAST(Amphibians AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Fishes', SUM(CASE WHEN LENGTH(Fishes)=0 THEN 0 ELSE CAST(Fishes AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Insects', SUM(CASE WHEN LENGTH(Insects)=0 THEN 0 ELSE CAST(Insects AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Molluscs', SUM(CASE WHEN LENGTH(Molluscs)=0 THEN 0 ELSE CAST(Molluscs AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Crustaceans', SUM(CASE WHEN LENGTH(Crustaceans)=0 THEN 0 ELSE CAST(Crustaceans AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Arachnids', SUM(CASE WHEN LENGTH(Arachnids)=0 THEN 0 ELSE CAST(Arachnids AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Arthropods', SUM(CASE WHEN LENGTH(Arthropods)=0 THEN 0 ELSE CAST(Arthropods AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Invertebrates', SUM(CASE WHEN LENGTH(Invertebrates)=0 THEN 0 ELSE CAST(Invertebrates AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'Fungi', SUM(CASE WHEN LENGTH(Fungi)=0 THEN 0 ELSE CAST(Fungi AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,'BacteriaProtozoans', SUM(CASE WHEN LENGTH(BacteriaProtozoans)=0 THEN 0 ELSE CAST(BacteriaProtozoans AS INTEGER) END) "
" from Alien_Species_Analysis GROUP BY 1 "
") Table_A GROUP BY 1,2")
#Construct the URL to be used to invoke the GridDB WebAPI to retrieve data
url = 'https://cloud1.griddb.com/trial1302/griddb/v2/gs_clustertrial1302/dbs/public/sql'
#Construct the request to be used during the GridDB WebAPI invocation
request_body = '[{"type":"sql-select", "stmt":"'+sql+'"}]'
#Invoke the GridDB WebAPI
x = requests.post(url, data=request_body, headers=header_obj)
send: b'POST /trial1302/griddb/v2/gs_clustertrial1302/dbs/public/sql HTTP/1.1\r\nHost: cloud1.griddb.com\r\nUser-Agent: PostmanRuntime/7.29.0\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: Basic U3ViaGE6Yjk+RWtDUA==\r\nContent-Type: application/json; charset=UTF-8\r\nContent-Length: 2768\r\n\r\n'
send: b'[{"type":"sql-select", "stmt":"SELECT CASE when Year between 1500 and 1599 then 16 when Year between 1600 and 1699 then 17 when Year between 1700 and 1799 then 18 when Year between 1800 and 1899 then 19 when Year between 1900 and 1999 then 20 when Year between 2000 and 2099 then 21 end as Century, AlienType, SUM(Count) As Count FROM (SELECT Year,\'Vascular Plants\' AS AlienType, SUM(CASE WHEN LENGTH(VascularPlants)=0 THEN 0 ELSE CAST(VascularPlants AS INTEGER) END) AS Count from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Bryophytes\', SUM(CASE WHEN LENGTH(Bryophytes)=0 THEN 0 ELSE CAST(Bryophytes AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Algae\', SUM(CASE WHEN LENGTH(Algae)=0 THEN 0 ELSE CAST(Algae AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Mammals\', SUM(CASE WHEN LENGTH(Mammals)=0 THEN 0 ELSE CAST(Mammals AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Birds\', SUM(CASE WHEN LENGTH(Birds)=0 THEN 0 ELSE CAST(Birds AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Reptiles\', SUM(CASE WHEN LENGTH(Reptiles)=0 THEN 0 ELSE CAST(Reptiles AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Amphibians\', SUM(CASE WHEN LENGTH(Amphibians)=0 THEN 0 ELSE CAST(Amphibians AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Fishes\', SUM(CASE WHEN LENGTH(Fishes)=0 THEN 0 ELSE CAST(Fishes AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Insects\', SUM(CASE WHEN LENGTH(Insects)=0 THEN 0 ELSE CAST(Insects AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Molluscs\', SUM(CASE WHEN LENGTH(Molluscs)=0 THEN 0 ELSE CAST(Molluscs AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Crustaceans\', SUM(CASE WHEN LENGTH(Crustaceans)=0 THEN 0 ELSE CAST(Crustaceans AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Arachnids\', SUM(CASE WHEN LENGTH(Arachnids)=0 THEN 0 ELSE CAST(Arachnids AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Arthropods\', SUM(CASE WHEN LENGTH(Arthropods)=0 THEN 0 ELSE CAST(Arthropods AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Invertebrates\', SUM(CASE WHEN LENGTH(Invertebrates)=0 THEN 0 ELSE CAST(Invertebrates AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'Fungi\', SUM(CASE WHEN LENGTH(Fungi)=0 THEN 0 ELSE CAST(Fungi AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 UNION SELECT Year,\'BacteriaProtozoans\', SUM(CASE WHEN LENGTH(BacteriaProtozoans)=0 THEN 0 ELSE CAST(BacteriaProtozoans AS INTEGER) END) from Alien_Species_Analysis GROUP BY 1 ) Table_A GROUP BY 1,2"}]'
reply: 'HTTP/1.1 200 \r\n'
header: Date: Fri, 15 Jul 2022 03:12:56 GMT
header: Content-Type: application/json;charset=UTF-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Server: Apache/2.4.6 (CentOS)
#Process the response received from the invocation of the GridDB WebAPI
myJson = x.json()
#Construct a Pandas dataframe with the data from the response
Alien_Dist_century = pd.DataFrame(myJson[0]["results"], columns=[myJson[0]["columns"][0]["name"], myJson[0]["columns"][1]["name"],myJson[0]["columns"][2]["name"]])
Century | AlienType | Count | |
---|---|---|---|
0 | 19 | Insects | 831 |
1 | 20 | Reptiles | 315 |
2 | 20 | Bryophytes | 124 |
3 | 18 | Vascular Plants | 636 |
4 | 18 | Reptiles | 0 |
... | ... | ... | ... |
91 | 18 | Molluscs | 10 |
92 | 20 | Arachnids | 220 |
93 | 16 | Birds | 24 |
94 | 21 | Arthropods | 22 |
95 | 19 | Arachnids | 20 |
96 rows × 3 columns
#Create a Strip Plot with the data
fig, ax = plt.subplots(figsize=(16,9))
sns.set(style='darkgrid')
plt.title("Distribution of Alien Species across centuries", fontsize=16)
plt.ylim(0,1000)
plt.xticks(rotation=45)
plt.yticks([-40,-20,0,20,40,60,80,100,150,200,400,600])
sns.stripplot(x='AlienType',y='Count',hue='Century',data=Alien_Dist_century)
<AxesSubplot:title={'center':'Distribution of Alien Species across centuries'}, xlabel='AlienType', ylabel='Count'>
推論:菌類の外来種は、ほぼすべての世紀にわたって流行していたようです。昆虫は16世紀にはすでに記録されており、外来種の維管束植物は16世紀と17世紀に記録されています。外来種のクモ類、両生類、爬虫類は20世紀と21世紀に記録されています。
外来種の分布に異常はないのか?
先ほどのGridDB WebAPI呼び出しのデータを再利用してみましょう。
#Construct a scatter plot
fig = plt.figure(figsize=(14, 7), dpi= 80, facecolor='w', edgecolor='k')
plt.scatter('AlienType', 'Century', data=Alien_Dist_century, s='Count', c='Count', cmap='Reds', edgecolors='black', linewidths=.5)
plt.title("Density Mapping of Alien types ", fontsize=16)
plt.xlabel('AlienType', fontsize=18)
plt.ylabel('Century', fontsize=18)
plt.xticks(rotation=90)
plt.colorbar()
plt.show()
推論:維管束植物と昆虫の間には、3つの大きな異常が見られます。19世紀から20世紀にかけて、維管束植物の発生率が驚くほど高くなっていて昆虫類も20世紀には高い発生率を示しています。もう一つの興味深い異常は、魚類と哺乳類の外来種です。魚類と哺乳類の歴史的傾向を見ると、魚類は常に哺乳類より発生率が低いのです。しかし、20世紀に入ってからは、魚類の外来種が哺乳類の外来種よりはるかに多くなっています。
まとめ
この記事では、GridDBのSQL機能とGridDBのWebAPIの威力、そしてカスタムPythonアプリケーションとの連携のしやすさについて説明しました。