1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

interactive Splunk COVID-19 Dashboard について(その1)

Last updated at Posted at 2020-03-20

Bringing Data to COVID-19で紹介されたDashboard、interactive Splunk COVID-19 Dashboardが何をやっているのかを確認してみる。

github

https://github.com/splunk/corona_virus
元のデータは
https://github.com/CSSEGISandData/COVID-19

データ

Johns Hopkins大学のブログMapping 2019-nCoVが元ネタなのでデータの形式が載っているかと思ったら、載っていなかった。なお本家はArcGISで作成している。

CSSEGISandData/COVID-19のデータを確認してみる。

COVID-19/who_covid_19_situation_reports/who_covid_19_sit_rep_time_series/who_covid_19_sit_rep_time_series.csv(2020/3/20現在)

Province/States Country/Region WHO region 1/21/2020 ....
38箇所 87ヶ国 8(Otherを含む) 日毎の感染者数
code.spl
| makeresults
| eval _raw="Province/States,Country/Region,WHO region,1/21/2020, ....
Confirmed,Globally,,282, ....
"
| table P* C* W* 
| fieldsummary

コピペして、上記SPLで概要を確認した。 :sweat: 重い

COVID-19/who_covid_19_situation_reports/who_covid_19_sit_rep_pdfs/

ここはWHOが出しているsitrep(状況報告)のPDFが置いてある。

COVID-19/csse_covid_19_data/

よかった、解説がしっかり書いてある。

CSSE COVID-19 Dataset

Daily reports (csse_covid_19_daily_reports)

This folder contains daily case reports. All timestamps are in UTC (GMT+0).

File naming convention

MM-DD-YYYY.csv in UTC.

Field description
  • Province/State: China - province name; US/Canada/Australia/ - city name, state/province name; Others - name of the event (e.g., "Diamond Princess" cruise ship); other countries - blank.
  • Country/Region: country/region name conforming to WHO (will be updated).
  • Last Update: MM/DD/YYYY HH:mm (24 hour format, in UTC).
  • Confirmed: the number of confirmed cases. For Hubei Province: from Feb 13 (GMT +8), we report both clinically diagnosed and lab-confirmed cases. For lab-confirmed cases only (Before Feb 17), please refer to who_covid_19_situation_reports. For Italy, diagnosis standard might be changed since Feb 27 to "slow the growth of new case numbers." (Source)
  • Deaths: the number of deaths.
  • Recovered: the number of recovered cases.
Update frequency
  • Files after Feb 1 (UTC): once a day around 23:59 (UTC).
  • Files on and before Feb 1 (UTC): the last updated files before 23:59 (UTC). Sources: archived_data and dashboard.

ダイアモンドプリンセスはイベントか。日本はこれによると県単位では集計されていないことがわかる。
湖北省Hubei Provinceがわかりづらい。

_イタリアの場合、診断基準は2月27日から「新しい症例数の増加を遅らせる」ために変更される可能性があります。_とか書いていますね。

Time series summary (csse_covid_19_time_series)

This folder contains daily time series summary tables, including confirmed, deaths and recovered. All data are from the daily case report.

Field descriptioin

  • Province/State: same as above.
  • Country/Region: same as above.
  • Lat and Long: a coordinates reference for the user.
  • Date fields: M/DD/YYYY (UTC), the same data as MM-DD-YYYY.csv file.

Update frequency

Once a day.

Data modification records

We are also monitoring the curve change. Any errors made by us will be corrected in the dataset. Any possible errors from the original data sources will be listed here as a reference.

  • NHC 2/14: Hubei Province deducted 108 prior deaths from the death toll due to double counting.
    About DP 3/1: All cases of COVID-19 in repatriated US citizens from the - Diamond Princess are grouped together, and their location is currently designated at the ship’s port location off the coast of Japan. These individuals have been assigned to various quarantine locations (in military bases and hospitals) around the US. This grouping is consistent with the CDC.

もう一方の時系列集計のほう

ダイアモンドプリンセスはグループ化されており、現在、その場所は日本沖の船の港の場所に指定されています。 これらの個人は、米国中のさまざまな隔離場所(軍事基地および病院内)に割り当てられています。 このグループ化はCDCと一致しています。
:thinking:

COVID-19/archived_data/

This folder contains the previously posted dashboard case reports from Jan 21 to Feb 14, 2020
ということで過去分が保存されている。

Apps

https://github.com/splunk/corona_virus
に書いてある通り
$SPLUNK_HOME/etc/apps
git clone --recurse-submodules https://github.com/splunk/corona_virus.git

tree
$ tree
.
├── README.md
├── appserver
│   ├── static
│   │   ├── covid-19.js
│   │   ├── dashboard.css
│   │   ├── dashboard.json
│   │   └── images
│   │       ├── confirmed.svg
│   │       ├── deaths.svg
│   │       ├── legend.svg
│   │       ├── logo.svg
│   │       └── recovered.svg
│   └── templates
│       └── covid-19.html
├── bin
│   └── update_git.sh
├── default
│   ├── app.conf
│   ├── collections.conf
│   ├── data
│   │   └── ui
│   │       ├── nav
│   │       │   └── default.xml
│   │       └── views
│   │           ├── README
│   │           ├── confirmed_cases_location_overlay.xml
│   │           ├── corona_virus.xml
│   │           ├── coronavirus_timelapse.xml
│   │           └── covid-19.xml
│   ├── inputs.conf
│   └── transforms.conf
├── git
│   └── COVID-19
│       ├── README.md
│       ├── archived_data
│       │   ├── README.md
│       │   ├── archived_daily_case_updates
│       │   │   ├── 01-21-2020_2200.csv
│       │   │   |   ....
│       │   │   ├── 02-14-2020_1123.csv
│       │   │   └── README.md
│       │   └── archived_time_series
│       │       ├── README.md
│       │       ├── time_series_2019-ncov-Confirmed.csv
│       │       ├── time_series_2019-ncov-Deaths.csv
│       │       └── time_series_2019-ncov-Recovered.csv
│       ├── csse_covid_19_data
│       │   ├── README.md
│       │   ├── csse_covid_19_daily_reports
│       │   │   ├── 01-22-2020.csv
│       │   │   |   ....
│       │   │   ├── 03-19-2020.csv
│       │   │   └── README.md
│       │   └── csse_covid_19_time_series
│       │       ├── README.md
│       │       ├── time_series_19-covid-Confirmed.csv
│       │       ├── time_series_19-covid-Deaths.csv
│       │       └── time_series_19-covid-Recovered.csv
│       └── who_covid_19_situation_reports
│           ├── README.md
│           ├── who_covid_19_sit_rep_pdfs
│           │   ├── 20200121-sitrep-1-2019-ncov.pdf
│           │   |   ....
│           │   └── 20200310-sitrep-50-covid-19.pdf
│           └── who_covid_19_sit_rep_time_series
│               └── who_covid_19_sit_rep_time_series.csv
├── lookups
│   ├── confirmed.csv -> ../git/COVID-19/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv
│   ├── deaths.csv -> ../git/COVID-19/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Deaths.csv
│   ├── locations.csv
│   └── recovered.csv -> ../git/COVID-19/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Recovered.csv
└── metadata
    └── default.meta

23 directories, 199 files

git cloneしたのでpdfも取り込んでいる。

conf

inputs.conf

inputs.conf
[script://$SPLUNK_HOME/etc/apps/corona_virus/bin/update_git.sh]
disabled = false
interval = 60
index = main
sourcetype = git_update_corona

inputs.conf@Splunk>docsによると60秒間隔で取り込んでいることがわかる。

update_git.sh
update_git.sh
cd $SPLUNK_HOME/etc/apps/corona_virus
git submodule foreach git pull origin master

git submodule foreach 配下の全てのサブモジュールに対して、指定したコマンドを実行する - Git Submoduleについてまとめてみる

.gitmodules
.gitmodules
[submodule "git/COVID-19"]
	path = git/COVID-19
	url = https://github.com/CSSEGISandData/COVID-19

元のgithubから読み込んでいることがわかる。

transforms.conf

transforms.conf
[user_timer]
external_type = kvstore
fields_list = inc, username

今のところ空っぽだけど、lookupの設定がされている。

collections.conf

collections.conf
[user_timer]
enforceTypes = true 
field.inc = number 
field.username = string 

細部はこちらの設定

apps.conf

apps.conf
#
# Splunk app configuration file
#

[install]
is_configured = 0

[ui]
is_visible = true
label = Corona

[launcher]
author =
description =
version = 1.0.0

corona_virus/default/data/ui/views/

ここに各ダッシュボードの設定記述されている。

covid-19.xml

covid-19.xml
<?xml version="1.0"?>
<view template="corona_virus:/templates/covid-19.html" type="html">
    <label>covid-19 Patterns &amp; Trends</label>
</view>

corona_virus/appserver/templates/covid-19.html

covid-19.html
<%!
if cherrypy.config['product_type'] == 'hunk':
        faviconFile = 'favicon_hunk.ico'
elif cherrypy.config['product_type'] == 'enterprise':
        faviconFile = 'favicon.ico'
else:
        faviconFile = 'favicon.ico'

app_name = cherrypy.request.path_info.split('/')[3]
app_root = "/" + "/".join(["static","app",app_name])
config_qs = dict(autoload=1)
%>\
<!DOCTYPE html>
<html class="no-js" lang="">
    <head>
        <title>covid-19 Patterns & Trends</title>
        <meta charset="utf-8" />
        <meta http-equiv="x-ua-compatible" content="ie=edge" />
        <meta name="description" content="listen to your data" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
        <link rel="apple-touch-icon" href="apple-touch-icon.png" />
    </head>
    <body>
        <script src="${make_url('/config', _qs=config_qs)}"></script>
        <script src="${make_url('/static/js/i18n.js')}"></script>
        <script src="${make_url('/i18ncatalog?autoload=1')}"></script>
        <script>
            __requirejs_base_url = ${json_decode(make_url('/static/build'))};
            __splunkd_partials__ = ${json_decode(splunkd)};
        </script>
        <script src="${make_url('/static/app/corona_virus/covid-19.js')}"></script>
    </body>
</html>

cherrypyがなんだかわからないので調べるとCHERRYPY IS A PYTHONIC, OBJECT-ORIENTED WEB FRAMEWORKということらしい。
今のところ最終的にcovid-19.jsだけしかわからない:sweat:

covid-19.jsdashboard.jsonを読み込んでいて、それはSplunk Dashboards App (Beta)用のJSON形式で記述されている。
SPLはcorona_virus.xmlと一緒なものとindex=pandemicのクエリーが混じっている。
なんだろう:question:

corona_virus.xml

613行もあるのでLink

base
|inputlookup confirmed.csv | fields *

をベースとして各種グラフを作成している。

ここは詳しくやりたいので、一旦終了とします。

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?