More than 3 years have passed since last update.

【Edge Bundling】Pythonを用いてエッジ束化

Posted at 2022-05-27

はじめに

前回記事では、情報可視化について勉強したことのまとめをしました。
今回は、実際にネットワーク図を作成し、ネットワークを作成した際の問題点の解決手法の一つであるEdge BundlingをPythonライブラリであるNetworkxとDatashadeを用いて簡単に実装します。

これが

こうなったり

こうなるような技術の紹介です。

ネットワークに関する用語

まずはNWに関する基礎知識として，基本的な用語をまとめます．
・ノード｜ネットワークの頂点
・エッジ｜ノードとノードをつなぐ辺
・次数｜ノードにつながるエッジの数
・有向グラフ｜ネットワークの辺に向きがあるもの
・無向グラフ｜ネットワークの辺に向きがないもの

ネットワーク図を作成するときの問題点

ノードの処理

　ノードの配置はネットワークの可読性に大きく関わります。例えば、はじめにの1枚目に示したようなネットワーク図では、情報を読み取ることは困難です。ノードを一定以上の距離に保つことや、配置する位置に意味を持たせることが大切です。

エッジの処理

　ネットワーク図作成する場合はエッジに重みをつけて長さや太さに意味を持たせ、類似度で長さや太さを設定します。またエッジ表現で人が情報を読み取る際の適した手法に関する研究も行われており、自分のサーベイ論文の中からわかりやすかったものを1つあげておきます。

(連結図における重み付きグラフのエッジ表現に関する研究)こちらは修士論文なので、分量多めではありますが、29ページから始まる第6章の議論はとても勉強になります。また被験者実験によって評価している点が参考になります。

　ほかにもエッジの長さの総和を短くしたり、エッジが別のノードと重ならないようにすることが大切です。

Edge Bundling

今回は先ほどあげた問題点の中でも、特にエッジの処理にフォーカスを当て、エッジ束化に取り組みます。

エッジ束化（edge bundling）
Danny Holten,Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical

　そうですね、なによりもビジュアルがかっこよかった！ので実装してみたいと考えていたのですが、「Edge Bundluing」と検索しても日本語の文献も少なく、むずかしそうだな～と諦めていました。
ある日サーベイしていたところ、Datashaderを発見し、Edge Bundlingができる
とのことだったので早速使ってみました。
　ちなみに今回対象とするネットワークはNetworkxでおなじみのkarate_club_graphを用いてみます。

（はじめにのところで示したは、ノード数100個をランダムに発生させた完全グラフ（全てのノードにエッジが張られているグラフ）になります。あまり面白くはないと思うので割愛します。）

edge_bundlung_karate.py

import networkx as nx
import numpy as np
import pandas as pd
import datashader as ds
import datashader.transfer_functions as tf
from datashader.bundling import connect_edges, hammer_bundle
from itertools import chain

G = nx.karate_club_graph()
cvsopts = dict(plot_height=400, plot_width=650)

def ng(graph,name):
    graph.name = name
    return graph

def nodesplot(nodes, name=None, canvas=None, cat=None):
    canvas = ds.Canvas(**cvsopts) if canvas is None else canvas
    aggregator=None if cat is None else ds.count_cat(cat)
    agg=canvas.points(nodes,'x','y',aggregator)
    return tf.spread(tf.shade(agg, cmap=["#FF3333"]), px=3, name=name)

def edgesplot(edges, name=None, canvas=None):
    canvas = ds.Canvas(**cvsopts) if canvas is None else canvas
    return tf.shade(canvas.line(edges, 'x','y', agg=ds.count()), name=name)
    
def graphplot(nodes, edges, name="", canvas=None, cat=None):
    if canvas is None:
        xr = nodes.x.min(), nodes.x.max()
        yr = nodes.y.min(), nodes.y.max()
        canvas = ds.Canvas(x_range=xr, y_range=yr, **cvsopts)
    np = nodesplot(nodes, name + " nodes", canvas, cat)
    ep = edgesplot(edges, name + " edges", canvas)
    return tf.stack(ep, np, how="over", name=name)

def ng(graph,name):
    graph.name = name
    return graph

def nx_layout(graph):
    seed = 0
    np.random.seed(seed)
    layout = nx.spring_layout(graph, k=0.3, seed=seed)
    data = [[node]+layout[node].tolist() for node in graph.nodes]
    nodes = pd.DataFrame(data, columns=['id', 'x', 'y'])
    nodes.set_index('id', inplace=True)
    edges = pd.DataFrame(list(graph.edges), columns=['source', 'target'])
    return nodes, edges

def nx_plot(graph,bundl,dec,name=""):
    print(graph.name, len(graph.edges))
    print("Bundled bw="+str(bundl)+"")
    print("decay="+str(dec)+"")
    nodes, edges = nx_layout(graph)
    bundled_bw = hammer_bundle(nodes, edges, decay=dec, initial_bandwidth=bundl) 
    return [graphplot(nodes, bundled_bw, "Bundled bw="+str(bundl)+"decay="+str(dec)+"")]
    
bundling_nums = [0, 0.0001, 0.001, 0.01, 0.03, 0.05, 0.1, 0.3, 0.5, 1]
decay_nums = [0.1, 0.25, 0.5, 0.9]
plots = [nx_plot(ng(G,name="test"),bundl = bundling_num, dec = decay_num) for bundling_num in bundling_nums for decay_num in decay_nums]

tf.Images(*chain.from_iterable(plots)).cols(4)

コード解説は時間があるときに追記させてください。

こちらが何もしていないkarate_club_graphになります。
正直、ネットワーク図としての可読性は高いとは思いますが、今回はEdge Bundlingを実装してみたという点が重要なのでご了承ください。

Datashaderのhammer_bundleにはdecayとinitial_bandwidthというパラメータがあります。
（decayは近隣のノードとのエッジを束化、initial_bandwidthはネットワークの中心ノードとのエッジを束化というように理解していますが、まだ勉強不足でちゃんと理解できていません。）
それぞれを
decayを0.1、0.25、0.5、0.9
initial_bandwidthを0.3、0.5、1
というように動かしました。

decay=0.1、initial_bandwidth=1あたりでは、エッジ束化により、だいぶネットワークの構造が単純化されていることがわかります。
右下の流線形というのかな？なんなんでしょうか。うーん奥が深そうです。
もう少し勉強を進めて、使いこなせるようになりたいです。

おわりに

あまりいないかもしれませんが、僕と同様にEdge Bundlingしてみたいけどなかなか...というような人のお力に少しでもなれたら幸いです。

またDatashaderのほかにも、CytoscapeでもEdge Bundlingが可能みたいです。
Cytoscapeは何度か使用したことがありますが、データファイルを読み込むだけでネットワーク図を作成することができ、データの関係を可視化してみるか～ってときなんかにも便利です。

Another GEXF import example w/tweaks. Custom Cytoscape visual style & edge bundling applied to the original data. pic.twitter.com/AV76WYb1n2
— Cytoscape (@cytoscape) May 3, 2016

ここまで規模の大きいネットワークだと威力を発揮するな、という印象です。

この記事は以下の情報を参考にして執筆しました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up