Help us understand the problem. What is going on with this article?

グラフの表現学習を利用したいけど,データ整形(gexf形式に変換)ができない人向け

グラフの表現学習を利用したいけど,入力データを整形できないあなたに向けて.
今回紹介するコードでgraph2vecやGNN,GCNに対しての入力データであるサブフラフを生成できます.(サブグラフのファイル形式はgexf)

最後にコード全文を記載します.

動作環境

neo4j-desktop-offline-1.2.1-x86_64.Applmage
python 3.7.3
Ubuntu 18.04.2

1.Neo4jを起動させた状態にする

bbb.png

2.予め,空のフォルダを作成

今回はsubgraph_callフォルダを作成しました.
このフォルダ内にgexf形式のファイルが生成されます.

3.本題のコード解説

import pprint
import networkx as nx
from neo4jrestclient.client import GraphDatabase

pprint:list型やdict型等をstr型に変換可能なPythonライブラリ
networkx:グラフ/ネットワーク理論系の計算を行うためのPythonライブラリ

今回,neo4jrestclientを利用することにより,pythonでNeo4jに接続します.

url = 'http://localhost:7474/db/data'
gdb = GraphDatabase(url, username='neo4j', password='password')

passwordには,グラフDBを生成した際に決めたパスワードを格納します.

codes = gdb.query("MATCH (START)-[:CALL]->() return collect(DISTINCT START.name) AS list_d1", data_contents=True)

エッジCALLが出ているノードの名前をリストにまとめます.(重複箇所は削除)
MATCH句はNeo4jにおけるCypherクエリと同じ文法で記述できます.

G = nx.DiGraph()

for i, codes in enumerate(codes.rows[0][0]):
    G.clear()
    print("No." + str(i) + ": " + codes)

    behavior = gdb.query("MATCH p=(({{name: '{0}'}})-[:CALL]->()) return p".format(codes), data_contents=True)

    for graph in behavior.graph:
        for node in graph['nodes']:
            print("id: " + node['id'] + " - labels: " + node['labels'][0])
            G.add_node(node['id'])
            G.nodes[node['id']]['label'] = node['id']
            G.nodes[node['id']]['Label'] = node['labels'][0]
        for relationship in graph['relationships']:
            print("id: " + relationship['id'] + " - type: " + relationship['type'])
            G.add_edge(relationship['startNode'], relationship['endNode'])
            if (relationship['startNode'], relationship['endNode']) in nx.get_edge_attributes(G, 'Type'):
                pass
            else:
                G.edges[relationship['startNode'], relationship['endNode']]['Type'] = relationship['type']

各ノードからエッジCALLでつながるサブグラフを複数出力します.(重複箇所は削除)

nx.write_gexf(G, "./../data/subgraph_call/{0}.gexf".format(i))
    f = open('./../data/subgraph_call.Labels','a')
    f.write("{0}.gexf 0\n".format(i))
    f.close()

nx.write_gexfは,指定したフォルダ内に出力したサブグラフをgexf形式のファイルとして生成します.
その後,Labels形式のファイルを機械学習におけるラベルとして生成します.

最後のコード全文を実行することで以下のような結果が得られます.

Screenshot from 2019-12-04 14-04-21.png
Screenshot from 2019-12-04 .png

以下コード全文

# coding:utf-8
import pprint
import networkx as nx
from neo4jrestclient.client import GraphDatabase

global i,codes
url = 'http://localhost:7474/db/data'
gdb = GraphDatabase(url, username='neo4j', password='password')

codes = gdb.query("MATCH (START)-[:CALL]->() return collect(DISTINCT START.name) AS list_d1", data_contents=True)

G = nx.DiGraph()

for i, codes in enumerate(codes.rows[0][0]):
    G.clear()
    print("No." + str(i) + ": " + codes)

    behavior = gdb.query("MATCH p=(({{name: '{0}'}})-[:CALL]->()) return p".format(codes), data_contents=True)

    for graph in behavior.graph:
        for node in graph['nodes']:
            print("id: " + node['id'] + " - labels: " + node['labels'][0])
            G.add_node(node['id'])
            G.nodes[node['id']]['label'] = node['id']
            G.nodes[node['id']]['Label'] = node['labels'][0]
        for relationship in graph['relationships']:
            print("id: " + relationship['id'] + " - type: " + relationship['type'])
            G.add_edge(relationship['startNode'], relationship['endNode'])
            if (relationship['startNode'], relationship['endNode']) in nx.get_edge_attributes(G, 'Type'):
                pass
            else:
                G.edges[relationship['startNode'], relationship['endNode']]['Type'] = relationship['type']

    nx.write_gexf(G, "./../data/subgraph_call/{0}.gexf".format(i))

    f = open('./../data/subgraph_call.Labels','a')
    f.write("{0}.gexf 0\n".format(i))
    f.close()

    print("--------------------------")

今回得られたデータに対して表現学習を適用できますね!

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away