Graphical Image for Hyperbolic Space

代表的なPoincaré (Ball)、Lorentz (Hyperboloid)、Klein の関係性に関してイメージを掴むのに以下が参考になる。
Hyperboloid and projections

Basic Definitions

curvatureはEuclideanは 0 、Poincaré、Lorentz、Kleinは -1 だが、後者3つは任意の負の定数をとれるよう一般化して $-c\:(c>0)$ または $-1/K\:(K>0)$ と定義する。この時、Poincaré Ballなどの半径は$\sqrt{K}$となる。また、Space（空間）の点を${\bf x}$と${\bf y}$で表し、Tangent Space（接ベクトル、速度ベクトル、勾配の空間）の方向を${\bf u}$と${\bf v}$で表す。以下の定義は¹²³⁴⁵から参照し、まとめたものである。Gyrovecotrに関しては後日まとめる予定。

	Euclidean	Poincaré (Ball)	Lorentz (Hyperboloid)	Klein
Space $\mathcal{M}^{n,K}$	$\{{\bf x} \in \mathbb{R}^{n}\}$	$\{{\bf x} \in \mathbb{R}^{n}:\lVert {\bf x} \rVert^2<K\}$	$\{{\bf x} \in \mathbb{R}^{n+1}: \langle {\bf x}, {\bf x} \rangle_{\mathcal{L}}=-K, x_0>0\}$ where $\langle {\bf x}, {\bf x} \rangle_{\mathcal{L}}:=-x_0^2+\sum^n_{i=1} x_i^2$ and $x_0=\sqrt{K+\lVert\hat{{\bf x}}\rVert^2}$, where $\lVert\hat{{\bf x}}\rVert^2=\sum^n_{i=1} x_i^2$	$\{{\bf x} \in \mathbb{R}^{n}:\lVert {\bf x} \rVert^2<K\}$
Tangent Space $T_{{\bf x}}\mathcal{M}^{n,K}$	-	-	$\{{\bf v} \in \mathbb{R}^{n+1}: \langle {\bf v}, {\bf x} \rangle_{\mathcal{L}}=0\}$	-
Metric Tensor in $\mathrm{g}^K_{\bf x} : T_{{\bf x}}\mathcal{M}^{n,K} \times T_{{\bf x}}\mathcal{M}^{n,K} \rightarrow \mathbb{R}$ which defines the inner product, norm, angle and distance in the Tangent Space	$I_n$	$(\lambda^K_{\bf x})^2I_n$ where $\lambda^K_{\bf x}$ is the conformal factor $\frac{2}{(1-\frac{1}{K}\lVert {\bf x} \rVert^2)}$	$\mathrm{diag}(-1,{\bf 1}_n^{\mathrm{T}})=\left(\begin{array}{cccc}-1&0&\cdots &0 \\ 0&1&\cdots &0 \\\vdots&&\ddots& \\ 0&0&\cdots &1\end{array}\right)$	$g^1_{ij}=\frac{\delta{ij}}{1 - \lVert {\bf x} \rVert^2}+\frac{x_ix_j}{(1 - \lVert {\bf x} \rVert^2)^2}$ when $K=1$, where $g^1_{ij}$ is the element of a matrix, derived from ³
Distance $\mathrm{d}^K: \mathcal{M}^{n,K} \times \mathcal{M}^{n,K} \rightarrow \mathbb{R}$	$\lVert {\bf x} - {\bf y} \rVert$		$\sqrt{K}\mathrm{arcosh}(-\frac{\langle {\bf x}, {\bf y} \rangle_{\mathcal{L}}}{K})$	-
Exponential Map $\mathrm{exp}^K_{{\bf x}} : T_{{\bf x}}\mathcal{M}^{n,K} \rightarrow \mathcal{M}^{n,K}$	${\bf x}+{\bf v}$		$\mathrm{cosh}(\frac{\lVert {\bf v} \rVert_{\mathcal{L}}}{\sqrt{K}}){\bf x} + \sqrt{K}\mathrm{sinh}(\frac{\lVert {\bf v} \rVert_{\mathcal{L}}}{\sqrt{K}})\frac{{\bf v}}{\lVert {\bf v} \rVert_{\mathcal{L}}}$ where $\lVert {\bf v} \rVert_{\mathcal{L}}=\sqrt{\langle {\bf v}, {\bf v} \rangle_{\mathcal{L}}}$	-
Logarithmic Map $\mathrm{log}^K_{{\bf x}} : \mathcal{M}^{n,K} \rightarrow T_{{\bf x}}\mathcal{M}^{n,K}$ which is the inverse function of Exponential Map	${\bf y}-{\bf x}$		$\mathrm{d}^K_{\mathcal{L}}({\bf x}, {\bf y})\frac{{\bf y}+\frac{1}{K}\langle {\bf x}, {\bf y} \rangle_{\mathcal{L}}{\bf x}}{\lVert {\bf y}+\frac{1}{K}\langle {\bf x}, {\bf y} \rangle_{\mathcal{L}}{\bf x}\rVert_{\mathcal{L}}}$ where $\mathrm{d}^K_{\mathcal{L}}({\bf x}, {\bf y})$ is the distance of Lorentz.	-
Parallel Transport $P^K_{{\bf x} \rightarrow {\bf y}}:T_{\bf x}M \rightarrow T_{\bf y}M$	${\bf v}$		${\bf v} - \frac{\langle \log^K_{\bf x}({\bf y}), {\bf v}\rangle_{\mathcal{L}}}{\mathrm{d}^K_{\mathcal{L}}({\bf x}, {\bf y})^2}(\log^K_{\bf x}({\bf y}) + \log^K_{\bf y}({\bf x}))$	-
To Poincaré	-	-	$\frac{\hat{{\bf x}}}{1+\frac{1}{\sqrt{K}}x_0}$ where ${\bf x}=(x_0, \hat{{\bf x}}^{\mathrm{T}})^{\mathrm{T}}$	$\frac{2{\bf x}}{1+\frac{1}{K}\lVert {\bf x} \rVert^2}$
To Lorents	-	$(\sqrt{K}\frac{1+\frac{1}{K}\lVert {\bf x}\rVert^2}{1-\frac{1}{K}\lVert {\bf x}\rVert^2}, \frac{2}{1-\frac{1}{K}\lVert {\bf x}\rVert^2}{\bf x}^{\mathrm{T}})^{\mathrm{T}}$	-	$\frac{1}{\sqrt{1 - \frac{1}{K}\lVert {\bf x} \rVert^2}}(\sqrt{K},{\bf x}^{\mathrm{T}})^{\mathrm{T}}$
To Klein	-	$\frac{{\bf x}}{1+\sqrt{1-\frac{1}{K}\lVert {\bf x} \rVert^2}}$	$\frac{\hat{{\bf x}}}{\frac{1}{\sqrt{K}}x_0}$ where ${\bf x}=(x_0, \hat{{\bf x}}^{\mathrm{T}})^{\mathrm{T}}$	-

Metric Tensorはある地点${\bf x}$を中心としたTangent Spaceにおける方向（接ベクトル、速度ベクトル、勾配）${\bf u}$と${\bf v}$の大きさと角度を定義する。大きさはEuclideanの場合、$\sqrt{\mathrm{g}^K_{\bf x}({\bf u},{\bf u})}=\sqrt{{\bf u}^{\mathrm{T}}I_n{\bf u}}=\lVert {\bf u} \rVert$で表され、これはL2 normである。他の空間でも、Metric Tensorを間に掛け合わせ、平方根を取ることで大きさを得られる。
Tangent Spaceにおける角度はEuclideanとPoincaréでは同じ（これを等角（conformal）という）で以下のように等価を証明できる（後ろから三番目がEuclidean、後ろから二番目がPoincaréのMetric Tensorを使った式である）。⁴

$$cos\theta=\frac{\mathrm{g}^K_{\bf x}({\bf u},{\bf v})}{\sqrt{\mathrm{g}^K_{\bf x}({\bf u},{\bf u})} \sqrt{\mathrm{g}^K_{\bf x}({\bf v},{\bf v})}}=\frac{{\bf u}^{\mathrm{T}}I_n{\bf v}}{\sqrt{{\bf u}^{\mathrm{T}}I_n{\bf u}}\sqrt{{\bf v}^{\mathrm{T}}I_n{\bf v}}}=\frac{(\lambda^K_{\bf x})^2{\bf u}^{\mathrm{T}}I_n{\bf v}}{\sqrt{(\lambda^K_{\bf x})^2{\bf u}^{\mathrm{T}}I_n{\bf u}}\sqrt{(\lambda^K_{\bf x})^{2}{\bf v}^{\mathrm{T}}I_n{\bf v}}}=\frac{{\bf u}^{\mathrm{T}}{\bf v}}{\lVert {\bf u} \rVert\lVert {\bf v} \rVert}$$

$\lambda^K_{\bf x}$は分子分母で打ち消され、Euclideanと同じ角度となる。 $\lambda^K_{\bf x}$をconformal factorという。
Space 上の二点間の距離は以下のように$\mathrm{g}^K_{\gamma(t)}$を使って定義できる。⁴

$$\mathrm{d}^K({\bf x}, {\bf y}) = \mathrm{inf}_{\gamma}\int_0^1\sqrt{\mathrm{g}^K_{\gamma(t)}(\dot{\gamma}(t),\dot{\gamma}(t))}dt$$

ここで $\gamma$ は $\gamma(0)={\bf x}$ 、 $\gamma(1)={\bf y}$ となる二点を結ぶ任意の線を表す関数である。この任意の線を表す関数のうち $\mathrm{inf}_{\gamma}$ 内の値を最小にする $\gamma$ を選び、その値を二点の距離として定義している。また選ばれた $\gamma$ を測地線（geodesic）という。 $\dot{\gamma}(t)$ は $t$ 時点での速度ベクトルを表し（$t=0$ の時、 ${\bf x}$ 地点の速度ベクトルを表す）、この大きさを足し合わせていくことで距離を算出している。各空間の Distance $\mathrm{d}^K$ はこの積分を解いたものである。

Aggregation Operations

Tangent Space で通常の Euclidean Space のように average pooling をしたり（その後、Exponential Map で Hyperbolic 上に戻す）、以下のように Klein Space で Einstein Midpoint （Hyperbolic上での平均）を求める方法が提案されている。⁶

$${\bf y}=\sum_i\frac{\gamma_i {\bf x}_i}{\sum_j\gamma_j}$$ $$\mathrm{where}\;\gamma_i=\frac{1}{\sqrt{1 - \frac{1}{K}\lVert {\bf x}_i \rVert^2}}\;\mathrm{and}\;{\bf x}_i\;\mathrm{is}\;\mathrm{a}\;\mathrm{point}\;\mathrm{of}\;\mathrm{Klein}\;\mathrm{Space}$$

Poincaré、Lorentzの場合は、一旦Klein Spaceに移動してから、aggregation を行い、その後もとの空間に戻す。
また attention のようなケースで重み付きの Einstein Midpoint も提案されている。²

$${\bf y}=\sum_i\frac{\alpha_i\gamma_i {\bf x}_i}{\sum_j\alpha_j\gamma_j}$$ $$\mathrm{where}\;\alpha_i\;\mathrm{is}\;\mathrm{an}\;\mathrm{attention}\;\mathrm{weight}$$

他の方法としては Gyrovector により aggregation する方法も提案されている。
（Hyperbolic上での）max pooling や bi-interaction pooling などは提案されておらず、今の所 Tangent Space で行う方法しかない。

Euclidean Space to Hyperbolic Space

Euclidean Space から Hyperbolic Space への変換方法として以下が提案されている。¹

$$\mathrm{exp}^K_{{\bf o},\mathcal{L}}((0,{\bf v}^{\mathrm{T}})^{\mathrm{T}})=(\sqrt{K}\mathrm{cosh}(\frac{\lVert {\bf v} \rVert}{\sqrt{K}}), \sqrt{K}\mathrm{sinh}(\frac{\lVert {\bf v} \rVert}{\sqrt{K}})\frac{\bf v}{\lVert {\bf v} \rVert})^{\mathrm{T}}$$ $$\mathrm{where}\; \mathrm{exp}^K_{{\bf o},\mathcal{L}}\; \mathrm{is}\; \mathrm{exp}^K_{\bf o}\; \mathrm{of}\; \mathrm{Lorentz}\; \mathrm{at}\; {\bf o}:=(\sqrt{K},0,\cdots,0)^{\mathrm{T}}\; \mathrm{and}\; {\bf v} \in \mathbb{R}^{n}$$

この方法を使えば Euclidean Space の特徴量を Hyperbolic Space に持っていくことも可能である。

Hyperbolic Graph Convolutional Neural Networks： https://arxiv.org/pdf/1910.12933.pdf ↩
Hyperbolic Attention Networks： https://openreview.net/pdf?id=rJxHsjRqFQ ↩
https://math.stackexchange.com/questions/1292707/comparing-metric-tensors-of-the-poincare-and-the-klein-disk-models-of-hyperbolic/1295924#1295924 ↩
Hyperbolic Neural Networks： https://arxiv.org/pdf/1805.09112.pdf ↩
Hyperbolic Neural Networks++: https://arxiv.org/pdf/2006.08210.pdf ↩
HyperText: Endowing FastText with Hyperbolic Geometry: https://arxiv.org/pdf/2010.16143.pdf ↩

Hyperbolic Definitions for Machine Learning

Graphical Image for Hyperbolic Space

Basic Definitions

Aggregation Operations

Euclidean Space to Hyperbolic Space