More than 1 year has passed since last update.

bi-gram頻度からの論理配列評価

Posted at 2023-01-21

目的

キーボードの論理配列について、入力文書の各bi-gramの頻度と各打鍵パターンのコストから、その総打鍵コストを算出する。
ただし、ここでのコストとは、打鍵にかかる時間や指の移動量などを想定する。

定義

文字とキー

$n$種の文字を各々$c_i$とする。
ただし、$0 \leqq i < n, i \in \mathbb{N}$とする。

\{c_0, \cdots,c_{n-1}\} \\

同様にキーを各々$k_i$とする。

\{k_0, \cdots, k_{n-1}\} \\

論理配列

文字$c_i$とキー$k_i$の対応が以下の例ように与えられたとする。
($\circ$が対応する箇所, $\times$が対応しない箇所)

	$c_0$	$c_1$	$\cdots$	$c_{n-1}$
$k_0$	$\times$	$\circ$	$\cdots$	$\times$
$k_1$	$\circ$	$\times$	$\cdots$	$\times$
$\vdots$	$\vdots$	$\vdots$		$\vdots$
$k_{n-1}$	$\times$	$\times$	$\cdots$	$\circ$

これを論理配列$\boldsymbol{A}$とする。

\begin{aligned}
    \boldsymbol{A} &= \begin{pmatrix}
        a_{0\ 0} &  \cdots & a_{0\ n-1} \\
        \vdots &         & \vdots \\
        a_{n-1\ 0} &  \cdots & a_{n-1\ n-1} \\
    \end{pmatrix} \qquad where \quad a_{ij} \in \{0, 1\} \\
    &\ni \begin{pmatrix}
        0 & 1 & \cdots & 0 \\
        1 & 0 & \cdots & 0 \\
        \vdots & \vdots && \vdots \\
        0 & 0 & \cdots & 1 \\
    \end{pmatrix} \qquad\cdots\ [例]\ 上記の表の場合
\end{aligned}

論理配列$\boldsymbol{A}$は置換行列である。

bi-gramの出現頻度

各bi-gramの出現頻度が以下ように与えられたとする。

	$c_0$	$\cdots$	$c_{n-1}$
$c_0$	$b_{0\ 0}$	$\cdots$	$b_{0\ n-1}$
$\vdots$	$\vdots$		$\vdots$
$c_{n-1}$	$b_{n-1\ 0}$	$\cdots$	$b_{n-1\ n-1}$

これをbi-gram頻度行列$\boldsymbol{B}$とする。

\boldsymbol{B} = \begin{pmatrix}
    b_{0\ 0} & \cdots & b_{0\ n-1} \\
    \vdots &        & \vdots \\
    b_{n-1\ 0} & \cdots & b_{n-1\ n-1} \\
\end{pmatrix}

打鍵パターンのコスト

各キーの打鍵パターンにかかるコストが以下のように与えられたとする。

	$k_0$	$\cdots$	$k_{n-1}$
$k_0$	$t_{0\ 0}$	$\cdots$	$t_{0\ n-1}$
$\vdots$	$\vdots$		$\vdots$
$k_{n-1}$	$t_{n-1\ 0}$	$\cdots$	$t_{n-1\ n-1}$

これを打鍵コスト行列 $\boldsymbol{T}$ とする。

\boldsymbol{T} = \begin{pmatrix}
    t_{0\ 0} & \cdots & t_{0\ n-1} \\
    \vdots &        & \vdots \\
    t_{n-1\ 0} & \cdots & t_{n-1\ n-1} \\
\end{pmatrix}

モデル化

打鍵パターンの頻度

論理配列$\boldsymbol{A}$とbi-gram頻度行列$\boldsymbol{B}$から、キー打鍵のパターン頻度行列$\boldsymbol{P}$が以下で求まる。

\boldsymbol{P} = \boldsymbol{A}\boldsymbol{B}\boldsymbol{A}^T \qquad \cdots 式(1) \\

[例]

論理配列$\boldsymbol{A}$とbi-gram頻度行列$\boldsymbol{B}$を以下と仮定する。

\boldsymbol{A} = \begin{pmatrix}
    0 & 0 & 1 \\
    1 & 0 & 0 \\
    0 & 1 & 0 \\
\end{pmatrix},\qquad
\boldsymbol{B} = \begin{pmatrix}
    b_{00} & b_{01} & b_{02} \\
    b_{10} & b_{11} & b_{12} \\
    b_{20} & b_{21} & b_{22} \\
\end{pmatrix}\\

この時のパターン頻度行列$\boldsymbol{P}$は以下となる。

\begin{aligned}
    \boldsymbol{P} &= \boldsymbol{A}\boldsymbol{B}\boldsymbol{A}^T \qquad \cdots 式(1) 再掲 \\
    &= \begin{pmatrix}
        0 & 0 & 1 \\
        1 & 0 & 0 \\
        0 & 1 & 0 \\
    \end{pmatrix}\begin{pmatrix}
        b_{00} & b_{01} & b_{02} \\
        b_{10} & b_{11} & b_{12} \\
        b_{20} & b_{21} & b_{22} \\
    \end{pmatrix}\begin{pmatrix}
        0 & 0 & 1 \\
        1 & 0 & 0 \\
        0 & 1 & 0 \\
    \end{pmatrix}^T \\
    &= \begin{pmatrix}
        b_{20} & b_{21} & b_{22} \\
        b_{00} & b_{01} & b_{02} \\
        b_{10} & b_{11} & b_{12} \\
    \end{pmatrix}\begin{pmatrix}
        0 & 1 & 0 \\
        0 & 0 & 1 \\
        1 & 0 & 0 \\
    \end{pmatrix} \\
    &= \begin{pmatrix}
        b_{22} & b_{20} & b_{21} \\
        b_{02} & b_{00} & b_{01} \\
        b_{12} & b_{10} & b_{11} \\
    \end{pmatrix}
\end{aligned}

以下の手順でも求めてみる。

	$c_0$	$c_1$	$c_2$
$c_0$	$b_{00}$	$b_{01}$	$b_{02}$
$c_1$	$b_{10}$	$b_{11}$	$b_{12}$
$c_2$	$b_{20}$	$b_{21}$	$b_{22}$

$\boldsymbol{M}$に従い$c_i$を$k_i$置換

	$k_1$	$k_2$	$k_0$
$k_1$	$b_{00}$	$b_{01}$	$b_{02}$
$k_2$	$b_{10}$	$b_{11}$	$b_{12}$
$k_0$	$b_{20}$	$b_{21}$	$b_{22}$

並び替え

	$k_0$	$k_1$	$k_2$
$k_0$	$b_{22}$	$b_{20}$	$b_{21}$
$k_1$	$b_{02}$	$b_{00}$	$b_{01}$
$k_2$	$b_{12}$	$b_{10}$	$b_{11}$

\therefore \quad \boldsymbol{P} = \begin{pmatrix}
    b_{22} & b_{20} & b_{21} \\
    b_{02} & b_{00} & b_{01} \\
    b_{12} & b_{10} & b_{11} \\
\end{pmatrix}

式(1)で求めた結果と一致する。

総打鍵コスト

論理配列$\boldsymbol{A}$の総打鍵コスト$E$を以下の数式でモデル化できる。

\begin{aligned}
    E &= \boldsymbol{T} \cdot \boldsymbol{P} \\
    &= \boldsymbol{T} \cdot (\boldsymbol{A}\boldsymbol{B}\boldsymbol{A}^T) \\
    &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} t_{ij}(\boldsymbol{A}\boldsymbol{B}\boldsymbol{A}^T)_{ij}\\
    &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} t_{ij} \lbrace \sum_{k=0}^{n-1} (\boldsymbol{A}\boldsymbol{B})_{ik}(\boldsymbol{A}^T)_{kj}  \rbrace \\
    &= \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} t_{ij} \lbrace \sum_{k=0}^{n-1} a_{jk} (\sum_{l=0}^{n-1} a_{il}b_{lk}) \rbrace \qquad // \\
\end{aligned}

TODO:

式(1)の証明

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up