最小二乗法
年齢と血圧の様な2つの変数$x_i,y_i \ (i = 1,2, \cdots ,n)$ があり
$$y_i = ax_i + b$$
の様な関係がある場合、最も最適な直線の引き方は
$$ L=\displaystyle \sum_{i=1}^n [y_i - (ax_i + b)]^2$$
を最小にする $a,b$ を決める事
$L$ を $a$ と $b$ の2次関数とみなし偏微分する事で求める
\begin{align}
L & = \displaystyle \sum_{i=1}^n [y_i - (ax_i + b)]^2 \\
& = \displaystyle \sum_{i=1}^n [y_i^2 - 2y_i(ax_i + b) + (ax_i + b)^2 ] \\
& = \displaystyle \sum_{i=1}^n [y_i^2 - 2y_i(ax_i + b) + a^2x_i^2 + 2abx_i + b^2 ] \\
& = \displaystyle \sum_{i=1}^n y_i^2 + \displaystyle \sum_{i=1}^n [ -2y_i(ax_i + b) + a^2x_i^2 + 2abx_i + b^2 ]
\end{align}
なので
\begin{align}
\frac{ \partial L }{ \partial a } & = \displaystyle \sum_{i=1}^n ( -2y_ix_i + 2ax_i^2 + 2bx_i ) \\
& = - 2\displaystyle \sum_{i=1}^n y_ix_i + 2a \displaystyle \sum_{i=1}^n x_i^2 + 2b \displaystyle \sum_{i=1}^n x_i = 0 \qquad \cdots (1) \\\\[1pt]
\frac{ \partial L }{ \partial b } &= \displaystyle \sum_{i=1}^n ( -2y_i + 2ax_i +2b)= 0 \\
&= - 2\displaystyle \sum_{i=1}^n y_i +2a\displaystyle \sum_{i=1}^n x_i + 2nb = 0 \qquad \cdots (2)
\end{align}
なので
\begin{align}
\left\{
\begin{array}{l}
a\displaystyle \sum_{i=1}^n x_i^2 + b\displaystyle \sum_{i=1}^n x_i = \displaystyle \sum_{i=1}^n y_ix_i \qquad \cdots (1)' \\
a\displaystyle \sum_{i=1}^n x_i +nb = \displaystyle \sum_{i=1}^n y_i \qquad \cdots (2)'
\end{array}
\right.
\end{align}
$(2)' $ より
\begin{align}
b &= \frac{1}{n}\displaystyle \sum_{i=1}^n y_i - \frac{a}{n} \displaystyle \sum_{i=1}^n x_i \\\\[1pt]
&= \overline{y} - a\overline{x} \qquad \cdots (2)''
\end{align}
$(2)'' $ を$(1)'$ に代入し
\begin{align}
a\displaystyle \sum_{i=1}^n x_i^2 + (\overline{y} - a\overline{x})\displaystyle \sum_{i=1}^n x_i = \displaystyle \sum_{i=1}^n y_ix_i \\\\[1pt]
a\displaystyle \sum_{i=1}^n x_i^2 + \overline{y}\displaystyle \sum_{i=1}^n x_i - a\overline{x}\displaystyle \sum_{i=1}^n x_i = \displaystyle \sum_{i=1}^n y_ix_i \\\\[1pt]
a\displaystyle \sum_{i=1}^n x_i^2 + \overline{y}\displaystyle \sum_{i=1}^n x_i - an\overline{x}\frac{1}{n}\displaystyle \sum_{i=1}^n x_i = \displaystyle \sum_{i=1}^n y_ix_i \\\\[1pt]
a[ \displaystyle \sum_{i=1}^n x_i^2 -n\overline{x}^2] + \overline{y}\displaystyle \sum_{i=1}^n x_i = \displaystyle \sum_{i=1}^n y_ix_i
\end{align}
故に
\begin{align}
a &= \frac{\displaystyle \sum_{i=1}^n y_ix_i - \overline{y}\displaystyle \sum_{i=1}^n x_i }{\displaystyle \sum_{i=1}^n x_i^2 -n\overline{x}^2} \\\\[1pt]
&= \frac{\displaystyle \sum_{i=1}^n y_ix_i - n \overline{y} \cdot \overline{x}}
{\displaystyle \sum_{i=1}^n x_i^2 -n\overline{x}^2} \qquad \cdots (1)''
\end{align}
もう一度まとめると $y_i = ax_i + b$ を最適化する $a,b$ は
\begin{align}
\left\{
\begin{array}{l}
a = \frac{\displaystyle \sum_{i=1}^n y_ix_i - n \overline{y} \cdot \overline{x}}
{\displaystyle \sum_{i=1}^n x_i^2 -n\overline{x}^2} \qquad \cdots (1)'' \\\\[1pt]
b = \overline{y} - a\overline{x} \qquad \cdots (2)''
\end{array}
\right.
\end{align}
であり、点$(\overline{x},\overline{y})$ を通る、傾き$b$の直線
$$y - \overline{y} = b(x- \overline{x})$$
と表現できる