LoginSignup
0
2

More than 3 years have passed since last update.

【Python】 主双対アルゴリズムを用いた LASSO 回帰

Last updated at Posted at 2019-07-11

主双対アルゴリズム1を使用して LASSO (least absolute shrinkage and selection operator) 回帰を実装してみた.

扱う問題

\min_{w}{\frac1n\|Xw-y\|_1 + \alpha\|w\|_1}

ここで $X\in\mathbb{R}^{n\times m},\ y\in\mathbb{R}^n$ は入力データセット,$w\in\mathbb{R}^m$ は決定変数,$\alpha$ は正則化パラメータである.

\|w\|_1 = \max_{-1\leq p\leq1}\left<w, p\right>

より,先の最適化問題は,

\min_w{\max_{-1 \leq (p, q) \leq 1}{\frac1n\left<Xw-y, q\right> + \alpha\left<w, p\right>}}

とかける.

主双対アルゴリズム

近接点法を用いて交互に最小化・最大化問題を解く.

\left\{
\begin{align}
&w^{k+1} = \mathop{\mathrm{arg~min}}\limits_{w}{\left\{\frac1n\left<Xw-y, q^k\right> + \alpha\left<w, p^k\right> + \frac12\|w - w^k\|_X^2\right\}}\\
&\bar{w}^{k+1} = 2w^{k+1} - w^k\\
&(p^{k+1}, q^{k+1}) = \mathop{\mathrm{arg~max}}\limits_{-1 \leq (p, q) \leq 1}{\left\{\frac1n\left<X\bar{w}^{k+1}-y, q\right> + \alpha\left<\bar{w}^{k+1}, p\right> - \frac12\|(p, q) - (p^k, q^k)\|_Y^2\right\}}
\end{align}
\right.

ここで,

K = \left(\frac1nX^\top\hspace{-1pt}, \alpha I\right)^\top\hspace{-1pt}\\
\|x\|_X^2 = \left<x, T^{-1}x\right>\\
T = \mathrm{diag}(\tau)\quad \mathrm{where}\quad \tau_j=\frac{1}{\sum_{i}|K_{i,j}|^{2-\beta}}\\
\|y\|_Y^2 = \left<y, \Sigma^{-1}y\right>\\
\Sigma = \mathrm{diag}(\sigma)\quad \mathrm{where}\quad \sigma_i=\frac{1}{\sum_{j}|K_{i,j}|^{\beta}}

である.これより,更新式は

\left\{
\begin{align}
&w^{k+1} = w^k - T\left(\frac1nX^\top\hspace{-1pt}q^k + \alpha p^k\right)\\
&\bar{w}^{k+1} = 2w^{k+1} - w^k\\
&(p^{k+1}, q^{k+1}) = \mathrm{proj}_{[-1, 1]}\left((p^k, q^k) + \Sigma\left(\alpha\bar{w}^{k+1}, \frac1n(X\bar{w}^{k+1} - y)\right)\right)
\end{align}
\right.

となる.

ソースコード

Python で実装.

  • MacOS Mojave : 10.14.3
  • Python : 3.7.3
  • numpy : 1.16.4

ソースコードや実行した notebook は Github に.

from typing import Tuple, List

import numpy as np
from tqdm import trange


class Lasso:
    """
    Lasso regression model using the preconditioned primal dual algorithm
    """
    def __init__(self,
            alpha: float = 1.0,
            beta: float = 0.5,
            max_iter: int = 1000,
            extended_output: bool = False):
        """
        Parameters
        ----------
        alpha : float
            A regularization parameter.
        beta : float in [0.0, 2.0]
            A parameter used in step size.
        max_iter : int
            The maximum number of iterations.
        extended_output : bool
            If True, return the value of the objective function of each iteration.
        """
        np.random.seed(0)
        self.alpha = alpha
        self.beta = beta
        self.max_iter = max_iter
        self.extended_output = extended_output

        self.coef_ = None
        self.objective_function = list()

    def _step_size(self, X: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
        n_samples, n_features = X.shape
        abs_X = np.abs(X)

        tau = np.sum(abs_X ** self.beta, axis=0)
        tau += self.alpha ** self.beta
        tau = 1 / tau

        sigma = np.empty(n_samples + n_features, dtype=X.dtype)
        sigma[:n_samples] = np.sum(abs_X ** (2 - self.beta), axis=1)
        sigma[n_samples:] = self.alpha ** (2 - self.beta)
        sigma = 1. / sigma

        return tau, sigma

    def fit(self, X: np.ndarray, y: np.ndarray) -> None:
        """
        Parameters
        ----------
        X : array, shape = (n_samples, n_features)
        y : array, shape = (n_samples, )
        """
        n_samples, n_features = X.shape
        n_samples_inv = 1 / n_samples
        bar_X = X * n_samples_inv
        bar_y = y * n_samples_inv
        tau, sigma = self._step_size(bar_X)

        # initialize
        res = np.zeros(n_features, dtype=X.dtype)
        dual = np.zeros(n_samples + n_features, dtype=X.dtype)
        dual[:n_samples] = np.clip(-sigma[:n_samples] * y * n_samples_inv, -1, 1)

        # objective function
        if self.extended_output:
            self.objective_function.append(
                np.sum(np.abs(bar_y)) + self.alpha * np.sum(np.abs(res))
            )

        # main loop
        for _ in trange(self.max_iter):
            w = res - tau * (bar_X.T.dot(dual[:n_samples]) + self.alpha * dual[n_samples:])
            bar_w = 2 * w - res
            dual[:n_samples] = dual[:n_samples] + sigma[:n_samples] * (bar_X.dot(bar_w) - bar_y)
            dual[n_samples:] = dual[n_samples:] + sigma[n_samples:] * bar_w * self.alpha
            dual = np.clip(dual, -1, 1)
            res = w
            if self.extended_output:
                self.objective_function.append(
                    np.sum(np.abs(bar_X.dot(w) - bar_y)) + self.alpha * np.sum(np.abs(res))
                )
        self.coef_ = res

実行結果.
$X,\ w$ を乱数で与え,$w$ の復元を見ている.復元誤差 (error) は $\alpha$ に依存するが,実装した Lasso は収束が遅く scikit-learn のが良さそう.
lasso.png


  1. Thomas Pock and Antonin Chambolle 2012 "Diagonal preconditioning for first order primal-dual algorithms in convex optimization" 2011 International Conference on Computer Vision https://ieeexplore.ieee.org/abstract/document/6126441 

0
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
2