More than 3 years have passed since last update.

pythonでインスタンス情報を調べるためのtips

Last updated at 2021-05-22Posted at 2021-05-14

機械学習でクラスを調べるときに頻繁に使いそうだったのでメモ。

クラスの情報を調べるメソッド
ex.PCA(主成分分析)においてインスタンスが持つインスタンス変数を調べる

クラスの情報を調べるメソッド

モジュール.file

モジュールの場所を返す

vars(hoge)

モジュール、クラス、インスタンス、あるいはそれ以外の dict 属性を持つオブジェクトの、 dict 属性を返します

hoge.dict

そのオブジェクトの有効な属性のリストを返そうと試みる

from pprint import pprint

sklearn.__file__
>>>'C:\\anaconda3\\hoge\\hoge\\hoge\\__init__.py'

class hoge:
    def __init__(self,x,y):
        self.x = x
        self.y = y

ins_sample = hoge(30,20)

pprint(vars(ins_sample),width=100)
# {'x': 30, 'y': 20}
pprint(ins_sample.__dict__)
# {'x': 30, 'y': 20}

help(np.dot)

メソッドの説明が見られる


help(np.dot)
>>>Help on function dot in module numpy:

dot(...)
    dot(a, b, out=None)
    
    Dot product of two arrays. Specifically,
    
    - If both `a` and `b` are 1-D arrays, it is inner product of vectors
      (without complex conjugation).

    ...
    
    - If `a` is an N-D array and `b` is an M-D array (where ``M>=2``),...

        dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
    
    Parameters
    ----------
    a : array_like
        First argument.
    b : array_like
        Second argument.
    out : ndarray, optional
        Output argument. This must have the exact kind that would be returned
        if it was not used. In particular, it must have the ...
    
    Returns
    -------
    output : ndarray

inspect.getmembers(lr)

[x for x in inspect.getmembers(lr) ]
のように使うことが多い。
こっちの方が詳しくクラスの内容を見られる

vars(lr)とinspect.getmembers(lr)の違いを以下に示す。

import inspect
from sklearn.linear_model import LinearRegression
slr = LinearRegression()
slr.fit(X, y)

[x for x in inspect.getmembers(lr) ]

>>>
[('__abstractmethods__', frozenset()),
 ('__class__', sklearn.linear_model._base.LinearRegression),
 ('__delattr__',
  <method-wrapper '__delattr__' of LinearRegression object at 0x0000019507D16910>),
 ('__dict__',
  {'fit_intercept': True,
   'normalize': False,
   'copy_X': True,
   'n_jobs': None,
   'n_features_in_': 1,
   'coef_': array([9.10210898]),
   '_residues': 22061.879196211798,
   'rank_': 1,
   'singular_': array([15.78935652]),
   'intercept_': -34.67062077643857}),
 ('__dir__', <function LinearRegression.__dir__()>),
 ('__doc__',
  '\n    Ordinary least squares Linear Regression.\n\n    LinearRegression fits a linear model with coefficients w = (w1, ..., wp)\n    to minimize the residual sum of squares between the observed targets in\n    the dataset, and the targets predicted by the linear approximation.\n\n    Parameters\n    ----------\n    fit_intercept : bool, default=True\n        Whether to calculate the intercept for this model. If set\n        to False, no intercept will be used in calculations\n        (i.e. data is expected to be centered).\n\n    normalize : bool, default=False\n        This parameter is ignored when ``fit_intercept`` is set to False.\n        If True, the regressors X will be normalized before regression by\n        subtracting the mean and dividing by the l2-norm.\n        If you wish to standardize, please use\n        :class:`sklearn.preprocessing.StandardScaler` before calling ``fit`` on\n        an estimator with ``normalize=False``.\n\n    copy_X : bool, default=True\n        If True, X will be copied; else, it may be overwritten.\n\n    n_jobs : int, default=None\n        The number of jobs to use for the computation. This will only provide\n        speedup for n_targets > 1 and sufficient large problems.\n        ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.\n        ``-1`` means using all processors. See :term:`Glossary <n_jobs>`\n        for more details.\n\n    Attributes\n    ----------\n    coef_ : array of shape (n_features, ) or (n_targets, n_features)\n        Estimated coefficients for the linear regression problem.\n        If multiple targets are passed during the fit (y 2D), this\n        is a 2D array of shape (n_targets, n_features), while if only\n        one target is passed, this is a 1D array of length n_features.\n\n    rank_ : int\n        Rank of matrix `X`. Only available when `X` is dense.\n\n    singular_ : array of shape (min(X, y),)\n        Singular values of `X`. Only available when `X` is dense.\n\n    intercept_ : float or array of shape (n_targets,)\n        Independent term in the linear model. Set to 0.0 if\n        `fit_intercept = False`.\n\n    See Also\n    --------\n    sklearn.linear_model.Ridge : Ridge regression addresses some of the\n        problems of Ordinary Least Squares by imposing a penalty on the\n        size of the coefficients with l2 regularization.\n    sklearn.linear_model.Lasso : The Lasso is a linear model that estimates\n        sparse coefficients with l1 regularization.\n    sklearn.linear_model.ElasticNet : Elastic-Net is a linear regression\n        model trained with both l1 and l2 -norm regularization of the\n        coefficients.\n\n    Notes\n    -----\n    From the implementation point of view, this is just plain Ordinary\n    Least Squares (scipy.linalg.lstsq) wrapped as a predictor object.\n\n    Examples\n    --------\n    >>> import numpy as np\n    >>> from sklearn.linear_model import LinearRegression\n    >>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])\n    >>> # y = 1 * x_0 + 2 * x_1 + 3\n    >>> y = np.dot(X, np.array([1, 2])) + 3\n    >>> reg = LinearRegression().fit(X, y)\n    >>> reg.score(X, y)\n    1.0\n    >>> reg.coef_\n    array([1., 2.])\n    >>> reg.intercept_\n    3.0000...\n    >>> reg.predict(np.array([[3, 5]]))\n    array([16.])\n    '),
 ('__eq__',
  <method-wrapper '__eq__' of LinearRegression object at 0x0000019507D16910>),
 ('__format__', <function LinearRegression.__format__(format_spec, /)>),
 ('__ge__',
  <method-wrapper '__ge__' of LinearRegression object at 0x0000019507D16910>),
 ('__getattribute__',
  <method-wrapper '__getattribute__' of LinearRegression object at 0x0000019507D16910>),
 ('__getstate__',
  <bound method BaseEstimator.__getstate__ of LinearRegression()>),
 ('__gt__',
  <method-wrapper '__gt__' of LinearRegression object at 0x0000019507D16910>),
 ('__hash__',
  <method-wrapper '__hash__' of LinearRegression object at 0x0000019507D16910>),
 ('__init__', <bound method LinearRegression.__init__ of LinearRegression()>),
 ('__init_subclass__', <function LinearRegression.__init_subclass__>),
 ('__le__',
  <method-wrapper '__le__' of LinearRegression object at 0x0000019507D16910>),
 ('__lt__',
  <method-wrapper '__lt__' of LinearRegression object at 0x0000019507D16910>),
 ('__module__', 'sklearn.linear_model._base'),
 ('__ne__',
  <method-wrapper '__ne__' of LinearRegression object at 0x0000019507D16910>),
 ('__new__', <function object.__new__(*args, **kwargs)>),
 ('__reduce__', <function LinearRegression.__reduce__()>),
 ('__reduce_ex__', <function LinearRegression.__reduce_ex__(protocol, /)>),
 ('__repr__', <bound method BaseEstimator.__repr__ of LinearRegression()>),
 ('__setattr__',
  <method-wrapper '__setattr__' of LinearRegression object at 0x0000019507D16910>),
 ('__setstate__',
  <bound method BaseEstimator.__setstate__ of LinearRegression()>),
 ('__sizeof__', <function LinearRegression.__sizeof__()>),
 ('__str__',
  <method-wrapper '__str__' of LinearRegression object at 0x0000019507D16910>),
 ('__subclasshook__', <function LinearRegression.__subclasshook__>),
 ('__weakref__', None),
 ('_abc_impl', <_abc_data at 0x1950076adb0>),
 ('_check_n_features',
  <bound method BaseEstimator._check_n_features of LinearRegression()>),
 ('_decision_function',
  <bound method LinearModel._decision_function of LinearRegression()>),
 ('_estimator_type', 'regressor'),
 ('_get_param_names',
  <bound method BaseEstimator._get_param_names of <class 'sklearn.linear_model._base.LinearRegression'>>),
 ('_get_tags', <bound method BaseEstimator._get_tags of LinearRegression()>),
 ('_more_tags',
  <bound method MultiOutputMixin._more_tags of LinearRegression()>),
 ('_preprocess_data',
  <function sklearn.linear_model._base._preprocess_data(X, y, fit_intercept, normalize=False, copy=True, sample_weight=None, return_mean=False, check_input=True)>),
 ('_repr_html_inner',
  <bound method BaseEstimator._repr_html_inner of LinearRegression()>),
 ('_repr_mimebundle_',
  <bound method BaseEstimator._repr_mimebundle_ of LinearRegression()>),
 ('_residues', 22061.879196211798),
 ('_set_intercept',
  <bound method LinearModel._set_intercept of LinearRegression()>),
 ('_validate_data',
  <bound method BaseEstimator._validate_data of LinearRegression()>),
 ('coef_', array([9.10210898])),
 ('copy_X', True),
 ('fit', <bound method LinearRegression.fit of LinearRegression()>),
 ('fit_intercept', True),
 ('get_params', <bound method BaseEstimator.get_params of LinearRegression()>),
 ('intercept_', -34.67062077643857),
 ('n_features_in_', 1),
 ('n_jobs', None),
 ('normalize', False),
 ('predict', <bound method LinearModel.predict of LinearRegression()>),
 ('rank_', 1),
 ('score', <bound method RegressorMixin.score of LinearRegression()>),
 ('set_params', <bound method BaseEstimator.set_params of LinearRegression()>),
 ('singular_', array([15.78935652]))]

vars(slr) 

>>>
{'fit_intercept': True,
 'normalize': False,
 'copy_X': True,
 'n_jobs': None,
 'n_features_in_': 1,
 'coef_': array([9.10210898]),
 '_residues': 22061.879196211798,
 'rank_': 1,
 'singular_': array([15.78935652]),
 'intercept_': -34.67062077643857}

ex.PCA(主成分分析)においてインスタンスが持つインスタンス変数を調べる

from sklearn.decomposition import PCA

PCA = PCA()

X0 = np.arange(100).reshape(100,1)
X1 = 2*X0
X_demo = np.append(X0,X1, axis =1) + np.random.normal(loc = 0, scale=15, size=200).reshape(100,2)

PCA.fit(X_demo) 
pc = PCA.transform(X_demo)

# モジュール、クラス、インスタンス、あるいはそれ以外の dict 属性を持つオブジェクトの、 dict 属性を返します
pprint(vars(PCA),width=100) #width=80だと見づらい

{'_fit_svd_solver': 'full',
 'components_': array([[ 0.31587351,  0.94880131],
       [-0.94880131,  0.31587351]]),
 'copy': True,
 'explained_variance_': array([9245.73144191,  190.24761265]), # 分散共分散行列の固有値
 'explained_variance_ratio_': array([0.97983806, 0.02016194]),
 'iterated_power': 'auto',
 'mean_': array([ 47.17932959, 147.75046498]),
#  X_demo.mean(axis=0) --> array([ 47.17932959, 147.75046498])
 'n_components': None,
 'n_components_': 2, # 主成分の数
 'n_features_': 2,
 'n_features_in_': 2,
 'n_samples_': 100,
 'noise_variance_': 0.0,
 'random_state': None,
 'singular_values_': array([956.72744956, 137.23889264]),
 'svd_solver': 'auto',
 'tol': 0.0,
 'whiten': False}

# そのオブジェクトの有効な属性のリストを返そうと試みます
pprint(PCA.__dict__)


# オブジェクトが持つ属性のリストを取得したい
dir([object])
引数がない場合、現在のローカルスコープにある名前のリストを返します。引数がある場合、そのオブジェクトの有効な属性のリストを返そうと試みます。

pprint(dir(X_demo))
['T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_finalize__',
 '__array_function__',
 '__array_interface__',
 '__array_prepare__',
 '__array_priority__',
 '__array_struct__',
 '__array_ufunc__',
 '__array_wrap__',
 '__bool__',
 '__class__',
 '__complex__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__ilshift__',
 '__imatmul__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmatmul__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__setitem__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__xor__',
 'all',
 'any',
 'argmax',
 'argmin',
 'argpartition',
 'argsort',
 'astype',
 'base',
 'byteswap',
 'choose',
 'clip',
 'compress',
 'conj',
 'conjugate',
 'copy',
 'ctypes',
 'cumprod',
 'cumsum',
 'data',
 'diagonal',
 'dot',
 'dtype',
 'dump',
 'dumps',
 'fill',
 'flags',
 'flat',
 'flatten',
 'getfield',
 'imag',
 'item',
 'itemset',
 'itemsize',
 'max',
 'mean',
 'min',
 'nbytes',
 'ndim',
 'newbyteorder',
 'nonzero',
 'partition',
 'prod',
 'ptp',
 'put',
 'ravel',
 'real',
 'repeat',
 'reshape',
 'resize',
 'round',
 'searchsorted',
 'setfield',
 'setflags',
 'shape',
 'size',
 'sort',
 'squeeze',
 'std',
 'strides',
 'sum',
 'swapaxes',
 'take',
 'tobytes',
 'tofile',
 'tolist',
 'tostring',
 'trace',
 'transpose',
 'var',
 'view']

docstringのみやソースコードをとってきたい場合は以下を参照

おまけ
予約語を調べる時

import keyword

予約語であるかどうか一覧で確認
keyword.kwlist

'etc'という単語が予約語かどうか確認したい時
keyword.iskeyword('etc')

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up