More than 5 years have passed since last update.

numpy.ndarrayの代入について

Posted at 2017-06-12

numpy.ndarrayの代入でハマったのでメモ

三行で書くと

np.ndarrayのスライスとマスクについて．
単純に使う場合には元のオブジェクトに代入するときに同じように使える．
スライスはviewを返すが，マスク(fancy index)はcopyを返すのでchainさせたり複雑なことするときは注意が必要．

ndarrayのスライスとマスク

御存知の通りndarrayはスライスで指定した範囲に対して代入することができます．

slice.py

In [1]: import numpy as np

In [2]: table = np.zeros((2, 5))

In [3]: table[1, :3] = 1

In [4]: table
Out[4]:
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  0.,  0.]])

これはスライスで指定するとndarrayがcopyではなく，referenceを返すから．

また，Booleanによるマスクでも同様のことができます．

mask.py

In [1]: import numpy as np

In [2]: table = np.zeros((2, 5))

In [3]: table[1, [True, True, True, False, False]] = 1

In [4]: table
Out[4]:
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  0.,  0.]])

同じですね．

ハマったポイント: ndarrayのスライスとマスクのchain

ではこれをchainさせてみましょう．

chain1.py

In [1]: import numpy as np

In [2]: table = np.zeros((2, 5))

In [3]: table[1, :3][1:] = 1

In [4]: table
Out[4]:
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  0.,  0.]])

スライスは，Referenceを返すので，ReferenceのReferenceはReferenceということで，無事元のtableに代入することができました．簡単ですね．
では，同じことをマスクについてもやってみましょう．

chain2.py

In [1]: import numpy as np

In [2]: table = np.zeros((2, 5))

In [3]: table[1, [True, True, True, False, False]][[False, True, True]] = 1

In [4]: table  # 代入されてない!!!!
Out[4]:
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])

あれ，思ったのと違う結果がでてきました．

chain3.py

In [1]: import numpy as np

In [2]: table = np.zeros((2, 5))

In [3]: table[1, [True, True, True, False, False]][1:] = 1

In [4]: table  # 代入されてない!!!!
Out[4]:
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])

mask+sliceのチェーンも同じです．

どういうことなんでしょうか...

原因と説明

公式のドキュメントに詳しい説明がのっています．
http://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html

But fancy indexing does seem to return views sometimes, doesn't it?
の節を確認すると，

sample1.py

>>> a = numpy.arange(10)
>>> a[[1,2]] = 100
>>> a
array([  0, 100, 100,   3,   4,   5,   6,   7,   8,   9])

と

sample2.py

>>> a = numpy.arange(10)
>>> c1 = a[[1,2]]
>>> c1[:] = 100
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> c1
array([100, 100])

について違いが説明されています．今回もこのパターンです．

結論からいうと，
mask (fancy index)は実はviewを返しているわけではありませんでした．

maskで元のobjectを上書きできたのは，viewを返しているからではなくpythonのインタープリターがinplaceの式に変換していたからだということです．

fancy_index1.py

>>> a[[1,2]] = 100

は

fancy_index1_translated.py

a.__setitem__([1,2], 100)

に変換されるため，元のaの値が変更されたということです．しかし，

fancy_index2.py

c1 = a[[1,2]]

は

fancy_index2_translated.py

c1 = a.__getitem__([1,2])

とviewではなく，copyを返すため，二行に分けた瞬間に結果が変わるということになります．

これを応用することで，chain2.py，chain3.pyの結果も説明できます．

table[1, [True, True, True, False, False]]

はコピーを返すため，その後でいくら上書きしても元のtableは変更されないわけですね．

最後に問題

chain4.py

In [1]: import numpy as np

In [2]: table = np.zeros((2, 5))

In [3]: table[1, :3][False, True, True] = 1

In [4]: table
Out[4]: ???

これはどのような結果を返すでしょうか??

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up