LoginSignup
6
9

More than 5 years have passed since last update.

NumPyでしきい値以上の位置を求める

Last updated at Posted at 2017-04-20

NumPyで 昇順のデータに対して value >= threshold となる index を求める

方法

  1. 「np.array < th」の argmin
  2. 「np.array >= th」の searchsorted
  3. 「np.array < th」の長さ
  4. 「np.array < th」の sum
  5. 「np.array >= th」の where
  6. for ループで探す
  7. 「takewhile」 の長さ

7通りの方法で、結果が正しいことを確認

python
import numpy as np
from more_itertools import ilen
from itertools import takewhile

def find_index(a, ths):
    for i,v in enumerate(a):
        if v >= th:
            break
    return i

n = 10000000
x = np.arange(n)
th = n / 10
print((x < th).argmin())
print(np.searchsorted(x, th))
print(len(x[x < th]))
print((x < th).sum())
print(np.where(x>=th)[0][0])
print(find_index(x, th))
print(ilen(takewhile(lambda i: i < th, x)))
>>>
1000000
1000000
1000000
1000000
1000000
1000000
1000000

計測

python
%timeit np.searchsorted(x, th)
%timeit (x < th).argmin()
%timeit len(x[x < th])
%timeit (x < th).sum()
%timeit np.where(x >= th)[0][0]
%timeit find_index(x, th)
%timeit ilen(takewhile(lambda i: i < th, x))
>>>
3.84 µs ± 347 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
10.2 ms ± 215 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
12.5 ms ± 62.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
21.2 ms ± 411 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
43.1 ms ± 635 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
3.36 µs ± 19.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.74 µs ± 75.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
python
th = 10
%timeit np.searchsorted(x, th)
%timeit (x < th).argmin()
%timeit len(x[x < th])
%timeit (x < th).sum()
%timeit np.where(x >= th)[0][0]
%timeit find_index(x, th)
%timeit ilen(takewhile(lambda i: i < th, x))
>>>
3.31 µs ± 24.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
9.86 ms ± 28.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
13.1 ms ± 530 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
21.7 ms ± 761 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46.9 ms ± 1.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
3.76 µs ± 310 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
6.01 µs ± 231 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

考察

  • searchsortedがよい
  • argmin と forループ ならば、ソートされている必要はない

以上

6
9
6

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
6
9