0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

pandas・numpy・python

Last updated at Posted at 2016-02-13

boolの数を数える

import numpy as np
np.count_nonzero({list|DataFrame|Series})
np.sum({DataFrame}) #× 行合計

文字列要素の部分一致

df[xxx].str.contains("部分文字列")

重複判定

df[xxx].duplicated()

データに関数を使う

sr.apply(func)

重複したindexを削除

重複しているグループの最後のデータを残す場合

dup_index = df.index.duplicated(take_last=True)
df[dup_index]

※take_lastではFutureWarningがでる、keep='last'

python

ソート

list.sort(key=func,reverse=True|False)

funcにはソートに使う値を返す関数を
list.sort(key=lambda a:a[1])

可変長引数

def func(*args,**kwards)
def func(var=1,var=2,*args,**kwards)
~~ child_func(*args,**kwards)

※順番に注意

BeautifulSoupとlxml

BeautifulSoup

from bs4 import BeautifulSoup as bs
soup = bs(src)

# soup.find("タグ",{"要素":"文字列"})
# 例
soup.find("div",{"class":"文字列"})

lxml

from lxml import html
html.fromstring(src)

# 検証ツールからxpathコピーとか
# 見つかったのを配列で返してくるので
XPATH = "//*[@id=\"rfindex\"]/div[2]/div[1]/dl/dd/strong"
dom.xpath(XPATH)[0]

lxmlの方がソース元のせいでパースできない事が多い。
ページ遷移しながら使おうとする場合はlxmlよりbsの方が安定する。

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?