0

Beautiful Soup 4 Advent Calendar 2025

@CookieBox26(Chihiro Mihara)

Beautiful Soup の Tag に対する [] と in の挙動

0

Last updated at 2026-01-04Posted at 2025-12-12

beautifulsoup4==4.12.3 で動作確認しています。

まとめ

bs4.element.Tag オブジェクト x に対し、 x[attr_name] は「x の属性 attr_name の値」、y in x は「HTML 要素 y が x 直下にあるか」を意味する。
- 直感に反して [] と in で適用対象のメンバが異なる。x[attr_name] で属性が取れるからといって attr_name in x で属性の有無を判定するのは誤りなので注意する。

スニペット

from bs4 import BeautifulSoup

text = '<div id="hoge"><p>Apple</p>Banana</div>'
soup = BeautifulSoup(text, 'html.parser')
div = soup.find('div')
p = soup.find('p')

# 1
assert div['id'] == 'hoge'
# 2
assert 'id' not in div
# 3
assert 'id' in div.attrs
assert type(div.attrs) is dict
# 4
assert p in div
assert 'Apple' not in div
assert 'Banana' in div
assert type(div.contents) is list

bs4.element.Tag オブジェクトは角括弧で属性にアクセスできる。
じゃあ in 演算子で属性の有無を判定できるかというとできない。
属性の有無を判定するには .attrs メンバに in 演算子を適用する (.attrs メンバからそのタグの属性の辞書にアクセスできる)。
bs4.element.Tag オブジェクトに直接 in 演算子を適用したときに判定できるのは直下の要素の有無である (in 演算子は直下の要素のリスト .contents に対する in として実装されているため)。

参考文献

1-3. のドキュメントとソース:
- https://www.crummy.com/software/BeautifulSoup/bs4/doc/#Tag.attrs
- https://git.launchpad.net/beautifulsoup/tree/bs4/element.py?h=4.13#n2203
4. のソース:
- https://git.launchpad.net/beautifulsoup/tree/bs4/element.py?h=4.13#n2216

0

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

0