【Python初心者】Webデータを取得・レスポンスを処理する基本まとめ

Posted at 2025-06-23

PythonでWeb上のデータを取得したり、レスポンスの中身を確認したりするための基本的な方法について学んだことをまとめました。urllib.request を使って、HTTPリクエストやレスポンスの処理を行います。

Webページの内容を取得する（`urlopen()`）

指定したURLからHTMLやJSONなどのデータを取得できます。

from urllib.request import urlopen

url = "https://www.example.com"
with urlopen(url) as response:
    html = response.read().decode("utf-8")

print(html[:200])  # 先頭200文字だけ表示

<!doctype html>
<html>
<head>
    <title>Example Domain</title>
    ...

HTTPステータスコードを確認する

レスポンスのステータスコード（200, 404など）を取得できます。

from urllib.request import urlopen

url = "https://www.example.com"
with urlopen(url) as response:
    print(response.status)

レスポンスヘッダーを取得する

レスポンスに含まれるHTTPヘッダーの情報を確認できます。

from urllib.request import urlopen

url = "https://www.example.com"
with urlopen(url) as response:
    headers = response.getheaders()
    for header in headers:
        print(header)

('Content-Type', 'text/html; charset=UTF-8')
('Content-Length', '1256')
...

個別のヘッダーだけを取得するには getheader() を使います。

print(response.getheader("Content-Type"))

text/html; charset=UTF-8

HTTPメソッドを指定して送信する（`Request` クラス）

GET以外のメソッド（POST, PUT, DELETEなど）を使いたいときは Request を使います。

from urllib.request import Request, urlopen

url = "https://httpbin.org/put"
req = Request(url, method="PUT")

with urlopen(req) as response:
    print(response.status)

POSTでデータを送信する

data にバイト列を渡すと、POSTリクエストになります。

from urllib.request import Request, urlopen
from urllib.parse import urlencode

data = urlencode({'name': 'alice', 'age': 30}).encode("utf-8")
url = "https://httpbin.org/post"

req = Request(url, data=data, method="POST")

with urlopen(req) as response:
    print(response.read().decode("utf-8"))

{
  "form": {
    "age": "30",
    "name": "alice"
  },
  ...
}

おわりに

今回はWebからのデータ取得や、レスポンスの内容を確認するための基本的な方法を整理しました。特にAPIを使う場面では、レスポンスヘッダーやステータスコードの扱いが重要になることもあるので、実際に手を動かして確認していきたいと思います。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up