2
4

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

PythonでPDFのページを画像に保存する方法

Posted at

はじめに

個人の備忘録としてのメモです

方法

PyMuPDF

import fitz

with fitz.open("test.pdf") as doc:
    for i, page in enumerate(doc):
        image = page.get_pixmap()
        file_name = f"{i+1}.jpg"
        image.save(file_name)

pikepdf + pdf2image

from pdf2image import convert_from_path
from pikepdf import Pdf

with Pdf.open(source_path) as pdf:
    for i, page in enumerate(pdf.pages):
        page_size = page.MediaBox
        temp_pdf_file = f"temp_{i+1}.pdf"
        temp_pdf = Pdf.new()
        temp_pdf.pages.append(page)
        temp_pdf.save(temp_pdf_file)
        image_file_path = convert_from_path(
            temp_pdf_file,
            fmt="jpeg",
            paths_only=True,
            size=(page_size[2], page_size[3]), # デフォルトだと別のサイズになってしまうので、ここでページのサイズを指定している
        )

PyPDF2 + pdf2image

基本的に、PDFを読み込むのを PyPDF2 に変えただけ。

from pdf2image import convert_from_path
from PyPDF2 import PdfFileReader, PdfFileWriter

with open(source_path, "rb") as file:
    pdf = PdfFileReader(file)
    page_count = pdf.getNumPages()
    for i in range(0, page_count):
        page = pdf.getPage(i)
        page_size = page.mediaBox
        temp_pdf_file = f"temp_{i+1}.pdf"
        temp_pdf = PdfFileWriter()
        temp_pdf.addPage(page)
        with open(temp_pdf_file, "wb") as new_file:
            temp_pdf.write(new_file)
        image_file_path = convert_from_path(
            temp_pdf_file,
            fmt="jpeg",
            paths_only=True,
            size=(page_size.upperRight[0], page_size.upperRight[1]),
        )

おまけ

2022/02/08時点で、 pikepdf を AWS Lambda の Arm64 のインスタンスでインストールしようとすると、
qpdf 関連のエラーでインストールできないので、その場合は PyPDF2 が良さそう。
それ以外(x86)では、 pikepdf が良さそう。(メイン処理が C++ で実装されているので、そちらが高速だと思われる)

2
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
4

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?