More than 5 years have passed since last update.

Saving PIL.Image images to s3

Posted at 2019-09-20

JPEG画像アップロードのAPIを作りました。Pillowを使って、画像を処理したらAWS s3に保存。aiobotocoreを使ってs３に保存した。

…日本語はまだ無理です。英語で続きます…

So, I built an HTTP for uploading JPEG images.

The images are read by the API, processed by Python Pillow, and then stored in AWS s3. I use aiobotocore to asynchronously upload the raw binary data to s3.

Here's the code to async upload a file to s3:

import aiohttp
import io
import aiobotocore
from engine.config import config


async def upload_object(
    request: aiohttp.web.Request,
    key: str,
    bucket: str,
    data: bytes,
    public_read: bool = False,
):
    """
    Helper function to upload a single file.
    Args:
        :object_name str: The path to where the object will be stored in s3, e.g. data/annoy/test.py
        :file BufferedReader: Reader stream to the file that's going to be uploaded
        :bucket_name str: Name of the s3 bucket.
    """
    loop = request.app.loop
    semaphore = request.app["s3_semaphore"]

    async with semaphore:
        try:
            session = aiobotocore.get_session(loop=loop)
            async with session.create_client(
                "s3",
                aws_access_key_id=config["aws"]["access_key_id"],
                aws_secret_access_key=config["aws"]["access_key_secret"],
            ) as aclient:
                await aclient.put_object(
                    Bucket=bucket,
                    Key=key,
                    Body=io.BytesIO(data),
                )
                if public_read:
                    await aclient.put_object_acl(
                        Bucket=bucket, Key=key, ACL="public-read"
                    )
        except TypeError as e:
            raise aiohttp.web.HTTPException(text="Failed to upload file")

Note that the data argument is of type bytes.

The API reads the HTTP multipart request, and creates a PIL.Image object. After manipulating the image, the API calls await upload_file() like this:

resp = await s3.upload_object(
            request=request,
            bucket=image_bucket,
            key=image_key,
            data=pil_image.tobytes(),
            public_read=public_read,
        )

where request is the aiohttp.web.Request sent to the handler. The other arguments should be self-explanatory.

However this didn't work

The first version of the API didn't upload the image data to s3 correctly. When downloading the image from s3, the file data would be corrupted. The problem, I found out, is that image.tobytes() writes the raw bytes of the internal PIL image representation, not the JPEG binary data. I'm guessing that PIL tries to restore RAW data with all the principal components? Not sure, but anyway this behavior is documented:

In [154]: i = Image.open('/Users/halfdan/Desktop/food.jpg')

In [155]: i.tobytes?
Signature: i.tobytes(encoder_name='raw', *args)
Docstring:
Return image as a bytes object.

.. warning::

    This method returns the raw image data from the internal
    storage.  For compressed image data (e.g. PNG, JPEG) use
    :meth:`~.save`, with a BytesIO parameter for in-memory
    data.

:param encoder_name: What encoder to use.  The default is to
                     use the standard "raw" encoder.
:param args: Extra arguments to the encoder.
:rtype: A bytes object.
File:      ~/.pyenv/versions/3.6.5/lib/python3.6/site-packages/PIL/Image.py
Type:      method

Solution

The solution was to use image.save and write the data into a memory buffer, like this:

# using pil_image.tobytes() doesn't work, so use pil_image.save instead
image_bytes = io.BytesIO()
pil_image.save(image_bytes, format="JPEG")
image_bytes = image_bytes.getvalue()
resp = await s3.upload_object(
    request=request,
    bucket=image_bucket,
    key=image_key,
    data=image_bytes,
    public_read=public_read,
)

Note that

image_bytes.read()

return None, because the data is already in memory. Hence you need to call

image_bytes.getvalue()

勉強になりました！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up