More than 1 year has passed since last update.

[AWS] Amazon Textractの実装と機能

Posted at 2023-10-02

AWSのAmazon Textractの概要と機能

概要

Amazon Textractは、AWSのクラウドベースの光学文字認識（OCR）サービス。このサービスは、印刷された文書や手書きのメモ、テーブルなどの情報を構造化されたデータに抽出し、機械的な処理や分析に利用することができる。Amazon Textractは、文字や単語の位置情報、テーブルのセルや行の情報などを抽出するため、精度の高い結果を提供する。

機能

以下に、Amazon Textractの主な機能を示します。

テキスト抽出: イメージファイルやPDFなどの文書から、文字の情報を抽出し、手書きの文字や印刷された文字に対しても高い精度で認識が可能。
テーブル抽出: テキストベースで構成された表やグリッドの情報を抽出し、テーブル内のセル、行、列の情報を構造化した形式で提供する。
フォーム抽出: フォームの情報（氏名、住所、電話番号など）を自動的に抽出し、複数のフォーム項目を一括で処理することができる。
関連画像抽出: 文書に関連する画像を検出し、抽出することができる。画像内のオブジェクトやテキスト情報を特定することができる。

サンプルコード

Java:

import software.amazon.awssdk.services.textract.TextractClient;
import software.amazon.awssdk.services.textract.model.*;

public class TextractJavaSample {

    public static void main(String[] args) {

        TextractClient textractClient = TextractClient.builder()
                .region(Region.US_WEST_2)
                .build();

        StartDocumentTextDetectionRequest request = StartDocumentTextDetectionRequest.builder()
                .documentLocation(S3Object.builder().bucket("my-bucket").name("my-document.pdf").build())
                .featureTypes("TABLES")
                .build();

        StartDocumentTextDetectionResponse response = textractClient.startDocumentTextDetection(request);
        String jobId = response.jobId();

        // ジョブの完了を待機する処理
        DescribeDocumentTextDetectionRequest jobRequest = DescribeDocumentTextDetectionRequest.builder()
                .jobId(jobId)
                .build();

        boolean isJobComplete = false;
        while (!isJobComplete) {
            DescribeDocumentTextDetectionResponse jobResponse = textractClient.describeDocumentTextDetection(jobRequest);
            isJobComplete = jobResponse.jobStatus().equals(JobStatus.SUCCEEDED);
            try {
                Thread.sleep(5000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }

        GetDocumentTextDetectionRequest textDetectionRequest = GetDocumentTextDetectionRequest.builder()
                .jobId(jobId)
                .build();

        GetDocumentTextDetectionResponse textDetectionResponse = textractClient.getDocumentTextDetection(textDetectionRequest);
        // 結果の処理

        textractClient.close();
    }
}

Go:

package main

import (
	"context"
	"fmt"

	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/service/textract"
)

func main() {
	cfg, err := config.LoadDefaultConfig(context.TODO(),
		config.WithRegion("us-west-2"),
	)
	if err != nil {
		fmt.Println("Configuration error", err)
		return
	}
	client := textract.NewFromConfig(cfg)

	startDocTextDetectionInput := &textract.StartDocumentTextDetectionInput{
		DocumentLocation: &textract.DocumentLocation{
			S3Object: &textract.S3Object{
				Bucket: &bucket,
				Name:   &documentName,
			},
		},
		FeatureTypes: []textract.FeatureType{textract.FeatureTypeTables},
	}
	startResponse, err := client.StartDocumentTextDetection(context.TODO(), startDocTextDetectionInput)
	if err != nil {
		fmt.Println("Error starting text detection", err)
		return
	}
	jobId := startResponse.JobId
	fmt.Println("Job ID:", jobId)

	// ジョブの完了を待機する処理

	for {
		describeResponse, err := client.DescribeDocumentTextDetection(context.TODO(), &textract.DescribeDocumentTextDetectionInput{
			JobId: jobId,
		})
		if err != nil {
			fmt.Println("Error describing text detection job", err)
			return
		}
		if describeResponse.JobStatus == textract.JobStatusSucceeded {
			break
		}
		// 5秒待機
		time.Sleep(5 * time.Second)
	}

	getResponse, err := client.GetDocumentTextDetection(context.TODO(), &textract.GetDocumentTextDetectionInput{
		JobId: jobId,
	})
	if err != nil {
		fmt.Println("Error getting text detection results", err)
		return
	}
	// 結果の処理
}

C#:

using System;
using System.Threading.Tasks;
using Amazon;
using Amazon.Textract;
using Amazon.Textract.Model;

namespace TextractSample
{
    class Program
    {
        static async Task Main(string[] args)
        {
            var config = new AmazonTextractConfig
            {
                RegionEndpoint = RegionEndpoint.USWest2,
                // 他の設定オプション
            };
            using var client = new AmazonTextractClient(config);

            var request = new StartDocumentTextDetectionRequest
            {
                DocumentLocation = new DocumentLocation
                {
                    S3Object = new S3Object
                    {
                        Bucket = "my-bucket",
                        Name = "my-document.pdf"
                    }
                },
                FeatureTypes = new System.Collections.Generic.List<string> { "TABLES" }
            };

            var response = await client.StartDocumentTextDetectionAsync(request);
            var jobId = response.JobId;
            Console.WriteLine($"Job ID: {jobId}");

            // ジョブの完了を待機する処理

            bool isJobComplete = false;
            while (!isJobComplete)
            {
                var jobResponse = await client.DescribeDocumentTextDetectionAsync(new DescribeDocumentTextDetectionRequest
                {
                    JobId = jobId
                });
                isJobComplete = jobResponse.JobStatus == JobStatus.SUCCEEDED;

                await Task.Delay(5000);
            }

            var textDetectionResponse = await client.GetDocumentTextDetectionAsync(new GetDocumentTextDetectionRequest
            {
                JobId = jobId
            });

            // 結果の処理
        }
    }
}

以上が、AWSのAmazon Textractについての概要と機能、およびJava、Go、C#のサンプルコード。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up