More than 3 years have passed since last update.

AWS日記13 (Textract)

Last updated at 2022-03-04Posted at 2020-06-28

はじめに

今回は Amazon Textractを利用してテキスト抽出機能試します。

今回作成したAmazon Textract のテキスト抽出機能を試すページ

準備

[Lambda , API Gatewayの準備をします。]
(https://qiita.com/tanaka_takurou/items/3f93295de6cff060ec09)

[Amazon Textractの資料]
Amazon Textract

※ 2020年6月時点で、Textractは「アジアパシフィック (東京)」リージョンで利用できないため「米国西部 (オレゴン)」リージョンで試しました。

WEBページ・API作成

GO言語のAWS Lambda関数ハンドラー aws-lambda-go を使用してHTMLやJSONを返す処理を作成します。
また、Textract を使用するため aws-sdk-go を利用します。

[参考資料]
AWS SDK for Go API Reference
Amazon Textractを試してみた
 Amazon Textract（OCR）についてまとめてみた

テキストを抽出するには AnalyzeDocument を使う。

main.go

func analyzeDocument(img string)(string, error) {
        b64data := img[strings.IndexByte(img, ',')+1:]
        data, err := base64.StdEncoding.DecodeString(b64data)
        if err != nil {
                log.Print(err)
                return "", err
        }
        svc := textract.New(session.New(), &aws.Config{
                Region: aws.String("us-west-2"),
        })

        input := &textract.AnalyzeDocumentInput{
                Document: &textract.Document{
                        Bytes: data,
                },
                FeatureTypes: []*string{aws.String("TABLES")},
        }
        res, err2 := svc.AnalyzeDocument(input)
        if err2 != nil {
                return "", err2
        }
        if len(res.Blocks) < 1 {
                return "No Document", nil
        }
        var wordList []string
        for _, v := range res.Blocks {
                if aws.StringValue(v.BlockType) == "WORD" || aws.StringValue(v.BlockType) == "LINE" {
                        wordList = append(wordList, aws.StringValue(v.Text))
                }
        }
        results, err3 := json.Marshal(wordList)
        if err3 != nil {
                return "", err3
        }
        return string(results), nil
}

終わりに

今回はAmazon Textractを試しました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up