0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

watsonxのText Extraction(テキスト抽出)がfailした時の対応ファーストステップ

0
Last updated at Posted at 2026-02-13

こちらのように watsonx.ai を使えば、Pythonプログラムでテキスト抽出を実行できますが、何らかの原因でジョブがfailしてしまうことがあります。そのような場合のトラブルシューティングのファーストステップです。

ジョブがfailした時のサンプル出力です。
image.png

この場合、次のセル(実行前にジョブのIDを変更してください)を実行してジョブの詳細を取得します。参照: watsonx.ai API Text Extractions

# 失敗したジョブのID
extraction_job_id = "10a1ed50-e30e-4910-b20a-bed5d6d727cc"

details = extraction.get_job_details(extraction_job_id=extraction_job_id)

# デバッグ表示(構造をそのまま確認)
import json
print(json.dumps(details, indent=2, ensure_ascii=False))

その出力の中のresults部分を参照するとfailした原因の手がかりを得られることが多いです。例えば、この場合は、エラーメッセージから、ICOSへの接続にS3互換アクセスで必須のHMACでの認証をしていないことによるエラーだとわかります。

    "results": {
      "error": {
        "code": "file_download_error",
        "message": "failed to download file, EmptyStaticCreds: static credentials are empty"
      },
      "number_pages_processed": 0,
      "status": "failed"
    },

参考)出力全文例

{
  "entity": {
    "document_reference": {
      "connection": {
        "id": "597d73f1-6b49-4971-b5ac-f5937dfa4787"
      },
      "location": {
        "bucket": "project1-donotdelete-pr-51e06amx1xzgoc",
        "file_name": "test.pdf"
      },
      "type": "connection_asset"
    },
    "parameters": {
      "auto_rotation_correction": true,
      "create_embedded_images": "enabled_verbalization_all",
      "kvp_mode": "generic_with_semantic",
      "languages": [
        "ja"
      ],
      "mode": "high_quality",
      "ocr_mode": "enabled",
      "output_dpi": 72,
      "output_tokens": true,
      "requested_outputs": [
        "assembly",
        "html",
        "page_images",
        "md"
      ],
      "semantic_config": {
        "enable_generic_kvp": true
      }
    },
    "results": {
      "error": {
        "code": "file_download_error",
        "message": "failed to download file, EmptyStaticCreds: static credentials are empty"
      },
      "number_pages_processed": 0,
      "status": "failed"
    },
    "results_reference": {
      "connection": {
        "id": "597d73f1-6b49-4971-b5ac-f5937dfa4787"
      },
      "location": {
        "bucket": "project1-donotdelete-pr-51e06amx1xzgoc",
        "file_name": "text_extraction_result/"
      },
      "type": "connection_asset"
    }
  },
  "metadata": {
    "created_at": "2026-02-04T02:07:44.815Z",
    "id": "10a1ed50-e30e-4910-b20a-bed5d6d727cc",
    "modified_at": "2026-02-04T02:07:47.943Z",
    "project_id": "b527eddf-0fca-4ebb-9e48-0278717d99eb"
  },
  "system": {
    "warnings": [
      {
        "message": "enable_generic_kvp,target_image_width,enable_text_hints at schema level have been deprecated",
        "more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
      },
      {
        "message": "output_tokens_bbox has been deprecated and replaced with the param output_tokens",
        "more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
      },
      {
        "message": "target_image_width under semantic_config has been deprecated.",
        "more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
      },
      {
        "message": "kvp_mode - ubill and invoice have been deprecated.",
        "more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
      }
    ]
  }
}
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?