こちらのように watsonx.ai を使えば、Pythonプログラムでテキスト抽出を実行できますが、何らかの原因でジョブがfailしてしまうことがあります。そのような場合のトラブルシューティングのファーストステップです。
この場合、次のセル(実行前にジョブのIDを変更してください)を実行してジョブの詳細を取得します。参照: watsonx.ai API Text Extractions
# 失敗したジョブのID
extraction_job_id = "10a1ed50-e30e-4910-b20a-bed5d6d727cc"
details = extraction.get_job_details(extraction_job_id=extraction_job_id)
# デバッグ表示(構造をそのまま確認)
import json
print(json.dumps(details, indent=2, ensure_ascii=False))
その出力の中のresults部分を参照するとfailした原因の手がかりを得られることが多いです。例えば、この場合は、エラーメッセージから、ICOSへの接続にS3互換アクセスで必須のHMACでの認証をしていないことによるエラーだとわかります。
"results": {
"error": {
"code": "file_download_error",
"message": "failed to download file, EmptyStaticCreds: static credentials are empty"
},
"number_pages_processed": 0,
"status": "failed"
},
参考)出力全文例
{
"entity": {
"document_reference": {
"connection": {
"id": "597d73f1-6b49-4971-b5ac-f5937dfa4787"
},
"location": {
"bucket": "project1-donotdelete-pr-51e06amx1xzgoc",
"file_name": "test.pdf"
},
"type": "connection_asset"
},
"parameters": {
"auto_rotation_correction": true,
"create_embedded_images": "enabled_verbalization_all",
"kvp_mode": "generic_with_semantic",
"languages": [
"ja"
],
"mode": "high_quality",
"ocr_mode": "enabled",
"output_dpi": 72,
"output_tokens": true,
"requested_outputs": [
"assembly",
"html",
"page_images",
"md"
],
"semantic_config": {
"enable_generic_kvp": true
}
},
"results": {
"error": {
"code": "file_download_error",
"message": "failed to download file, EmptyStaticCreds: static credentials are empty"
},
"number_pages_processed": 0,
"status": "failed"
},
"results_reference": {
"connection": {
"id": "597d73f1-6b49-4971-b5ac-f5937dfa4787"
},
"location": {
"bucket": "project1-donotdelete-pr-51e06amx1xzgoc",
"file_name": "text_extraction_result/"
},
"type": "connection_asset"
}
},
"metadata": {
"created_at": "2026-02-04T02:07:44.815Z",
"id": "10a1ed50-e30e-4910-b20a-bed5d6d727cc",
"modified_at": "2026-02-04T02:07:47.943Z",
"project_id": "b527eddf-0fca-4ebb-9e48-0278717d99eb"
},
"system": {
"warnings": [
{
"message": "enable_generic_kvp,target_image_width,enable_text_hints at schema level have been deprecated",
"more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
},
{
"message": "output_tokens_bbox has been deprecated and replaced with the param output_tokens",
"more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
},
{
"message": "target_image_width under semantic_config has been deprecated.",
"more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
},
{
"message": "kvp_mode - ubill and invoice have been deprecated.",
"more_info": "https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-text-extraction-params.html?context=wx"
}
]
}
}
