More than 1 year has passed since last update.

【AzureFunctions】AzureAutoML＋AzureFunctionsで発生したエラーと対処法【AzureAutoML】

Last updated at 2022-09-13Posted at 2022-09-06

今後全体の流れを書いた記事を書くので今から始める方はそちらをご覧ください。
ここでは自分と同じ症状に出くわした方（もしくは将来の自分）のために状態とやったこと等を載せていきたいと思います。

python: 3.7.9
ローカル: win10

VSCodeからデプロイしてます。

requirements.txtには下記を記載した際に起こった現象です。

requirements.txt

azure-functions
azureml-core
azureml-sdk[automl]

1. Result: Failure Exception: AttributeError: module 'typing' has no attribute '_ClassVar'

どうやらdataclassesというモジュールが悪さをしているようです。
Functions上でアンインストールするすべがないので、ローカルにモジュールを書き出して不要なものを削除する方針で進めます。

pip install -r requirements.txt -t .python_packages/lib/site-packages

書き出したのち、dataclassesとついているフォルダとdataclasses.pyを削除します。

2. cannot import name 'check_ansix923_padding' from 'cryptography.hazmat.bindings._rust' (unknown location

おそらくwindowsマシンでinstallしたデータをLinuxで動くFunctionsにあげているのが問題だと思います。
cryptographyというモジュールを削除し、requirements.txtに記載し、Function側でインストールすることにします。

requirements.txt

cryptography

4. 503error

エラー文を持っていないのでかけないが、AzureMLへの権限がないことが問題だった。
Portalで関数ページのIDの項目を開き、状態をオンに変更する。
その後、同ページのIAMから追加→ロールの割り当ての追加→所有者→マネージドID→関数とワークスペースの選択
として権限を付与する。
これと同様のことをリソースグループとワークスペースにも行った。
たぶんワークスペースに行うだけでもいい気がする。

5. Failure Exception: OSError: [Errno 8] Exec format error: '/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/bin/dotnet.exe' Stack: File "/azure-functions-host/workers/python/3.7/LINUX/X64/azure_functions_worker/dispatcher.py"

モジュールのインストールをwindowsで行ったのが問題のようでした。
Vagrant上でUbuntuを立てて、そこにpython3.7を入れてモジュールを書き出して、それを利用することで解決できました。

6. Failure Exception: NotImplementedError: Linux distribution debian 11. does not have automatic support. Missing packages: {'liblttng-ust.so.0'} .NET Core 3.1 can still be used via `dotnetcore2` if the required dependencies are installed.

ランタイムのバージョンが4になってたのが問題でした。
関数の「構成」タブにあるFUNCTIONS_EXTENSION_VERSIONの値を~3に変更することで対応できました。
また、vscodeからデプロイする場合は.vscode/settings.jsonにあるazureFunctions.projectRuntimeの値も~3にしておきましょう。

7. Result: Failure Exception: OSError: [Errno 30] Read-only file system: '/home/site/wwwroot/automl.log'

logファイルを生成するのだが、Functionsはtmp配下以外はreadonlyなので怒られてしまう。
モジュール上のlogを書き出す箇所をコメントアウトして実行してみる。
変更箇所の抜粋

set_log_fileと記述してある箇所
これがlogファイルを出力しているらしいのでコメントアウトすることで対応。エラーについては解消した。

8. Result: Failure Exception: OSError: [Errno 38] Function not implemented: '/home/site/wwwroot/.azureml'

上記と同じ感じ。ただ、一時ファイルのため出力しないようにはできないので、/tmp/配下に配置するように修正する。
/home/site/wwwroot/がUSER_PATHという変数に入っているようなので、これを検索し書き換えた。

.python_packages\lib\site-packages\azureml_base_sdk_common\common.py

# FILE LOCATIONS
if 'win32' in sys.platform:
    # USER_PATH = os.path.expanduser('~')
    # TODO Rename CREDENTIALS_PATH since there aren't credentials there anymore.
    # CREDENTIALS_PATH = os.path.join(USER_PATH, ".azureml")
    CREDENTIALS_PATH = "/tmp/.azureml"
else:
    # USER_PATH = os.path.join(os.getenv('HOME'), '.config')
    # CREDENTIALS_PATH = os.path.join(os.getenv('HOME'), '.azureml')
    CREDENTIALS_PATH = "/tmp/.azureml"

同様のエラーが出たので対処
エラー内に
/home/site/wwwroot/.python_packages/lib/site-packages/azureml/train/automl/_experiment_drivers/experiment_driver.py", line 58
と記載されていたのでこのパスをtmpに該当するように修正

experiment_driver.py

# Expand out the path because os.makedirs can't handle '..' properly
# aml_config_path = os.path.abspath(os.path.join(self.experiment_state.automl_settings.path, '.azureml'))
aml_config_path = "/tmp/.azureml"
os.makedirs(aml_config_path, exist_ok=True)

自分はこれらを修正したところ動作するようになりました。
動作したコードも載せておきます。

import json
import logging
import azure.functions as func
from azureml.core import VERSION
from azureml.core.experiment import Experiment
from azureml.core import Workspace, Datastore, Dataset
from azureml.core.compute import ComputeTarget
from azureml.train.automl import AutoMLConfig

def main(req: func.HttpRequest) -> func.HttpResponse:
    logging.basicConfig(level=logging.ERROR)
    ws = Workspace(
        subscription_id="<サブスクリプションID>",
        resource_group="<リソースグループ名>",
        workspace_name="<ワークスペース名>"
    )

    # 使用するクラスターの準備
    cluster_name = "<作成済みクラスター名>"
    cluster = ComputeTarget(workspace=ws, name=cluster_name)
    # データの登録
    datastore_name = '<登録済みデータストア名>'
    datastore = Datastore.get(ws, datastore_name)
    datastore_paths = [(datastore, '<データストア内のファイルパス>')]
    titanic_dataset = Dataset.Tabular.from_delimited_files(path=datastore_paths)


    # AutoMLの実行準備
    automl_classifier=AutoMLConfig(
        task='classification',
        iterations=5,
        compute_target=cluster,
        primary_metric='AUC_weighted',
        experiment_timeout_minutes=60,
        blocked_models=['XGBoostClassifier'],
        training_data=titanic_dataset,
        test_size=0.3,
        label_column_name="Survived",
        n_cross_validations=2)


    # AutoMLの実行
    experiment_name = '<実験名>'
    experiment = Experiment(ws, experiment_name)
    run = experiment.submit(automl_classifier)

    return func.HttpResponse(json.dumps({"job_id": run.id}))

データはタイタニックの生存予測データを使っています。

まとめ

ライブラリの中身いじったのは初めてだったので目ちゃんこ怖い。
これちゃんと残しておかないと再現とか絶対無理だから、、、
後でこのライプらり群はzipにまとめて保管しておきますわ…

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up