「Alexaに感情を加える」ためのシンプルでかんたんなTIPS #Alexa

先日行われた『JAWS PANKRATION 2021 -Up till Down-』にて、 Brian Tarboxさんが発表された「Alexaに感情を加える」について、実装とともにまとめてみました。

※上記サイトでセッション動画および発表スライドが公開されていますので、ぜひ見てみてください。

ポイント

SSML使おう
発話はランダムに
サウンドを使おう
最適な音声を選択しよう

上記のうち、今回は1と2について説明します。

サンプルコード

題材として以下のようなシンプルなもので考えてみたいと思います。

U: アレクサ、「お天気のサンプル」を開いて。
A: このスキルでは今日のお天気をお伝えします。お天気を教えて、と言ってみてください。
U: お天気を教えて。
A: 今日のお天気は晴れです。

上記のやり取りの一番最後で、Alexaが天気を伝えるところの発話部分を工夫してみたいと思います。

なお、ベースとなる対話モデルとバックエンドコードはこんな感じです。Alexa-hostedを使えば、コピペでかんたんに試せます。１インテントだけのシンプルな対話モデルで、コードも最小限かつベタな書き方にしてあります。

対話モデル

「ビルド」→「対話モデル」→「JSONエディタ」と進んで、以下をペーストして「モデルを保存」「モデルをビルド」します。

{
    "interactionModel": {
        "languageModel": {
            "invocationName": "お天気のサンプル",
            "intents": [
                {
                    "name": "AMAZON.CancelIntent",
                    "samples": []
                },
                {
                    "name": "AMAZON.HelpIntent",
                    "samples": []
                },
                {
                    "name": "AMAZON.StopIntent",
                    "samples": []
                },
                {
                    "name": "AMAZON.NavigateHomeIntent",
                    "samples": []
                },
                {
                    "name": "WeatherIntent",
                    "slots": [],
                    "samples": [
                        "お天気を教えて"
                    ]
                }
            ],
            "types": []
        }
    }
}

*バックエンドコード

「コードエディタ」→ファイル一覧から「index.js」を選択して以下のコードをコピペ、「保存」→「デプロイ」をクリックします。（実際に使うのはWeatherIntentHandlerだけです）

const Alexa = require('ask-sdk-core');

const LaunchRequestHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'LaunchRequest';
    },
    handle(handlerInput) {
        const speakOutput = "このスキルでは今日のお天気をお伝えします。お天気を教えて、と言ってみてください。";

        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

const WeatherIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && Alexa.getIntentName(handlerInput.requestEnvelope) === 'WeatherIntent';
    },
    handle(handlerInput) {
        const speakOutput = "今日のお天気は晴れです。お天気を教えて、と言ってみてください。";
        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

const HelpIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && Alexa.getIntentName(handlerInput.requestEnvelope) === 'AMAZON.HelpIntent';
    },
    handle(handlerInput) {
        const speakOutput = "このスキルでは今日のお天気をお伝えします。お天気を教えて、と言ってみてください。";

        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

const CancelAndStopIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && (Alexa.getIntentName(handlerInput.requestEnvelope) === 'AMAZON.CancelIntent'
                || Alexa.getIntentName(handlerInput.requestEnvelope) === 'AMAZON.StopIntent');
    },
    handle(handlerInput) {
        const speakOutput = "バイバイ。";

        return handlerInput.responseBuilder
            .speak(speakOutput)
            .getResponse();
    }
};

const FallbackIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && Alexa.getIntentName(handlerInput.requestEnvelope) === 'AMAZON.FallbackIntent';
    },
    handle(handlerInput) {
        const speakOutput = "ごめんなさい、うまく聞き取れませんでした。お天気を教えて、と言ってみてください。";

        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

const SessionEndedRequestHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'SessionEndedRequest';
    },
    handle(handlerInput) {
        console.log(`~~~~ Session ended: ${JSON.stringify(handlerInput.requestEnvelope)}`);
        return handlerInput.responseBuilder.getResponse();
    }
};

const ErrorHandler = {
    canHandle() {
        return true;
    },
    handle(handlerInput, error) {
        const speakOutput = 'ごめんなさい、エラーが発生しました。時間をおいてまた試してみてください。';
        console.log(`~~~~ Error handled: ${JSON.stringify(error)}`);

        return handlerInput.responseBuilder
            .speak(speakOutput)
            .getResponse();
    }
};

exports.handler = Alexa.SkillBuilders.custom()
    .addRequestHandlers(
        LaunchRequestHandler,
        WeatherIntentHandler,
        HelpIntentHandler,
        CancelAndStopIntentHandler,
        FallbackIntentHandler,
        SessionEndedRequestHandler
    )
    .addErrorHandlers(
        ErrorHandler
    )
    .lambda();

上記で実際に動かしてみた場合をご紹介します。

とても機械的ですね・・・

1. SSMLを使おう

SSMLを使うとAlexaの発話の調整が可能です。今回は手軽に使えるAmazon:emotionを使ってみましょう

amazon:emotionタグは、Alexaが話すときに感情を表します。これは、ストーリー、ゲーム、ニュース、その他の物語体のコンテンツに役立ちます。たとえば、ゲームでは、正解時に「高揚」の感情を使用し、誤答時には「落胆」の感情を使用できます。

WeatherIntentHandlerを以下のように書きかえてください。

const WeatherIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && Alexa.getIntentName(handlerInput.requestEnvelope) === 'WeatherIntent';
    },
    handle(handlerInput) {
        const speakOutput = "<amazon:emotion name='excited' intensity='high'>今日のお天気は晴れです。お天気を教えて、と言ってみてください。</amazon:emotion>";
        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

発話部分が<amazon:emotion>で囲まれているのがわかると思います。amazon:emotionで設定できるパラメータは以下です。

nameは感情の種類です。name='excited'で興奮した感じ、name='disappointed'で失望した感じが選べます。
intensityは感情の起伏度合いです。high/medium/lowから選択でき、highが高い起伏となります。

ここでは、name='exited' intensity='high'でアゲアゲな感じにしてみました。

試してみましょう。

タグで囲むだけなのですが、とても明るい感じになりましたね！name='disappointed'だと暗い感じになるので、例えば、晴れの場合は明るく、天の場合は暗く話す、という風にすると良いかもしれませんね。

SSMLタグは他にもいろいろあります。詳細はドキュメントをご覧いただき、いろいろ試してみると良いと思います。

2. 発話はランダムに

毎回、同じことの繰り返しになると、ユーザはとても機械的に感じてしまいます。Alexaの発話にもバラエティを加えることで、機械っぽさを減らすことができます。

WeatherIntentHandlerを以下のように書きかえてください。

function getRandomMessage(messages) {
  var i = 0;
  i = Math.floor(Math.random() * messages.length);
  return(messages[i]);
}

const WeatherMessages = [
  "今日のお天気は晴れです。",
  "今日はいちにち良いお天気でしょう。",
  "今日はいいお天気で、絶好の洗濯日和ですね。",
  "今日はいいお天気が続きそうです。傘もいらなさそうですね。"
];

const WeatherIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && Alexa.getIntentName(handlerInput.requestEnvelope) === 'WeatherIntent';
    },
    handle(handlerInput) {
        const speakOutput = getRandomMessage(WeatherMessages) + "お天気を教えて、と言ってみてください。";
        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

発話させたい内容のバリエーションを配列WeatherMessagesに用意しておいて、その中から一つをランダムに取り出す関数getRandomMessageを使って、Alexaの発話内容似バラエティをもたせているわけですね。

実際に試してみましょう。

毎回言い方が変わっているのがわかりますね。ユーザの発話にバリエーションがあるように、Alexaの発話にもバリエーションはあったほうがより人間に近い感じになりますよね。

で、実はすでにask-utilsというライブラリを使うとかんたんに実装できたりします（というか２はまんま以下の記事とおなじです、すいません）

Alexa Champion岡本さんによるスキル開発を効率的に行うためのライブラリですので、ぜひ使ってみてください。

1と2を組み合わせてみる

ということで、ask-utilsも使って、1と2を組みわせるとこうなります。

Alexa-hostedの場合はpackage.jsonを修正します。

〜略〜
  "dependencies": {
    "ask-sdk-core": "^2.7.0",
    "ask-sdk-model": "^1.19.0",
    "aws-sdk": "^2.326.0",
    "ask-utils": "^3.11.0", 
    "moment": "^2.29.1"
  }
}

WeatherIntentHandlerを以下のように書き換えます。

const { getRandomMessage } = require('ask-utils')

const WeatherMessages = [
  "今日のお天気は晴れです。",
  "今日はいちにち良いお天気でしょう。",
  "今日はいいお天気で、絶好の洗濯日和ですね。",
  "今日はいいお天気が続きそうです。傘もいらなさそうですね。"
];

const WeatherIntentHandler = {
    canHandle(handlerInput) {
        return Alexa.getRequestType(handlerInput.requestEnvelope) === 'IntentRequest'
            && Alexa.getIntentName(handlerInput.requestEnvelope) === 'WeatherIntent';
    },
    handle(handlerInput) {
        const speakOutput = "<amazon:emotion name='excited' intensity='high'>" + getRandomMessage(WeatherMessages) + "お天気を教えて、と言ってみてください。</amazon:emotion>";
        return handlerInput.responseBuilder
            .speak(speakOutput)
            .reprompt("お天気を教えて、と言ってみてください。")
            .getResponse();
    }
};

では試してみましょう。

最初に比べると随分と雰囲気が変わったのではないでしょうか？

まとめ

スキル開発の経験がある方なら、特に目新しいことは何もないと感じられたかもしれませんが、スライドにもあるように以下がとても重要ですね。

Making small changes to your skill can greatly inclease user engagement（小さな変更で大きくユーザーエンゲージメントを高めることができる。）

Most skill developers are not doing this（多くのスキル開発者はこれをやっていない。）

Your skill can stand out from the others（これであなたのスキルは他のスキルよりも抜きん出ることができる）

If your skill has "high engagement" Amazon will pay you!(「高いエンゲージメント」を稼げば、Amazonから報酬があるかも)

※上記の「報酬」というのは開発者リワードプログラムを指していると思われます。

VUIでは、通り一辺倒のレスポンスだととても機械的に感じられてしまいます。（自分も含めて）開発者としてはどうしても凝った実装や新しい機能に目が行きがちなところですが、こういった地道な工夫というのもとても重要です。また、スキル開発を始めたばかり、という方でも比較的かんたんに実装できる内容だと思いますので、非常に実践しやすく効果の高いTIPSだと思います。

ぜひユーザエンゲージメントの高いスキルを作っていきましょう！

※気が向いたら3と4もやるかもしれません。

Thanks!

Brian Tarbox(@btarbox)
Hidetaka Okamoto(@hide__dev)