Amazon Echoを使ってIoT-LTに関するクイズで遊ぶスキルをAlexa Skill Kit と AWS Lambda で作った。

Last updated at 2017-12-12Posted at 2017-12-09

2017/12/12追記

スキル「IoT-LTクイズ」がなんと認定プロセスに合格したと通知が来ました。（3回目の申請でパス）
アレクサアプリから見れるはずですが、Echoの設定をしないと公開スキルを見ることができず、僕はEcho持ってないので見れませんwww
Lambdaのモニタリングで使われたかどうかくらいは分かりそうです。

IoT-LT Advent Calendar 2017 10日目の記事です！

大熊元気と申します。
僕はIoT-LTが大好きで、2016年7月からほぼ毎回参加させてもらっています。
2016年11月の初LTで、Amazon Echo ラズパイ版でLチカするというのを発表しました。（発表資料はこちら）
そこから1年経過し、ついにAmazon Echo日本語版が発売されたので、日本語版の外部スキルを作ってみました。

アマゾンエコーとIoT-LTクイズで遊べるスキルです！
クイズの内容は、チャラ電Mitzさんにご協力いただきました。ありがとう

1. スキルの作り方基本

このオフィシャルブログがとっても分かりやすいです。まずはこの星占いスキルを作ってみると基本構成が理解できた気になれます。

2. Alexa Skill Kit

Alexa Skill Kitの方は、スキルの名前とか権限とか各種設定と、Lambdaにデータを渡すための対話モデルの作成、音声認識のチューニングを行います。

まず、対話モデルのインテントスキーマはこんな感じです。1年前のチュートリアルを元にしているので日本語で使えるのかどうか不安でしたが、バッチリ日本語で使えました。さすがアマゾン。
ビルトインライブラリというのがあって、これを組み合わせるだけでもすごいスキルが作れそうです。しかし、それだけ自然言語認識が難しいというか、場面や文脈に応じて絞り込む作業が必要ということですね。たぶん。

{
  "intents": [
    {
      "slots": [
        {
          "name": "Answer",
          "type": "AMAZON.NUMBER"
        }
      ],
      "intent": "AnswerIntent"
    },
    {
      "slots": [
        {
          "name": "Answer",
          "type": "AMAZON.NUMBER"
        }
      ],
      "intent": "AnswerOnlyIntent"
    },
    {
      "intent": "DontKnowIntent"
    },
    {
      "intent": "AMAZON.StartOverIntent"
    },
    {
      "intent": "AMAZON.RepeatIntent"
    },
    {
      "intent": "AMAZON.HelpIntent"
    },
    {
      "intent": "AMAZON.YesIntent"
    },
    {
      "intent": "AMAZON.NoIntent"
    },
    {
      "intent": "AMAZON.StopIntent"
    },
    {
      "intent": "AMAZON.CancelIntent"
    }
  ]
}

最初はサンプルコードに従って、ユーザーの回答をカスタムスロットで1/2/3/4と定義しましたが、全然認識してくれないので、あらかじめアマゾンが作ってくれているビルトインスロット "AMAZON.NUMBER" を使うようにして、いくらかマシになりました。短いワードだと認識が難しいのかな。

サンプル発話はこんな感じで作ってみました。日本語はいろいろな言い回しがあって大変だなぁ。とか思ってたくさん作りましたが、後で何度もテストすると、こんな言い回しはないわ。とか思って減らしました。多いほど良いのでしょうけど。
「〇〇番」って言うと、アレクサがよく「〇〇万」って誤認識するので、〇〇万もサンプル発話に突っ込んでみました。でも、アレクサを甘やかしすぎてて間違った使い方だと思うｗｗ

AnswerIntent 答えは {Answer}
AnswerIntent 答えは {Answer} です
AnswerIntent 正解は {Answer}
AnswerIntent {Answer} かな
AnswerIntent {Answer} だと思う
AnswerIntent {Answer} です
AnswerIntent {Answer} 番
AnswerIntent {Answer} ばん
AnswerIntent {Answer} 万
AnswerOnlyIntent {Answer}
DontKnowIntent 分かりません
DontKnowIntent 分からない
DontKnowIntent 次
DontKnowIntent 知りません
DontKnowIntent 無理です
DontKnowIntent ダメ
AMAZON.StartOverIntent 開始
AMAZON.StartOverIntent クイズ開始
AMAZON.StartOverIntent スタート
AMAZON.StartOverIntent クイズスタート

3. AWS Lambda

クイズのコンテンツはLambdaに置いて実行させます。SDKを使いつつ、コードをインラインで編集したいので、"一から作成"ではなく、"設計図"っていうのを選んで、alexa-skill-kit-sdk-factskill を選択します。あれ？triviaskill かな？どっちでも同じかも。SDKが入ればOKだと思う。

Node.jsのコードを貼り付けます。

1年前のチュートリアルで使ったサンプルコードを元にして、発話部分を日本語に書き換えています。

node.js

/**
 Copyright 2014-2015 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
 http://aws.amazon.com/apache2.0/
 or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
 */

/**
 * This sample shows how to create a simple Trivia skill with a multiple choice format. The skill
 * supports 1 player at a time, and does not support games across sessions.
 */

'use strict';

/**
 * When editing your questions pay attention to your punctuation. Make sure you use question marks or periods.
 * Make sure the first answer is the correct one. Set at least 4 answers, any extras will be shuffled in.
 */
var questions = [
    {
        "IOT-LTの主催者は誰でしょう。": [
            "のびすけさんと土屋さん",
            "タカ＆トシ",
            "西野カナ",
            "日本IOT推進委員会",
            "経済産業省",
            "秘密結社の総統"
        ]
    },
    {
        "2017年最後の回は、第何回でしょう。": [
            "34回",
            "634回",
            "624回",
            "5回",
            "数えてないんかい"
        ]
    },
    {
        "IOT-LT開始前にかかるBGMは何でしょう。": [
            "ジブリ",
            "Get Wild",
            "EDM",
            "Hip Hop",
            "ラブライブ"
        ]
    },
    {
        "IOTという言葉が初めて使われたとされるのはいつでしょう。:": [
            "1999年",
            "2003年",
            "2005年",
            "2007年"
        ]
    },
    {
        "IOT-LTの記念すべき第1回が開催されたのは2015年2月ですが、募集定員数は何名でしたでしょうか？": [
            "65名",
            "45名",
            "55名",
            "75名",
            "95名",
            "85名"
        ]
    },
    {
        "IOT入門と言えば「ラズベリーパイ」。実は色々な類似品がありますが、次のうち実在しないものはどれでしょう。": [
            "ピコパイ",
            "オレンジパイ",
            "バナナパイ",
            "ナノパイ"
        ]
    },
    {
        "IOTデータの活用と言えば「AI」。この「AI」という言葉が初めて登場したのはある会議でした。次のうちどれ？": [
            "ダートマス会議",
            "ダボス会議",
            "国際人工知能会議",
            "無駄な会議",
            "IOT-LT",
            "ニコニコ超会議"
        ]
    },
    {
        "増え続けるIOTデバイス。ガートナー調査では2020年には何台を超えると言われているでしょう。": [
            "200億台",
            "10億台",
            "80億台",
            "150億台",
            "500億台",
            "5000兆台"
        ]
    },
    {
        "IOT-LTクイズのロゴが今のものになったのはいつからでしょう。": [
            "第14回",
            "第5回",
            "第10回",
            "第12回",
            "第23回",
            "第5000兆回"
        ]
    },
];

// Route the incoming request based on type (LaunchRequest, IntentRequest,
// etc.) The JSON body of the request is provided in the event parameter.
exports.handler = function (event, context) {
    try {
        console.log("event.session.application.applicationId=" + event.session.application.applicationId);

        /**
         * Uncomment this if statement and populate with your skill's application ID to
         * prevent someone else from configuring a skill that sends requests to this function.
         */

//     if (event.session.application.applicationId !== "amzn1.echo-sdk-ams.app.05aecccb3-1461-48fb-a008-822ddrt6b516") {
//         context.fail("Invalid Application ID");
//      }

        if (event.session.new) {
            onSessionStarted({requestId: event.request.requestId}, event.session);
        }

        if (event.request.type === "LaunchRequest") {
            onLaunch(event.request,
                event.session,
                function callback(sessionAttributes, speechletResponse) {
                    context.succeed(buildResponse(sessionAttributes, speechletResponse));
                });
        } else if (event.request.type === "IntentRequest") {
            onIntent(event.request,
                event.session,
                function callback(sessionAttributes, speechletResponse) {
                    context.succeed(buildResponse(sessionAttributes, speechletResponse));
                });
        } else if (event.request.type === "SessionEndedRequest") {
            onSessionEnded(event.request, event.session);
            context.succeed();
        }
    } catch (e) {
        context.fail("Exception: " + e);
    }
};

/**
 * Called when the session starts.
 */
function onSessionStarted(sessionStartedRequest, session) {
    console.log("onSessionStarted requestId=" + sessionStartedRequest.requestId
        + ", sessionId=" + session.sessionId);

    // add any session init logic here
}

/**
 * Called when the user invokes the skill without specifying what they want.
 */
function onLaunch(launchRequest, session, callback) {
    console.log("onLaunch requestId=" + launchRequest.requestId
        + ", sessionId=" + session.sessionId);

    getWelcomeResponse(callback);
}

/**
 * Called when the user specifies an intent for this skill.
 */
function onIntent(intentRequest, session, callback) {
    console.log("onIntent requestId=" + intentRequest.requestId
        + ", sessionId=" + session.sessionId);

    var intent = intentRequest.intent,
        intentName = intentRequest.intent.name;

    // handle yes/no intent after the user has been prompted
    if (session.attributes && session.attributes.userPromptedToContinue) {
        delete session.attributes.userPromptedToContinue;
        if ("AMAZON.NoIntent" === intentName) {
            handleFinishSessionRequest(intent, session, callback);
        } else if ("AMAZON.YesIntent" === intentName) {
            handleRepeatRequest(intent, session, callback);
        }
    }

    // dispatch custom intents to handlers here
    if ("AnswerIntent" === intentName) {
        handleAnswerRequest(intent, session, callback);
    } else if ("AnswerOnlyIntent" === intentName) {
        handleAnswerRequest(intent, session, callback);
    } else if ("DontKnowIntent" === intentName) {
        handleAnswerRequest(intent, session, callback);
    } else if ("AMAZON.YesIntent" === intentName) {
        handleAnswerRequest(intent, session, callback);
    } else if ("AMAZON.NoIntent" === intentName) {
        handleAnswerRequest(intent, session, callback);
    } else if ("AMAZON.StartOverIntent" === intentName) {
        getWelcomeResponse(callback);
    } else if ("AMAZON.RepeatIntent" === intentName) {
        handleRepeatRequest(intent, session, callback);
    } else if ("AMAZON.HelpIntent" === intentName) {
        handleGetHelpRequest(intent, session, callback);
    } else if ("AMAZON.StopIntent" === intentName) {
        handleFinishSessionRequest(intent, session, callback);
    } else if ("AMAZON.CancelIntent" === intentName) {
        handleFinishSessionRequest(intent, session, callback);
    } else {
        throw "Invalid intent";
    }
}

/**
 * Called when the user ends the session.
 * Is not called when the skill returns shouldEndSession=true.
 */
function onSessionEnded(sessionEndedRequest, session) {
    console.log("onSessionEnded requestId=" + sessionEndedRequest.requestId
        + ", sessionId=" + session.sessionId);

    // Add any cleanup logic here
}

// ------- Skill specific business logic -------

var ANSWER_COUNT = 4;
var GAME_LENGTH = 5;
var CARD_TITLE = "IOT-LTクイズ"; // Be sure to change this for your skill.

function getWelcomeResponse(callback) {
    var sessionAttributes = {},
        speechOutput = "IOT-LTクイズを始めます。これから " + GAME_LENGTH.toString()
            + " 問のクイズを出題します。4択から選択して番号をお答えください。それでは始めましょう！",
        shouldEndSession = false,

        gameQuestions = populateGameQuestions(),
        correctAnswerIndex = Math.floor(Math.random() * (ANSWER_COUNT)), // Generate a random index for the correct answer, from 0 to 3
        roundAnswers = populateRoundAnswers(gameQuestions, 0, correctAnswerIndex),

        currentQuestionIndex = 0,
        spokenQuestion = Object.keys(questions[gameQuestions[currentQuestionIndex]])[0],
        repromptText = "Question 1. " + spokenQuestion + " ",

        i, j;

    for (i = 0; i < ANSWER_COUNT; i++) {
        repromptText += (i+1).toString() + ". " + roundAnswers[i] + ". "
    }
    speechOutput += repromptText;
    sessionAttributes = {
        "speechOutput": repromptText,
        "repromptText": repromptText,
        "currentQuestionIndex": currentQuestionIndex,
        "correctAnswerIndex": correctAnswerIndex + 1,
        "questions": gameQuestions,
        "score": 0,
        "correctAnswerText":
            questions[gameQuestions[currentQuestionIndex]][Object.keys(questions[gameQuestions[currentQuestionIndex]])[0]][0]
    };
    callback(sessionAttributes,
        buildSpeechletResponse(CARD_TITLE, speechOutput, repromptText, shouldEndSession));
}

function populateGameQuestions() {
    var gameQuestions = [];
    var indexList = [];
    var index = questions.length;

    if (GAME_LENGTH > index){
        throw "Invalid Game Length.";
    }

    for (var i = 0; i < questions.length; i++){
        indexList.push(i);
    }

    // Pick GAME_LENGTH random questions from the list to ask the user, make sure there are no repeats.
    for (var j = 0; j < GAME_LENGTH; j++){
        var rand = Math.floor(Math.random() * index);
        index -= 1;

        var temp = indexList[index];
        indexList[index] = indexList[rand];
        indexList[rand] = temp;
        gameQuestions.push(indexList[index]);
    }

    return gameQuestions;
}

function populateRoundAnswers(gameQuestionIndexes, correctAnswerIndex, correctAnswerTargetLocation) {
    // Get the answers for a given question, and place the correct answer at the spot marked by the
    // correctAnswerTargetLocation variable. Note that you can have as many answers as you want but
    // only ANSWER_COUNT will be selected.
    var answers = [],
        answersCopy = questions[gameQuestionIndexes[correctAnswerIndex]][Object.keys(questions[gameQuestionIndexes[correctAnswerIndex]])[0]],
        temp, i;

    var index = answersCopy.length;

    if (index < ANSWER_COUNT){
        throw "Not enough answers for question.";
    }

    // Shuffle the answers, excluding the first element.
    for (var j = 1; j < answersCopy.length; j++){
        var rand = Math.floor(Math.random() * (index - 1)) + 1;
        index -= 1;

        var temp = answersCopy[index];
        answersCopy[index] = answersCopy[rand];
        answersCopy[rand] = temp;
    }

    // Swap the correct answer into the target location
    for (i = 0; i < ANSWER_COUNT; i++) {
        answers[i] = answersCopy[i];
    }
    temp = answers[0];
    answers[0] = answers[correctAnswerTargetLocation];
    answers[correctAnswerTargetLocation] = temp;
    return answers;
}

function handleAnswerRequest(intent, session, callback) {
    var speechOutput = "";
    var sessionAttributes = {};
    var gameInProgress = session.attributes && session.attributes.questions;
    var answerSlotValid = isAnswerSlotValid(intent);
    var userGaveUp = intent.name === "DontKnowIntent";

    if (!gameInProgress) {
        // If the user responded with an answer but there is no game in progress, ask the user
        // if they want to start a new game. Set a flag to track that we've prompted the user.
        sessionAttributes.userPromptedToContinue = true;
        speechOutput = "進行中のゲームはありません。新しいゲームを始めますか？ ";
        callback(sessionAttributes,
            buildSpeechletResponse(CARD_TITLE, speechOutput, speechOutput, false));
    } else if (!answerSlotValid && !userGaveUp) {
        // If the user provided answer isn't a number > 0 and < ANSWER_COUNT,
        // return an error message to the user. Remember to guide the user into providing correct values.
        var reprompt = session.attributes.speechOutput;
        var speechOutput = "回答は 1 から " + ANSWER_COUNT + "の番号でお願いします。 " + reprompt;
        callback(session.attributes,
            buildSpeechletResponse(CARD_TITLE, speechOutput, reprompt, false));
    } else {
        var gameQuestions = session.attributes.questions,
            correctAnswerIndex = parseInt(session.attributes.correctAnswerIndex),
            currentScore = parseInt(session.attributes.score),
            currentQuestionIndex = parseInt(session.attributes.currentQuestionIndex),
            correctAnswerText = session.attributes.correctAnswerText;

        var speechOutputAnalysis = "";

        if (answerSlotValid && parseInt(intent.slots.Answer.value) == correctAnswerIndex) {
            currentScore++;
            speechOutputAnalysis = "正解です！すごいねー。 ";
        } else {
            if (!userGaveUp) {
                speechOutputAnalysis = "ち、違うと、思う・・・。 "
            }
            speechOutputAnalysis += "正解は " + correctAnswerIndex + ": " + correctAnswerText + "でした。残念。 ";
        }
        // if currentQuestionIndex is 4, we've reached 5 questions (zero-indexed) and can exit the game session
        if (currentQuestionIndex == GAME_LENGTH - 1) {
            speechOutput = userGaveUp ? "" : " ";
            speechOutput += speechOutputAnalysis + "クイズの結果は " + GAME_LENGTH.toString() + "問中" + currentScore.toString() + " 問正解でした。またやってみてね。";
            callback(session.attributes,
                buildSpeechletResponse(CARD_TITLE, speechOutput, "", true));
        } else {
            currentQuestionIndex += 1;
            var spokenQuestion = Object.keys(questions[gameQuestions[currentQuestionIndex]])[0];
            // Generate a random index for the correct answer, from 0 to 3
            correctAnswerIndex = Math.floor(Math.random() * (ANSWER_COUNT));
            var roundAnswers = populateRoundAnswers(gameQuestions, currentQuestionIndex, correctAnswerIndex),

                questionIndexForSpeech = currentQuestionIndex + 1,
                repromptText = "第 " + questionIndexForSpeech.toString() + "問です。" + spokenQuestion + " ";
            for (var i = 0; i < ANSWER_COUNT; i++) {
                repromptText += (i+1).toString() + ". " + roundAnswers[i] + ". "
            }
            speechOutput += userGaveUp ? "" : " ";
            speechOutput += speechOutputAnalysis + "現在のあなたの得点は " + currentScore.toString() + "です。 " + repromptText;

            sessionAttributes = {
                "speechOutput": repromptText,
                "repromptText": repromptText,
                "currentQuestionIndex": currentQuestionIndex,
                "correctAnswerIndex": correctAnswerIndex + 1,
                "questions": gameQuestions,
                "score": currentScore,
                "correctAnswerText":
                    questions[gameQuestions[currentQuestionIndex]][Object.keys(questions[gameQuestions[currentQuestionIndex]])[0]][0]
            };
            callback(sessionAttributes,
                buildSpeechletResponse(CARD_TITLE, speechOutput, repromptText, false));
        }
    }
}

function handleRepeatRequest(intent, session, callback) {
    // Repeat the previous speechOutput and repromptText from the session attributes if available
    // else start a new game session
    if (!session.attributes || !session.attributes.speechOutput) {
        getWelcomeResponse(callback);
    } else {
        callback(session.attributes,
            buildSpeechletResponseWithoutCard(session.attributes.speechOutput, session.attributes.repromptText, false));
    }
}

function handleGetHelpRequest(intent, session, callback) {
    // Provide a help prompt for the user, explaining how the game is played. Then, continue the game
    // if there is one in progress, or provide the option to start another one.
    
    // Ensure that session.attributes has been initialized
    if (!session.attributes) {
        session.attributes = {};
    }

    // Set a flag to track that we're in the Help state.
    session.attributes.userPromptedToContinue = true;

    // Do not edit the help dialogue. This has been created by the Alexa team to demonstrate best practices.

    var speechOutput = "私が " + GAME_LENGTH + " 問の4択クイズを出題します。答えの番号を回答してください。 "
        + " 新しくゲームを始める場合は、クイズスタートと言ってください。 "
        + "問題をもう一度聞く場合はリピートと言ってください。 "
        + "ゲームを続けますか？",
        repromptText = "回答する場合は、答えの番号を言ってください。問題の途中でもOKです。 "
        + "ゲームを続けますか？";
        var shouldEndSession = false;
    callback(session.attributes,
        buildSpeechletResponseWithoutCard(speechOutput, repromptText, shouldEndSession));
}

function handleFinishSessionRequest(intent, session, callback) {
    // End the session with a "Good bye!" if the user wants to quit the game
    callback(session.attributes,
        buildSpeechletResponseWithoutCard("Good bye!", "", true));
}

function isAnswerSlotValid(intent) {
    var answerSlotFilled = intent.slots && intent.slots.Answer && intent.slots.Answer.value;
    var answerSlotIsInt = answerSlotFilled && !isNaN(parseInt(intent.slots.Answer.value));
    return answerSlotIsInt && parseInt(intent.slots.Answer.value) < (ANSWER_COUNT + 1) && parseInt(intent.slots.Answer.value) > 0;
}

// ------- Helper functions to build responses -------


function buildSpeechletResponse(title, output, repromptText, shouldEndSession) {
    return {
        outputSpeech: {
            type: "PlainText",
            text: output
        },
        card: {
            type: "Simple",
            title: title,
            content: output
        },
        reprompt: {
            outputSpeech: {
                type: "PlainText",
                text: repromptText
            }
        },
        shouldEndSession: shouldEndSession
    };
}

function buildSpeechletResponseWithoutCard(output, repromptText, shouldEndSession) {
    return {
        outputSpeech: {
            type: "PlainText",
            text: output
        },
        reprompt: {
            outputSpeech: {
                type: "PlainText",
                text: repromptText
            }
        },
        shouldEndSession: shouldEndSession
    };
}

function buildResponse(sessionAttributes, speechletResponse) {
    return {
        version: "1.0",
        sessionAttributes: sessionAttributes,
        response: speechletResponse
    };
}

4. 実機テスト

サービスシミュレーターのテストはバッチリなので、いよいよ実機テストをしてみます。・・・って、Amazon Echo購入の招待メール来てねーし！　実は、勢い勇んで申し込んだのに招待メール来ず
そんな訳で、ブラウザで使えるエミュレータ「Echosim.io」を使います。1年前にはなかった日本語が選択できる。すごい。

動画を晒しておきます。ひどい状態ですが、この後チューニングして改善しています。
https://youtu.be/NIC6lEi6Gqk

最後に、
ロクにテストもしていないゴミスキルを公開審査に出したら、翌日にアマゾンからメール返信が来ました。ちゃんとテストしてくれて、問題点を丁寧に指摘してくれて、やる気全開しました。ありがとう！アマゾン！！
早速、修正して再申請しました。

まとめ

・日本語版もSDKとかビルトイン機能がちゃんと使えるし、日本語ドキュメントも増えてて開発環境がすばらしい。
・言語モデルが日本語版オリジナルということで、かなり流暢。
・でも、英語の発音が超悪くなってて草。せっかくなのでバイリンガルな感じになってほしい。
・IoTをどうしても「イオット」と言ってしまう。今後、日本語版のAIが学習していくのだろうか。
・対話モデル作ったり、言い回しチューニングするのがかなり手間で、日本語って難しいんだなぁと思った。
・なんで招待メール来ないんだ。ちくしょう。

以上です。ありがとうございました。
アドベントカレンダー、この後も楽しいエンジニアが続きます！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up