0
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

kintone gets lifelike speech from text via Amazon Polly

Last updated at Posted at 2016-12-01

Amazon Polly

Yesterday, a new service that turns text into lifelike speech, “Amazon Polly” is announced at AWS re:Invent 2016. Amazon Polly lets us create applications that talk, enabling us to build entirely new categories of speech-enabled products. (press release, official blog, official document for developers).

polly18.png

Simple use case of kintone & Amazon Polly

In this case, I try a simple use case that if you input text you want to convert to voice, you'll able to obtain the voice file as MP3 style and to play on kintone.

Architecture

Here is that architecture.
Untitled (4).png

Record view of kintone

polly19.png

Configuration of kintone & Amazon Polly

kintone app.

Create a kintone application

create a kintone application consisting of forms like this.

Field labels & code Summary Field types
Text text field you want to convert to speech SINGLE_LINE_TEXT
Attachment file field to attach mp3-voice-file converted from text FILE

スクリーンショット 2016-12-01 2.17.38.png

(AWS) IAM

We create new role that can access from Lambda to S3, Rekognition and CloudWatch.

Create New Role

click "Create New Role".
polly1.png

input "lambda_polly_exec_role", and click "Next Step"
polly2.png

click "select" in the row of "AWS Lambda" and click "Next Step"
polly3.png

check "AmazonPollyFullAccess" and "CloudWatchFullAccess". and click "Next Step".
polly4.png

click "Create Role"
polly5.png

AWS Lambda

We create the Lambda function that access to kintone and Polly.
First, we create working directory, workspace like this.

$ mkdir workspace
$ cd workspace

Create Lambda function

We have to replace {kintone domain}, {app. id}, {api token} with your parameters (see also, kintone REST API).

lambda_function.py


from __future__ import print_function

import json
import boto3
from botocore.exceptions import BotoCoreError, ClientError
from contextlib import closing
import os
import sys
import subprocess
from tempfile import gettempdir
import requests, urllib2

# parameters of kintone
KINTONE_DOMAIN = "{kintone domain}"
BASE_URL = "https://"+ KINTONE_DOMAIN
APP_ID = "{app. id}"
API_TOKEN = "{api token}"

# parameters of AWS
REGION = 'us-east-1'

# file name attaced to kintone
MP3_FILE_NAME = 'speech.mp3'

print('Loading function')

# get a kintone record
def getRecord(id):
    query = 'app='+APP_ID+'&id='+id
    url = BASE_URL + "/k/v1/record.json?" + query
    headers = {"X-Cybozu-API-Token": API_TOKEN}
    # http request with requests
    res = requests.get(url, headers=headers, data={})
    print(res.text, res.status_code)
    return {"res":json.loads(res.text), "code":res.status_code}

# upload the MP3 file to kintone
def uploadMP3File(file):
    fileUrl = BASE_URL + "/k/v1/file.json"
    fileHeaders = {"X-Cybozu-API-Token": API_TOKEN}
    files = {'file': (MP3_FILE_NAME, open(file, 'rb'), 'audio/mp3', {'Expires': '0'})}
    res = requests.post(fileUrl, headers=fileHeaders, files=files)
    print(res.text, res.status_code)
    fileKey = json.loads(res.text)['fileKey']
    return fileKey

# update the kintone record by a fileKey
def updateRecordForFile(recordId, fileKey):
    fileKeys = [{"fileKey":fileKey}]
    record = {'Attachment':{'value':fileKeys}}
    request = {'app':APP_ID,'id': recordId,'record':record}
    requestJson = json.dumps(request)
    url = BASE_URL + "/k/v1/record.json"
    headers = {"X-Cybozu-API-Token": API_TOKEN, "Content-Type" : "application/json"}
    res = requests.put(url, headers=headers, data=requestJson)
    print(res.text, res.status_code)
    return {"res":json.loads(res.text), "code":res.status_code}

# build the MP3 file from text with Amazon Polly API
def buildMP3File(text):
    # Create a client using the credentials and region defined in the [adminuser]
    polly = boto3.client('polly')

    try:
        # Request speech synthesis
        response = polly.synthesize_speech(Text=text, OutputFormat="mp3",
                                            VoiceId="Joey")
    except (BotoCoreError, ClientError) as error:
        # The service returned an error, exit gracefully
        print(error)
        sys.exit(-1)

    # Access the audio stream from the response
    if "AudioStream" in response:
        # Note: Closing the stream is important as the service throttles on the
        # number of parallel connections. Here we are using contextlib.closing to
        # ensure the close method of the stream object will be called automatically
        # at the end of the with statement's scope.
        with closing(response["AudioStream"]) as stream:
            output = os.path.join(gettempdir(), MP3_FILE_NAME)
            try:
                # Open a file for writing the output as a binary stream
                with open(output, "wb") as file:
                    file.write(stream.read())
            except IOError as error:
                # Could not write to file, exit gracefully
                print(error)
                sys.exit(-1)

    else:
        # The response didn't contain audio data, exit gracefully
        print("Could not stream audio")
        sys.exit(-1)
    return output

# Lambda function
def lambda_handler(event, context):
    recordId = event['params']['querystring']['id'] # obtain the record id of kintone via Amazon API Gateway
    obj = getRecord(recordId)
    text = obj['res']['record']['Text']['value'] # obtain the text to convert
    if(len(text) is 0):
        text = 'Text field is empty.'
    mp3 = buildMP3File(text) # convert text to MP3-voice
    fileKey = uploadMP3File(mp3) # upload the MP3 file to kintone and obtain the fileKey
    updateRecordForFile(recordId, fileKey) # update the kintone record with attachment of the MP3 file
    return 'Process is completed.'

in workspace, save lambda_function.py and install requests[http://docs.python-requests.org/en/master/](http://docs.python-requests.org/en/master/) & boto3 libraries, and create a Lambda function as a deployment package.

$ pip install boto3 -t .
$ pip install requests -t .
$ zip -r upload.zip *

choose "N. Virginia(us-east-1)" and click “Create a Lambda function”.
polly6.png

choose “Blank Function” to configure a Lambda function.
polly7.png

click "Next"
polly8.png

configure a Lambda function as below.
polly9.png

click "create function".
polly10.png

Amazon API Gateway

create API set for invoking Lambda & pass kintone parameters to Lambda via API Gateway.

Create API

click "Create API".
polly11.png

type "kintone-polly" as API name and click "Create API".
polly12.png

choose "Create Method" from "Actions".
polly13.png

choose "GET" and click the check mark.
polly14.png

configure GET method API as below.
polly15.png

click "OK".
polly16.png

configuration only "Integration Request" to set minium although we have to set up all four sections.
polly17.png

configure Integration Request as follows.
polly18.png

choose "Deploy API" from "Actions".
polly19.png

conclude API settings as bellow.
polly20.png

copy invoke URL to call from kintone JavaScript customization later.
polly21.png

kintone JavaScript/CSS Customization

Finally, we set kintone JavaScript/CSS customization to call the API we configured.

Main file

We save this main file as "polly.js". We have to replace {api gw url} with your parameters.

polly.js
jQuery.noConflict();
(function($) {
  'use strict';

  // API to convert text to speech, and to upload the converted MP3 file to kintone with Amazon Polly
  var POLLY_URL = '{api gw url}';

  // show spinner
  var showSpinner = function() {
    // initialization
    if ($('.kintone-spinner').length == 0) {
      // create elements for spinner and background
      var spin_div = $('<div id ="kintone-spin" class="kintone-spinner"></div>');
      var spin_bg_div = $('<div id ="kintone-spin-bg" class="kintone-spinner"></div>');

      // append spinner element to "body"
      $(document.body).append(spin_div, spin_bg_div);

      // style for spinner
      $(spin_div).css({
        'position': 'fixed',
        'top': '50%',
        'left': '50%',
        'z-index': '510',
        'background-color': '#fff',
        'padding': '26px',
        '-moz-border-radius': '4px',
        '-webkit-border-radius': '4px',
        'border-radius': '4px'
      });
      $(spin_bg_div).css({
        'position': 'absolute',
        'top': '0px',
        'z-index': '500',
        'width': '150%',
        'height': '150%',
        'background-color': '#000',
        'opacity': '0.5',
        'filter': 'alpha(opacity=50)',
        '-ms-filter': "alpha(opacity=50)"
      });

      // options for spinner
      var opts = {
        'color': '#000'
      };

      // invoke spinner
      new Spinner(opts).spin(document.getElementById('kintone-spin'));
    }

    // start(show) spinner
    $('.kintone-spinner').show();
  };

  // stop(hide) spinner
  var hideSpinner = function() {
    // hide spinner element
    $('.kintone-spinner').hide();
  };

  // download file from kintone
  var downloadFile = function(fileKey, callback, errback) {
    var url = kintone.api.url('/k/v1/file', false) + '?fileKey=' + fileKey;
    var xhr = new XMLHttpRequest();
    xhr.open('GET', url);
    xhr.setRequestHeader('X-Requested-With', 'XMLHttpRequest');
    xhr.responseType = 'blob';
    xhr.onload = function() {
      if (xhr.status === 200) {
        // success
        var blob = new Blob([xhr.response]);
        return callback(blob);
      } else {
        // error
        var err = JSON.parse(xhr.responseText);
        if (errback) {
          return errback(err);
        }
      }
    };
    xhr.onerror = function(err) {
      // error
      if (errback) {
        return errback(err);
      }
    };
    xhr.send();
  };

  // convert Blob to Base64
  var convertBlobToBase64 = function(blob, callback, errback) {
    var reader = new window.FileReader();
    reader.readAsDataURL(blob);
    reader.onload = function() {
      // success
      var base64data = reader.result;
      return callback(base64data.split(',', 2)[1]);
    };
    reader.onerror = function(err) {
      // error
      if (errback) {
        return errback(err);
      }
    };
  };

  // submit success events
  kintone.events.on([
    'app.record.create.submit.success',
    'app.record.edit.submit.success',
  ], function(event) {
    // show spinner
    showSpinner();
    // call API with Amazon Polly
    var id = event.recordId;
    var query = 'id=' + id;
    var url = POLLY_URL + '?' + query;
    return kintone.proxy(url, 'GET', {}, {}).then(function(r){
      // hide spinner
      hideSpinner();
      return event;
    }).catch(function(e){
      // hide spinner
      hideSpinner();
      event.error = 'Error occurred.'
      return event;
    });
  });

  // after showing record details
  kintone.events.on(['app.record.detail.show'], function(event) {
    // show spinner
    showSpinner();
    var record = event.record;
    new kintone.Promise(function(resolve, reject) {
      // download the MP3 file
      var file = record['Attachment'].value[0];
      var fileKey = file.fileKey;
      downloadFile(fileKey, function(r) {
        return resolve(r);
      }, function(e) {
        return reject(e);
      });
    }).then(function(blob) {
      // convert the MP3 blob file to Base64 style
      return new kintone.Promise(function(resolve, reject) {
        convertBlobToBase64(blob, function(r) {
          return resolve(r);
        }, function(e) {
          return reject(e);
        });
      });
    }).then(function(base64data) {
      // hide spinner
      hideSpinner();
      // append the Base64-encoded MP3 file to "audio" element
      var audioUrl = 'data:audio/mp3;base64,' + base64data;
      $(document.body).append(
        $('<audio>').prop({
          'id': "voice",
          'preload': "auto"
        }).append(
          $('<source>').prop({
            'src': audioUrl,
            'type': "audio/mp3"
          })
        )
      );

      // append the button that starts to play the MP3 file
      var elAttachment = kintone.app.record.getFieldElement('Attachment');
      $(elAttachment).append(
        $('<button>').prop({
          id: 'speak'
        }).addClass('kintoneplugin-button-dialog-ok').text('Speak')
      );

      // attach click event to the "#speak" button for playing
      $('#speak').click(function() {
        var voice = $('#voice');
        voice[0].currentTime = 0;
        voice[0].play();
      });
      return;
    }).catch(function(e) {
      hideSpinner();
      console.log(e);
    });
  });
})(jQuery);

Customizing an App with JavaScript & CSS

set links and a file as follows at JavaScript and CSS Customization view.
polly22.png

51-current-default.css is downloaded from "kintone Plug-in SDK ".

Save text and convert

If you input text you want to convert to voice, you'll able to obtain the voice file as MP3 style and to play on kintone.
polly19.png

Related Topics

Information

Please stop by AWS re:Invent Booth #2534 to see what our team developed with it! Let's experience it!

0
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?