Amazon Polly
Yesterday, a new service that turns text into lifelike speech, “Amazon Polly” is announced at AWS re:Invent 2016. Amazon Polly lets us create applications that talk, enabling us to build entirely new categories of speech-enabled products. (press release, official blog, official document for developers).
Simple use case of kintone & Amazon Polly
In this case, I try a simple use case that if you input text you want to convert to voice, you'll able to obtain the voice file as MP3 style and to play on kintone.
Architecture
Record view of kintone
Configuration of kintone & Amazon Polly
kintone app.
Create a kintone application
create a kintone application consisting of forms like this.
Field labels & code | Summary | Field types |
---|---|---|
Text | text field you want to convert to speech | SINGLE_LINE_TEXT |
Attachment | file field to attach mp3-voice-file converted from text | FILE |
(AWS) IAM
We create new role that can access from Lambda to S3, Rekognition and CloudWatch.
Create New Role
input "lambda_polly_exec_role", and click "Next Step"
click "select" in the row of "AWS Lambda" and click "Next Step"
check "AmazonPollyFullAccess" and "CloudWatchFullAccess". and click "Next Step".
AWS Lambda
We create the Lambda function that access to kintone and Polly.
First, we create working directory, workspace
like this.
$ mkdir workspace
$ cd workspace
Create Lambda function
We have to replace {kintone domain}
, {app. id}
, {api token}
with your parameters (see also, kintone REST API).
from __future__ import print_function
import json
import boto3
from botocore.exceptions import BotoCoreError, ClientError
from contextlib import closing
import os
import sys
import subprocess
from tempfile import gettempdir
import requests, urllib2
# parameters of kintone
KINTONE_DOMAIN = "{kintone domain}"
BASE_URL = "https://"+ KINTONE_DOMAIN
APP_ID = "{app. id}"
API_TOKEN = "{api token}"
# parameters of AWS
REGION = 'us-east-1'
# file name attaced to kintone
MP3_FILE_NAME = 'speech.mp3'
print('Loading function')
# get a kintone record
def getRecord(id):
query = 'app='+APP_ID+'&id='+id
url = BASE_URL + "/k/v1/record.json?" + query
headers = {"X-Cybozu-API-Token": API_TOKEN}
# http request with requests
res = requests.get(url, headers=headers, data={})
print(res.text, res.status_code)
return {"res":json.loads(res.text), "code":res.status_code}
# upload the MP3 file to kintone
def uploadMP3File(file):
fileUrl = BASE_URL + "/k/v1/file.json"
fileHeaders = {"X-Cybozu-API-Token": API_TOKEN}
files = {'file': (MP3_FILE_NAME, open(file, 'rb'), 'audio/mp3', {'Expires': '0'})}
res = requests.post(fileUrl, headers=fileHeaders, files=files)
print(res.text, res.status_code)
fileKey = json.loads(res.text)['fileKey']
return fileKey
# update the kintone record by a fileKey
def updateRecordForFile(recordId, fileKey):
fileKeys = [{"fileKey":fileKey}]
record = {'Attachment':{'value':fileKeys}}
request = {'app':APP_ID,'id': recordId,'record':record}
requestJson = json.dumps(request)
url = BASE_URL + "/k/v1/record.json"
headers = {"X-Cybozu-API-Token": API_TOKEN, "Content-Type" : "application/json"}
res = requests.put(url, headers=headers, data=requestJson)
print(res.text, res.status_code)
return {"res":json.loads(res.text), "code":res.status_code}
# build the MP3 file from text with Amazon Polly API
def buildMP3File(text):
# Create a client using the credentials and region defined in the [adminuser]
polly = boto3.client('polly')
try:
# Request speech synthesis
response = polly.synthesize_speech(Text=text, OutputFormat="mp3",
VoiceId="Joey")
except (BotoCoreError, ClientError) as error:
# The service returned an error, exit gracefully
print(error)
sys.exit(-1)
# Access the audio stream from the response
if "AudioStream" in response:
# Note: Closing the stream is important as the service throttles on the
# number of parallel connections. Here we are using contextlib.closing to
# ensure the close method of the stream object will be called automatically
# at the end of the with statement's scope.
with closing(response["AudioStream"]) as stream:
output = os.path.join(gettempdir(), MP3_FILE_NAME)
try:
# Open a file for writing the output as a binary stream
with open(output, "wb") as file:
file.write(stream.read())
except IOError as error:
# Could not write to file, exit gracefully
print(error)
sys.exit(-1)
else:
# The response didn't contain audio data, exit gracefully
print("Could not stream audio")
sys.exit(-1)
return output
# Lambda function
def lambda_handler(event, context):
recordId = event['params']['querystring']['id'] # obtain the record id of kintone via Amazon API Gateway
obj = getRecord(recordId)
text = obj['res']['record']['Text']['value'] # obtain the text to convert
if(len(text) is 0):
text = 'Text field is empty.'
mp3 = buildMP3File(text) # convert text to MP3-voice
fileKey = uploadMP3File(mp3) # upload the MP3 file to kintone and obtain the fileKey
updateRecordForFile(recordId, fileKey) # update the kintone record with attachment of the MP3 file
return 'Process is completed.'
in workspace
, save lambda_function.py
and install requests[http://docs.python-requests.org/en/master/](http://docs.python-requests.org/en/master/)
& boto3
libraries, and create a Lambda function as a deployment package.
$ pip install boto3 -t .
$ pip install requests -t .
$ zip -r upload.zip *
choose "N. Virginia(us-east-1)" and click “Create a Lambda function”.
choose “Blank Function” to configure a Lambda function.
configure a Lambda function as below.
Amazon API Gateway
create API set for invoking Lambda & pass kintone parameters to Lambda via API Gateway.
Create API
type "kintone-polly" as API name and click "Create API".
choose "Create Method" from "Actions".
choose "GET" and click the check mark.
configure GET method API as below.
configuration only "Integration Request" to set minium although we have to set up all four sections.
configure Integration Request as follows.
choose "Deploy API" from "Actions".
conclude API settings as bellow.
copy invoke URL to call from kintone JavaScript customization later.
kintone JavaScript/CSS Customization
Finally, we set kintone JavaScript/CSS customization to call the API we configured.
Main file
We save this main file as "polly.js". We have to replace {api gw url}
with your parameters.
jQuery.noConflict();
(function($) {
'use strict';
// API to convert text to speech, and to upload the converted MP3 file to kintone with Amazon Polly
var POLLY_URL = '{api gw url}';
// show spinner
var showSpinner = function() {
// initialization
if ($('.kintone-spinner').length == 0) {
// create elements for spinner and background
var spin_div = $('<div id ="kintone-spin" class="kintone-spinner"></div>');
var spin_bg_div = $('<div id ="kintone-spin-bg" class="kintone-spinner"></div>');
// append spinner element to "body"
$(document.body).append(spin_div, spin_bg_div);
// style for spinner
$(spin_div).css({
'position': 'fixed',
'top': '50%',
'left': '50%',
'z-index': '510',
'background-color': '#fff',
'padding': '26px',
'-moz-border-radius': '4px',
'-webkit-border-radius': '4px',
'border-radius': '4px'
});
$(spin_bg_div).css({
'position': 'absolute',
'top': '0px',
'z-index': '500',
'width': '150%',
'height': '150%',
'background-color': '#000',
'opacity': '0.5',
'filter': 'alpha(opacity=50)',
'-ms-filter': "alpha(opacity=50)"
});
// options for spinner
var opts = {
'color': '#000'
};
// invoke spinner
new Spinner(opts).spin(document.getElementById('kintone-spin'));
}
// start(show) spinner
$('.kintone-spinner').show();
};
// stop(hide) spinner
var hideSpinner = function() {
// hide spinner element
$('.kintone-spinner').hide();
};
// download file from kintone
var downloadFile = function(fileKey, callback, errback) {
var url = kintone.api.url('/k/v1/file', false) + '?fileKey=' + fileKey;
var xhr = new XMLHttpRequest();
xhr.open('GET', url);
xhr.setRequestHeader('X-Requested-With', 'XMLHttpRequest');
xhr.responseType = 'blob';
xhr.onload = function() {
if (xhr.status === 200) {
// success
var blob = new Blob([xhr.response]);
return callback(blob);
} else {
// error
var err = JSON.parse(xhr.responseText);
if (errback) {
return errback(err);
}
}
};
xhr.onerror = function(err) {
// error
if (errback) {
return errback(err);
}
};
xhr.send();
};
// convert Blob to Base64
var convertBlobToBase64 = function(blob, callback, errback) {
var reader = new window.FileReader();
reader.readAsDataURL(blob);
reader.onload = function() {
// success
var base64data = reader.result;
return callback(base64data.split(',', 2)[1]);
};
reader.onerror = function(err) {
// error
if (errback) {
return errback(err);
}
};
};
// submit success events
kintone.events.on([
'app.record.create.submit.success',
'app.record.edit.submit.success',
], function(event) {
// show spinner
showSpinner();
// call API with Amazon Polly
var id = event.recordId;
var query = 'id=' + id;
var url = POLLY_URL + '?' + query;
return kintone.proxy(url, 'GET', {}, {}).then(function(r){
// hide spinner
hideSpinner();
return event;
}).catch(function(e){
// hide spinner
hideSpinner();
event.error = 'Error occurred.'
return event;
});
});
// after showing record details
kintone.events.on(['app.record.detail.show'], function(event) {
// show spinner
showSpinner();
var record = event.record;
new kintone.Promise(function(resolve, reject) {
// download the MP3 file
var file = record['Attachment'].value[0];
var fileKey = file.fileKey;
downloadFile(fileKey, function(r) {
return resolve(r);
}, function(e) {
return reject(e);
});
}).then(function(blob) {
// convert the MP3 blob file to Base64 style
return new kintone.Promise(function(resolve, reject) {
convertBlobToBase64(blob, function(r) {
return resolve(r);
}, function(e) {
return reject(e);
});
});
}).then(function(base64data) {
// hide spinner
hideSpinner();
// append the Base64-encoded MP3 file to "audio" element
var audioUrl = 'data:audio/mp3;base64,' + base64data;
$(document.body).append(
$('<audio>').prop({
'id': "voice",
'preload': "auto"
}).append(
$('<source>').prop({
'src': audioUrl,
'type': "audio/mp3"
})
)
);
// append the button that starts to play the MP3 file
var elAttachment = kintone.app.record.getFieldElement('Attachment');
$(elAttachment).append(
$('<button>').prop({
id: 'speak'
}).addClass('kintoneplugin-button-dialog-ok').text('Speak')
);
// attach click event to the "#speak" button for playing
$('#speak').click(function() {
var voice = $('#voice');
voice[0].currentTime = 0;
voice[0].play();
});
return;
}).catch(function(e) {
hideSpinner();
console.log(e);
});
});
})(jQuery);
Customizing an App with JavaScript & CSS
set links and a file as follows at JavaScript and CSS Customization view.
- Upload JavaScript for PC
- https://js.cybozu.com/jquery/3.0.0/jquery.min.js [link]
- https://js.cybozu.com/spinjs/2.3.2/spin.min.js [link]
- polly.js [saved file]
- Upload CSS for PC
- 51-current-default.css [downloaded file]
51-current-default.css
is downloaded from "kintone Plug-in SDK ".
Save text and convert
If you input text you want to convert to voice, you'll able to obtain the voice file as MP3 style and to play on kintone.
Related Topics
Information
Please stop by AWS re:Invent Booth #2534 to see what our team developed with it! Let's experience it!