We wrote late last year about how Amazon Web Services, AWS, were about to put the final nail in the coffin of the voice recognition dinosaur Dragon from Nuance with their foray into the voice recognition world. Click here to read the post which is titled Amazon AWS Transcribe, The Nuance Dragon Slayer?
The big deal for us with the AWS Transcribe technology was really the ability to convert multiple individual voices in an audio file (or audio stream) to text. This is something that Nuance has never managed to achieve and that customers have wanted. Meaning audio recordings of meetings, interviews, round tables, in fact any good quality audio containing multiple speaking voices can now be automatically voice-to-text transcribed. Click here to see our blog post on AWS Transcribe with a demo on how it converts voice to text for audio of a two person interview.
The excitement continues with Amazon’s text to speech capabilities. With easy to use plugins popping up all over the internet which integrate not only text to speech but also foreign language translated audio in applications and websites. We are using one on this blog which we installed, activated and implemented in 2 minutes. This blog runs off WordPress and the plugin we are using for our text-to-speech is called Amazon AI. Amazon AI is powered by Amazon Polly a service that turns text into lifelike speech.
We have dabbled with adding an audio read aloud of our blog posts for some years, originally we were using Soundcloud to host our audio and we used their code to embed our audio into our posts. With website visitors constantly connected to mobile devices, it makes sense to enable an audio read of your blog posts so that visitors can listen to, rather than read, your content. With Amazon AI, as you can hear from this blog post example, the audio is clear, accurate and the quality of the read is excellent.
To use the Amazon AI plugin on your WordPress blog you will need to access your AWS account. These are free to set up but please note that there may be costs involved for using some AWS services depending on how much you use as AWS employs a pay-as-you-go model. Having said that AWS also have free tiers, usually enough to get you started and tested for free. You link your WordPress blog to your AWS account via two keys. An Access Key ID and a Secret Access Key, both can be found in the “Your Security Credentials” settings of your AWS account, never post these keys or give them to anyone.
With the Amazon AI WordPress plugin installed the first step is to connect the plugin to your AWS account using your Access Keys:
To enable conversion of your blog post written text Amazon AI will require some storage, this can either be on your own server if you are WordPress self-hosted or on Amazon’s cloud storage known as S3.
In the Text-To-Speech section of the Amazon AI plugin is where you can select the voice and accent that you want to hear, a number of male and female voices are available with various accents including Australian, English, US and even Welsh. This section also defines the audio playback speed, whether to include automated breaths for a more realistic read, the position of the audio playback either before or after the blog post and whether or not to autoplay the audio:
The Amazon AI WordPress plugin can take you audio even further with options to translate your audio from one language to another and even publish the audio it has created as a podcast. Both of these features require you to use Amazon’s S3 storage for the audio files. Think about the reach your blogs can now have with auto-translation and read out of your blog posts in a different language to the one you wrote the post in.
The Amazon AI WordPress Plugin uses an Amazon service called Polly to convert your text to speech. There are costs for using the Polly service which are detailed on the Amazon Polly website click here. Polly, like many Amazon services, does have a free tier. In your first year, Polly will process let you process 5 million characters per month for free. After that, you will be looking at a cost of $4 per one million characters processed by Polly.
Watch this space, Amazon AWS are revolutionising the way we convert both audio and text to speech, it is good now, it will only get better.
Click here for the official Amazon AI WordPress Plugin page.