|Amazon AI Services|
|Written by Sue Gee|
|Friday, 02 December 2016|
News from re:Invent, the annual AWS developer conference includes three new AI services that can be used within apps. There is a free tier for all three, bringing the prospect of applied AI ever nearer.
Over the past couple of months we've repeatedly heard Microsoft's message about democratizing AI. Now Amazon has provided three new SDKs that will allow all of us to incorporate deep learning into our apps and interfaces.
The newly launched Amazon AI Services portal gives access to three new products and the Amazon Machine Learning service, which we reported in April 2015 when it was announced.
Amazon Rekognition is a fully managed service for image detection and recognition that uses deep neural network models, intended to make it easy for devs to add image analysis to apps. It integrates directly with Amazon S3 and AWS Lambda for building scalable, affordable, and reliable image analysis applications.
According to the Amazon press release:
Amazon Rekognition can locate faces within images and detect attributes, such as whether or not the face is smiling or the eyes are open. Amazon Rekognition also supports advanced facial analysis functionalities such as face comparison and facial search. Using Rekognition, developers can build an application that measures the likelihood that faces in two images are of the same person, thereby being able to verify a user against a reference photo in near real-time. Similarly, developers can create collections of millions of faces (detected in images) and can search for a face similar to their reference image in the collection. Amazon Rekognition removes the complexity and overhead required to develop and manage expensive image processing pipelines by making comprehensive image classification, detection, and management capabilities available in a simple, cost-effective, and reliable AWS service. There are no upfront costs for Amazon Rekognition, developers pay only for the images they analyze and the facial feature vectors they store.
There's a demo for trying it out and it looks promising - although not very novel given that Microsoft's Face API, originally part of Project Oxford and now among its Cognitive Services API, has had several public demos over the past 18 months.
This is the results given for one of the sample photos. But try uploading your own photos and the results might not be so impressive. Another of the sample photos, this time for Object and Scene Detection is of a silky, golden haired dog identified with 97.9% confidence, suggesting that it might outperform Microsoft's What-Dog.net.
The system might have been well-trained when it comes to dogs. It's not so great at cats:
OK I can just about see why the pose, and the lack of anything object to indicate the scale, might have fooled the app and on another, very similar photo, the results were 87.9% for both Animal and Cat and 79.2% Siamese, and I always suspected my moggy had some Siamese ancestry.
But the system is easy to fool. My caption for this would be suburban garden in the snow with pond. Yes to Outdoors/Snow/Ice - but where is the crowd?
As part of AWS’s Free Tier, you can analyze 5,000 images per month and store up to 1,000 face metadata each month, for the first 12 months for free.
Amazon Polly is a text to speech service and its free tier includes 5 million characters per month, for the first 12 months, starting from the first request for speech.
According to Amazon this new service:
makes it easy for developers to add natural-sounding speech capabilities to existing applications like newsreaders and e-learning platforms, or create entirely new categories of speech-enabled products - from mobile apps to devices and appliances. Amazon Polly is easy to use; developers can send text to Amazon Polly using the SDK or from within the AWS Management Console and Polly immediately returns an audio stream that can be played directly or stored in a standard audio file format. With 47 lifelike voices and support for 24 languages, developers can choose from both male and female voices with a variety of accents to make applications for users around the globe.
The third new service Amazon Lex uses the technology that powers Amazon Alexa - i.e. the combination of automatic speech recognition (ASR) and natural language understanding (NLU) that makes it possible to build the bots that perform tasks like checking the weather or booking flights. Again this seems like catch up with Microsoft - recall the demo of Cortana making hotel reservations from BUILD. But that's is no bad thing - it serves to provide choice for developers - Azure or AWS. Amazon Lex is now available in a free but limited preview. You'll need an AWS account to apply and also need to indicate the platforms on which you plan to publish a Lex bot and the services you plan to integrate with it.
or email your comment to: firstname.lastname@example.org
|Last Updated ( Friday, 21 April 2017 )|