|The Art and Science of Conversational AI|
|Written by Sue Gee|
|Sunday, 08 January 2023|
Nine teams have been selected for the fifth iteration of the Alexa Prize SocialBot Grand Challenge. Will one of them succeed this year in claiming the $1 Million that is on offer for the first Social Bot capable of holding an engaging and coherent conversation lasting 20 minutes?
Amazon's overall aim when it launched the Alexa Prize in 2016 was to make advances in conversational AI. To do this it challenged teams of university students to build a “socialbot” on Alexa to converse with people about popular topics and news events. The ultimate goal for competing teams is to meet the Grand Challenge: a score of 4.0 or higher (out of 5) from a panel of judges and to sustain a coherent and engaging conversation for 20 minutes.The first team to achieve this will win a $1 million research grant for their university but prizes also go to the top three teams in each iteration of the contest..
Each team selected for the challenge receivesnot only a research grant of up to $250,000 but also with Alexa-enabled devices, AWS services and support from Alexa Science. As the competition has progressed so have the tools provided to the teams for automated speech recognition, neural detection and generation models which this means that each time the challenge runs, teams start further along the road to success.
The nine teams announced for SocialBot Grand Challenge 5 (SGC 5) include teams from five universities that have participated before, together with four new universities. Among the returning teams is Alquist from the Czech Technical University, Prague which was the 2021 winner of SGC4 and among the finalists in all previous competitions. Stanford University's Chirpy Cardinal, which came second in 2021 is participating for the third time. The other three returning teams come from University of California, Santa Cruz; Universidad Politécnica de Madrid and Carnegie Mellon University. All four of the new universities are US-based - University of Illinois, Urbana-Champaign; University of California, Santa Barbara; Virginia Tech and Stevens Institute of Technology.
As reported in Alexa Prize SocialBot Grand Challenge 5, this year teams will be competing for two sets of awards - for overall performance and a new set for scientific impact which seems a worthwhile innovation as:
In previous challenges, participating teams have improved the state of the art for open domain dialogue systems by developing improved natural language understanding (NLU) systems, neural response generation models, common sense knowledge modeling, and dialogue policies leading to smoother, and more engaging conversations.
The other new feature introduced into this year's challenge is that, in addition to verbal conversations, customers with Echo screen devices or a Fire TV may be presented with images or text on screen that enhance the conversational experience. This gives teams the opportunity to include additional text and images that provide more diverse and meaningful information.
Whether or not this feature will actually help with holding a sustained meaningful conversation time will tell. But over the four contests to date it has become increasing apparent that this is difficult as indicated by the fact that last year Alquist, the winning team and the team that had been finalists in every SocialBot challenge to date only had an average rating of 3.28 from Alexa customers and its average interaction duration with judges during the final was only 14 minutes and 14 seconds.
Watching the video of the finals what often happens is that Alexa uses a question to initiate a conversation but then can't carry on the thread in a convincing manner and instead makes an abrupt change to another of its prepared topics. Despite the judges indulgence in engaging the chatbot in its choice of films, music or sports this seems pretty artificial. Is this due to the constraints of the challenge or to the limited conversational experience of the youthful university teams?
As an Alexa user I find "her" pretty knowledgeable about a wide range of topics I am interested in. For example, if I say "Alexa what do you know about Picasso/Andy Warhol/ Stravinsky/Joni Mitchell/Charles Dickens/Hilary Mantel" the immediate responses are both correct and informative - its an amazing combination of Google search and text-to-speech. Follow up questions can elicit more information and increasing I ask Alexa when I come across a topic I want to know more about. But this isn't conversation because it lacks the give and take of an exchange between speaker and listener who exchange these roles in a process that can end up far from where it started. We sometimes refer to the "art" of conversation, indicating that is something the requires skill and practice.
The difficulty is acknowledged by Reza Ghanadan, senior principal scientist with Alexa AI and head of Alexa Prize who has admitted:
“Creating a socially adept AI is a hard problem. This is because human-like social conversation is remarkably delicate and complex, and the open domain nature of the SocialBot dialogues makes it extremely challenging. You need to provide relevant and deep responses to a wide range of topics users may ask, maintain a natural and coherent exchange throughout a potentially long conversation, and accurately interpret the intent of the user by correctly picking up on names, topics, places and products while taking into account the context of each conversation turn. You also need to make the interactions with users lively and engaging, which is challenging given the diversity of topics and users interacting with Alexa.”
If you are interested in how the 2023 teams are doing, from March Alexa customers will be able to engage in conversation with their socialbots by saying, “Alexa, let’s chat.”
or email your comment to: firstname.lastname@example.org
|Last Updated ( Sunday, 08 January 2023 )|