Spot + ChatGTP - It's Amazing
Written by Sue Gee   
Sunday, 03 December 2023

Boston Dynamics' quadruped robot Spot has been given the ability to hear questions and respond to them, thanks to integration with Chat GPT and other AI models. See Spot in action as a tour guide and as a proof of concept for the robotics applications of foundational models. 

We have previously speculated about the outcome of combining  a chatbot and a robot. Matt Klingensmith, Principal Robot Autonomy Engineer at Boston Dynamics had the same idea and the result is really impressive - if a bit spooky. It raises new questions about what real world problems Spot might solve in the future - but first watch the video:

In common with developers across the globe engineers at Boston Dynamics were impressed and excited by the rapid progress being made by large Foundational Models and wanted to explore how these models work and how they might impact robotics development. The team responsible for Spot put together some proof-of-concept demos using FMs for robotics applications and expanding on them during an internal hackathon. Inspired by the apparent ability of LLMs to roleplay, replicate culture and nuance, form plans, and maintain coherence over time, as well as by Visual Question Answering (VQA) models that can caption images and answer simple questions about them, they created the Robot Tour Guide using Spot's SDK. 

The demo required some simple hardware integrations together with several software models:

robot tour guide

 

As with earlier demos of Spot's abilities - locomotion and dance and more recently carrying items round a construction site - see Spot The Robot Dog Learns New Tricks -  Spot's behavior isn't entirely autonomous - instead it is scripted, at least in part. The robot is instructed to walk around, look at objects in the environment and use a VQA or captioning model to describe them, and then elaborate on those descriptions using an LLM. Additionally, the LLM could answer questions from the tour audience, and plan what actions the robot should take next. 

Acording to Klingensmith

In this way, the LLM can be thought of as an improv actor—we provide a broad strokes script and the LLM fills in the blanks on the fly.

Commenting on the possibility that the LLM-derived information might be misleading, Klingensmith adds that while the Robot Tour Guide might

hallucinate and add plausible-sounding details without fact checking; but in this case, we didn’t need the tour to be factually accurate, just entertaining, interactive, and nuanced.

From watching the video I think Spot succeeded in being all three - and the Haikus it invents are icing on the cake. 

Klingensmith also details some fascinating emergent behavior: 

For example, we asked the robot “who is Marc Raibert?”, and it responded “I don’t know. Let’s go to the IT help desk and ask!”, then proceeded to ask the staff at the IT help desk who Marc Raibert was. We didn’t prompt the LLM to ask for help. It drew the association between the location “IT help desk” and the action of asking for help independently. Another example: we asked the robot who its “parents” were— it went to the “old Spots” where Spot V1 and Big Dog are displayed in our office and told us that these were its “elders”.

To be clear, these anecdotes don’t suggest the LLM is conscious or even intelligent in a human sense—they just show the power of statistical association between the concepts of “help desk” and “asking a question,” and “parents” with “old.” But the smoke and mirrors the LLM puts up to seem intelligent can be quite convincing.

Compared to an earlier demo of a quadruped robot endowed with ChatGTP capability, see Should We Beware The Unitree Go2 Boston Dynamics has achieved a new level and it is almost impossible not to think that this is a thinking, talking, dog, ready to go into action in rescue situations, to guard premises and to join a site maintenance team.

talking spot

 

More Information

Robots that can chat

Related Articles

Spot The Robot Dog Learns New Tricks

Should We Beware The Unitree Go2

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Gemini 1.5 Pro Now Available
29/04/2024

Google has released Gemini 1.5 Pro with improvements including Native Audio Understanding, System Instructions, and a JSON mode.



Node.js 22 Adds WebSocket Client
29/04/2024

Node.js 22 has been released with support for requiring ESM graphs, a stable WebSocket client, and updates of the V8 JavaScript engine.


More News

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

 

 

 

Last Updated ( Sunday, 03 December 2023 )