|Deep Mimic - A Virtual Stuntman
|Written by Mike James
|Saturday, 14 April 2018
Although the title says "A Virtual Stuntman" it is quite difficult to know exactly what the use case for this research is. It clearly has an impact. Take a look at the video and see what you think.
This research, conducted by Xue Bin Peng, Pieter Abbeel, Sergey Levine from the University of California, Berkeley and Michiel van de Panne of the University of British Columbia, has been characterized in a number of ways and this is because it is a technique that has many potential uses, but none of them standout as the most important. You could say that it is about creating better cartoon animation without the help of a human animator. It could be a way of getting virtual actors to do things that real actors would find difficult or dangerous. It could be an approach to the problem of robot locomotion. Probably, in time, it will be all of the above and more.
So what is DeepMimic?
Put simply it is a neural network trained using reinforcement learning to reproduce motion captured movement using a simulated humanoid, or other, agent. The idea may sound involved, and to implement it is a lot of work, but the fundamental idea really is simple enough. Set up a simulation of whatever it is you want to animate - there are lots of examples in the video. Get some motion capture of someone doing something that you want to imitate. Use the motion capture data to train a neural network using reinforcement learning. The reward is simply how close the simulation gets to the motion capture data. The input is the configuration of the arms and legs at each time point and the reward is simply the difference between the real thing and the simulation at each time point.
Reinforcement learning doesn't give any feedback on what the simulation is doing wrong, it simply gives a signal of how much the agent is achieving the goal and leaves it to figure out how to improve.
In this case the reward is remarkably simple and almost seems not to be nuanced enough to guide the learning to a correct solution. My guess is that there are sufficient constraints from the physics to mean that getting closer to a set of points in the trajectory usually means closer later in the trajectory. Even so, starting from cold, i.e. the first position, is going to require a lot of trial and error learning to get to the final position. A clever trick, and one worth remembering in other situations, is that the simulation was started at different points in the trajectory in an attempt to complete the action. This seems to have worked well and it reinforces the idea that once the physics has started it constrains the possibilities sufficiently for the reinforcement learner to improve.
Take a look at the video:
You will notice that some fo the animations were of imaginary entities such as the T-Rex. How do you get motion capture data for these? The answer is you don't. You get an animation artist to draw some keyframes and use those to train. Again it suggests that the physics constrains the problems so that there are few solutions between keyframes.
There are lots of ways that DeepMimic could be used and lots of ways of extending the idea. The researchers have tried putting together actions, extending actions by putting them into a different terrain, perturbing them by throwing blocks and so on. You can find more by reading the paper which is being presented at this year's SIGGRAPH in August.
So what can we use it for?
You could use it to create animated cartoons on demand, but given one of the animations was of the Boston Dynamics Atlas robot I don't see why the learning couldn't be transferred to the real machine. We've already seen Atlas doing backflips and it obviously needed more practice.
Also the same animations don't have to be applied to obvious cartoon renders. Why not simulated human like actors. What else?
or email your comment to: firstname.lastname@example.org
|Last Updated ( Saturday, 14 April 2018 )