Music video crowd sourced training data

Written by Kay Ewbank

Saturday, 21 May 2011

What do you get if you put together a team of researchers, a music video from a progressive electro band and tens of thousands of its fans?

If you’re trying to teach a computer how to recognize that an object is a person, you have to overcome the problem that changes in the light level, background clutter or strange clothing will all confuse the situation. This is a tough and time consuming task.

Using a music video might not spring to mind as a way around this problem, but a team of researchers at New York University's Courant Institute of Mathematical Sciences has done just that.

Progressive-electro band C-Mon & Kypski’s music video for the song “More or Less” uses single frames from thousands of fans (over 32000 so far) all imitating the band with a capture from their webcam. The band’s video crowd-sourcing project One Frame of Fame shows fans a single frame from the video, and asks them to imitate the pose, capture it on their webcam and upload it to the site. The contribution is then spliced into the video, and an updated version is shown each hour.

cloudfaces

This means the video shows thousands of different people posing in the same way under many different lighting and background conditions, just what the researchers at the Courant Institute were looking for.

Graham Taylor, a post-doctoral fellow at the Courant Institute, explained:

"If we had many examples of people in similar pose, but under differing conditions, we could construct an algorithm that matches based on pose and ignores the distracting information - lighting, clothing, and background."

The researchers realised that the video would be ideal for their needs:

"This turned out to be the perfect data source for developing an algorithm that learns to compute similarity based on pose."

The research team, which includes Courant Professors Chris Bregler and Rob Fergus along with doctoral student Ian Spiro, will present its findings in at the 24th IEEE Conference on Computer Vision and Pattern Recognition (June 21-23) in Colorado Springs. The paper is available here: http://movement.nyu.edu/imitation.

Microsoft Revamps Marketplace
16/10/2025

Microsoft has launched a revamped version of Microsoft Marketplace that combines its Azure Marketplace and Microsoft AppSource into a single new entity.

+ Full Story

Anthropic Says Claude Sonnet 4.5 Is World's Best Coding Model
06/10/2025

Anthropic has released Claude Sonnet 4.5, describing it as the best coding model in the world. Anthropic says this is the strongest model for building complex agents, the best model at using computers [ ... ]

+ Full Story

More News

Last Updated ( Saturday, 21 May 2011 )