Are Your Pictures Memorable?
Written by Nikos Vaggalis   
Wednesday, 23 December 2015

Whenever you post a picture to the social media, you are eager to know how well it will be received and how many tweets or likes it will attract. Now, an algorithm called MemNet coming out of the MIT labs may reveal whether your picture will be forgotten in a snap or be remembered throughout time.

This will subsequently make you better at finding the right picture that will stand out from the rest and be ideal for online consumption

Fun use aside, this technology can have serious and widespread applications :

  • Advertizing - use the picture that guarantees commercial success, increasing the chance of making a sale

  • Public relations -  improve your chaces of making an impact on networks like Facebook and Flickr!

  • Educational - mnemonic aids for learning

  • Medical - helping people with memory disabilities

  • Computing/Research - improvements in other Computer Science disciplines like Computer Vision and graphics, as well as  the basis for further research

According to the official statement, MemNet is an algorithm that can:

"objectively measure human memory, allowing us to build LaMem, the largest annotated image memorability dataset to date (containing 60,000 images from diverse sources). Using Convolutional Neural Networks (CNNs), we show that fine-tuned deep features outperform all other features by a large margin, reaching a rank correlation of 0.64, near human consistency (0.68). Analysis of the responses of the high-level CNN layers shows which objects and regions are positively, and negatively, correlated with memorability, allowing us to create memorability maps for each image and provide a concrete method to perform image memorability manipulation."

 

The real issue here is that although long term human visual memory can store a remarkable amount of visual information, it tends to degrade over time. Further, it is found that image memorability is a property of an image and not of the human brain and can be quantified using machine learning algorithms.

In our every day lives we are bombarded with a massive amount of images that pose constraints on our memory, so the challenge here is how to aid human memory. Can we do it by making images more memorable thus allowing people to consume information more efficiently?

But it can go both ways and the same algorithm can also work in reverse and identify the reasons of why some parts of an image tend to be forgotten rather than remembered!

The algorithm has sprung into existence as part of the computational architectures for visual processing, and in particular that of convolutional neural networks, CNN. So the science behind it already existed, but for the machine to have any kind of success there is another very important ingredient to be taken into consideration - a large array of data.

In this case the dataset consisted of 60,000 images, taken from a variety of sources such as the MIR  Flickr, AVA, SUN,  the image popularity dataset and more. Each image had been scored according to its memorability.

The dataset did not only contain faces of people, being not only anthropocentric but also scene-centric and object-centric,meaning that the algorithm was able to work on images that also contain landscapes and other inanimate objects

 

 

Once you have a suitable dataset the next step is to teach the machine,beginning from the very basic principle of how to detect,classify and relate the objects that an image is composed from.

This happens by annotating or adding meta-data to the pictures that guide the machine i.e notify it for the presence  or  absence  of  an  object class  in  the  image,  e.g.,

“there are cars in this image” but “there are no tigers,”

or by object-level annotation of a tight bounding box and class label around an object instance in the image, e.g.,

“there is a screwdriver centered at position (20,25)
with width of 50 pixels and height of 30 pixels”.

This can be achieved using the Hybrid-CNN convolutional neural network used for classifying categories of objects and scenes, which was additionally tweaked for accommodating MemNet's needs as unlike visual classification, images that are memorable, or forgettable, do not even look alike:  an elephant, a kitchen, an  abstract  painting,  a  face  and  a  billboard  can  all  share the same level of memorability,  but no visual recognition algorithms would cluster these images together.  

The research used a crowd-sourcing architecture in that workers of the Amazon’s Mechanical Turk (AMT)  platform would press a key when first encountering a picture and press it again when they encountered it again (that is, if they remembered seeing it in the first place). This acted as the indicator of the image's memorability.

Then the data was annotated as the example described above, but with meta-data and attributes such as aesthetics, popularity and emotions, and was subsequently fed to MemNet, which managed to score an impressive 0.64 which nears the human consistency rank correlation  for memorability of 0.68, demonstrating that predicting human cognitive abilities is within reach for the field of computer vision

The correlation between these attributes and memorability lead to some very interesting observations such as:

  • images that evoke disgust are statistically more memorable than images showing most other emotions, except for amusement

  • images portraying emotions like awe and contentment tend to be the least  memorable

  • overall images that evoke negative emotions such as anger and fear tend to be more memorable than those portraying positive ones

  • the aesthetics of an image and its memorability have little or no correlation!

The learned representations were then visualized in a memorability heat map that portrays the significance of the objects that make an image memorable or forgettable. The areas with hotter colors denote memorability, the cooler colors denote forgetability.

These maps could then be used in the learning and education field, for creating visual cues which reinforce the forgettable aspects of an image while also maintain the memorable ones.

 

 

If you want to try things out then visit the LaMem website, which contains samples of the algorithm's work and allows you to upload pictures and check out how they perform against the memorability scale.

 

Lamemsq

More Information

LaMem

 

Related Articles

Microsoft Wins ImageNet Using Extremely Deep Neural Networks 

Baidu AI Team Caught Cheating - Banned For A Year From ImageNet Competition 

The Allen Institute's Semantic Scholar

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


AWS Introduces A New JavaScript Runtime For Lambda
19/03/2024

Amazon has announced the availability, albeit for experimental purposes, of a new JavaScript based runtime called Low Latency Runtime or LLRT for short, to bring JavaScript up to the performance throu [ ... ]



TypeScript 5.4 Adds NoInfer Type
12/03/2024

TypeScript 5.4 has been released, with the addition of a NoInfer utility type alongside preserved narrowing in closures following last assignments. 


More News

 

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Sunday, 03 November 2019 )