Are Your Pictures Memorable?

Written by Nikos Vaggalis

Wednesday, 23 December 2015

Whenever you post a picture to the social media, you are eager to know how well it will be received and how many tweets or likes it will attract. Now, an algorithm called MemNet coming out of the MIT labs may reveal whether your picture will be forgotten in a snap or be remembered throughout time.

This will subsequently make you better at finding the right picture that will stand out from the rest and be ideal for online consumption

Fun use aside, this technology can have serious and widespread applications :

Advertizing - use the picture that guarantees commercial success, increasing the chance of making a sale
Public relations - improve your chaces of making an impact on networks like Facebook and Flickr!
Educational - mnemonic aids for learning
Medical - helping people with memory disabilities
Computing/Research - improvements in other Computer Science disciplines like Computer Vision and graphics, as well as the basis for further research

According to the official statement, MemNet is an algorithm that can:

"objectively measure human memory, allowing us to build LaMem, the largest annotated image memorability dataset to date (containing 60,000 images from diverse sources). Using Convolutional Neural Networks (CNNs), we show that fine-tuned deep features outperform all other features by a large margin, reaching a rank correlation of 0.64, near human consistency (0.68). Analysis of the responses of the high-level CNN layers shows which objects and regions are positively, and negatively, correlated with memorability, allowing us to create memorability maps for each image and provide a concrete method to perform image memorability manipulation."

The real issue here is that although long term human visual memory can store a remarkable amount of visual information, it tends to degrade over time. Further, it is found that image memorability is a property of an image and not of the human brain and can be quantified using machine learning algorithms.

In our every day lives we are bombarded with a massive amount of images that pose constraints on our memory, so the challenge here is how to aid human memory. Can we do it by making images more memorable thus allowing people to consume information more efﬁciently?

But it can go both ways and the same algorithm can also work in reverse and identify the reasons of why some parts of an image tend to be forgotten rather than remembered!

The algorithm has sprung into existence as part of the computational architectures for visual processing, and in particular that of convolutional neural networks, CNN. So the science behind it already existed, but for the machine to have any kind of success there is another very important ingredient to be taken into consideration - a large array of data.

In this case the dataset consisted of 60,000 images, taken from a variety of sources such as the MIR Flickr, AVA, SUN, the image popularity dataset and more. Each image had been scored according to its memorability.

The dataset did not only contain faces of people, being not only anthropocentric but also scene-centric and object-centric,meaning that the algorithm was able to work on images that also contain landscapes and other inanimate objects

Once you have a suitable dataset the next step is to teach the machine,beginning from the very basic principle of how to detect,classify and relate the objects that an image is composed from.

This happens by annotating or adding meta-data to the pictures that guide the machine i.e notify it for the presence or absence of an object class in the image, e.g.,

“there are cars in this image” but “there are no tigers,”

or by object-level annotation of a tight bounding box and class label around an object instance in the image, e.g.,

“there is a screwdriver centered at position (20,25) with width of 50 pixels and height of 30 pixels”.

This can be achieved using the Hybrid-CNN convolutional neural network used for classifying categories of objects and scenes, which was additionally tweaked for accommodating MemNet's needs as unlike visual classiﬁcation, images that are memorable, or forgettable, do not even look alike: an elephant, a kitchen, an abstract painting, a face and a billboard can all share the same level of memorability, but no visual recognition algorithms would cluster these images together.

The research used a crowd-sourcing architecture in that workers of the Amazon’s Mechanical Turk (AMT) platform would press a key when first encountering a picture and press it again when they encountered it again (that is, if they remembered seeing it in the first place). This acted as the indicator of the image's memorability.

Then the data was annotated as the example described above, but with meta-data and attributes such as aesthetics, popularity and emotions, and was subsequently fed to MemNet, which managed to score an impressive 0.64 which nears the human consistency rank correlation for memorability of 0.68, demonstrating that predicting human cognitive abilities is within reach for the ﬁeld of computer vision

The correlation between these attributes and memorability lead to some very interesting observations such as:

images that evoke disgust are statistically more memorable than images showing most other emotions, except for amusement
images portraying emotions like awe and contentment tend to be the least memorable
overall images that evoke negative emotions such as anger and fear tend to be more memorable than those portraying positive ones
the aesthetics of an image and its memorability have little or no correlation!

The learned representations were then visualized in a memorability heat map that portrays the significance of the objects that make an image memorable or forgettable. The areas with hotter colors denote memorability, the cooler colors denote forgetability.

These maps could then be used in the learning and education field, for creating visual cues which reinforce the forgettable aspects of an image while also maintain the memorable ones.

If you want to try things out then visit the LaMem website, which contains samples of the algorithm's work and allows you to upload pictures and check out how they perform against the memorability scale.

Lamemsq

More Information

LaMem

Microsoft Wins ImageNet Using Extremely Deep Neural Networks

Baidu AI Team Caught Cheating - Banned For A Year From ImageNet Competition

The Allen Institute's Semantic Scholar

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Google Releases Gemini Code Assist Enterprise
16/10/2024

Google has released the enterprise version of Gemini Code Assist. This latest version adds the ability to train on internal polices and source code. The product was announced at the Google Cloud Summi [ ... ]

+ Full Story

CouchDB 3.4 Strengthens Password Hashes
03/10/2024

CouchDB 3.41 has been released with stronger password hashes, a Lucene-based full text search implementation, and QuickJS as a JavaScript option.

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Sunday, 03 November 2019 )

More Information

Related Articles

Comments