The need for better tools for analysing large, unstructured datasets has prompted IBM to announce an investment of $100 million in research and development for ways to manage and make use of unstructured data.
The announcement was made on May 20 along with the launch of software and services that companies can use to analyse and manage Internet-scale data at the multi-petabyte size.
The $100 million will be used for continuing research into technologies and services for managing and exploiting data as it continues to grow in diversity, speed and volume. A recent survey carried out for IBM, the 2011 IBM Global CIO Study, found that 83 percent of 3,000 CIOs surveyed said applying analytics and business intelligence to their IT operations is the most important element of their strategic growth plans over the next three to five years.
According to IBM, the new software launched last week will give organizations the tools to integrate and analyze tens-of-petabytes of data in its native format and gain critical intelligence in sub-second response times.
"The volume and velocity of information is generated at a record pace. This is magnified by new forms of data coming from social networking and the explosion of mobile devices," said Steve Mills, Senior Vice President and Group Executive, IBM Software & Systems.
The software being launched can be used to analyze both traditional structured data alongside unstructured data such as text, video, audio, images, social media, and click streams. The software is powered by Apache Hadoop, the open source data analytics platform that was originally developed by Yahoo. One of Yahoo’s Hadoop clusters won the Terabyte Sort Benchmark in 2008, the first time that either a Java or an open source program won. It is also used by a wide range of well-known companies including on Amazon’s product search indices, Facebook and ebay.
Hadoop in Action (Book review)
Hadoop: The Definitive Guide (Book review)
Pro Hadoop (Book review)