While surfing the web, you find something really interesting. But us it of current interest or is it already long gone. One of the problems with the web is that we don't remove dead material and who ever adds an accurate date of posting? Now however we have a way to discover how old a webpage is.
It is often impossible to find a date for the origination of a webpage. And of course you cannot always trust a claim such as "serving the web since 1902" - well if such a claim were made you would know it was an exaggeration since the web is not much more than 20 years old.
Carbon Dating the Web is a utility that retrieves various items of evidence to provide the estimated creation date of any page. All the user has to do is enter its url and then, after a short delay, it gives a report.
The utility has been provided as part of a research project being undertaken by Hany SalahEldeen and Michael Nelson of the Department of Computer Science at Old Dominion University in Norfolk, Virginia.
According to the abstract of a paper presented at the WWW 2013 conference held in Rio de Janeiro, Brazil in May:
To establish a likely datetime, we poll Bitly for the ﬁrst time someone shortened the URI, Topsy for the ﬁrst time someone
tweeted the URI, a Memento aggregator for the ﬁrst time it appeared in a public web archive, Google’s time of last crawl, and the Last-Modiﬁed HTTP response header of the resource itself. We also examine the backlinks of the URI as reported by Google and apply the same techniques for the resources that link to the URI.
The paper also includes this timeline for the resources used for the process of carbon dating.
Hany SalahEldeen, writing on the Web Science and Digital Libraries Research Group blog, explained how the researchers tested the accuracy of the model of 1200 resources for which they were able to manually extract a creation date. The model was able to estimate a creation date in 75% of cases, with 33% being the exact creation date. The model was then used to build the utility shown above. The page we tested was for I Programmer's most popular book review, Beautiful Architecture.
This page was actually created on June 12, 2009 and the date shown on its page is April 10, 2010. It is carbon dated to 20 July 2009 which it also gives as the date it was initially tweeted.This fits in with I Programmer's history, which started to Tweet its articles in July 2010.
Dating web pages is something many developers would find useful and the code for the utility is available on GitHub. Anyone who registers with Bitly and Topsy to obtain API keys can set up a service.