|Book Stored On DNA - All Knowledge In Just 4gm of DNA|
|Written by Sue Gee|
|Wednesday, 22 August 2012|
Researchers have successfully encoded an entire book - including text, images, and interactive animations - in DNA at a density which means that you could store the total of the world's information in about 4 grams of DNA. This isn't science fiction - it is science and it changes the game for data storage.
We are currently challenged by the problems of storing big data over long time period - but a technique combining next-generation DNA sequencing technology with a novel encoding strategy has come up with an impressive solution that results in a storage density of 5.5 petabits (1 million gigabits) per cubic millimeter.
According to Harvard geneticist George Church it means that:
"A device the size of your thumb could store as much information as the whole Internet,"
and even more incredibly, it means that we could theoretically store the total of the world's information (around 1.8 Zetabyes) in about 4 grams of DNA.
As well as its density DNA offers other advantages; it is stable at room temperature and has been proven to work over a timespan of billions of years. Church commented:
“You can drop it wherever you want, in the desert or your backyard, and it will be there 400,000 years later.”
So while those of us who want to read Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves, (by Church and Regis, published by Basic Books, 2012) will have to wait until October, there are already 70 billion copies of a pre-print edition, which according to Kurzwiel is "roughly triple the sum of the top 100 books of all time".
The encoding strategy, which was applied to the HTML is summarized in a diagram included in Regenesis and also reproduced, together with an explanatory caption in Science Magazine's supplementary materials to the research report, Next-Generation Digital Information Storage in DNA, by George M. Church, Yuan Gao, Sriram Kosuri, which was published online on August 16 in Science Express:
A 12-byte portion of a sentence within the encoded html book is converted to bits (blue) with a 19-bit barcode (red) that determines the location of the encoded bits within the overall book. The bit sequence is then encoded to DNA using a 1 bit per base encoding (a,c = 0; T,G = 1), while also avoiding 4 or more nucleotide repeats and balancing GC content. The entire 5.27 megabit html book used 54,898 oligonucleotides and was synthesized and eluted from a DNA microchip. After amplification (common primer sequences to all oligonucleotides are not shown), the oligonucleotide library was sequenced using next-generation sequencing. Individual reads with the correct barcode and length were screened for consensus, and then reconverted to bits obtaining the original book. In total, the writing, amplification, and reading resulted in 10 bit errors out of 5.27 megabits.
A less technical explanation is provided by George Church and Sriram Kosuri is this video:
Next-Generation Digital Information Storage in DNA, by George M. Church, Yuan Gao, Sriram Kosuri, published online on August 16 in Science ExpressSupplementary Materials (pdf)
or email your comment to: firstname.lastname@example.org
To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin, or sign up for our weekly newsletter.
|Last Updated ( Wednesday, 22 August 2012 )|