Book Stored On DNA - All Knowledge In Just 4gm of DNA
Written by Sue Gee
Wednesday, 22 August 2012
Researchers have successfully encoded an entire book - including text, images, and interactive animations - in DNA at a density which means that you could store the total of the world's information in about 4 grams of DNA. This isn't science fiction - it is science and it changes the game for data storage.
We are currently challenged by the problems of storing big data over long time period - but a technique combining next-generation DNA sequencing technology with a novel encoding strategy has come up with an impressive solution that results in a storage density of 5.5 petabits (1 million gigabits) per cubic millimeter.
According to Harvard geneticist George Church it means that:
"A device the size of your thumb could store as much information as the whole Internet,"
and even more incredibly, it means that we could theoretically store the total of the world's information (around 1.8 Zetabyes) in about 4 grams of DNA.
As well as its density DNA offers other advantages; it is stable at room temperature and has been proven to work over a timespan of billions of years. Church commented:
“You can drop it wherever you want, in the desert or your backyard, and it will be there 400,000 years later.”
The encoding strategy, which was applied to the HTML is summarized in a diagram included in Regenesis and also reproduced, together with an explanatory caption in Science Magazine's supplementary materials to the research report, Next-Generation Digital Information Storage in DNA, by George M. Church, Yuan Gao, Sriram Kosuri, which was published online on August 16 in Science Express:
A 12-byte portion of a sentence within the encoded html book is converted to bits (blue) with a 19-bit barcode (red) that determines the location of the encoded bits within the overall book. The bit sequence is then encoded to DNA using a 1 bit per base encoding (a,c = 0; T,G = 1), while also avoiding 4 or more nucleotide repeats and balancing GC content. The entire 5.27 megabit html book used 54,898 oligonucleotides and was synthesized and eluted from a DNA microchip. After amplification (common primer sequences to all oligonucleotides are not shown), the oligonucleotide library was sequenced using next-generation sequencing. Individual reads with the correct barcode and length were screened for consensus, and then reconverted to bits obtaining the original book. In total, the writing, amplification, and reading resulted in 10 bit errors out of 5.27 megabits.
A less technical explanation is provided by George Church and Sriram Kosuri is this video:
Before you get too excited, it is probably a while before you can store your latest backup on a strand of DNA. The problem is that chemical reactions don't happen as fast as electronics and so while the storage density may be huge, the read/write times are on the long side. However, this doesn't mean that it isn't possible to build a DNA drive.
Next-Generation Digital Information Storage in DNA, by George M. Church, Yuan Gao, Sriram Kosuri, published online on August 16 in Science Express