Concept

DNA digital data storage

Summary
DNA digital data storage is the process of encoding and decoding binary data to and from synthesized strands of DNA. While DNA as a storage medium has enormous potential because of its high storage density, its practical use is currently severely limited because of its high cost and very slow read and write times. In June 2019, scientists reported that all 16 GB of text from the English Wikipedia had been encoded into synthetic DNA. In 2021, scientists reported that a custom DNA data writer had been developed that was capable of writing data into DNA at 18 Mbps. Countless methods for encoding data in DNA are possible. The optimal methods are those that make economical use of DNA and protect against errors. If the message DNA is intended to be stored for a long period of time, for example, 1,000 years, it is also helpful if the sequence is obviously artificial and the reading frame is easy to identify. Several simple methods for encoding text have been proposed. Most of these involve translating each letter into a corresponding "codon", consisting of a unique small sequence of nucleotides in a lookup table. Some examples of these encoding schemes include Huffman codes, comma codes, and alternating codes. To encode arbitrary data in DNA, the data is typically first converted into ternary (base 3) data rather than binary (base 2) data. Each digit (or "trit") is then converted to a nucleotide using a lookup table. To prevent homopolymers (repeating nucleotides), which can cause problems with accurate sequencing, the result of the lookup also depends on the preceding nucleotide. Using the example lookup table below, if the previous nucleotide in the sequence is T (thymine), and the trit is 2, the next nucleotide will be G (guanine). Various systems may be incorporated to partition and address the data, as well as to protect it from errors. One approach to error correction is to regularly intersperse synchronization nucleotides between the information-encoding nucleotides.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.