DNA-based information is a new interdisciplinary field linking information technology and biotechnology. The field is expected to meet the enormous need for long-term data storage by using DNA as an information storage medium. Despite DNA’s promise of strong stability, high storage density and low maintenance cost, researchers face problems in correctly rewriting the digital information encoded in DNA sequences.
Generally, DNA data storage technology has two modes, namely “in vitro hard disk mode” and “in vivo CD mode”. The primary advantage of the in vivo mode is the low-cost, reliable replication of chromosomal DNA by cell replication. Because of this characteristic, it can be used for rapid and low-cost data copy dissemination. Since the DNA sequences encoded for certain information contain a large number of repeats and the presence of homopolymers, however, such information can only be “written” and “read”, but cannot be “rewritten” accurately.
To solve the rewriting problem, Prof. Liu Cai from the Department of Chemistry of Tsinghua University, Prof. Li Jingjing from Changchun Institute of Applied Chemistry (CIAC) of the Chinese Academy of Sciences, and Professor Chen Dong from Zhejiang University led the research. The research team that recently developed a dual plasmid editing system to accurately process digital information in a microbial vector. their conclusions. were published in science advance,
The researchers established a dual plasmid system in vivo using a rationally designed coding algorithm and an information editing tool. This dual plasmid system is suitable for storing, reading and rewriting a wide variety of information, including text, codebooks and images. It fully explores the coding potential of DNA sequences without the need for any addressing index or backup sequence. It is also compatible with a variety of coding algorithms, thus enabling higher coding efficiency. For example, the coding efficiency of the current system reaches 4.0 bits per nucleotide.
To achieve high efficiency as well as reliability in rewriting complex information stored in exogenous DNA sequences in vivo, a variety of CRISPR-associated proteins (Cas) and recombinants were used. The devices were guided by their corresponding CRISPR RNA (crRNA) to cleave a target locus in the DNA sequence so that specific information could be addressed and rewritten. Due to the high specificity between complementary pairs of nucleic acid molecules, information-encoded DNA sequences were precisely reconstructed by recombinases to encode new information. By optimizing the crRNA sequence, the information rewriting tool became highly adapted to complex information, thus resulting in rewrite reliability of up to 94%, which is comparable to existing gene-editing systems.
The dual-plasmid system can serve as a universal platform for DNA-based information rewriting in vivo, thus offering a new strategy for information processing and target-specific rewriting of large and complex data at the molecular level. .
We believe this strategy can also be applied in live hosts with large genomes such as yeast, which will pave the way for practical applications with respect to big data storage.”
Pro. Liu Cai, Department of Chemistry, Tsinghua University
Chinese Academy of Sciences Headquarters
Liu, Y., and others. (2022) In vivo processing of molecularly digital information with targeted specificity and strong reliability. science advance. doi.org/10.1126/sciadv.abo7415.