以色列科學家優化DNA存儲技術
作者:
小柯機器人發布時間:2019/9/10 15:01:51
以色列理工學院Zohar Yakhini和Leon Anavy研究組,利用複合的DNA鹼基通過減少合成周期,優化了DNA存儲技術。相關論文2019年9月9日在線發表在《自然—生物技術》上。
研究人員通過對編碼和解碼方法的優化,利用複合的DNA鹼基來探索信息冗餘這種現象。複合DNA鹼基表示鹼基在序列中的位置,由預定比例的四種DNA核苷酸的混合物組成。研究人員的方法使用較少的合成周期編碼數據。研究者將6.4 MB編碼為複合DNA,具有可區分的組成中位數,與之前的報告相比,每單位數據的合成周期減少20%。研究者還使用較大的、具有可區分的組成十分位數的複合字母表模擬編碼,其顯示足以能減少75%的合成周期。研究者描述了適用的糾錯碼和推理方法,並研究了複合DNA鹼基背景下的錯誤模式。
據了解,DNA的密度和長期穩定性使其成為一種吸引人的存儲介質,特別是對於長期數據存儲。現有的DNA存儲技術涉及多個相同的分子的並行合成和測序,導致信息冗餘。
附:英文原文
Title: Data storage in DNA with fewer synthesis cycles using composite DNA letters
Author: Leon Anavy, Inbal Vaknin, Orna Atar, Roee Amit, Zohar Yakhini
Issue&Volume: 2019-09-09
Abstract: The density and long-term stability of DNA make it an appealing storage medium, particularly for long-term data archiving. Existing DNA storage technologies involve the synthesis and sequencing of multiple nominally identical molecules in parallel, resulting in information redundancy. We report the development of encoding and decoding methods that exploit this redundancy using composite DNA letters. A composite DNA letter is a representation of a position in a sequence that consists of a mixture of all four DNA nucleotides in a predetermined ratio. Our methods encode data using fewer synthesis cycles. We encode 6.4MB into composite DNA, with distinguishable composition medians, using 20% fewer synthesis cycles per unit of data, as compared to previous reports. We also simulate encoding with larger composite alphabets, with distinguishable composition deciles, to show that 75% fewer synthesis cycles are potentially sufficient. We describe applicable error-correcting codes and inference methods, and investigate error patterns in the context of composite DNA letters. Toward more storage for less synthesis using a six-letter composite DNA alphabet.
DOI: 10.1038/s41587-019-0240-x
Source:https://www.nature.com/articles/s41587-019-0240-x