A Robust and Efficient Representation-based DNA Storage Architecture by Deep Learning

Small Methods. 2025 Mar;9(3):e2400959. doi: 10.1002/smtd.202400959. Epub 2024 Dec 29.

Abstract

As one main form of multimedia data, images play a critical role in various applications. In this paper, a representation-based architecture is proposed which takes advantage of the outstanding representation and image-generation abilities of deep learning (DL). This architecture includes two DL models: an autoencoder and a U-Net network which achieve the representation, construction, and refinement of images from the noisy reads in DNA storage. Simulation experiments demonstrate that it can reconstruct images of moderate quality in scenarios where insertion-deletion-substitution (IDS) errors are less than 6%. Combined with the feature quantization, it also offers a flexible way to achieve a balanced trade-off between compression ratio and image quality by selecting an approximate representation channel number. Additionally, the quality of images can be boosted by using multiple reads which are a common situation in DNA storage. A wet lab practice that successfully reconstructs an image stored in 14 plasmids further proves the feasibility of the proposed architecture. Instead of storing the original image information, the representation-based architecture provides a competitive solution which achieves robust and efficient DNA storage for large-scale image applications.

Keywords: deep learning; image DNA storage; image compression; representation features.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • DNA* / chemistry
  • DNA* / genetics
  • Deep Learning*
  • Image Processing, Computer-Assisted* / methods
  • Neural Networks, Computer

Substances

  • DNA