Science
Information Rate Decomposition for Noisy Nanopore Channels with Geometric Duplication
Key Points
arXiv:2606.06808v1 Announce Type: new Abstract: This paper studies information rates of noisy duplication channels with memory, motivated by nanopore DNA sequencing. In nanopore sequencing, the measured signal is affected by both inter-symbol interference (ISI), caused by multiple DNA bases residing in the pore, and random sample duplications, where variable translocation speed causes each base to generate a random number of samples.
arXiv:2606.06808v1 Announce Type: new
Abstract: This paper studies information rates of noisy duplication channels with memory, motivated by nanopore DNA sequencing. In nanopore sequencing, the measured signal is affected by both inter-symbol interference (ISI), caused by multiple DNA bases residing in the pore, and random sample duplications, where variable translocation speed causes each base to generate a random number of samples. These two effects make direct theoretical analysis difficult. To address this, we derive a new decomposition of the information rate into two interpretable terms: one associated with the intrinsic memory of an auxiliary ISI channel, and another that captures the uncertainty in the segment boundaries caused by random duplications. This decomposition separates the dominant channel distortions and replaces the direct analysis of the full channel with two more readily interpretable components. We then study the second term through a soft alignment functional closely related to Soft-DTW, which enables strong AEP results and an alternative proof of the Markov-constrained coding theorem based on strong information stability. Finally, we develop a lower bound on the information rate that depends on the distribution of jump distances between adjacent nanopore levels. This bound gives a simple geometric explanation of channel synchronisability and provides a tractable framework for computing achievable rates of Oxford nanopore sequencers.