Abstract:
We study algorithms for solving a problem of constructing a text (a long string) from a dictionary (a sequence of small strings). The problem has an application in bioinformatics and has a connection with the sequence assembly method for reconstructing a long DNA sequence from small fragments. Our problem is the construction a string t of length n using strings s1, ⋯ , sm with possible overlapping. Firstly, we provide a classical (randomized) algorithm with running time O(n+ L+ m(log n) 2) = O~ (n+ L) where L is the sum of lengths of s1, ⋯ , sm. Secondly, we provide a quantum algorithm with running time O(n+logn·(logm+loglogn)·m·L)=O~(n+m·L). Additionally, we show that the lower bound for a classical (randomized or deterministic) algorithm is Ω(n+ L). Thus, our classical algorithm is optimal up to a log factor, and our quantum algorithm shows a speed-up when compared with any classical (randomized or deterministic) algorithm in the case of non-constant length of strings in the dictionary.