Comparison of sequences generated by a hidden Markov model
Sequences comparison;Asymptotic behavior
Kerchev, George Georgiev ; Houdre, Christian Mathematics Damron, Michael Foley, Robert Koltchinskii, Vladimir Tikhomirov, Konstantin ; Houdre, Christian
The length $LC_n$ of the longest common subsequences of two strings $X = (X_1, \ldots, X_n)$ and $Y = (Y_1, \ldots, Y_n)$ is way to measure the similarity between $X$ and $Y$. We study the asymptotic behavior of $LC_n$ when the two strings are generated by a hidden Markov model $(Z, (X, Y))$. The latent chain $Z$ is an aperiodic time-homogeneous and irreducible finite state Markov chain and the pair $(X_i, Y_i)$is generated according to a distribution depending of the state of $Z_i$ for every $i \geq 1$. The letters $X_i$ and $Y_i$ each take values in a finite alphabet $\mathcal{A}$.The goal of this work is to build upon asymptotic results for $LC_n$ obtained for sequences of iid random variables. Under some standard assumptions regarding the model we first prove convergence results with rates for $\mathbb{E}[LC_n]$. Then, versions of concentration inequalities for the transversal fluctuations of $LC_n$ are obtained. Finally, we have outlined a proof for a central limit theorem by building upon previous work and adapting a Stein's method estimate.
【 预 览 】
附件列表
Files
Size
Format
View
Comparison of sequences generated by a hidden Markov model