The study of the longest common subsequences (LCSs) of two random words/strings is classical in computer science and bioinformatics. A problem of particular probabilistic interest is to determine the limiting behavior of the expectation and variance of the length of the LCSs, as the length of the random words grows without bound. This dissertation studies this problem using both Monte-Carlo simulation and theoretical analysis. The specific questions studied here include estimating the growth order of the variance, LCSs based hypothesis testing methods for sequences similarity, theoretical upper bounds on the Chvátal-Sankoff constant of multiple sequences, and theoretical growth order of the variance when the two random words have asymmetrical distributions.
【 预 览 】
附件列表
Files
Size
Format
View
Topics on the longest common subsequences: Simulations, computations, and variance