学位论文详细信息
Improvement of ab initio methods of gene prediction in genomic and metagenomic sequences
Codon usage;Hidden Markov model;Gene prediction;GeneMark;Metagenomics;Gene finding
Zhu, Wenhan ; Biology
University:Georgia Institute of Technology
Department:Biology
关键词: Codon usage;    Hidden Markov model;    Gene prediction;    GeneMark;    Metagenomics;    Gene finding;   
Others  :  https://smartech.gatech.edu/bitstream/1853/33869/1/zhu_wenhan_201005_phd.pdf
美国|英语
来源: SMARTech Repository
PDF
【 摘 要 】

A metagenome originated from a shotgun sequencing of a microbial community is a heterogeneous mixture of rather short sequences. A vast majority of microbial species in a given community (99%) are likely to be non-cultivable. Many protein-coding regions in a new metagenome are likely to code for barely detectable homologs of already known proteins. Therefore, an ab initio method that would accurately identify the new genes is a vitally important tool of metagenomic sequence analysis. However, a heuristic model method for finding genes in short prokaryotic sequences with anonymous origin was proposed in 1999 prior to the advent of metagenomics. With hundreds of new prokaryotic genomes available it is now possible to enhance the original approach and to utilize direct polynomial and logistic approximations of oligonucleotide frequencies. The idea was to bypass traditional ways of parameter estimation such as supervised training on a set of validated genes or unsupervised training on an anonymous sequence supposed to contain a large enough number of genes. The codon frequencies, critical for the model parameterization, could be derived from frequencies of nucleotides observed in the short sequence. This method could be further applied for initializing the algorithms for iterative parameters estimation for prokaryotic as well as eukaryotic gene finders.

【 预 览 】
附件列表
Files Size Format View
Improvement of ab initio methods of gene prediction in genomic and metagenomic sequences 8352KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:21次