期刊论文详细信息
BMC Genomics
Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote
Research Article
Lester Khoo1  Jianguo Lu2  Eric Peatman2  Yu Zhang2  Huseyin Kucuktas2  Zunchun Zhou2  Jiaren Zhang2  Zhanjiang Liu2  Hao Zhang2  KV Rajendran2  Fanyue Sun2  Yanliang Jiang2  Xiuli Wang2  Shikai Liu3  Geoff Waldbieser4 
[1] College of Veterinary Medicine, Mississippi State University, 127 Experiment Station Road, 38776, Stoneville, Mississippi, USA;The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, Auburn University, 36849, Auburn, AL, USA;The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, Auburn University, 36849, Auburn, AL, USA;The Shellfish Genetics and Breeding Laboratory, Fisheries College, Ocean University of China, 266003, Qingdao, P.R. China;USDA, ARS, Catfish Genetics Research Unit, 141 Experiment Station Road, 38776, Stoneville, Mississippi, USA;
关键词: Transcriptome Assembly;    Iron Responsive Element;    Protein Query Sequence;    Average Contig Length;    Catfish Genome;   
DOI  :  10.1186/1471-2164-13-595
 received in 2012-03-01, accepted in 2012-08-09,  发布年份 2012
来源: Springer
PDF
【 摘 要 】

BackgroundUpon the completion of whole genome sequencing, thorough genome annotation that associates genome sequences with biological meanings is essential. Genome annotation depends on the availability of transcript information as well as orthology information. In teleost fish, genome annotation is seriously hindered by genome duplication. Because of gene duplications, one cannot establish orthologies simply by homology comparisons. Rather intense phylogenetic analysis or structural analysis of orthologies is required for the identification of genes. To conduct phylogenetic analysis and orthology analysis, full-length transcripts are essential. Generation of large numbers of full-length transcripts using traditional transcript sequencing is very difficult and extremely costly.ResultsIn this work, we took advantage of a doubled haploid catfish, which has two sets of identical chromosomes and in theory there should be no allelic variations. As such, transcript sequences generated from next-generation sequencing can be favorably assembled into full-length transcripts. Deep sequencing of the doubled haploid channel catfish transcriptome was performed using Illumina HiSeq 2000 platform, yielding over 300 million high-quality trimmed reads totaling 27 Gbp. Assembly of these reads generated 370,798 non-redundant transcript-derived contigs. Functional annotation of the assembly allowed identification of 25,144 unique protein-encoding genes. A total of 2,659 unique genes were identified as putative duplicated genes in the catfish genome because the assembly of the corresponding transcripts harbored PSVs or MSVs (in the form of pseudo-SNPs in the assembly). Of the 25,144 contigs with unique protein hits, around 20,000 contigs matched 50% length of reference proteins, and over 14,000 transcripts were identified as full-length with complete open reading frames. The characterization of consensus sequences surrounding start codon and the stop codon confirmed the correct assembly of the full-length transcripts.ConclusionsThe large set of transcripts assembled in this study is the most comprehensive set of genome resources ever developed from catfish, which will provide the much needed resources for functional genome research in catfish, serving as a reference transcriptome for genome annotation, analysis of gene duplication, gene family structures, and digital gene expression analysis. The putative set of duplicated genes provide a starting point for genome scale analysis of gene duplication in the catfish genome, and should be a valuable resource for comparative genome analysis, genome evolution, and genome function studies.

【 授权许可】

CC BY   
© Liu et al.; licensee BioMed Central Ltd. 2013

【 预 览 】
附件列表
Files Size Format View
RO202311094867075ZK.pdf 1468KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  文献评价指标  
  下载次数:2次 浏览次数:0次