期刊论文详细信息
BMC Genomics
RECLU: a pipeline to discover reproducible transcriptional start sites and their alternative regulation using capped analysis of gene expression (CAGE)
Timo Lassmann1  Yoshihide Hayashizaki2  Alistair RR Forrest1  Piero Carninci1  Masayoshi Itoh2  Martin C Frith3  Morana Vitezic4  Hiroko Ohmiya5 
[1] RIKEN Center for Life Science Technologies (CLST), Division of Genomic Technologies, RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, 230-0045 Yokohama, Japan;RIKEN Preventive Medicine and Diagnosis Innovation Program (PMI), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, 230-0045 Yokohama, Japan;Sequence Analysis Team, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, 135-0064 Tokyo, Japan;Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark;RIKEN Advanced Center for Computing and Communication, Preventive Medicine and Applied Genomics Unit, 1-7-22 Suehiro-cho, Tsurumi-ku, 230-0045 Yokohama, Japan
关键词: Hierarchical stability;    Reproducibility;    Peak finding;    CAGE;   
Others  :  1217395
DOI  :  10.1186/1471-2164-15-269
 received in 2014-01-16, accepted in 2014-04-04,  发布年份 2014
PDF
【 摘 要 】

Background

Next generation sequencing based technologies are being extensively used to study transcriptomes. Among these, cap analysis of gene expression (CAGE) is specialized in detecting the most 5’ ends of RNA molecules. After mapping the sequenced reads back to a reference genome CAGE data highlights the transcriptional start sites (TSSs) and their usage at a single nucleotide resolution.

Results

We propose a pipeline to group the single nucleotide TSS into larger reproducible peaks and compare their usage across biological states. Importantly, our pipeline discovers broad peaks as well as the fine structure of individual transcriptional start sites embedded within them. We assess the performance of our approach on a large CAGE datasets including 156 primary cell types and two cell lines with biological replicas. We demonstrate that genes have complicated structures of transcription initiation events. In particular, we discover that narrow peaks embedded in broader regions of transcriptional activity can be differentially used even if the larger region is not.

Conclusions

By examining the reproducible fine scaled organization of TSS we can detect many differentially regulated peaks undetected by previous approaches.

【 授权许可】

   
2014 Ohmiya et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150706095159544.pdf 2409KB PDF download
Figure 9. 74KB Image download
Figure 8. 67KB Image download
Figure 7. 76KB Image download
Figure 6. 25KB Image download
Figure 5. 25KB Image download
Figure 4. 59KB Image download
Figure 3. 90KB Image download
Figure 2. 54KB Image download
Figure 1. 90KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

【 参考文献 】
  • [1]Sandelin A, Carninci P, Lenhard B, Ponjavic J, Hayashizaki Y, Hume DA: Mammalian rna polymerase ii core promoters: insights from genome-wide studies. Nature Rev Genet 2007, 8:424-436.
  • [2]Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T, Fukuda S, Sasaki D, Podhajska A, Harbers M, Kawai J, Carninci P, Hayashizaki Y: Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Nati Acad Sci USA 2003, 100:15776-15781.
  • [3]Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D, Imamura K, Kai C, Harbers M, Hayashizaki Y, Carninci P: Cage: cap analysis of gene expression. Nat Methods 2006, 3:211-222.
  • [4]Consortium TF, Center ROS: The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nat Genet 2009, 41:553-562.
  • [5]Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest ARR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, et al.: Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 2006, 38:626-635.
  • [6]Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A: A code for transcription initiation in mammalian genomes. Genome Res 2008, 18:1-12.
  • [7]Consortium EP: An integrated encyclopedia of dna elements in the human genome. Nature 2012, 489:57-74.
  • [8]Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alito T, Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen X, Chrast J, Curado J, et al.: Landscape of transcription in human cells. Nature 2012, 489:101-108.
  • [9]Li Q, Brown JB, Huang H, Bickel PJ: Measuring reproducibility of high-throughput experiments. Ann Appl Stat 2011, 5(3):1752-1779.
  • [10]Forrest ARR, Kawaji H, Rehli M, Baillie JK, de Hoon MJL, Haberle V, Lassmann T, Kulakovskiy IV, Lizio M, Itoh M, Andersson R, Mungall CJ, Meehan TF, Schmeier S, Bertin N, Jørgensen M, Dimont E, Arner E, Schmidl C, Schaefer U, Medvedeva YA, Plessy C, Vitezic M, Severin J, Semple CA, Ishizu Y, Young RS, Francescatto M, Alam I, Albanese D, et al.: A promoter level mammalian expression atlas. Nature 2014.
  • [11]Kanamori-Katayama M, Itoh M, Kawaji H, Lassmann T, Katayama S, Kojima M, Bertin N, Kaiho A, Ninomiya N, Daub CO, Carninci P, Forrest ARR, Hayashizaki Y: Unamplified cap analysis of gene expression on a single molecule sequencer. Genome Res 2011, 21:1150-1159.
  • [12]FANTOM5 Project http://fantom.gsc.riken.jp/5/ webcite.
  • [13]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Subgroup.G.P.D.P: The sequence alignment/map format and samtools. Bioinformatics 2009, 25:2078-2079.
  • [14]Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C, Murata M, Nishiyori H, Lazarevic D, Motti D, Marstrand TT, Tang ME, Zhao X, Krogh A, Winther O, Arakawa T, Kawai J, Wells C, Daub C, Harbers M, Hayashizaki Y, Gustincich S, Sandelin A, Carninci P: Genome-wide detection and analysis of hippocampus core promoters using deepcage. Genome Res 2009, 19:255-265.
  • [15]Quinlan A, Hall I: Bedtools: a flexible suite of utilities for comparing genomic features. Bioinformatics 2010, 26:841-842.
  • [16]Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merlel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigó R: The gencode v7 catalog of human long noncoding rnas: analysis of their gene structure, evolution, and expression. Genome Res 2012, 22:1775-1789.
  • [17]Robinson MD, McCarthy DJ, Smyth GK: edger: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010, 26:139-140.
  • [18]Anders S, Huber W: Differential expression analysis for sequence count data. Genome Biol 2010, 11:106. BioMed Central Full Text
  • [19]Huang DW, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nat Protoc 2009, 1:44-57.
  • [20]Keilwagen J, Grau J, Paponov IA, Posch S, Strickert M, Grosse I: De-novo discovery of differentially abundant transcription factor binding sites including their positional preference. PLoS Comput Biol 2011, 7:1001070-1017311001070.
  • [21]Portales-Casamar E, Thongjuea S, Kwon AT, Arenillas D, Zhao X, Valen E, Yusuf D, Lenhard B, Wasserman WW, Sandelin A: Jaspar 2010 the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Res 2010, 38:105-110.
  • [22]Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS: Quantifying similarity between motifs. Genome Biol 2007, 8:24. BioMed Central Full Text
  • [23]Karolchik D, Baertsch R, Diekhans M, Furey TS, Hinrichs A, Lu YT, Poskin KM, Schwartz M, Sugnet CW, Thomas DJ, Weber RJ, Haussler D, Kent WJ: The ucsc genome browser database. Nucleic Acids Res 2003, 31:51-54.
  • [24]Takei E, Tsukimoto M, Harada H, Sawada K, Moriyama Y, Kojima S: Autocrine regulation of tgf-β-induced cell migration by exocytosis of atp and activation of p2 receptors in human lung cancer cells. J Cell Sci 2012, 125:5051-5060.
  • [25]Eck MV, Singaraja RR, Ye D, Hildebrand RB, James ER, Hayden MR, Berkel TJCV: Macrophage atp-binding cassette transporter a1 over expression inhibits atherosclerotic lesion progression in low-density lipoprotein receptor knockout mice. Arterioscler Thromb Vasc Biol 2006, 26:929-934.
  • [26]Middendorf M, Kundaje A, Shah M, Freund Y, Wiggins C, Leslie C: Motif discovery through predictive modeling of gene regulation. Res Comput Mol Biol 2005, 3500:538-552.
  • [27]Wang H, Yang L, Jamaluddin MS, Boyd DD: The kruppel-like klf4 transcription factor, a novel regulator of urokinase receptor expression, drives synthesis of this binding site in colonic crypt luminal surface epithelial cells. J Biol Chem 2004, 279:22674-22683.
  • [28]Takahashi K, Yamanaka S: Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 2006, 126:663-676.
  • [29]Rowland BD, Peeper DS: Klf4, p21 and context-dependent opposing forces in cancer. Nat Rev Cancer 2006, 6:11-23.
  • [30]Oler AJ, Alla RK, Roberts DN, Wong A, Hollenhorst PC, Chandler KJ, Cassiday PA, Nelson CA, Hagedorn CH, Craves BJ, Cairns B: Human rna polymerase ii transcriptomes and relationships to pol ii promoter chromatin and enhancer-binding factors. Nat Struct Mol Biol 2010, 17:620-628.
  • [31]Macville M, Schrock E, Padilla-Nash H, Keck C, Ghadimi BM, Zimonjic D, Popescu N, Ried T: Comprehensive and definitive molecular cytogenetic characterization of hela cells by spectral karyotyping. Cancer Res 1999, 59:141-150.
  • [32]Choi JW, Herr DR, Noguchi K, Yung YC, Lee CW, Mutoh T, Lin ME, Teo ST, Park KE, Mosley AN, Chun J: Lpa receptors: subtypes and biological actions. Annu Rev Pharmacol Toxicol 2010, 50:157-186.
  • [33]Mills GB, Moolenaar WH: The emerging role of lysophosphatidic acid in cancer. Nat Rev Cancer 2003, 3:582-591.
  • [34]Wang F, Yin Y, Ye X, Liu K, Zhu H, Wang L, Chiourea M, Okuka M, Ji G, Dan J, Zuo B, Li M, Zhang Q, Liu N, Chen L, Pan X, Gagos S, Keefe DL, Liu L: Molecular insights into the heterogeneity of telomere reprogramming in induced pluripotent stem cells. Cell Res 2011, 22:757-768.
  • [35]Zhao X, Valen E, Parker BJ, Sandelin A: Systematic clustering of transcription start site landscapes. PLoS ONE 2011, 6:23409.
  • [36]Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at ucsc. Genome Res 2002, 12:996-1006.
  文献评价指标  
  下载次数:20次 浏览次数:5次