BMC Genomics,2016年
In Seok Yang, Hyeonju Son, Sora Kim, Sangwoo Kim
LicenseType:CC BY |
BackgroundAlternative splicing events that result in the production of multiple gene isoforms reveals important molecular mechanisms. Gene isoforms are often differentially expressed across organs and tissues, developmental stages, and disease conditions. Specifically, recent studies show that aberrant regulation of alternative splicing frequently occurs in cancer to affect tumor cell transformation and growth. While analysis of isoform expression is important for discovering tumor-specific isoform signatures and interpreting relevant genomic mutations, there is currently no web-based, easy-to-use, and publicly available platform for this purpose.DescriptionWe developed ISOexpresso to provide information regarding isoform existence and expression, which can be grouped by cancer vs. normal conditions, cancer types, and tissue types. ISOexpresso implements two main functions: First, the Isoform Expression View function creates visualizations for condition-specific RNA/isoform expression patterns upon query of a gene of interest. With this function, users can easily determine the major isoform (the most expressed isoform in a sample) of a gene with respect to the condition and check whether it matches the known canonical isoform. ISOexpresso outputs expression levels of all known transcripts to check alterations of expression landscape and to find potential tumor-specific isoforms. Second, the User Data Annotation function supports annotation of genomic variants to determine the most plausible consequence of a variation (e.g., an amino acid change) among many possible interpretations. As most coding sequence mutations are effective through the subsequent transcription and translation, ISOexpresso automatically prioritizes transcripts that act as backbones for mutation effect prediction by their relative expression. By employing ISOexpresso, we could investigate the consistency between the most expressed and known canonical/principal isoforms, as well as infer candidate tumor-specific isoforms based on their expression levels. In addition, we confirmed that ISOexpresso could easily reproduce previously known isoform expression patterns: recurrent observation of a major isoform across tissues, differential isoform expression patterns in a given tissue, and switching of major isoform during tumorigenesis.ConclusionsISOexpresso serves as a web-based, easy-to-use platform for isoform expression and alteration analysis based on large-scale cancer database. We anticipate that ISOexpresso will expedite formulation and confirmation of novel hypotheses by providing isoform-level perspectives on cancer research. The ISOexpresso database is available online at http://wiki.tgilab.org/ISOexpresso/.
BMC Genomics,2016年
Pierre-Etienne Cholley, Marco-Antonio Mendoza-Parra, Matthias Blum, Hinrich Gronemeyer, Valeriya Malysheva
LicenseType:CC BY |
BackgroundProximity ligation-mediated methods are essential to study the impact of three-dimensional chromatin organization on gene programming. Albeit significant progress has been made in the development of computational tools that assess long-range chromatin interactions, next to nothing is known about the quality of the generated datasets.MethodWe have developed LOGIQA (www.ngs-qc.org/logiqa), a database hosting quality scores for long-range genome interaction assays, accessible through a user-friendly web-based environment.ResultsCurrently, LOGIQA harbors QC scores for >900 datasets, which provides a global view of their relative quality and reveals the impact of genome size, coverage and other technical aspects. LOGIQA provides a user-friendly dataset query panel and a genome viewer to assess local genome-interaction maps at different resolution and quality-assessment conditions.ConclusionsLOGIQA is the first database hosting quality scores dedicated to long-range chromatin interaction assays, which in addition provides a platform for visualizing genome interactions made available by the scientific community.
3 Gene evolution and gene expression after whole genome duplication in fish: the PhyloFish database [期刊论文]
BMC Genomics,2016年
Pierre Pontarotti, Dany Severac, Laurent Journot, Thaovi Nguyen, Julien Bobe, Jeremy Pasquier, Yann Guiguen, Elodie Jouanno, Cédric Cabau, Christophe Klopp, John H. Postlethwait, Ingo Braasch
LicenseType:CC BY |
With more than 30,000 species, ray-finned fish represent approximately half of vertebrates. The evolution of ray-finned fish was impacted by several whole genome duplication (WGD) events including a teleost-specific WGD event (TGD) that occurred at the root of the teleost lineage about 350 million years ago (Mya) and more recent WGD events in salmonids, carps, suckers and others. In plants and animals, WGD events are associated with adaptive radiations and evolutionary innovations. WGD-spurred innovation may be especially relevant in the case of teleost fish, which colonized a wide diversity of habitats on earth, including many extreme environments. Fish biodiversity, the use of fish models for human medicine and ecological studies, and the importance of fish in human nutrition, fuel an important need for the characterization of gene expression repertoires and corresponding evolutionary histories of ray-finned fish genes. To this aim, we performed transcriptome analyses and developed the PhyloFish database to provide (i) de novo assembled gene repertoires in 23 different ray-finned fish species including two holosteans (i.e. a group that diverged from teleosts before TGD) and 21 teleosts (including six salmonids), and (ii) gene expression levels in ten different tissues and organs (and embryos for many) in the same species. This resource was generated using a common deep RNA sequencing protocol to obtain the most exhaustive gene repertoire possible in each species that allows between-species comparisons to study the evolution of gene expression in different lineages. The PhyloFish database described here can be accessed and searched using RNAbrowse, a simple and efficient solution to give access to RNA-seq de novo assembled transcripts.
BMC Genomics,2011年
Javier Canales, Francisco R Cantón, David P Villalobos, Sara M Díaz-Moreno, Francisco M Cánovas, Concepción Avila, Noé Fernández-Pozo, Arantxa Flores-Monterroso, M Gonzalo Claros, M Ángeles Guevara, Carmen Collada, M Teresa Cervera, Álvaro Soto, Rocío Bautista, Darío Guerrero-Fernández, Pedro Perdiguero, Ricardo Ordás
LicenseType:Unknown |
BackgroundPinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases.DescriptionEuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided.ConclusionsThe EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome.
BMC Genomics,2011年
Ge Gao, He Zhang, Jun Wang, Liang Tang, Jingchu Luo, Lei Kong, Xiaocheng Gu, Zhe Li, Shuqi Zhao
LicenseType:Unknown |
BackgroundThe concurrent release of rice genome sequences for two subspecies (Oryza sativa L. ssp. japonica and Oryza sativa L. ssp. indica) facilitates rice studies at the whole genome level. Since the advent of high-throughput analysis, huge amounts of functional genomics data have been delivered rapidly, making an integrated online genome browser indispensable for scientists to visualize and analyze these data. Based on next-generation web technologies and high-throughput experimental data, we have developed Rice-Map, a novel genome browser for researchers to navigate, analyze and annotate rice genome interactively.DescriptionMore than one hundred annotation tracks (81 for japonica and 82 for indica) have been compiled and loaded into Rice-Map. These pre-computed annotations cover gene models, transcript evidences, expression profiling, epigenetic modifications, inter-species and intra-species homologies, genetic markers and other genomic features. In addition to these pre-computed tracks, registered users can interactively add comments and research notes to Rice-Map as User-Defined Annotation entries. By smoothly scrolling, dragging and zooming, users can browse various genomic features simultaneously at multiple scales. On-the-fly analysis for selected entries could be performed through dedicated bioinformatic analysis platforms such as WebLab and Galaxy. Furthermore, a BioMart-powered data warehouse "Rice Mart" is offered for advanced users to fetch bulk datasets based on complex criteria.ConclusionsRice-Map delivers abundant up-to-date japonica and indica annotations, providing a valuable resource for both computational and bench biologists. Rice-Map is publicly accessible at http://www.ricemap.org/, with all data available for free downloading.
6 RASOnD - A comprehensive resource and search tool for RAS superfamily oncogenes from various species [期刊论文]
BMC Genomics,2011年
A Srinivasan, Umay Kulsum, Sujata Sharma, Punit Kaur, Tej P Singh, Vishwadeep Singh
LicenseType:Unknown |
BackgroundThe Ras superfamily plays an important role in the control of cell signalling and division. Mutations in the Ras genes convert them into active oncogenes. The Ras oncogenes form a major thrust of global cancer research as they are involved in the development and progression of tumors. This has resulted in the exponential growth of data on Ras superfamily across different public databases and in literature. However, no dedicated public resource is currently available for data mining and analysis on this family. The present database was developed to facilitate straightforward accession, retrieval and analysis of information available on Ras oncogenes from one particular site.DescriptionWe have developed the RAS Oncogene Database (RASOnD) as a comprehensive knowledgebase that provides integrated and curated information on a single platform for oncogenes of Ras superfamily. RASOnD encompasses exhaustive genomics and proteomics data existing across diverse publicly accessible databases. This resource presently includes overall 199,046 entries from 101 different species. It provides a search tool to generate information about their nucleotide and amino acid sequences, single nucleotide polymorphisms, chromosome positions, orthologies, motifs, structures, related pathways and associated diseases. We have implemented a number of user-friendly search interfaces and sequence analysis tools. At present the user can (i) browse the data (ii) search any field through a simple or advance search interface and (iii) perform a BLAST search and subsequently CLUSTALW multiple sequence alignment by selecting sequences of Ras oncogenes. The Generic gene browser, GBrowse, JMOL for structural visualization and TREEVIEW for phylograms have been integrated for clear perception of retrieved data. External links to related databases have been included in RASOnD.ConclusionsThis database is a resource and search tool dedicated to Ras oncogenes. It has utility to cancer biologists and cell molecular biologists as it is a ready source for research, identification and elucidation of the role of these oncogenes. The data generated can be used for understanding the relationship between the Ras oncogenes and their association with cancer. The database updated monthly is freely accessible online at http://202.141.47.181/rasond/ and http://www.aiims.edu/RAS.html.