BMC Bioinformatics | |
MIA: Mutual Information Analyzer, a graphic user interface program that calculates entropy, vertical and horizontal mutual information of molecular sequence sets | |
Software | |
Fernando Antoneli1  Flavio Lichtenstein1  Marcelo R. S. Briones2  | |
[1] Departamento de Informática em Saúde, Escola Paulista de Medicina, Universidade Federal de Sao Paulo, Rua Botucatu, 862, Ed. José Leal Prado, andar térreo, Vila Clementino, CEP 04023-062, Sao Paulo, SP, Brazil;Laboratory of Evolutionary Genomics and Biocomplexity, Escola Paulista de Medicina, Universidade Federal de São Paulo, Rua Pedro de Toledo, 669, 4 andar L4E, CEP 04039-032, São Paulo, SP, Brazil;Departamento de Microbiologia, Immunologia and Parasitologia, Escola Paulista de Medicina, Universidade Federal de Sao Paulo, Rua Botucatu, 862, Ed. Ciências Biomédicas, 3 andar, Vila Clementino, CEP 04023-062, Sao Paulo, SP, Brazil;Laboratory of Evolutionary Genomics and Biocomplexity, Escola Paulista de Medicina, Universidade Federal de São Paulo, Rua Pedro de Toledo, 669, 4 andar L4E, CEP 04039-032, São Paulo, SP, Brazil; | |
关键词: Software; Information theory; Entropy; Mutual information; DNA sequences; Species; | |
DOI : 10.1186/s12859-015-0837-0 | |
received in 2015-09-03, accepted in 2015-12-02, 发布年份 2015 | |
来源: Springer | |
【 摘 要 】
BackgroundShort and long range correlations in biological sequences are central in genomic studies of covariation. These correlations can be studied using mutual information because it measures the amount of information one random variable contains about the other. Here we present MIA (Mutual Information Analyzer) a user friendly graphic interface pipeline that calculates spectra of vertical entropy (VH), vertical mutual information (VMI) and horizontal mutual information (HMI), since currently there is no user friendly integrated platform that in a single package perform all these calculations. MIA also calculates Jensen-Shannon Divergence (JSD) between pair of different species spectra, herein called informational distances. Thus, the resulting distance matrices can be presented by distance histograms and informational dendrograms, giving support to discrimination of closely related species.ResultsIn order to test MIA we analyzed sequences from Drosophila Adh locus, because the taxonomy and evolutionary patterns of different Drosophila species are well established and the gene Adh is extensively studied. The search retrieved 959 sequences of 291 species. From the total, 450 sequences of 17 species were selected. With this dataset MIA performed all tasks in less than three hours: gathering, storing and aligning fasta files; calculating VH, VMI and HMI spectra; and calculating JSD between pair of different species spectra. For each task MIA saved tables and graphics in the local disk, easily accessible for future analysis.ConclusionsOur tests revealed that the “informational model free” spectra may represent species signatures. Since JSD applied to Horizontal Mutual Information spectra resulted in statistically significant distances between species, we could calculate respective hierarchical clusters, herein called Informational Dendrograms (ID). When compared to phylogenetic trees all Informational Dendrograms presented similar taxonomy and species clusterization.
【 授权许可】
CC BY
© Lichtenstein et al. 2015
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311094897216ZK.pdf | 3017KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]