| BMC Bioinformatics | |
| Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia | |
| Research | |
| Ioannis Kavakiotis1  Grigorios Tsoumakas1  Ioannis Vlahavas1  Andreas Agathangelidis2  Anastasia Hadzidimitriou3  Kostas Stamatopoulos3  Aliki Xochelli3  Ioanna Chouvarda4  Nicos Maglaveras4  | |
| [1] Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece;Division of Molecular Oncology and Department of Onco-Hematology, San Raffaele Scientific Institute, Milan, Italy;Institute of Applied Biosciences, CERTH, Thessaloniki, Greece;Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden;Institute of Applied Biosciences, CERTH, Thessaloniki, Greece;Lab of Computing and Medical Informatics, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece; | |
| 关键词: Data integration; Feature extraction; List aggregation; Mutation patterns, somatic hypermutation; SHM; Chronic lymphocytic leukaemia; CLL; | |
| DOI : 10.1186/s12859-016-1044-3 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundSomatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostication markers for clinical outcome in chronic lymphocytic leukaemia (CLL), the most frequent adult B-cell malignancy. In this paper we present a methodology for integrating multiple immunogenetic and clinocobiological data sources in order to extract features and create high quality datasets for SHM analysis in IG receptors of CLL patients. This dataset is used as the basis for a higher level integration procedure, inspired form social choice theory. This is applied in the Towards Analysis, our attempt to investigate the potential ontogenetic transformation of genes belonging to specific stereotyped CLL subsets towards other genes or gene families, through SHM.ResultsThe data integration process, followed by feature extraction, resulted in the generation of a dataset containing information about mutations occurring through SHM. The Towards analysis performed on the integrated dataset applying voting techniques, revealed the distinct behaviour of subset #201 compared to other subsets, as regards SHM related movements among gene clans, both in allele-conserved and non-conserved gene areas. With respect to movement between genes, a high percentage movement towards pseudo genes was found in all CLL subsets.ConclusionsThis data integration and feature extraction process can set the basis for exploratory analysis or a fully automated computational data mining approach on many as yet unanswered, clinically relevant biological questions.
【 授权许可】
CC BY
© Kavakiotis et al. 2016
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311106698574ZK.pdf | 1845KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
PDF