期刊论文详细信息
BMC Bioinformatics
MCA: Multiresolution Correlation Analysis, a graphical tool for subpopulation identification in single-cell gene expression data
Justin Feigelman1  Fabian J Theis1  Carsten Marr2 
[1] Department of Mathematics, Technische Universität München, Boltzmannstrasse, 3, 85747 Garching, Germany
[2] Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstrasse 1, 85764 Neuherberg, Germany
关键词: qPCR analysis;    Subpopulation identification;    Correlation;    Multiresolution;   
Others  :  1087553
DOI  :  10.1186/1471-2105-15-240
 received in 2014-04-23, accepted in 2014-07-04,  发布年份 2014
PDF
【 摘 要 】

Background

Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable.

Results

We present Multiresolution Correlation Analysis (MCA), a method for visually identifying subpopulations based on the local pairwise correlation between covariates, without needing to define an a priori interaction scale. We demonstrate that MCA facilitates the identification of differentially regulated subpopulations in simulated data from a small gene regulatory network, followed by application to previously published single-cell qPCR data from mouse embryonic stem cells. We show that MCA recovers previously identified subpopulations, provides additional insight into the underlying correlation structure, reveals potentially spurious compartmentalizations, and provides insight into novel subpopulations.

Conclusions

MCA is a useful method for the identification of subpopulations in low-dimensional expression data, as emerging from qPCR or FACS measurements. With MCA it is possible to investigate the robustness of covariate correlations with respect subpopulations, graphically identify outliers, and identify factors contributing to differential regulation between pairs of covariates. MCA thus provides a framework for investigation of expression correlations for genes of interests and biological hypothesis generation.

【 授权许可】

   
2014 Feigelman et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117015859558.pdf 1378KB PDF download
Figure 3. 45KB Image download
Figure 2. 55KB Image download
Figure 1. 90KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Chambers I, Silva J, Colby D, Nichols J, Nijmeijer B: Nanog safeguards pluripotency and mediates germline development. Nature 2007, 450(7173):1230-1234.
  • [2]Narsinh KH, Sun N, Sanchez-Freire V, Lee AS, Almeida P, Hu S, Jan T, Wilson KD, Leong D, Rosenberg J, Yao M, Robbins RC, Wu JC: Single cell transcriptional profiling reveals heterogeneity of human induced pluripotent stem cells. J Clin Invest 2011, 121(3):1217-1221.
  • [3]Tischler J, Surani MA: Investigating transcriptional states at single-cell-resolution. Curr Opin Biotechnol 2012, 24(1):69-78.
  • [4]Rubakhin SS, Lanni EJ, Sweedler JV: Progress toward single cell metabolomics. Curr Opin Biotechnol 2013, 24(1):95-104.
  • [5]Mayle A, Luo M, Jeong M, Goodell MA: Flow cytometry analysis of murine hematopoietic stem cells. Cytometry 2012, 83A(1):27-37.
  • [6]Martinez-Arias A, Brickman JM: Gene expression heterogeneities in embryonic stem cell populations origin and function. Curr Opin Cell Biol 2011, 23(6):650-656.
  • [7]Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S: Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 2008, 453(7194):544-547.
  • [8]Murphy RF: Automated identification of subpopulations in flow cytometric list mode data using cluster analysis. Cytometry 1985, 6(4):302-309.
  • [9]Lugli E, Roederer M, Cossarizza A: Data analysis in flow cytometry: the future just started. Cytometry 2010, 77A(7):705-713.
  • [10]Bashashati A, Brinkman RR: A survey of flow cytometry data analysis methods. Adv Bioinform 2009, 2009(1):1-19.
  • [11]Goil S, Nagesh H, Choudhary A: MAFIA: Efficient and scalable subspace clustering for very large data sets. Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1999, 443-452.
  • [12]Tung AK, Xu X, Ooi BC: Curler: finding and visualizing nonlinear correlation clusters. Proceedings of the 2005 ACM SIGMOD international conference on Management of data 2005, 467-478.
  • [13]Yang J, Wang W, Wang H, Yu P: δ-clusters: Capturing subspace correlation in a large data set. Proceedings of the 18th International Conference on Data Engineering 2002, 517-528.
  • [14]Cheng C-H, Fu AW, Zhang Y: Entropy-based subspace clustering for mining numerical data. Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining 1999, 84-93.
  • [15]Li G-W, Xie XS: Central dogma at the single-molecule level in living cells. Nature 2011, 475(7356):308-315.
  • [16]MacArthur BD, Ma’ayan A, Lemischka IR: Systems biology of stem cell fate and cellular reprogramming. Nat Rev Mol Cell Biol 2009, 10(10):672-681.
  • [17]Glauche I, Herberg M: Nanog variability and pluripotency regulation of embryonic stem cells-insights from a mathematical model analysis. PLoS One 2010, 5(6):11238.
  • [18]Trott J, Hayashi K, Surani A, Babu MM, Martinez-Arias A: Dissecting ensemble networks in ES cell populations reveals micro-heterogeneity underlying pluripotency. Mol Biosyst 2012, 8(3):744-752.l.
  • [19]Hayashi K, Lopes SMCdS, Tang F, Surani MA: Dynamic equilibrium and heterogeneity of mouse pluripotent stem cells with distinct functional and epigenetic states. Cell Stem cell 2008, 3(4):391-401.
  • [20]Shi W, Wang H, Pan G, Geng Y, Guo Y, Pei D: Regulation of the Pluripotency Marker Rex-1 by Nanog and Sox2. J Biol Chem 2006, 281(33):23319-23325.
  • [21]Karwacki-Neisius V, Göke J, Osorno R, Halbritter F, Ng JH, Weiße AY, Wong FCK, Gagliardi A, Mullin NP, Festuccia N, Colby D, Tomlinson SR, Ng H-H, Chambers I: Reduced oct4 expression directs a robust pluripotent state with distinct signaling activity and increased enhancer occupancy by oct4 and nanog. Cell Stem Cell 2013, 12(5):531-545.
  • [22]Raj A, van Oudenaarden A: Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 2008, 135(2):216-226.
  • [23]Marcus JS, Anderson WF, Quake SR: Microfluidic single-cell mRNA isolation and analysis. Anal Chem 2006, 78(9):3084-3089.
  • [24]Kalisky T, Blainey P, Quake SR: Genomic analysis at the single-cell level. Annu Rev Genet 2011, 45(1):431-445.
  • [25]Bandura DR, Baranov VI, Ornatsky OI, Antonov A, Kinach R, Lou X, Pavlov S, Vorobiev S, Dick JE, Tanner SD: Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Anal Chem 2009, 81(16):6813-6822.
  • [26]Bajikar SS, Fuchs C, Roller A, Theis FJ, Janes KA: Parameterizing cell-to-cell regulatory heterogeneities via stochastic transcriptional profiles. Proc Natl Acad Sci 2014, 111(5):626-35.
  • [27]Wu J, Tzanakakis ES: Biotechnology advances. Biotechnol Adv 2013, 31(7):1047-1062.
  • [28]Hensel Z, Feng H, Han B, Hatem C, Wang J, Xiao J: Stochastic expression dynamics of a transcription factor revealed by single-molecule noise analysis. Nat Struct Mol Biol 2012, 19(8):797-802.
  • [29]Moignard V, Macaulay IC, Swiers G, Buettner F, Schütte J, Calero-Nieto FJ, Kinston S, Joshi A, Hannah R, Theis FJ, Jacobsen SE, de Bruijn MF, Gottgens B: Characterization of transcriptional networks in blood stem and progenitor cells using high-throughput single-cell gene expression analysis. Nat Cell Biol 2013, 15(4):363-372.
  • [30]Hulett HR, Bonner WA, Barrett J, Herzenberg LA: Cell sorting: automated separation of mammalian cells as a function of intracellular fluorescence. Science 1969, 166(3906):747-749.
  • [31]Tomer A, Harker LA, Burstein SA: Purification of human megakaryocytes by fluorescence-activated cell sorting. Blood 1987, 70(6):1735-1742.
  • [32]Malatesta P, Hartfuss E, Götz M: Isolation of radial glial cells by fluorescent-activated cell sorting reveals a neuronal lineage. Development 2000, 127(24):5253-5263.
  • [33]Abdi H, Williams LJ: Principal component analysis. Wiley Interdiscip Rev: Comput Stat 2010, 2(4):433-459.
  • [34]Buettner F, Theis FJ: A novel approach for resolving differences in single-cell gene expression patterns from zygote to blastocyst. Bioinformatics 2012, 28(18):626-632.
  • [35]Chen YA, Almeida JS, Richards AJ, Müller P, Carroll RJ, Rohrer B: A nonparametric approach to detect nonlinear correlation in gene expression. J Comput Graphical Stat: Joint Publication Am Stat Assoc Inst Math Stat Interface Foundation N Am 2010, 19(3):552-568.
  • [36]Tjøstheim D, Hufthammer KO: Local Gaussian correlation: a new measure of dependence. J Econometrics 2013, 172(1):33-48.
  • [37]Cordeiro RLF, Traina AJM, Faloutsos C, Traina C: Halite: fast and scalable multiresolution local-correlation clustering. IEEE Trans Knowl Data Eng 25(2):387-401.
  • [38]Schäfer J, Strimmer K: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol 2005, 4(1):1-32.
  • [39]Kendall MG: Rank correlation methods. Oxford, England: Griffin; 1948.
  • [40]Gardiner CW: Stochastic methods: a handbook for the natural and social sciences. Berlin, Germany: Springer; 2009.
  文献评价指标  
  下载次数:57次 浏览次数:10次