This dissertation is centered on the modeling of heterogeneous data which is ubiquitous in this digital information age. From the statistical point of view heterogeneous data is composed of dissimilar components, where objects in each component are homogeneous themselves. One such example from the real world is the stock return data, where stocks in the same industry segments tend to move closely together, while different segments tend to have distinct movement patterns.Clustering is one of the most popular ways to characterize data heterogeneity. It is a classical problem of unsupervised learning. We will review major clustering approaches in Chapter 1. In recent years non-parametric Bayesian mixture models have attracted increasing attention in the clustering literature, which is closely related with our work. So we review the Mixture of Dirichlet Process Model in Chapter 2.The main dissertation body consists of three generic statistical methods to model heterogeneity in different scenarios. As data are becoming more and more prevailing today, traditional clustering tasks are often accompanied by additional information about the objects to cluster, known as the side information. The opportunity is that the side information has the potential to complement clustering algorithms to achieve more accurate and meaningful results. In Chapter 3 we describe Two-view Clustering method, a novel non-parametric clustering model that is capable of robustly incorporating noisy side information. We demonstrate the effectiveness of this new model with three real world applications in Chapter 4.Our second work is driven by market segmentation which is a key factor to a modern business's success by accurately recognizing customer groups with varying needs. Market segmentation involves dividing a larger market into sub-markets based upon a variety of factors such customers’ demographic information and product preferences. In Chapter 5 we will propose a multi-task learning framework to solve this problem.Our third work in Chapter 6 tries to solve a problem arising from citation analysis for research evaluation. In bibliometrics one central task is to characterize the statistical distribution of citations. This problem has been regarded as a challenging one for two reasons: (i) the citation distributions of almost all the subject areas are highly right-skewed; (ii) the citation behaviors across various subject areas can be drastically different. We propose a mixture model to formally characterize the statistical distribution of citation data. Based on this model we develop new criteria to evaluate impact of journals and performance of research institutes.