Machine Learning and Knowledge Extraction | |
Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm | |
Marko Horvat1  Alan Jović2  Kristijan Burnik3  | |
[1] Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000 Zagreb, Croatia;Department of Electronics, Microelectronics, Computer and Intelligent Systems, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000 Zagreb, Croatia;Independent Researcher, HR-10000 Zagreb, Croatia; | |
关键词: multimedia; clustering; k-means; Monte-Carlo simulation; cluster distribution; emotion; | |
DOI : 10.3390/make3020022 | |
来源: DOAJ |
【 摘 要 】
Clustering is a very popular machine-learning technique that is often used in data exploration of continuous variables. In general, there are two problems commonly encountered in clustering: (1) the selection of the optimal number of clusters, and (2) the undecidability of the affiliation of border data points to neighboring clusters. We address both problems and describe how to solve them in application to affective multimedia databases. In the experiment, we used the unsupervised learning algorithm k-means and the Nencki Affective Picture System (NAPS) dataset, which contains 1356 semantically and emotionally annotated pictures. The optimal number of centroids was estimated, using the empirical elbow and silhouette rules, and validated using the Monte-Carlo simulation approach. Clustering with k = 1–50 centroids is reported, along with dominant picture keywords and descriptive statistical parameters. Affective multimedia databases, such as the NAPS, have been specifically designed for emotion and attention experiments. By estimating the optimal cluster solutions, it was possible to gain deeper insight into affective features of visual stimuli. Finally, a custom software application was developed for study in the Python programming language. The tool uses the scikit-learn library for the implementation of machine-learning algorithms, data exploration and visualization. The tool is freely available for scientific and non-commercial purposes.
【 授权许可】
Unknown