Document Type
Article
Major
Statistics
Mentor
Prof. Haim Bar, Dept. of Statistics
Disciplines
Physical Sciences and Mathematics | Statistics and Probability
Abstract
Multidimensional scaling (MDS), in which high-dimensional data is projected to a lower dimensional map, is often followed by clustering in the reduced plot. To examine the effect of MDS on clustering, we simulate several data structures and apply clustering methods, including topological data analysis. We first perform clustering using the data in the original, high-dimensional space, then perform MDS to scale the data down to a lower dimension, cluster on this scaled data, and compare differences in the results. We found that MDS can often decrease clustering performance, and is unable to correctly represent data structures with unique shapes or noise. The shape and noise of the data also greatly affect clustering performance. With different data shapes, some clustering methods had a noticeably different performance. Topological data analysis in particular had greater success in clustering data with clear structure.
Recommended Citation
Liu, Lucy, "Measuring the Misplacement of Data from Multidimensional Scaling" (2024). Holster Scholar Projects. 55.
https://digitalcommons.lib.uconn.edu/srhonors_holster/55