t-SNE (t-distributed Stochastic Neighbor Embedding) is a popular nonlinear dimensionality-reduction technique for visualizing high-dimensional data, first proposed by van der Maaten and Hinton (2008). It has been widely adopted in machine learning and data mining due to its ability to preserve local neighborhood structures when embedding data into low-dimensional spaces, thus uncovering clustering patterns and geometric relationships within the data.
Despite its practical success, the geometric properties of t-SNE remain insufficiently understood. In particular, how the distances among points in the embedded space relate to the k-distances in the original space is still an open research question. In this work, we investigate the behavior of k-distance relationships under t-SNE mappings, establish theoretical results characterizing these distortions, and support our findings with comprehensive simulation studies. Our contributions provide a more rigorous foundation for understanding t-SNE’s behavior in high-dimensional data visualization.