
Other common methods of dimensionality reduction worth mentioning are independent component analysis ( Comon, 1994), nonnegative matrix factorization ( Lee and Seung, 1999), self-organized maps ( Kohonen, 1982), isomaps ( Tenenbaum et al., 2000), t-distributed stochastic neighbor embedding ( van der Maaten and Hinton, 2008), Uniform Manifold Approximation and Projection for Dimension Reduction ( McInnes et al., 2018), and autoencoders ( Vincent et al., 2008). It is defined as a set of data being projected orthogonally onto lower-dimensional linear space (called the principal subspace) to maximize the projected data's variance ( Hotelling, 1933). It is an unsupervised learning technique of dimensionality reduction also known as Karhunen–Loève transform, generally applied for data compression and visualization, feature extraction, dimensionality reduction, and feature extraction ( Bishop, 2000). PCA is one of the most popular algorithms used for dimensionality reduction ( Pearson, 1901 Wold et al., 1987 Dunteman, 1989 Jolliffe and Cadima, 2016). Many statistical and ML methods have been applied to high-dimensional data, such as vector quantization and mixture models, generative topographic mapping ( Bishop et al., 1998), and principal component analysis (PCA), to list just a few. It is commonly used during the analysis of high-dimensional data (e.g., multipixel images of a face or texts from an article, astronomical catalogues, etc.). de Souza PhD, in Knowledge Discovery in Big Data from Astronomy and Earth Observation, 2020 12.3.7 Dimensionality Reductionĭimensionality reduction, or variable reduction techniques, simply refers to the process of reducing the number or dimensions of features in a dataset.
