Unsupervised Learning Methods

Statistics and machine learning modeling broadly consist of two taxonomic divisions: Supervised Learning and Unsupervised Learning. Supervised learning methods focus on understanding the association of a set of variables (features) that respond to the values and levels of another set of variables that are thought to effect changes in the responses. For example, the thermal inertia on Mars is considered to be a possible indicator of whether crater ejecta layers result from volatile resident in the near surface material, entrained in the atmosphere, or a combination of both. Thermal inertia then may be modeled as a function of variables such as dust cover, crater depth, latitude, etc., which are anticipated to effect changes on the values of thermal inertia. Examples of supervised models are normal linear regression, logistic regression, analysis of variance (ANOVA), neural network models, random forest models, to name a few.

Unsupervised learning is not concerned with a set of features effecting change on a set of responses. Rather, the intent is to identify unknown categories, reduce a large number of variables into combinations describing similarities, and/or relate latent unmeasurable or unobservable variables to measurable or observed (manifest) variables. Unsupervised learning is not concerned with prediction, it is about giving visuals and data reductions to enhance understanding of the information contained within a data set. Some examples of unsupervised learning models are t-Distributed Neighbor Embedding (t-SNE), Principal Components Analysis (PCA), Exploratory Factor Analysis (EFA or just FA), Multidimensional Scaling (MDS), and cluster analysis.

The following are topics that are explained and demonstrated.