#### Consortium for Mathematics and its Applications

Product ID: 99819
Supplementary Print

# Linear Algebra and Optimization in Data Analysis (UMAP)

### Author: Paul Isihara and Student Team

TARGET AUDIENCE:

Students with a mathematical background interested in data analysis, and instructors who could tailor this material for Python Jupyter Notebook labs, in courses such as applied linear algebra, mathematics for data science, mathematical modeling, or a mathematics capstone course.

ABSTRACT:

Using the framework of linear algebra and optimization as a unifying theme, a number of mathematical concepts including least-squares solutions, loss functions, covariance matrices, eigenvalues and eigenvectors, and separating hyperplanes are used to explain least-squares linear fitting, unsupervised clustering using k-means, dimensionality reduction using principal components, and binary classification of labeled data using support vector machines. To illustrate how data analysis works in practice, Python Jupyter Notebooks are used to analyze a variety of data sets connected to the city of Chicago.

1. Introduction

2. OLS Linear Fitting
2.1 Least-Squares Solutions
2.2 Minimizing the OLS Loss Function via Normal Equations

3. Unsupervised Clustering by k-Means
3.1 Clustering of Data
3.3 Optimization by Coordinate and Block Descent
3.4 k-Means Clustering by Block Descent LS Minimization
3.4.1 Proof of Block Descent’s Stepwise Minimization of J

4. Dimension Reduction
4.1 Why Variance Matters in Reducing Dimensionality
4.2 Variance and Covariance for Mean-Centered Data
4.3 Covariance Matrix
4.4 Projected Variance
4.5 Maximization of Projected Variance

5. Binary Classification Via Support Vector Machines
5.1 Linearly Classifying Binary-labeled Data
5.2 Intuition Underlying SVM
5.3 Mathematical Formalism
5.4 Separating Hyperplanes
5.5 Signed Distance
5.6 Optimization for Linearly-Separable Data
5.7 Optimization for Non-Separable Data

6. Application to Reality

7. Conclusion

8. Solution to Selected Exercises

References

Acknowledgments UMAP Module
40 pages

#### Mathematics Topics:

• Linear Algebra

#### Application Areas:

• Data Analysis

#### Prerequisites:

Multivariable Calculus and Linear Algebra

### Not yet a member?

Browse More Resources
Search

COMAP develops curriculum resources, professional development programs, and contest opportunities that are multidisciplinary, academically rigorous, and fun for educators and students. COMAP's educational philosophy is centered around mathematical modeling: using mathematical tools to explore real-world problems.

Company
Products
Policies