Showing posts with label ML methods. Show all posts

Support Vector Machine

Support Vector Machines (SVM) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier.

Input vectors that support the margin are called "support vectors"

A hyperplane is a subspace of one dimension less than its ambient space. If a space is 3-dimensional then its hyperplanes are the 2-dimensional planes, while if the space is 2-dimensional, its hyperplanes are the 1 dimensional lines.

“The goal of support vector machines (SVM) is to find an optimal hyperplane that separates the data into classes.”

In case of SVM, we have a training data with the help of which we build an SVM model which can best predict the test data. There are two approaches in SVM for fitting a model using training data – Hard Margin SVM and Soft Margin SVM.

In case of a hard margin classifier we find “w” and “b” such that
ø(w)=2/||w|| is maximized for all {(xi,yi)} where yi(wT.xi+b)>=1

Soft margin doesn’t require to classify all the training data correctly, unlike hard margin. As a result, soft margin misclassifies some of the training data. However, on an average it has comparatively higher prediction accuracy than hard margin classifier for test data.
Concept of “Slack Variable” - "εi" is introduced to allow misclassification, where εi represents the distance of that point from the boundary margin for that class.

In case of a soft margin classifier we find “w” and “b” such that
ø(w)=(1/2)wTw+C∑εi is minimized for all {(xi,yi)} where yi(wT.xi+b) >= (1-εj) and εj >= 0 where j is the set of indices of violates the boundary hyper plane.
Parameter "C" can be viewed as a way to control over-fitting.

For a given point,

If 0 ≤ εi ≤ 1 then the point is classified correctly, lies in between the hyper plane and the margin on the correct side of the hyperplane. This point exhibits a margin violation.
If εi > 1 then the point is misclassified, lies on the wrong side of the hyperplane and beyond the margin.

C is a regularization parameter that controls the margin as follows:

A small value of C implies that the model is more tolerant and hence has a larger margin.
A large value of C makes the constraints hard to ignore, and hence the model has a smaller margin.
When the value of C is infinity, then all the constraints are enforced and thus the SVM model is considered a hard-margin classifier

SVM can classify non-linearly separable data as well using the kernel trick.

method of classifying the data by transforming it into a higher dimension is called "kernel trick"

Logistic Regression

Definition:

Logistic Regression is a Supervised learning algorithm that makes use of logistic functions to predict the probability of a Binary outcome.

- outcome limited to two possible outcomes: yes/no, 0/1, or true/false.

Logistic regression analyzes the relationship between one or more independent variables and classifies data into discrete classes. It is extensively used in predictive modeling, where the model estimates the mathematical probability of whether an instance belongs to a specific category or not.

Logistic regression uses a logistic function called a sigmoid function to map predictions and their probabilities. The sigmoid function refers to an S-shaped curve that converts any real value to a range between 0 and 1.

Types of Logistic Regression:

Binary Logistic Regression:
The dependent variable has only two 2 possible outcomes/classes.
Example-Male or Female.
Multinomial Logistic Regression:
The dependent variable has only two 3 or more possible outcomes/classes without ordering.
Example: Predicting food quality.(Good,Great and Bad).
Ordinal Logistic Regression:
The dependent variable has only two 3 or more possible outcomes/classes with ordering. Example: Star rating from 1 to 5

Reference:

Wikipedia

Origin of LR

Linear Discriminant Analysis - LDA Explanation with Numerical Problem

Linear Discriminant Analysis (LDA) is a dimensionality reduction technique commonly used for supervised classification problems. The goal of LDA is to project the dataset onto a lower-dimensional space while maximizing the class separability.

LDA is very similar to Principal Component Analysis (PCA).

LDA can be performed in 5 steps:

Compute the mean vectors for the different classes from the dataset.
Compute the scatter matrices (in-between-class and within-class scatter matrices).
Compute the eigenvectors and corresponding eigenvalues for the scatter matrices.
Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors with the largest eigenvalues.
Use this eigenvector matrix to transform the samples onto the new subspace.

Linear Discriminant Analysis (LDA) is like PCA, but it focuses on maximizing the seperability among the known categories.

Assumptions:

LDA is parametric - assumes Normal Distribution of data.

LDA assumes that each input have same variance.

Why Use Linear Discriminant Analysis?

Dimensionality Reduction

Feature Extraction

Handling Multiclass Problems

Reducing Overfitting

Applications:
image recognition, text classification, bioinformatics, face recognition

Numerical Problem: Conversion of 2D to 1D

References:
LDA clearly Explained - StatQuest

LDA Code Implementation

LDA Paper

WHAT IS PARTIAL LEAST SQUARES REGRESSION?

Partial Least Square Regression (PLS)

It is a method which reduces the variables, used to predict, to a smaller set of predictors. These predictors are then used to perform a regression.

When and Why use PLS

1. When there is high collinearity between features.

2. When there are more features than number of samples

Efficient Regression Modeling: PLS regression offers an efficient way to model relationships between variables in industrial contexts, providing a robust method for regression modelling.

Probabilistic Learning: In industrial settings, PLS regression can be formulated as a probabilistic model, enhancing its applicability and reliability in regression modeling tasks.

Handling Multicollinearity: PLS regression is effective in handling multicollinearity, a common issue in industrial data, making it a suitable choice for analyzing complex datasets with correlated variables.

Predictive Accuracy: PLS regression is known for its predictive accuracy, making it a valuable tool for industrial applications where accurate predictions are crucial for decision-making.

PLS is particularly useful when the matrix of predictors has more variables than observations and when there is multicollinearity among X values.

The components obtained from PLS regression are built to explain the dependent variables well, while in Principle Component Analysis(PCA) the components are built to describe the independent variables.

What are some real-world applications of PLSR?

Chemometrics: PLS regression is widely used in chemometrics for analyzing chemical data and spectra
Bioinformatics: PLS regression is applied in bioinformatics for analyzing high-dimensional genomic and proteomic data, making it a versatile tool for genomic analysis
Sensometrics: PLS regression finds applications in sensometrics, which involves the analysis of sensory data, such as in food science and consumer research
Neuroscience: PLS regression is utilized in neuroscience for various applications, including neuroimaging studies
Anthropology: PLS regression is used in anthropology for modeling and analyzing complex data structures in social sciences
Medicine and Health Professions: PLS-SEM, a variant of PLS regression, is employed in fields like healthcare for handling unobservable or latent variables and analyzing relationships between variables
Environmental Sciences: PLS-SEM is also applied in environmental sciences for data analysis and modeling relationships between observable and latent variables
Business and Management: PLS-SEM is widely used in business, management, and accounting for multivariate data analysis, combining regression and linear analysis methodologies

Translate

SKILLWILL

Search This Blog

Wikipedia

Search

Main header

2nd header links

Support Vector Machine

Logistic Regression

Types of Logistic Regression:

Linear Discriminant Analysis - LDA Explanation with Numerical Problem

Why Use Linear Discriminant Analysis?

Applications:
image recognition, text classification, bioinformatics, face recognition

WHAT IS PARTIAL LEAST SQUARES REGRESSION?

Popular Post

MindMaps

Featured post

Question 1: Reverse Words in a String III

Labels

Search This Blog

Wikipedia

Search

Main header

2nd header links

Support Vector Machine

Logistic Regression

Types of Logistic Regression:

Linear Discriminant Analysis - LDA Explanation with Numerical Problem

Why Use Linear Discriminant Analysis?

Applications:image recognition, text classification, bioinformatics, face recognition

WHAT IS PARTIAL LEAST SQUARES REGRESSION?

Popular Post

MindMaps

Featured post

Question 1: Reverse Words in a String III

Labels

Applications:
image recognition, text classification, bioinformatics, face recognition