Classification
Here we aim to understand how exactly "Logistic Regression", a method that seems only to be used for classification problems, is indeed a regression algorithm.
We will as per the trend thus far, detail closed-form and approximate solutions to the loss function on this page. Furthermore, we will see how this type of regression is still a member of the GLM (generalised linear models) family, and we shall witness the derivation of the loss function by assuming our data is Bernoulli distributed.
Origins
The perceptron learning algorithm is the most simple algorithm we have for Binary Classification.
It was introduced by Frank Rosenblatt in his seminal paper: "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain" in 1958. The history however dates back further to the theoretical foundations of Warren McCulloch and Walter Pitts in 1943 and their paper "A Logical Calculus of the Ideas Immanent in Nervous Activity". The interested reader may visit these links for annotations and the original pdfs.
Embedded Notebook
History
Abstract
Fashion-MNIST is a modern drop-in replacement for MNIST. Released by Zalando Research in 2017, it packs 70 000 tiny grayscale images of apparel—sneakers, shirts, coats—into a lightweight benchmark. Its familiar format keeps setup trivial, while richer visuals pose a tougher challenge.
Origins
Zalando’s quality-control cameras captured millions of 96 × 96 product shots. Han Xiao et al. down-sampled these to 28 × 28, grouped them into ten balanced classes, and open-sourced the result. The idea: upgrade MNIST difficulty without touching loaders or evaluation scripts.
import numpy as np
from sklearn import datasets, svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
#X = iris.data[:, :2]
"""results:
SVC with linear kernel Accuracy: 0.80
LinearSVC (linear kernel) Accuracy: 0.78
SVC with RBF kernel Accuracy: 0.80
SVC with polynomial (degree 3) Accuracy: 0.78
SVC with Monster kernel Accuracy: 0.82
"""
X = iris.data[:, :3]
"""results:
SVC with linear kernel Accuracy: 1.00
LinearSVC (linear kernel) Accuracy: 0.98
SVC with RBF kernel Accuracy: 1.00
SVC with polynomial (degree 3) Accuracy: 0.96
SVC with Monster kernel Accuracy: 0.91
"""
#X = iris.data
#1.00 accuracy on all methods
y = iris.target
# train / test split.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# random number generator
rng = np.random.RandomState(42)
D = 196883
W = rng.randn(X.shape[1], D) # creates random matrix of arg size
def monster_kernel(X1, X2): # produces pair-wise combinations of all feature vectors
X1_proj = np.dot(X1, W) # projects the 2,3 or 4 features into 198,883
X2_proj = np.dot(X2, W) # same here with same result
return np.dot(X1_proj, X2_proj.T) # returns the Gram Matrix
# Regularization parameter
C = 1.0
# Define models
models = [
# one vs. one classifier, with dual problem formulation. slower
("SVC with linear kernel", svm.SVC(kernel="linear", C=C)),
# one vs. rest. primal, faster.
("LinearSVC (linear kernel)", svm.LinearSVC(C=C, max_iter=10000)),
("SVC with RBF kernel", svm.SVC(kernel="rbf", gamma=0.7, C=C)),
("SVC with polynomial (degree 3)", svm.SVC(kernel="poly", degree=3, gamma="auto", C=C)),
("SVC with Monster kernel", svm.SVC(kernel=monster_kernel, C=C))
]
# Train, predict, and print accuracy
print("Classification Accuracy:\n")
for name, clf in models:
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
acc = accuracy_score(y_test, y_pred)
print(f"{name:<40} Accuracy: {acc:.2f}")An Embedded Notebook
History
Abstract
The MNIST dataset (Modified National Institute of Standards and Technology) has been very influential in machine learning and computer vision. It is an easy and popular dataset that has been used since it's inception in 1998 as a benchmark for Machine Learning Models. Historically it has enhanced the evolution of OCR (Optical Character Recognition) and assisted in the emergence of neural networks.
This page is for finding a classifier on the KMNIST dataset. This dataset is more challenging than the original MNIST dataset that I have previously solved.
The details of the dataset can be found in the associated paper.
In short, since the reformation of the Japanese education in 1868, there became a standardisation of the kanji characters, and in the present day, most Japanese people cannot read the texts from 150 years ago.