MNIST

This repo is an experiment in non-ML approaches to creating a digit classifier on the MNIST dataset. It aims to provide a learning experience to better understand the advantages of Stochastic Gradient Descent and also to practice pytorch and tensor manipulation.

The inspiration for this project was taken from the fast.ai course.

Challenge

Using the MNIST dataset, use a non-ML approach to classify digits as accurately as possible.

I started out with the clue that images can be represented as light/dark matrices.

Tackling the Problem

Approach 1: Platonic

This approach can be found in 01_identify_platonic.py.

My first approach was to try and 'draw' some numbers in matrix form and then do a simple comparison between my ideal (Platonic) digits and the images. The digits I drew ressembled those from an LCD screen on a digital alarm clock. e.g. here is 3:

[
    [1, 1, 1, 1, 1, 1, 1],
    [0, 0, 0, 0, 0, 0, 1],
    [0, 0, 0, 0, 0, 0, 1],
    [1, 1, 1, 1, 1, 1, 1],
    [0, 0, 0, 0, 0, 0, 1],
    [0, 0, 0, 0, 0, 0, 1],
    [1, 1, 1, 1, 1, 1, 1],
],

Approach 2: Averaging (cheating a bit)

This approach can be found in 02_identify_cheat.py.

When Approach 1 did not work very well, I peaked at the high-level solution in the fast.ai book and attempted to implement that. This approach involved averaging over the training data to create the ideal digits, rather than trying to hand-draw them.

It was interesting to me that, although this is certainly not ML, it is still statistical in nature.

Results

Approach 1: 0.1477% accuracy
Approach 2: 0.7118% accuracy

Approach 1 did not go very well. Upon inspecting the training data further, it seems that the hand-drawn characters did simply not match the LCD-like digits I had created. The real data was often clumped in the centre and curvey. My data was often around the edge and at right angles. The two excpetions to this were '1' which drew a straight line down the middle and '0' which had more of a curve (and also more 'mass'), and these two digits were the ones exclusively predicted by my script. I experimented with modifying the digits to match but did not manage to obtain significant improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
01_identify_platonic.py		01_identify_platonic.py
02_identify_cheat.py		02_identify_cheat.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

01_identify_platonic.py

01_identify_platonic.py

02_identify_cheat.py

02_identify_cheat.py

README.md

README.md

Repository files navigation

MNIST

Challenge

Tackling the Problem

Approach 1: Platonic

Approach 2: Averaging (cheating a bit)

Results

About

Releases

Packages

Languages

mulholo/mnist

Folders and files

Latest commit

History

Repository files navigation

MNIST

Challenge

Tackling the Problem

Approach 1: Platonic

Approach 2: Averaging (cheating a bit)

Results

About

Resources

Stars

Watchers

Forks

Languages