Skip to content

kddubey/nn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nn

Neural network training and inference using only NumPy, vectorized.

  • AKA micrograd but w/ matrices
  • AKA a transparent torch.Tensor implementation (not at all accurate wrt PyTorch), assuming broadcasting and dot products are given operations.
Tips for others who want to re-implement backprop
  • Closely follow this extremely good video (you can probably skip the last 40 min). I first learned backprop through math, and didn't really appreciate its elegance. That's b/c reverse-mode autodiff is much more fruitfully thought about in terms of code: point to the object now, and update it later when you know the one other thing you need to know. It's also always useful to draw the computation graph.
  • Take broadcasting for granted until you can't anymore.
    • Because of this project, I moved broadcasting up to #1 in my top 5 algorithms I take for granted.
  • If you're having trouble thinking about the gradient of the dot product of matrices/vectors, start with a vector-vector dot product, and then matrix-vector, and then matrix-matrix. Here's a sort of answer key. I'll write a different one describing how I thought about condensing derivatives into the right vector/matrix.
    • Maybe I'm doing something wrong, but I found that I had to fight numpy's dot product a bit. It treats 1-D vectors pretty strictly.
  • For slicing/re-shape operations, just directly code the output of the chain rule instead of relying on correct multiplication operations. It's computationally a bit better, and feels easier to code and think through: you're just passing on the gradient to the right place / you're just directing traffic.
  • Just copy PyTorch's interface. It's a great interface, and it will make testing much easier.

Usage

Here's an NN for classification with 1 hidden layer:

import numpy as np
import nn

# input data parameters
num_observations = 100
input_dim = 10
num_classes = 3
rng_seed = abs(hash("waddup"))

# simulate input data
rng = np.random.default_rng(rng_seed)
y = nn.Tensor(rng.integers(0, num_classes, size=num_observations))
X = nn.Tensor(rng.normal(size=(num_observations, input_dim)))

# weights
hidden_size = 20
W1 = nn.Tensor(rng.normal(size=(input_dim, hidden_size)))
W2 = nn.Tensor(rng.normal(size=(hidden_size, num_classes)))

# forward pass
H1 = X @ W1
H1_relu = H1.relu()
H2 = H1_relu @ W2
loss = H2.cross_entropy(y)

# backward pass
loss.backward()

Installation

python -m pip install git+https://github.com/kddubey/nn.git

Todo

  • actually support tensors, not just matrices lol
  • tests
    • explicitly check that (tensor._data - tensor.grad).shape == tensor.shape
    • clever way to test nn code just by typing out torch code
  • mimic torch SGD optimizer
  • basic NN framework
  • requires_grad / freezing functionality
  • don't retain grads for non-leaf tensors

About

Vectorized micrograd

Topics

Resources

License

Stars

Watchers

Forks

Languages