Micrograd overview - DL Notes

Main Reference

The following notes reproduce Andrej Karpathy’s The spelled-out intro to neural networks and backpropagation lecture step by step, adapting the original flow into a sequence of Markdown notes. The implementation is based on the official Micrograd repository.

More Than a Lecture Transcript

The explanations are expanded beyond the original lecture flow: each idea is unpacked gradually and in depth, with the goal of supporting a real, deep understanding of Micrograd, backpropagation, and automatic differentiation.

Pipeline

The pipeline is:

recap derivatives and their meaning as local sensitivities;

build the Value object, the core scalar object of Micrograd: it wraps a number, stores its gradient, and remembers which previous values and operation produced it;

visualize computation graphs built from Value objects;

perform manual backpropagation through simple expressions and a neuron;

automate the backward pass;

compare the result with PyTorch;

use Micrograd to define and train a small MLP with a loss function and manual gradient descent updates.

Didactic Approach

Understanding First

Micrograd is developed as a small scalar-valued autodiff engine, and a toy multi-layer perceptron (MLP) is then built on top of it. Both are introduced through a sequence of progressively refined versions. A deliberately didactic style is adopted: the same code is sometimes repeated multiple times, with small incremental changes introduced from one version to the next.

Obviously this is not the most efficient way to write or present code, but it makes the reasoning behind each step visible. Intermediate solutions, partial implementations, and implementation obstacles are kept in the notes because they are part of the learning process, not just noise before the final implementation.

How to Use These Notes

Reading Mode

These notes can be used in two ways:

read them as standalone Markdown pages, following the explanations without executing every code block;

recreate the original notebook-style workflow by following the notes in order and running each code block step by step in a single continuous Python or Jupyter session.

Continuity Across Notes

These Markdown notes are the distributed version of a single Jupyter notebook, whose code cells and explanatory Markdown comments have been split into multiple pages. For this reason, code snippets may use variables, functions, or classes introduced in previous notes. Readers should keep the sequence in mind; when reproducing the code, blocks should be executed sequentially in the same Python or Jupyter session.

Reproducing the Code

The original code was run in a Jupyter notebook using a custom Conda environment as its kernel. Conda was installed through Miniforge rather than the full Anaconda distribution, since it provides a smaller Conda-based setup centered around the conda-forge ecosystem.

Why Conda?

Micrograd itself does not require Conda. A Conda environment was used for the surrounding tooling, especially for managing non-Python dependencies such as Graphviz. Another valid option would be to recreate the notebook-style workflow in Google Colab, avoiding most local environment setup.

About Conda Environment Setup

A full Conda env setup guide is intentionally not included. The environment setup is left to the reader as part of the practical learning process: finding packages, resolving version conflicts, and working through small installation obstacles are useful skills in themselves.

One relevant example is PyTorch: official PyTorch Conda binaries are no longer published for the newest releases, so a Conda-based setup may require choosing an older official build, using pip inside the Conda environment, or relying on community-maintained packages.

Required Packages and Imports

When recreating the notebook-style workflow, it is preferable to place the required imports in a dedicated initial cell:

import math
import random
import numpy as np
import torch
import matplotlib.pyplot as plt
%matplotlib inline