# πŸ”₯ smoltorch β€’ [blog](https://blog.ifkash.dev/smoltorch)
**A tiny autograd engine and neural network library built from first principles** [![PyPI version](https://badge.fury.io/py/smoltorch.svg)](https://badge.fury.io/py/smoltorch) [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) *Inspired by Andrej Karpathy's micrograd, built for learning*
--- ## 🎯 What is smoltorch? smoltorch is a minimalist deep learning library that implements automatic differentiation (autograd) and neural networks from scratch using only NumPy. It's designed to be: - **Educational**: Understand how modern deep learning frameworks work under the hood - **Transparent**: Every operation is visible and understandable - **Functional**: Train real models on real datasets with competitive performance - **Minimal**: ~500 lines of readable, well-documented Python code ### Why "smoltorch"? "Smol" + PyTorch. It's a tiny implementation that captures the essence of modern deep learning frameworks. --- ## ✨ Features ### Core Engine - βœ… **Automatic differentiation** with dynamic computational graphs - βœ… **NumPy-backed tensors** for efficient numerical computing - βœ… **Broadcasting support** with proper gradient handling - βœ… **Topological sorting** for correct backpropagation ### Operations - **Arithmetic**: `+`, `-`, `*`, `/`, `**` - **Matrix operations**: `@` (matmul) - **Activations**: ReLU, tanh, sigmoid - **Reductions**: sum, mean - **Element-wise**: log ### Neural Networks - **Layers**: Linear (fully connected) - **Models**: Multi-layer perceptron (MLP) - **Loss functions**: MSE, Binary Cross-Entropy - **Optimizers**: SGD (Stochastic Gradient Descent) --- ## πŸ“¦ Installation ### From PyPI (recommended) ```bash uv add smoltorch ``` ### From source ```bash git clone https://github.com/kashifulhaque/smoltorch.git cd smoltorch uv pip install -e . ``` ### Development installation ```bash uv pip install -e ".[dev]" ``` --- ## πŸš€ Quick Start ### Basic Tensor Operations ```python from smoltorch import Tensor # Create tensors x = Tensor([1.0, 2.0, 3.0]) y = Tensor([4.0, 5.0, 6.0]) # Operations z = x + y # Element-wise addition w = x * y # Element-wise multiplication a = x @ y.T # Matrix multiplication # Backward pass a.backward() print(x.grad) # Gradients computed automatically! ``` ### Training a Neural Network (Regression) ```python from smoltorch import Tensor, MLP, SGD from sklearn.datasets import make_regression import numpy as np # Generate data X, y = make_regression(n_samples=100, n_features=5, noise=10) y = y.reshape(-1, 1) # Create model model = MLP([5, 16, 16, 1]) # 5 inputs -> 16 -> 16 -> 1 output optimizer = SGD(model.parameters(), lr=0.001) # Training loop for epoch in range(100): # Forward pass X_tensor = Tensor(X) y_tensor = Tensor(y) y_pred = model(X_tensor) # Compute loss (MSE) loss = ((y_pred - y_tensor) ** 2).mean() # Backward pass optimizer.zero_grad() loss.backward() # Update weights optimizer.step() if (epoch + 1) % 10 == 0: print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}") ``` ### Binary Classification ```python from smoltorch import Tensor, MLP, SGD, binary_cross_entropy from sklearn.datasets import load_breast_cancer from sklearn.preprocessing import StandardScaler # Load and preprocess data data = load_breast_cancer() X, y = data.data, data.target.reshape(-1, 1) scaler = StandardScaler() X = scaler.fit_transform(X) # Create classifier with sigmoid output class BinaryClassifier(MLP): def __call__(self, x): x = super().__call__(x) return x.sigmoid() # Output probabilities model = BinaryClassifier([30, 16, 8, 1]) optimizer = SGD(model.parameters(), lr=0.01) # Training loop for epoch in range(200): X_tensor = Tensor(X) y_tensor = Tensor(y) y_pred = model(X_tensor) loss = binary_cross_entropy(y_pred, y_tensor) optimizer.zero_grad() loss.backward() optimizer.step() if (epoch + 1) % 20 == 0: accuracy = ((y_pred.data > 0.5) == y).mean() print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}, Acc: {accuracy:.4f}") # Result: ~96% test accuracy on breast cancer dataset! πŸŽ‰ ``` --- ## πŸ“Š Real-World Performance smoltorch achieves competitive results on standard benchmarks: | Dataset | Task | Test Accuracy | Epochs | |---------|------|---------------|--------| | Breast Cancer | Binary Classification | 96.5% | 200 | | Synthetic Regression | Regression | MSE: 95.7 | 100 | --- ## πŸ—οΈ Architecture ### Computational Graph smoltorch builds a dynamic computational graph during the forward pass: ```python x = Tensor([2.0]) y = Tensor([3.0]) z = (x * y) + (x ** 2) # Graph: z -> [+] -> [*, **] -> [x, y] z.backward() # Backpropagate through graph print(x.grad) # dz/dx = y + 2x = 3 + 4 = 7.0 ``` ### How Autograd Works 1. **Forward pass**: Build computational graph with operations as nodes 2. **Topological sort**: Order nodes for correct gradient flow 3. **Backward pass**: Apply chain rule in reverse topological order 4. **Gradient accumulation**: Sum gradients from multiple paths Example with broadcasting: ```python x = Tensor([[1, 2, 3]]) # shape (1, 3) y = Tensor([[1], [2]]) # shape (2, 1) z = x + y # shape (2, 3) - broadcasting! z.backward() # x.grad sums over broadcast dimensions: shape (1, 3) # y.grad sums over broadcast dimensions: shape (2, 1) ``` --- ## 🧠 Supported Operations ### Element-wise Operations ```python z = x + y # Addition with broadcasting z = x - y # Subtraction z = x * y # Multiplication z = x / y # Division z = x ** 2 # Power ``` ### Matrix Operations ```python z = x @ y # Matrix multiplication (with batch support) ``` ### Activation Functions ```python z = x.relu() # ReLU: max(0, x) z = x.tanh() # Tanh: (e^2x - 1) / (e^2x + 1) z = x.sigmoid() # Sigmoid: 1 / (1 + e^-x) ``` ### Reductions ```python z = x.sum() # Sum all elements z = x.sum(axis=0) # Sum along axis z = x.mean() # Mean of all elements z = x.mean(axis=1) # Mean along axis ``` ### Other ```python z = x.log() # Natural logarithm ``` --- ## πŸ“š Examples Check out the `examples/` directory: - [`train_regression.py`](examples/train_regression.py) - Train on synthetic regression data - [`train_classification.py`](examples/train_classification.py) - Binary classification on breast cancer dataset Run them: ```bash uv run examples/train_regression.py uv run examples/train_classification.py ``` --- ## πŸ§ͺ Testing Run the test suite: ```bash uv run pytest ``` Tests cover: - βœ… Addition with broadcasting - βœ… Multiplication with broadcasting - βœ… Matrix multiplication - βœ… Activation functions (ReLU, tanh, sigmoid) - βœ… Reductions (sum, mean) - βœ… Linear layers - βœ… Multi-layer perceptrons - βœ… End-to-end training --- ## πŸ—ΊοΈ Roadmap ### Coming Soon - [ ] **More optimizers**: Adam, RMSprop with momentum - [ ] **More activations**: Leaky ReLU, ELU, Softmax - [ ] **Regularization**: Dropout, L2 weight decay - [ ] **Mini-batch training**: Efficient batch processing - [ ] **Multi-class classification**: Softmax + Cross-Entropy loss ### Future - [ ] **Convolutional layers**: CNN support for images - [ ] **Model serialization**: Save/load weights in safetensors format - [ ] **GPU acceleration**: Explore Metal Performance Shaders for Apple Silicon - [ ] **Better initialization**: He initialization for ReLU networks - [ ] **Learning rate scheduling**: Decay strategies --- ## πŸŽ“ Learning Resources If you're learning from smoltorch, these resources complement it well: - [Andrej Karpathy's micrograd](https://github.com/karpathy/micrograd) - The original inspiration - [Neural Networks: Zero to Hero](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) - Video series by Andrej Karpathy - [The Matrix Calculus You Need For Deep Learning](https://arxiv.org/abs/1802.01528) - Paper on backpropagation math --- ## 🀝 Contributing Contributions are welcome! Please feel free to submit a Pull Request. For major changes: 1. Fork the repository 2. Create your feature branch (`git checkout -b feature/AmazingFeature`) 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`) 4. Push to the branch (`git push origin feature/AmazingFeature`) 5. Open a Pull Request --- ## πŸ“„ License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. --- ## πŸ™ Acknowledgments - **Andrej Karpathy** for [micrograd](https://github.com/karpathy/micrograd) and the brilliant educational content - **PyTorch team** for API design inspiration - The deep learning community for making knowledge accessible --- ## πŸ“¬ Contact Created by Kashif - feel free to reach out! - GitHub: [@kashifulhaque](https://github.com/kashifulhaque) - Twitter: [@notifkash](https://twitter.com/notifkash) ---
**⭐ Star this repo if you found it helpful!** Built with ❀️ for learners and tinkerers