mirror of
https://github.com/kashifulhaque/smoltorch.git
synced 2025-12-05 22:52:50 +00:00
360 lines
9.1 KiB
Markdown
360 lines
9.1 KiB
Markdown
# 🔥 smoltorch • [blog](https://blog.ifkash.dev/smoltorch)
|
|
|
|
<div align="center">
|
|
|
|
**A tiny autograd engine and neural network library built from first principles**
|
|
|
|
[](https://badge.fury.io/py/smoltorch)
|
|
[](https://www.python.org/downloads/)
|
|
[](https://opensource.org/licenses/MIT)
|
|
|
|
*Inspired by Andrej Karpathy's micrograd, built for learning*
|
|
|
|
</div>
|
|
|
|
---
|
|
|
|
## 🎯 What is smoltorch?
|
|
|
|
smoltorch is a minimalist deep learning library that implements automatic differentiation (autograd) and neural networks from scratch using only NumPy. It's designed to be:
|
|
|
|
- **Educational**: Understand how modern deep learning frameworks work under the hood
|
|
- **Transparent**: Every operation is visible and understandable
|
|
- **Functional**: Train real models on real datasets with competitive performance
|
|
- **Minimal**: ~500 lines of readable, well-documented Python code
|
|
|
|
### Why "smoltorch"?
|
|
|
|
"Smol" + PyTorch. It's a tiny implementation that captures the essence of modern deep learning frameworks.
|
|
|
|
---
|
|
|
|
## ✨ Features
|
|
|
|
### Core Engine
|
|
- ✅ **Automatic differentiation** with dynamic computational graphs
|
|
- ✅ **NumPy-backed tensors** for efficient numerical computing
|
|
- ✅ **Broadcasting support** with proper gradient handling
|
|
- ✅ **Topological sorting** for correct backpropagation
|
|
|
|
### Operations
|
|
- **Arithmetic**: `+`, `-`, `*`, `/`, `**`
|
|
- **Matrix operations**: `@` (matmul)
|
|
- **Activations**: ReLU, tanh, sigmoid
|
|
- **Reductions**: sum, mean
|
|
- **Element-wise**: log
|
|
|
|
### Neural Networks
|
|
- **Layers**: Linear (fully connected)
|
|
- **Models**: Multi-layer perceptron (MLP)
|
|
- **Loss functions**: MSE, Binary Cross-Entropy
|
|
- **Optimizers**: SGD (Stochastic Gradient Descent)
|
|
|
|
---
|
|
|
|
## 📦 Installation
|
|
|
|
### From PyPI (recommended)
|
|
```bash
|
|
uv add smoltorch
|
|
```
|
|
|
|
### From source
|
|
```bash
|
|
git clone https://github.com/kashifulhaque/smoltorch.git
|
|
cd smoltorch
|
|
uv pip install -e .
|
|
```
|
|
|
|
### Development installation
|
|
```bash
|
|
uv pip install -e ".[dev]"
|
|
```
|
|
|
|
---
|
|
|
|
## 🚀 Quick Start
|
|
|
|
### Basic Tensor Operations
|
|
```python
|
|
from smoltorch import Tensor
|
|
|
|
# Create tensors
|
|
x = Tensor([1.0, 2.0, 3.0])
|
|
y = Tensor([4.0, 5.0, 6.0])
|
|
|
|
# Operations
|
|
z = x + y # Element-wise addition
|
|
w = x * y # Element-wise multiplication
|
|
a = x @ y.T # Matrix multiplication
|
|
|
|
# Backward pass
|
|
a.backward()
|
|
print(x.grad) # Gradients computed automatically!
|
|
```
|
|
|
|
### Training a Neural Network (Regression)
|
|
```python
|
|
from smoltorch import Tensor, MLP, SGD
|
|
from sklearn.datasets import make_regression
|
|
import numpy as np
|
|
|
|
# Generate data
|
|
X, y = make_regression(n_samples=100, n_features=5, noise=10)
|
|
y = y.reshape(-1, 1)
|
|
|
|
# Create model
|
|
model = MLP([5, 16, 16, 1]) # 5 inputs -> 16 -> 16 -> 1 output
|
|
optimizer = SGD(model.parameters(), lr=0.001)
|
|
|
|
# Training loop
|
|
for epoch in range(100):
|
|
# Forward pass
|
|
X_tensor = Tensor(X)
|
|
y_tensor = Tensor(y)
|
|
y_pred = model(X_tensor)
|
|
|
|
# Compute loss (MSE)
|
|
loss = ((y_pred - y_tensor) ** 2).mean()
|
|
|
|
# Backward pass
|
|
optimizer.zero_grad()
|
|
loss.backward()
|
|
|
|
# Update weights
|
|
optimizer.step()
|
|
|
|
if (epoch + 1) % 10 == 0:
|
|
print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}")
|
|
```
|
|
|
|
### Binary Classification
|
|
```python
|
|
from smoltorch import Tensor, MLP, SGD, binary_cross_entropy
|
|
from sklearn.datasets import load_breast_cancer
|
|
from sklearn.preprocessing import StandardScaler
|
|
|
|
# Load and preprocess data
|
|
data = load_breast_cancer()
|
|
X, y = data.data, data.target.reshape(-1, 1)
|
|
scaler = StandardScaler()
|
|
X = scaler.fit_transform(X)
|
|
|
|
# Create classifier with sigmoid output
|
|
class BinaryClassifier(MLP):
|
|
def __call__(self, x):
|
|
x = super().__call__(x)
|
|
return x.sigmoid() # Output probabilities
|
|
|
|
model = BinaryClassifier([30, 16, 8, 1])
|
|
optimizer = SGD(model.parameters(), lr=0.01)
|
|
|
|
# Training loop
|
|
for epoch in range(200):
|
|
X_tensor = Tensor(X)
|
|
y_tensor = Tensor(y)
|
|
|
|
y_pred = model(X_tensor)
|
|
loss = binary_cross_entropy(y_pred, y_tensor)
|
|
|
|
optimizer.zero_grad()
|
|
loss.backward()
|
|
optimizer.step()
|
|
|
|
if (epoch + 1) % 20 == 0:
|
|
accuracy = ((y_pred.data > 0.5) == y).mean()
|
|
print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}, Acc: {accuracy:.4f}")
|
|
|
|
# Result: ~96% test accuracy on breast cancer dataset! 🎉
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Real-World Performance
|
|
|
|
smoltorch achieves competitive results on standard benchmarks:
|
|
|
|
| Dataset | Task | Test Accuracy | Epochs |
|
|
|---------|------|---------------|--------|
|
|
| Breast Cancer | Binary Classification | 96.5% | 200 |
|
|
| Synthetic Regression | Regression | MSE: 95.7 | 100 |
|
|
|
|
---
|
|
|
|
## 🏗️ Architecture
|
|
|
|
### Computational Graph
|
|
|
|
smoltorch builds a dynamic computational graph during the forward pass:
|
|
```python
|
|
x = Tensor([2.0])
|
|
y = Tensor([3.0])
|
|
z = (x * y) + (x ** 2) # Graph: z -> [+] -> [*, **] -> [x, y]
|
|
|
|
z.backward() # Backpropagate through graph
|
|
print(x.grad) # dz/dx = y + 2x = 3 + 4 = 7.0
|
|
```
|
|
|
|
### How Autograd Works
|
|
|
|
1. **Forward pass**: Build computational graph with operations as nodes
|
|
2. **Topological sort**: Order nodes for correct gradient flow
|
|
3. **Backward pass**: Apply chain rule in reverse topological order
|
|
4. **Gradient accumulation**: Sum gradients from multiple paths
|
|
|
|
Example with broadcasting:
|
|
```python
|
|
x = Tensor([[1, 2, 3]]) # shape (1, 3)
|
|
y = Tensor([[1], [2]]) # shape (2, 1)
|
|
z = x + y # shape (2, 3) - broadcasting!
|
|
|
|
z.backward()
|
|
# x.grad sums over broadcast dimensions: shape (1, 3)
|
|
# y.grad sums over broadcast dimensions: shape (2, 1)
|
|
```
|
|
|
|
---
|
|
|
|
## 🧠 Supported Operations
|
|
|
|
### Element-wise Operations
|
|
```python
|
|
z = x + y # Addition with broadcasting
|
|
z = x - y # Subtraction
|
|
z = x * y # Multiplication
|
|
z = x / y # Division
|
|
z = x ** 2 # Power
|
|
```
|
|
|
|
### Matrix Operations
|
|
```python
|
|
z = x @ y # Matrix multiplication (with batch support)
|
|
```
|
|
|
|
### Activation Functions
|
|
```python
|
|
z = x.relu() # ReLU: max(0, x)
|
|
z = x.tanh() # Tanh: (e^2x - 1) / (e^2x + 1)
|
|
z = x.sigmoid() # Sigmoid: 1 / (1 + e^-x)
|
|
```
|
|
|
|
### Reductions
|
|
```python
|
|
z = x.sum() # Sum all elements
|
|
z = x.sum(axis=0) # Sum along axis
|
|
z = x.mean() # Mean of all elements
|
|
z = x.mean(axis=1) # Mean along axis
|
|
```
|
|
|
|
### Other
|
|
```python
|
|
z = x.log() # Natural logarithm
|
|
```
|
|
|
|
---
|
|
|
|
## 📚 Examples
|
|
|
|
Check out the `examples/` directory:
|
|
|
|
- [`train_regression.py`](examples/train_regression.py) - Train on synthetic regression data
|
|
- [`train_classification.py`](examples/train_classification.py) - Binary classification on breast cancer dataset
|
|
|
|
Run them:
|
|
```bash
|
|
uv run examples/train_regression.py
|
|
uv run examples/train_classification.py
|
|
```
|
|
|
|
---
|
|
|
|
## 🧪 Testing
|
|
|
|
Run the test suite:
|
|
```bash
|
|
uv run pytest
|
|
```
|
|
|
|
Tests cover:
|
|
- ✅ Addition with broadcasting
|
|
- ✅ Multiplication with broadcasting
|
|
- ✅ Matrix multiplication
|
|
- ✅ Activation functions (ReLU, tanh, sigmoid)
|
|
- ✅ Reductions (sum, mean)
|
|
- ✅ Linear layers
|
|
- ✅ Multi-layer perceptrons
|
|
- ✅ End-to-end training
|
|
|
|
---
|
|
|
|
## 🗺️ Roadmap
|
|
|
|
### Coming Soon
|
|
- [ ] **More optimizers**: Adam, RMSprop with momentum
|
|
- [ ] **More activations**: Leaky ReLU, ELU, Softmax
|
|
- [ ] **Regularization**: Dropout, L2 weight decay
|
|
- [ ] **Mini-batch training**: Efficient batch processing
|
|
- [ ] **Multi-class classification**: Softmax + Cross-Entropy loss
|
|
|
|
### Future
|
|
- [ ] **Convolutional layers**: CNN support for images
|
|
- [ ] **Model serialization**: Save/load weights in safetensors format
|
|
- [ ] **GPU acceleration**: Explore Metal Performance Shaders for Apple Silicon
|
|
- [ ] **Better initialization**: He initialization for ReLU networks
|
|
- [ ] **Learning rate scheduling**: Decay strategies
|
|
|
|
---
|
|
|
|
## 🎓 Learning Resources
|
|
|
|
If you're learning from smoltorch, these resources complement it well:
|
|
|
|
- [Andrej Karpathy's micrograd](https://github.com/karpathy/micrograd) - The original inspiration
|
|
- [Neural Networks: Zero to Hero](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) - Video series by Andrej Karpathy
|
|
- [The Matrix Calculus You Need For Deep Learning](https://arxiv.org/abs/1802.01528) - Paper on backpropagation math
|
|
|
|
---
|
|
|
|
## 🤝 Contributing
|
|
|
|
Contributions are welcome! Please feel free to submit a Pull Request. For major changes:
|
|
|
|
1. Fork the repository
|
|
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
|
|
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
|
|
4. Push to the branch (`git push origin feature/AmazingFeature`)
|
|
5. Open a Pull Request
|
|
|
|
---
|
|
|
|
## 📄 License
|
|
|
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
|
|
---
|
|
|
|
## 🙏 Acknowledgments
|
|
|
|
- **Andrej Karpathy** for [micrograd](https://github.com/karpathy/micrograd) and the brilliant educational content
|
|
- **PyTorch team** for API design inspiration
|
|
- The deep learning community for making knowledge accessible
|
|
|
|
---
|
|
|
|
## 📬 Contact
|
|
|
|
Created by Kashif - feel free to reach out!
|
|
|
|
- GitHub: [@kashifulhaque](https://github.com/kashifulhaque)
|
|
- Twitter: [@notifkash](https://twitter.com/notifkash)
|
|
|
|
---
|
|
|
|
<div align="center">
|
|
|
|
**⭐ Star this repo if you found it helpful!**
|
|
|
|
Built with ❤️ for learners and tinkerers
|
|
|
|
</div>
|