Skip to content

pronzzz/atomgpt

AtomGPT

A chaotic, evolutionary, dependency-free GPT built from scratch.

AtomGPT is an educational and experimental project that implements a Generative Pre-trained Transformer (GPT) entirely in Python standard library. No PyTorch, no NumPy, no TensorFlow. Just pure Python logic, from the autograd engine to the transformer blocks.

Beyond a simple implementation, AtomGPT introduces an Evolutionary Forge (forge.py), where models not only learn from data but also evolve their architecture over time—growing layers, adding heads, and pruning weights to survive.

Features

  • Zero Dependencies: Runs on pure Python. If you have Python 3, you can run AtomGPT.
  • Custom Autograd: A transparent backpropagation engine (Value) built from the ground up.
  • Evolutionary Training: Models compete in a population. The fittest survive, clone, and mutate (add/remove layers, heads, etc.).
  • Educational Core: microgpt.py contains the entire logic in a single file for easy study.

Installation

git clone https://github.com/pronzzz/atomgpt.git
cd atomgpt

Optional: Install graphviz if you want to visualize the computation graph (used in atomgpt/visualizer.py), but it is not required for the core model.

pip install graphviz

Usage

1. The Evolutionary Forge (Recommended)

Watch models evolve and generate fantasy names in real-time.

python3 forge.py

This script will:

  1. Initialize a population of small GPT models.
  2. Train them on a dataset of fantasy names.
  3. Evolve the population (Select -> Clone -> Mutate).
  4. Generate new names periodically.
  5. Save the best names to generated_names.txt.

2. The Atomic Core

If you want to study the bare-metal implementation:

python3 microgpt.py

This script downloads a dataset (if missing), trains a model, and prints generated samples to the console.

Walkthrough: How It Works

Here is a step-by-step walkthrough of what happens when you run AtomGPT:

Step 1: The Spark (Initialization)

When forge.py starts, it initializes a Population of random GPT models. Each model starts small (e.g., 1 layer, 16 embedding dim) to survive the harsh environment of random initialization.

Step 2: The Learning (Forward & Backward)

In every step, the models are fed a name (e.g., "Drakon").

  1. Tokenization: "Drakon" is broken down into characters.
  2. Forward Pass: The characters flow through the user-defined GPT architecture.
    • Embeddings lookup.
    • Attention mechanisms weigh relationships between characters.
    • MLPs process the information.
  3. Loss Calculation: The model predicts the next character. We calculate the negative log-likelihood loss.
  4. Backward Pass: The custom autograd engine traces the graph backwards, calculating gradients for every weight.
  5. Update: An Adam-inspired optimizer tweaks the weights to reduce error.

Step 3: The Evolution (Survival of the Fittest)

After a set number of steps (a generation), the forge pauses to judge the models.

  1. Evaluation: Models are scored based on their Loss (how well they predict) and Efficiency (parameter count).
  2. Culling: The bottom 50% of models are deleted.
  3. Reproduction: The survivors are cloned to refill the population.
  4. Mutation: The clones undergo random mutations:
    • Growth: "I need more power!" -> Adds a layer or attention head.
    • Efficiency: "I am too heavy." -> Prunes small weights or shrinks embedding dimension.
    • Chaos: Randomly sparsifies a dense layer.

Step 4: The Creation

Finally, the champion model is used to hallucinate new names. It samples character by character, following the statistical patterns it learned (and evolved to process efficiently).

Roadmap

  • Implement more complex mutation operators (e.g., skip connection rewiring).
  • Add saving/loading of model "species" (checkpoints).
  • visualization of the evolutionary tree.

License

MIT License. See LICENSE for details.

About

AtomGPT is a chaotic, evolutionary implementation of a Generative Pre-trained Transformer (GPT) built from scratch using only the Python standard library. It features a custom autograd engine and an "Evolutionary Forge" where models compete, clone, and mutate their architectures (adding layers/heads or pruning weights) in real-time.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Contributors

Languages