Skip to content

NashChennc/sae_tools

Repository files navigation

SAE Analysis Toolkit

Python 3.10+ License: MIT Maintenance

A lightweight toolkit for analyzing Sparse Autoencoders (SAEs) that integrates statistical metrics, geometric analysis, and visualization dashboards.

Implemented using sae_lens and transformer_lens, this toolkit features an Anthropic-style dashboard and analysis workflow developed entirely in pure Python and Jupyter Notebooks.

Due to the restricted network access in my experimental environment, I intentionally avoided network dependencies during development, such as port mapping (in sae_dashboard) and online model loading (in transformer_lens), which have given me a lot of trouble.

tui scatter
dashboard cross correlation dimension reduction

Start Here

  • Quick Start: install the environment, configure local paths, verify resources, and run a dry run.
  • Complete Usage: TUI, Snakemake, idle-GPU runner, registries, experiments, resources, and artifacts.
  • Developer Guide: architecture, code layout, extension points, tests, and release checks.
  • Agent Conventions: project rules for coding agents working in this repository.

Topic references:

  • Workflow: registry, Snakemake DAG, deterministic paths, and shared runtime helpers.
  • Experiments: prompt and response grids plus common edits.
  • GPU Runner: idle-GPU allocation and memory records.
  • Artifacts: activation, statistical, and geometric output contracts.
  • Compatibility: TL3, SAE, model loading, and hook conventions.

One-Minute Setup

conda create -n sae-tl3 python=3.11 -y
conda activate sae-tl3
pip install -e ".[dev,workflow,tui]"

Create .env in the repository root:

MODEL_ROOT=<model-root>
SAE_ROOT=<sae-checkpoint-root>
DATASET_ROOT=<dataset-root>

Validate the install and registry:

python scripts/inspect_registry.py --strict
snakemake -n
python -m sae_tools_tui --help
python -m sae_tools.reporting --help

Launch the experiment-management TUI:

sae-tools-tui

If the console script has not been generated in the active environment yet, use:

python -m sae_tools_tui

Serve generated HTML reports and the read-only dashboard:

python scripts/analyze_layer_trends.py --config configs/experiments/response_grid.yaml
sae-tools-report artifacts --config configs/experiments/response_grid.yaml
sae-tools-report serve --config configs/experiments/response_grid.yaml

Core Commands

# Snakemake preview for the default experiment.
snakemake -n

# Preview a non-default experiment.
snakemake -n --config experiment_config=configs/experiments/response_grid.yaml

# Run missing workflow artifacts on one process.
snakemake -j 1 --rerun-incomplete

# Run missing targets across idle GPUs.
python scripts/run_idle_gpu_workflow.py

# Run only downstream stages after activations exist.
python scripts/run_idle_gpu_workflow.py --stages stat,geometric

# Generate the HTML layer-trend report.
python scripts/analyze_layer_trends.py --config configs/experiments/response_grid.yaml

# Generate a self-contained artifact plot report.
sae-tools-report artifacts --config configs/experiments/response_grid.yaml

# Serve reports and the browser dashboard on http://127.0.0.1:8765/.
sae-tools-report serve --config configs/experiments/response_grid.yaml

Generated artifacts, logs, Snakemake state, and per-run GPU memory logs are intentionally gitignored. Stable configuration and documentation are the source-controlled contract.

About

Nachuan Chen's interpretability toolset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors