Add non-record 1xH100 auto precision budget experiment#431
Open
spatnala18 wants to merge 3 commits intoopenai:mainfrom
Open
Add non-record 1xH100 auto precision budget experiment#431spatnala18 wants to merge 3 commits intoopenai:mainfrom
spatnala18 wants to merge 3 commits intoopenai:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a non-record exploratory submission under
records/track_non_record_16mbbased on the2026-03-20_10L_Int5MLP_MuonWD04_SWA50recipe.My idea is to replace fixed mixed-precision export exceptions with a calibration-driven precision allocator. After training, SWA, and pruning, the script evaluates a small set of candidate tensor promotions and greedily spends bytes where quantization appears most harmful, while staying under the 16,000,000-byte cap.
What’s included
records/track_non_record_16mb/2026-03-22_AutoPrecisionBudget_10L_1xH100/train_gpt.pyrecords/track_non_record_16mb/2026-03-22_AutoPrecisionBudget_10L_1xH100/train.logrecords/track_non_record_16mb/2026-03-22_AutoPrecisionBudget_10L_1xH100/submission.jsonrecords/track_non_record_16mb/2026-03-22_AutoPrecisionBudget_10L_1xH100/README.mdRun details
This is a cheap 1xH100 run with free credist on Modal platform, not a leaderboard attempt.
1xH1001MAX_WALLCLOCK_SECONDS=60ITERATIONS=150AUTO_CALIBRATION_WINDOWS=16FINAL_EVAL_MAX_WINDOWS=16Final exact metric from
train.log:val_loss: 5.53668879val_bpb: 3.08435975Artifact size:
15,771,56015,836,818Selected promotions:
blocks.9.attn.c_k.weightblocks.9.attn.c_v.weightWhy submit this
This is an in-progress non-record submission meant to document a concrete compression-aware direction rather than claim a strong score. The motivation is that current strong recipes already rely on hand-tuned mixed precision, and a sensitivity-driven allocator is a natural next step that may transfer better across future architecture changes.