Skip to content

Fix entropy collapse#93

Open
spktrm wants to merge 25 commits into
mainfrom
fix_entropy_collapse
Open

Fix entropy collapse#93
spktrm wants to merge 25 commits into
mainfrom
fix_entropy_collapse

Conversation

@spktrm
Copy link
Copy Markdown
Owner

@spktrm spktrm commented Apr 15, 2026

No description provided.

@spktrm
Copy link
Copy Markdown
Owner Author

spktrm commented Apr 15, 2026

@copilot rewrite learner.py to remove all points where I am manipulating on device tensors / arrays outside of train_step (which is jitted). I dont want to spend any wasted time moving these around. Track these variables another way without having to rely on the output of the train_step

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 15, 2026

@copilot rewrite learner.py to remove all points where I am manipulating on device tensors / arrays outside of train_step (which is jitted). I dont want to spend any wasted time moving...

Done in 787e709 and 87a7b04. Here's what changed:

train_step (jitted): Removed step_count and frame_count increments from both player and builder state .replace() calls, and removed training_step/player_frame_count/builder_frame_count from the returned logs dict.

Learner class: Added self.step_count, self.player_frame_count, self.builder_frame_count as plain Python ints (initialized once from state at __init__). Frame increments are computed from the numpy batch before device_put. All host-side logic (_handle_periodic_tasks, _should_add_new_player, _create_params_container, league management) now uses these counters directly — zero device-to-host transfers during training.

Also removed the check_finite_loss block that was calling jnp.isfinite().item(), and added _save_checkpoint() helper that injects host counters into the state only at save time.

@spktrm spktrm force-pushed the fix_entropy_collapse branch from 87a7b04 to af125f6 Compare April 15, 2026 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants