Skip to content

Fix/replay buffer nan loss#6

Merged
whtoo merged 5 commits intomainfrom
fix/replay-buffer-nan-loss
Jun 21, 2025
Merged

Fix/replay buffer nan loss#6
whtoo merged 5 commits intomainfrom
fix/replay-buffer-nan-loss

Conversation

@whtoo
Copy link
Copy Markdown
Owner

@whtoo whtoo commented Jun 21, 2025

No description provided.

google-labs-jules bot and others added 5 commits June 21, 2025 03:03
…ReplayBuffer

Added epsilons to prevent division by zero during the calculation of probabilities and importance sampling weights in `PrioritizedReplayBuffer.sample()`.

This addresses RuntimeWarnings for division by zero and invalid values, which were causing the average loss to become NaN during training.
- Modified RainbowAgent.__init__ to prevent passing 'n_step' or other
  Rainbow-specific parameters to DQNAgent, resolving the TypeError.
- Updated tests in test_rainbow_components.py to use 'base_n_step'
  instead of the legacy 'n_step' parameter when instantiating RainbowAgent.
- Corrected assertion in test_rainbow_agent_initialization_standard to check
  agent.n_step_buffer.base_n_step instead of a non-existent agent.n_step.

These changes fix the agent initialization error and related test failures.
A pre-existing TypeError in test_rainbow_agent_update_model_call_order_noisy
(MagicMock issue) remains and is unrelated to these changes.
将直接运行脚本路径改为使用模块方式运行,解决相对导入问题
- 添加优化配置文件optimized_config.py包含调整后的超参数
- 修改agent.py支持Huber Loss和更严格的梯度裁剪
- 新增train_optimized.py用于优化训练流程
- 添加quick_test_optimization.py快速验证优化效果
- 创建optimization_plan.md记录优化策略和计划
@whtoo whtoo merged commit e4500ce into main Jun 21, 2025
1 check failed
@whtoo whtoo deleted the fix/replay-buffer-nan-loss branch June 21, 2025 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant