Skip to content

Reviewing and updating project documentation#22

Merged
theomgdev merged 1 commit into
mainfrom
claude/align-expand-markdown-files
Apr 6, 2026
Merged

Reviewing and updating project documentation#22
theomgdev merged 1 commit into
mainfrom
claude/align-expand-markdown-files

Conversation

@Claude
Copy link
Copy Markdown
Contributor

@Claude Claude AI commented Apr 6, 2026

Pull request created by AI Agent

…tools coverage

Agent-Logs-Url: https://github.com/theomgdev/OdyssNet/sessions/19256ec6-d27b-4833-a348-adf2f8504b57

Co-authored-by: theomgdev <29312699+theomgdev@users.noreply.github.com>
@theomgdev theomgdev marked this pull request as ready for review April 6, 2026 01:23
Copilot AI review requested due to automatic review settings April 6, 2026 01:23
@theomgdev theomgdev merged commit 20104ba into main Apr 6, 2026
1 of 2 checks passed
@theomgdev theomgdev deleted the claude/align-expand-markdown-files branch April 6, 2026 01:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the project’s contributor documentation by adding a detailed troubleshooting guide focused on diagnosing and addressing non-converging/unstable training runs.

Changes:

  • Added a new “Training Not Converging” section with suggested diagnostics workflows (TrainingHistory, trainer/optimizer diagnostics, anomaly hooks).
  • Added practical mitigation tips for oscillating loss, stuck loss, performance slowdowns, and VRAM pressure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CONTRIBUTING.md
```

**Key metrics to monitor:**
- **frustration:** High values (>100) indicate the optimizer is struggling; may trigger plateau escape
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance for frustration value ranges appears incorrect. ChaosGrad._frustration is an EMA in the range ~[0, 1] with a burst threshold at _FRUST_THRESH = 0.75, so suggesting "High values (>100)" will mislead users. Update the text to reflect the actual scale/threshold (e.g., nearing/exceeding ~0.75).

Suggested change
- **frustration:** High values (>100) indicate the optimizer is struggling; may trigger plateau escape
- **frustration:** This is an EMA typically in the ~[0, 1] range; values nearing or exceeding ~0.75 indicate the optimizer is struggling and may trigger plateau escape

Copilot uses AI. Check for mistakes.
Comment thread CONTRIBUTING.md
Comment on lines +414 to +420
```python
if trainer._using_chaos_grad:
from odyssnet import ChaosGrad
chaos_opt = trainer.optimizer

opt_diag = chaos_opt.get_diagnostics()

Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example checks trainer._using_chaos_grad, which is a private implementation detail. Prefer using public surface area (e.g., trainer.get_diagnostics()['using_chaos_grad'] or isinstance(trainer.optimizer, ChaosGrad)) so the docs don’t encourage relying on underscored attributes.

Copilot uses AI. Check for mistakes.
Comment thread CONTRIBUTING.md

2. **Compile the model** (PyTorch 2.0+):
```python
model.compile()
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OdyssNet.compile() returns a compiled model instance (see existing examples using model = model.compile()). Calling model.compile() without assignment here implies it compiles in-place, which is not generally true and may lead to users not actually using the compiled model.

Suggested change
model.compile()
model = model.compile()

Copilot uses AI. Check for mistakes.
Comment thread CONTRIBUTING.md
Comment on lines +454 to +459
global patience_counter
patience_counter += 1
if patience_counter > 50:
print(f"⛔ 50 consecutive increases. Early stopping.")
raise KeyboardInterrupt

Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The anomaly hook example uses raise KeyboardInterrupt for early stopping. Raising KeyboardInterrupt programmatically is unconventional and can interfere with cleanup/exception handling; prefer breaking out of the loop, returning from the training function, or raising a dedicated exception type.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants