Skip to content

✨ Add RL truncation#697

Draft
flowerthrower wants to merge 32 commits into
mainfrom
678-add-rl-truncation
Draft

✨ Add RL truncation#697
flowerthrower wants to merge 32 commits into
mainfrom
678-add-rl-truncation

Conversation

@flowerthrower
Copy link
Copy Markdown
Collaborator

Description

tbd

Fixes #678

Assisted-by: GPT-5.5 via Codex

Checklist

  • The pull request only contains commits that are focused and relevant to this change.
  • I have added appropriate tests that cover the new/changed functionality.
  • I have updated the documentation to reflect these changes.
  • I have added entries to the changelog for any noteworthy additions, changes, fixes, or removals.
  • I have added migration instructions to the upgrade guide (if needed).
  • The changes follow the project's style guidelines and introduce no new warnings.
  • The changes are fully tested and pass the CI checks.
  • I have reviewed my own code changes.

If PR contains AI-assisted content:

  • I have disclosed the use of AI tools in the PR description as per our AI Usage Guidelines.
  • AI-assisted commits include an Assisted-by: [Model Name] via [Tool Name] footer.
  • I confirm that I have personally reviewed and understood all AI-generated content, and accept full responsibility for it.

flowerthrower and others added 29 commits March 11, 2026 12:24
## Description

This PR addresses critical bugs in the RL training process with the
following key changes:

**Structure Improvements:**
- **Redesigned action validation logic** (`predictorenv.py`): Rewrote
`determine_valid_actions_for_state()` with a more structured (but
equivalent) state machine that explicitly tracks three circuit states
(synthesized, laid_out, routed) and handles 6 different state
combinations.
- Added helper methods `is_circuit_laid_out()` and `is_circuit_routed()`
to replace the buggy `CheckMap` pass with more reliable state checking.
The new logic supports both the original restricted MDP and a flexible
general MDP mode.

- **Fixed type annotation** (`actions.py`): Corrected `do_while`
parameter type from `dict[str, Circuit]` to `PropertySet` and added
missing import for Qiskit's `PropertySet`.

- **Added reproducibility** (`predictor.py`): Set random seed for
non-test training runs to ensure reproducible results.

- **Improved VF2Layout error handling** (`predictorenv.py`): Replaced
assertion failures with warning logs when VF2Layout doesn't find a
solution, preventing crashes during training.

**Test Updates:**
- Suppressed deprecation warnings in tket routing test

---------

Signed-off-by: Patrick Hopf <81010725+flowerthrower@users.noreply.github.com>
Co-authored-by: flowerthrower <flowerthrower@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@flowerthrower flowerthrower changed the title 678 add rl truncation ✨ Add RL truncation Jun 2, 2026
@flowerthrower flowerthrower added the feature New feature or request label Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

✨ Add RL truncation

1 participant