Skip to content

Refactor module access to use PyTorch get/set_submodule API#1365

Open
scopophobic wants to merge 8 commits intointel:mainfrom
scopophobic:refactor/use-torch-submodule-api
Open

Refactor module access to use PyTorch get/set_submodule API#1365
scopophobic wants to merge 8 commits intointel:mainfrom
scopophobic:refactor/use-torch-submodule-api

Conversation

@scopophobic
Copy link
Contributor

@scopophobic scopophobic commented Jan 29, 2026

Summary

issue : #1362
This PR refactors module access and replacement to rely on PyTorch’s native
get_submodule / set_submodule APIs instead of maintaining custom traversal logic.

What changed

  • Removed legacy get_attr / set_attr helpers
  • Unified module access through get_module / set_module
    (thin wrappers over get_submodule / set_submodule)
  • Verified all call sites only operate on modules (no attribute access)

Why

PyTorch has provided a robust module replace API since 1.x.
Using the native API improves correctness, readability, and long-term maintainability,
and avoids subtle issues with manual attribute traversal.

Behavior

No functional changes — this is a pure refactor.

@yiliu30 yiliu30 requested review from xin3he and yiliu30 January 30, 2026 00:49
Copy link
Contributor

@xin3he xin3he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@yiliu30 yiliu30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@scopophobic
Copy link
Contributor Author

Thank you for the review @xin3he @yiliu30 . why these two test-cases failing ?

@yiliu30
Copy link
Contributor

yiliu30 commented Feb 3, 2026

@scopophobic
Copy link
Contributor Author

I'm now seeing test failures that appear to be related to this change:

Root Cause

The new fail-fast behavior is breaking code in test/helpers.py (lines 98-100):

for key in model._tied_weights_keys:
    weight = get_attr(model, key)
    set_attr(model, key, copy.deepcopy(weight))

The issue: model._tied_weights_keys may contain keys that don't exist in all model configurations. The old
get_attr silently returned None, but the new version raises AttributeError.

Question: Which approach should I take?

Option 1: Keep silent None behavior

  • Revert get_attr to return None for missing attributes
  • Maintains backward compatibility but Loses the fail-fast benefit.

Option 2: Improve the test case

Update test/helpers.py to handle exceptions properly:

for key in model._tied_weights_keys:
    try:
        weight = get_attr(model, key)
        set_attr(model, key, copy.deepcopy(weight))
    except AttributeError:
        # Skip tied weights that don't exist in this model configuration
        pass
  • Keeps the fail-fast behavior (better for catching bugs)
  • Makes the test code more explicit about handling missing keys

My recommendation: Option 2 (improve the test case), since fail-fast with clear errors is better for debugging and prevents silent bugs from propagating.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants