Skip to content

Incomplete state_dict / load_state_dict coverage in InftyBaseOptimizer #9

Description

@Bishwarupjee

content: InftyBaseOptimizer.state_dict() and load_state_dict() only delegate to the wrapped base_optimizer, silently dropping optimizer-internal state such as rho, rho_scheduler, adaptive-perturbation state stored in self.state[p], and self.forward_backward_func. Checkpoints therefore do not fully round-trip.
file: src/infty/optim/geometry_reshaping/base.py
code:

def state_dict(self):
    return self.base_optimizer.state_dict()

def load_state_dict(self, state_dict):
    self.base_optimizer.load_state_dict(state_dict)

description: A checkpoint should also serialize the wrapper's hyperparameters (rho, adaptive, perturb_eps, grad_reduce, schedulers), the self.state dict that carries perturbation vectors, and any closure-related state. Add a minimal override returning {"base": self.base_optimizer.state_dict(), "infty": ...} and a matching loader, so SAM, GSAM, and LookSAM all become reproducible after save/load.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions