content: InftyBaseOptimizer.state_dict() and load_state_dict() only delegate to the wrapped base_optimizer, silently dropping optimizer-internal state such as rho, rho_scheduler, adaptive-perturbation state stored in self.state[p], and self.forward_backward_func. Checkpoints therefore do not fully round-trip.
file: src/infty/optim/geometry_reshaping/base.py
code:
def state_dict(self):
return self.base_optimizer.state_dict()
def load_state_dict(self, state_dict):
self.base_optimizer.load_state_dict(state_dict)
description: A checkpoint should also serialize the wrapper's hyperparameters (rho, adaptive, perturb_eps, grad_reduce, schedulers), the self.state dict that carries perturbation vectors, and any closure-related state. Add a minimal override returning {"base": self.base_optimizer.state_dict(), "infty": ...} and a matching loader, so SAM, GSAM, and LookSAM all become reproducible after save/load.
content:
InftyBaseOptimizer.state_dict()andload_state_dict()only delegate to the wrappedbase_optimizer, silently dropping optimizer-internal state such asrho,rho_scheduler, adaptive-perturbation state stored inself.state[p], andself.forward_backward_func. Checkpoints therefore do not fully round-trip.file: src/infty/optim/geometry_reshaping/base.py
code:
description: A checkpoint should also serialize the wrapper's hyperparameters (rho, adaptive, perturb_eps, grad_reduce, schedulers), the
self.statedict that carries perturbation vectors, and any closure-related state. Add a minimal override returning{"base": self.base_optimizer.state_dict(), "infty": ...}and a matching loader, so SAM, GSAM, and LookSAM all become reproducible after save/load.