Skip to content

Batch API#521

Closed
klamike wants to merge 18 commits intoJuliaSmoothOptimizers:mainfrom
klamike:mk/batch
Closed

Batch API#521
klamike wants to merge 18 commits intoJuliaSmoothOptimizers:mainfrom
klamike:mk/batch

Conversation

@klamike
Copy link
Collaborator

@klamike klamike commented Nov 20, 2025

to be continued in #540...

@amontoison
Copy link
Member

cc @frapac

@klamike
Copy link
Collaborator Author

klamike commented Nov 20, 2025

Docs are broken due to #520

@tmigot
Copy link
Member

tmigot commented Nov 20, 2025

Hi @klamike ! Can you expand a bit more on the context for this and potential use of it?

@klamike
Copy link
Collaborator Author

klamike commented Nov 20, 2025

@tmigot The idea is to define an API to talk about batches of models in a standardized way across JSO. Then each solver (up for discussion -- maybe we should do that here instead?) can define its own batch model types, which would mark the relevant assumptions (e.g. same jac/hess structure, same nvar/ncon, etc.) to enable exploiting that shared structure (importantly, not just in the evaluation of the model, but also in the solvers themselves). For example, if we assume the same jac/hess structure, we can use CUDSS uniform batch API for solving KKT systems instead of looping over single solves. This should bring substantial speedup over something like madnlp.(nlps::Vector{<:AbstractNLPModel}).

To be honest, this kind of change is very big for me to think about all the consequences of different design decisions at once. So, we decided it might be best to just implement something dumb first to get some visibility into tradeoffs.

isempty(updates) && error("Cannot create InplaceBatchNLPModel from empty collection.")
InplaceBatchNLPModel{M}(base_model, updates, Counters(), length(updates))
end
# TODO: counters?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the counters are not properly wired up yet, both here and in foreach

]
@test hprods ≈ manual_hprods

if isa(bnlp, ForEachBatchNLPModel) # NOTE: excluding InplaceBatchNLPModel
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to exclude the inplace version here since the closures in the operators reference the model state, which we are mutating without their knowledge

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be doable to put the update function inside the operator closure..

@amontoison amontoison changed the base branch from main to 0.21.x December 3, 2025 11:23
@amontoison amontoison changed the base branch from 0.21.x to main December 3, 2025 11:24
@amontoison
Copy link
Member

@klamike Can you rebase your branch with main? We should be able to test it more easily now.

@klamike
Copy link
Collaborator Author

klamike commented Dec 9, 2025

@amontoison I've rebased, tests should pass now. Curious what you think of this approach, as well as the downstream klamike/MadIPM.jl#1 and klamike/MadNLP.jl#1 (which so far do not depend on this PR)

@amontoison
Copy link
Member

amontoison commented Dec 11, 2025

@klamike @andrewrosemberg
The PR is quite big, and we can probably make some simplifications, but it is excellent for prototyping!
I have a few comments:

  • Since we are back to 0.21.x ( 💪 ), we can try it on all models like AMPLModel, CUTEstModel, MathOptNLPModel, or ExaModel.
    Can you try to run a batch of 5 ExaModels (CPU and GPU) using the recent parametric support you added in ExaModels.jl? Doing a DC-OPF with parametric QPLOAD / PLOAD could be a very nice proof-of-concept.
    Andrew, maybe this task is more relevant for you?

  • A BatchNLPModelMeta could be useful to specify what is in common or not. Inside, we could specify if we have the same lcon, ucon, lvar, uvar, obj, ... and avoid duplicating the arrays if not needed. This would allow a less redundant implementation.

  • Another important feature is the ability to evaluate only the obj / grad / jac / ... of a specific model in the batch. We should provide a variant of the function where we can specify an index of the problem, a subset of the problems, or all problems.

We don’t want to update all KKT systems if some of them have already converged.

@amontoison
Copy link
Member

amontoison commented Feb 3, 2026

@klamike I worked a little bit on the batch support in ExaModels.jl and I propose to only provide the following API:

export AbstractBatchNLPModel

export batch_obj, batch_obj!
export batch_grad, batch_grad!
export batch_cons, batch_cons!
export batch_jac_structure!, batch_jac_structure
export batch_hess_structure!, batch_hess_structure
export batch_jac_coord!, batch_jac_coord
export batch_hess_coord!, batch_hess_coord
export batch_jprod, batch_jprod!
export batch_jtprod, batch_jtprod!
export batch_hprod, batch_hprod!

We don't need something else and I prefer that we extend it in the future if needed.
I will open a PR to update your api.jl.
I also think that we should force the API to be strided and everything stored in a long vector (x, fx, gx, Jval, hval, etc...).
We can do a reshape for free if we want a matrix where each column is a scenario.
What do you think if we also remove foreach.jl and inplace.jl?
It is a lot of new API that may not be needed, we can move it to NLPModelsTest.jl for testing the batch API.

What is still missing is a BatchNLPModelMeta where we stored all lvar, uvar, lcon, ucon in a strided way and also add a field nlp.meta.nbatch.

@amontoison
Copy link
Member

amontoison commented Feb 3, 2026

@klamike 784ccb7
I still need to do a pass on the docstrings.

@klamike
Copy link
Collaborator Author

klamike commented Feb 3, 2026

784ccb7 LGTM feel free to push it here! Thanks for spending some time on this.

Agreed on all your points, except I'm not so sure we should force strided (for example in multi-GPUs/multi-node setting we may not want it). It can always be changed/added separately later.

Also, it would be nice to have batch attributes similar to the operator availability ones you added recently for NLPModelMeta, like is_strided_batch, is_uniform_batch. Then the MadIPM UniformBatch wouldn't have to check that all the structures match.

@klamike
Copy link
Collaborator Author

klamike commented Feb 4, 2026

closing in favor of #540

@klamike klamike closed this Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants