Use DynamicPPL adtype path for Turing models#16
Conversation
There was a problem hiding this comment.
Pull request overview
Updates ParallelMCMC’s Turing/DynamicPPL integration to use DynamicPPL’s adtype (DifferentiationInterface) path for gradients, and to evaluate in linked/unconstrained space by default—aligning with upstream recommendations from Issue #15.
Changes:
- Replace
LogDensityProblemsAD.ADgradient(...)usage withDynamicPPL.LogDensityFunction(...; adtype=...). - Ensure the Turing convenience constructor uses
LinkAll()(linked/unconstrained space) and add a regression test for constrained parameters (Beta). - Remove
LogDensityProblemsADfrom package deps/extension wiring and update docs/examples accordingly.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
ext/DynamicPPLExt.jl |
Switches the Turing convenience constructor to build a linked-space LogDensityFunction and enable gradients via adtype. |
ext/LogDensityProblemsExt.jl |
Updates docs and error messaging to reflect the new gradient-enablement path (no longer via LogDensityProblemsAD). |
test/test-Turing-Integration.jl |
Migrates tests to the adtype path and adds a constrained-model (Beta) linked-space regression test. |
test/test-DEER-Turing-Logistic.jl |
Removes now-unused LogDensityProblemsAD import. |
docs/src/10-getting-started.md |
Updates “manual LogDensityProblems path” documentation to use adtype + linked-space constructor. |
Project.toml |
Drops LogDensityProblemsAD dependency and removes it from extension/weakdep wiring. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ld = DynamicPPL.LogDensityFunction( | ||
| mymodel(obs), | ||
| DynamicPPL.getlogjoint_internal, | ||
| DynamicPPL.LinkAll(); | ||
| adtype=ADTypes.AutoEnzyme(), |
| caps isa LogDensityProblems.LogDensityOrder{0} && error( | ||
| "LogDensityProblems model must support gradients (LogDensityOrder{1} or higher). " * | ||
| "Wrap it with LogDensityProblemsAD.ADgradient first.", | ||
| "Construct it with gradient support enabled.", |
| using Turing, LogDensityProblems, ADTypes | ||
| using ParallelMCMC, MCMCChains | ||
|
|
||
| ld = DynamicPPL.LogDensityFunction(normal_model(1.5)) | ||
| ldg = LogDensityProblemsAD.ADgradient(ADTypes.AutoEnzyme(), ld) | ||
| ld = DynamicPPL.LogDensityFunction( | ||
| normal_model(1.5), |
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project status has failed because the head coverage (88.89%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #16 +/- ##
==========================================
- Coverage 88.91% 88.89% -0.02%
==========================================
Files 6 6
Lines 1046 1045 -1
==========================================
- Hits 930 929 -1
Misses 116 116 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
penelopeysm
left a comment
There was a problem hiding this comment.
It looks good to me :)
There are other things that I think I could help a bit with. In particular, I think (but haven't verified!) that when you sample with say MALA you will get back samples that are unconstrained. The same is true of the logjoint value returned, it will probably include the Jacobian term associated with the transformation. That's fine if you know what you are doing but can be a bit confusing for end users.
In Turing we fix both of these issues by rerunning the model once per iteration with the unconstrained values, to get back the original constrained values as well as the logjoint in the correct space. (The docs also explain how to do this – although this docs page is maybe only a couple of weeks old, so it may not have been around when you were writing this, sorry!) This does take a tiny bit of time, but compared to the time taken to calculate the gradient etc. during the sampling step, it's pretty negligible.
Usually we do this in the return value of AbstractMCMC.step (see e.g. https://github.com/TuringLang/Turing.jl/blob/18eedab4b81d53d32e773a0dd4f5deba24d915fe/src/mcmc/hmc.jl#L243-L252), but I think because of the way you have overloaded mcmcsample, you would have to do this altogether at the end of mcmcsample. That is, in here:
ParallelMCMC.jl/src/interface.jl
Lines 754 to 757 in 882b6b4
before calling bundle_samples, we could add an overload to convert each item of samples to ParamsWithStats. This is only possible if it came from a DPPL model, so the overload would live in the DPPL extension, and we would make it specific by dispatching on the type of model.logdensity (if it's a DPPL.LogDensityFunction then we are good).
Another benefit of this is that you will also get back the parameter names correctly, so you don't need to do the parameter name extraction things. That's basically how Turing ensures that when you call sample(model, NUTS(), ...), all the transformation stuff is completely hidden away from the user.
Finally, Turing is switching (or as of a few hours ago, has switched) to using FlexiChains.jl as the default output format (instead of MCMCChains). It should be very easy to get that to work here, in particular if you are willing to do the reevaluation above, then it will be very easy to hook up to FlexiChains because all the plumbing is already there.
I realise this is quite a lot of thought-dump. I can do a PR (not sure when, but definitely within this coming week) if that is useful, or if you'd like to try to do it yourself I'm also happy to help answer any questions on GitHub :)
|
Great, thank you @penelopeysm! You’ve raised some really helpful points, especially around unconstrained parameters, naming, and how best to handle the interface between the algorithms and model sampling. I wasn’t always sure what the right design was there, so this is incredibly useful. I also wasn’t aware of the move to I’ll be pretty busy this week with NeurIPS submissions, so I’d really appreciate it if you’d be open to tackling some of this, especially given your expertise and your perspective from the Turing side. That said, no pressure at all; I’m also happy to work through it as I have time. Thanks again for all the detailed feedback! |
|
Of course, no problem at all, I'll add it to my to-do list. I'm a bit busy this weekend but I think mid-week I should be able to get round to it. I hope I didn't come on too strong haha, I'm just excited to see more packages interfacing with Turing 😄 Good luck with NeurIPS! |
|
Not at all--I'm really excited that someone who knows Turing so well will help get this into shape so that this is a widely usable package. I'm really excited about this method and hope others will be too. Thanks again! |
Resolves issues pointed out in #15