diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..c051f5269 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,87 @@ +# AGENTS.md + +Repository guidance for coding agents. See @JULIA.md for general Julia practices and `docs/src/onboarding.md` for newcomer-oriented background. + +## Project Overview + +DynamicPPL.jl is the core probabilistic programming language backend for the Turing.jl ecosystem. It provides the `@model` macro for tilde (`~`) statements and infrastructure for evaluating, conditioning, fixing, transforming, and inspecting probabilistic models. + +DynamicPPL builds on AbstractPPL.jl for shared PPL interfaces such as `VarName`, contexts, conditioning/fixing, and evaluator protocols. + +## Tests And Formatting + + - Tests are split into Group1/Group2 via `GROUP` in `test/runtests.jl`. + + - CI also runs Aqua.jl quality checks and doctests. + - Test files are self-contained: use package imports, not relative imports or `include()`, so they run individually with TestPicker.jl. + - Formatting is JuliaFormatter v1 (Blue style), enforced by CI: + + ```bash + julia --project -e 'using JuliaFormatter; format(".")' + ``` + +## Architecture Pointers + + - Docs: model evaluation, tilde pipeline, init strategies, transform strategies, accumulators, conditioning/fixing, and thread-safe accumulation. + - `Model` (`src/model.jl`): wraps model function, args, context; created by `@model` in `src/compiler.jl`. + - `AbstractVarInfo` (`src/abstract_varinfo.jl`): tracks random variables and accumulated quantities during evaluation. + - `VarName` (AbstractPPL): address for model variables, including nested fields/indices. + - `VarNamedTuple` (`src/varnamedtuple.jl`): named-tuple-like parameter storage keyed by `VarName`. + - `LogDensityFunction` (`src/logdensityfunction.jl`): bridge from named parameters to flat `AbstractVector{<:Real}` for samplers, optimisers, and AD via LogDensityProblems.jl. + - `ext/`: `DynamicPPLForwardDiffExt`, `DynamicPPLMooncakeExt`, `DynamicPPLReverseDiffExt`, `DynamicPPLEnzymeCoreExt`, `DynamicPPLComponentArraysExt`, `DynamicPPLMCMCChainsExt`, and `DynamicPPLMarginalLogDensitiesExt`. + - `DynamicPPL.TestUtils`: analytical test models (`logprior_true`, `loglikelihood_true`, etc.), `run_ad`, `ADResult`. + +## DynamicPPL Invariants + +Evaluator methods follow BangBang `!!` semantics (see JULIA.md). `VarInfo` and `AccumulatorTuple` are immutable, so discarding a `!!` return value is a silent bug. + +**`accumulate_assume!!`** — `val` is model-space (passed to `logpdf`); `tval` is transformed; `logjac` is the log-Jacobian of the forward link transform (zero if unlinked): + +```julia +vi = accumulate_assume!!(vi, x, tval, logjac, vn, dist, template) +``` + +**`LogLikelihoodAccumulator`** uses `Distributions.loglikelihood`, not `logpdf` — array/product observations differ in shape and aggregation. + +**Dynamic transforms** — `DynamicLink`/`Unlink` re-derive bijections from `dist` because support can depend on earlier RVs (e.g. `y ~ truncated(Normal(); lower=x)`). Use `get_raw_value(tv, dist)`; the one-argument form only works for `NoTransform` and `FixedTransform`. Never cache a fixed bijection. Use `FixedTransform`/`WithTransforms` only when support is constant, and make sure the fixed transform exactly matches the target. + +**Log joint** — `getlogjoint_internal(vi) = getlogjoint(vi) - getlogjac(vi)`. Samplers in unconstrained space want `getlogjoint_internal`; constrained-space is `getlogjoint`. + +**ReverseDiff** — don't use `AutoReverseDiff(; compile=true)` when model control flow depends on parameter values (compiled tapes are input-dependent). + +## Review Focus + + - Prefer `OnlyAccsVarInfo` + `init!!` for new evaluation code that needs only accumulators or a subset of `VarInfo` state. + - Avoid adding behaviour to `VarInfo` by default; it bundles values, transform state, metadata, and accumulators, but most fast paths need only part. + - Keep evaluator APIs split: structural prep vs AD-specific prep. Backend gradient code goes in extensions. + - Use `VarNamedTuple` as the canonical internal representation for named parameter collections in new code. Convert user-facing `NamedTuple` and `Dict{VarName}` inputs at boundaries. + - Preserve templates, shapes, and index structure when round-tripping between named values and flat vectors. + - Ensure `copy(acc)` does not share mutable internal state; aliased accumulator containers corrupt results when copied for `ThreadSafeVarInfo`. + - Use `@varname(x)`, not `:x` or `VarName(:x)`. Use subsumption for containment checks, e.g. `subsumes(@varname(x), @varname(x[1]))`. Conditioning on `@varname(x)` covers subindices; conditioning on `@varname(x[1])` only matches that index. + +## `@model` Compiler + +`@model` lowering must preserve ordinary Julia semantics, not only probabilistic statements. + +For compiler changes, test positional and keyword arguments, default values, splatting, closures, interpolation, return values, no-observation models, and data- or parameter-dependent control flow. + +Keep macro hygiene explicit. User variables, generated temporaries, and globals should not capture each other accidentally. Inspect expanded code when changing compiler paths. Preserve model return values; they are user-visible and distinct from accumulated random variables. + +## Threading + +Implement `promote_for_threadsafe_eval(acc, T)` for accumulators with concrete float fields; the default no-op leaves them unable to hold AD tracers like ForwardDiff `Dual`s. General threading guidance lives in JULIA.md. + +## Contributing Checklist + + - Non-breaking changes target `main`; breaking changes target `breaking`. + - Julia `1.10.8` is the minimum supported version in `Project.toml`. + - CI runs Ubuntu/Windows/macOS, Julia stable/min/1.11, and both one- and two-thread configurations. + - Identify whether the change is user-facing, internal, or downstream-facing through Turing.jl. + - Add the smallest tests that exercise the behavior. + - Add nested-submodel tests for context, prefix, conditioning, or fixing changes. + - Add AD backend tests for log-density, transform, vector-parameter, or `run_ad` changes. + - Add round-trip tests for flattening and unflattening changes, including scalars, arrays, tuples, `NamedTuple`s, nested values, and mixed element types. + - Check type stability and allocations for hot paths. + - Check dependency placement and compat bounds when touching Project files, extensions, docs, or tests. + - Include benchmark numbers for performance-sensitive changes. + - Document and test new user-facing API. diff --git a/CLAUDE.md b/CLAUDE.md index ad60cb5f3..496e0e98e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,109 +1,2 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Project Overview - -DynamicPPL.jl is the core probabilistic programming language and backend for the [Turing.jl](https://github.com/TuringLang/Turing.jl) ecosystem. It provides the `@model` macro for defining probabilistic models with tilde (`~`) statements, and infrastructure for evaluating, conditioning, and transforming those models. - -## Test Structure - -Tests are split into Group1 and Group2 for CI parallelism (controlled by the `GROUP` env var in `test/runtests.jl`). CI also runs Aqua.jl quality checks and doctests. - -**Important**: Each test file should be self-contained. All dependencies must come from package imports, not relative imports or `include()` statements. This enables running individual test files via [TestPicker.jl](https://github.com/theogf/TestPicker.jl). - -## Formatting - -Code formatting uses [JuliaFormatter.jl](https://github.com/domluna/JuliaFormatter.jl) v1 (not v2) with the **Blue style** (configured in `.JuliaFormatter.toml`). CI enforces formatting on all PRs. - -```bash -julia --project -e 'using JuliaFormatter; format(".")' -``` - -## Architecture - -For how things work, see the [docs](https://turinglang.org/DynamicPPL.jl/stable/): [model evaluation](https://turinglang.org/DynamicPPL.jl/stable/evaluation/), [tilde pipeline](https://turinglang.org/DynamicPPL.jl/stable/tilde/), [init strategies](https://turinglang.org/DynamicPPL.jl/stable/init/), [transform strategies](https://turinglang.org/DynamicPPL.jl/stable/transforms/), [accumulators](https://turinglang.org/DynamicPPL.jl/stable/accs/overview/), [conditioning/fixing](https://turinglang.org/DynamicPPL.jl/stable/conditionfix/), [threading](https://turinglang.org/DynamicPPL.jl/stable/accs/threadsafe/). - -### Key Types - - - **`Model`** (`src/model.jl`): Wraps a model function with its arguments and context. Created by the `@model` macro (`src/compiler.jl`). - - **`AbstractVarInfo`** (`src/abstract_varinfo.jl`): Interface for tracking random variables and accumulated quantities during model execution. - - **`VarNamedTuple`** (`src/varnamedtuple.jl`): A named-tuple-like structure keyed by `VarName`s (from AbstractPPL). Used as the primary representation for parameter values. - - **`LogDensityFunction`** (`src/logdensityfunction.jl`): Translation layer between named model parameters and flat `AbstractVector{<:Real}` for optimisers/samplers. Implements the `LogDensityProblems.jl` interface. - -### Extensions (`ext/`) - -Optional AD backends and integrations, loaded via Julia's package extension system: - - - `DynamicPPLForwardDiffExt` — ForwardDiff AD - - `DynamicPPLMooncakeExt` — Mooncake AD (with precompilation workload) - - `DynamicPPLReverseDiffExt` — ReverseDiff AD - - `DynamicPPLEnzymeCoreExt` — Enzyme AD - - `DynamicPPLMCMCChainsExt` — MCMCChains integration - - `DynamicPPLMarginalLogDensitiesExt` — marginalization support - -### Testing Utilities (`src/test_utils/`) - -`DynamicPPL.TestUtils` provides test models with known analytical solutions (`logprior_true`, `loglikelihood_true`, etc.) and an AD testing framework (`run_ad`, `ADResult`) used across the Turing ecosystem. - -## Review Guidelines - -Common pitfalls and non-obvious constraints when writing or reviewing DynamicPPL code. - -### Prefer `OnlyAccsVarInfo` over `VarInfo` - -New code should use `OnlyAccsVarInfo` (OAVI) + `init!!`, not `VarInfo` + `evaluate!!`. VarInfo is being phased out ([#1376](https://github.com/TuringLang/DynamicPPL.jl/issues/1376)) — it carries redundant state (`vi.values` duplicates `VectorValueAccumulator`) and is slower. Don't add new features to VarInfo. The migration path: `evaluate!!(model, vi)` becomes `init!!(model, oavi, InitFromParams(vi.values), vi.transform_strategy)`. - -### BangBang (`!!`) Return Values - -Functions suffixed with `!!` (from BangBang.jl) attempt in-place mutation but may return a new object instead. **Always use the return value.** `VarInfo` and `AccumulatorTuple` are immutable structs, so `!!` functions unconditionally return new objects — discarding the return value is a silent bug with no warning. - -```julia -# WRONG: mutation didn't happen, vi is unchanged -accumulate_assume!!(vi, x, tval, logjac, vn, dist, template) - -# RIGHT -vi = accumulate_assume!!(vi, x, tval, logjac, vn, dist, template) -``` - -This applies transitively: if your function calls a `!!` function, it must also return the updated state. - -### Accumulator Pitfalls - -See [accumulator docs](https://turinglang.org/DynamicPPL.jl/stable/accs/overview/) for the full protocol. Common mistakes: - - - **`val` vs `tval` in `accumulate_assume!!`**: `val` is always in the original unlinked space (use it for `logpdf`). `tval` is the `TransformedValue` which may hold linked values. `logjac` is the log-Jacobian of the **forward** link transform (zero if unlinked). Confusing these is a common source of wrong log-densities. - - **`logpdf` vs `loglikelihood` for observations**: `LogLikelihoodAccumulator` uses `Distributions.loglikelihood`, not `logpdf`. For vector observations, `logpdf` returns a vector while `loglikelihood` returns a scalar sum. Using `logpdf` where `loglikelihood` is expected silently produces wrong types. See [JuliaStats/Distributions.jl#1972](https://github.com/JuliaStats/Distributions.jl/issues/1972). - - **Aliased `copy`**: `copy(acc)` must deep-copy all mutable internal state. Aliased containers (e.g. shared `Vector` fields) corrupt results when accumulators are copied for `ThreadSafeVarInfo`. - -### TransformedValue - - - **`get_raw_value(tv)` errors for `DynamicLink` and `Unlink`.** These transforms are derived from the distribution, so you must use `get_raw_value(tv, dist)`. The one-argument form only works for `NoTransform` and `FixedTransform`. - - **`DynamicLink` re-derives the bijection from `dist` every evaluation.** This is necessary because the support of a variable can depend on other variables (e.g. `y ~ truncated(Normal(); lower=x)`), so the transform cannot be cached. When the support is known to be constant, [`FixedTransform` via `WithTransforms`](https://turinglang.org/DynamicPPL.jl/stable/fixed_transforms/) is an option. - - **`FixedTransform` must exactly match the target.** `apply_transform_strategy` errors if a `FixedTransform` doesn't match the expected `target_transform`. Fixed transforms don't compose with re-derived transforms. - -### LogDensityFunction - - - **`getlogjoint_internal` vs `getlogjoint`**: `getlogjoint_internal(vi) = getlogjoint(vi) - getlogjac(vi)`. Samplers operating in unconstrained space need `getlogjoint_internal` (the default). `getlogjoint` gives the density in constrained space without the Jacobian correction — using it for HMC/NUTS is wrong. - - **Compiled ReverseDiff tapes are input-dependent.** If your model has control flow that depends on parameter values (e.g. `if x > 0`), compiled ReverseDiff will only give correct gradients for inputs that trigger the same branch as the compilation input. Don't use `AutoReverseDiff(; compile=true)` with parameter-dependent branching. - -### `VarNamedTuple` as Primary Data Structure - -`VarNamedTuple` is the canonical representation for named parameter collections throughout DynamicPPL. New code should use it everywhere — for conditioning, fixing, parameter storage, and accumulator values. `NamedTuple` and `Dict{VarName}` are accepted as user-facing input but only insofar as they are converted to `VarNamedTuple` at the boundary. Don't propagate them through internal code. - -See the [VarNamedTuple docs](https://turinglang.org/DynamicPPL.jl/stable/vnt/motivation/) for motivation — it is performant, general, and provides a single source of truth for named parameter collections. - -### VarName - - - **Use `@varname(x)`, not `:x` or `VarName(:x)`.** The macro constructs the correct optic for indexed access. `@varname(x[1])` creates a VarName with an index lens — constructing this manually is error-prone. - - **Subsumption, not equality, for containment checks.** `subsumes(@varname(x), @varname(x[1]))` is `true`, but they are not `==`. Conditioning on `@varname(x)` matches all sub-indices; conditioning on `@varname(x[1])` only matches that index. Use `subsumes` when checking if a VarName is "covered by" another. - -### Threading - -See [threading docs](https://turinglang.org/DynamicPPL.jl/stable/accs/threadsafe/). Key edge case: `promote_for_threadsafe_eval(acc, T)` must be implemented if your accumulator stores typed containers that need to hold AD tracer types (e.g. ForwardDiff `Dual`s). The default is a no-op, which is wrong for accumulators with concrete float fields. - -## Contributing - - - Non-breaking changes target `main`; breaking changes target the `breaking` branch. - - CI runs tests on Ubuntu/Windows/macOS, Julia stable/min/1.11, with 1 and 2 threads. - - Julia ≥ 1.10.8 required (see `[compat]` in `Project.toml`). +@AGENTS.md +@JULIA.md diff --git a/JULIA.md b/JULIA.md new file mode 100644 index 000000000..4067ee32c --- /dev/null +++ b/JULIA.md @@ -0,0 +1,92 @@ +# JULIA.md + +Shared day-to-day Julia practices. DynamicPPL-specific review notes live in `AGENTS.md`; newcomer context lives in `docs/src/onboarding.md`. + +## Engineering + + - Write generic numeric code unless the math or an external API forces a concrete type. Avoid `Float64`/`Int`/`Real`/`Array`/`Vector`/`Matrix` constraints that aren't load-bearing. + - Preserve caller types with `zero(x)`, `one(x)`, `oftype`, `promote`, `promote_type` — especially for `Float32`, `BigFloat`, AD numbers, units, and GPU scalars. + - Struct fields should be concrete via type parameters, not `field::Number` or `field::AbstractVector`. + - Julia doesn't specialize on `Type`, `Function`, or `Vararg` arguments. Use `f(x, ::Type{T}) where {T}` when the type itself must specialize. + - Check inference (`@inferred`, `@code_warntype`) when touching generated code, custom containers, accumulators, transforms, or log-density paths. + - Benchmark generated functions, macro output, and hot-path refactors before assuming a simpler form is equivalent. + - Prefer dispatch and small protocol functions over large conditional blocks. + - Avoid broad Base overloads — they create method ambiguities and accidental API. + - Backend-specific behaviour goes in package extensions or narrow integration layers. + - Provide accessors for values downstream packages need — direct field access from another package becomes accidental API. + - Prefer `Base.maybeview` over eager slicing when allocation matters but tuple/scalar indexing must still work. + - Allocate output containers from observed values rather than predicting element types up front. + - Doctests must be deterministic — use `StableRNGs` when examples print random values. + +```julia +# Avoid: too concrete, inference-hostile. +f(x::Float64) = x / 2 +buf = zeros(Float64, length(xs)) +struct Model + scale::Number +end + +# Prefer: generic args, input-derived allocation, concrete fields. +f(x) = x / 2 +buf = similar(xs, promote_type(eltype(xs), Float64), length(xs)) +struct Model{T} + scale::T +end +``` + +## Idioms + + - `!!` semantics (BangBang.jl): methods ending in `!!` may mutate or return a replacement. Always reassign: `x = f!!(x, ...)`. + - Returns from `!!` methods may alias internal state — copy before holding long-term or reusing across calls. + - `copy(x)` must not share mutable internal state with `x` unless intentional and documented. + - Don't index thread-owned storage by `Threads.threadid()` — task scheduling makes IDs unstable. Pass per-task buffers explicitly or use a thread-safe collection. + +## Public APIs + +Signatures: + + - Dimension arguments use `dims=` (tuple-valued where natural). + - Data first; callable first when `do`-block syntax should work (`map(f, xs)`-style). + - Pair mutating and non-mutating versions when both make sense (`sort!`/`sort`). + - Configuration is keywords, not positional `Bool`/small `Int`/`Symbol` flags. + - Reductions take `init=`; sorting takes `lt=`/`by=`/`rev=`. + - Allocate output via `similar(x, ...)` or a destination buffer; don't hardcode `Vector{Float64}`. + - Wrappers forward `kwargs...`. + - Match argument order, keyword names, and return shape across related functions. + +Types: + + - Provide protocol functions (accessors, traits) so downstream packages can extend without reaching for internals. + - Type parameters serve dispatch, storage, or invariants — not decoration. + - Define `hash` whenever you define `==`, consistent with `isequal`. + - Extend an existing Base method rather than introducing a parallel name (`Base.length`, not `mylength`). + - Pick one failure mode (throw, `nothing`, sentinel) and document it. + +Public constructors, keyword arguments, exported names, aliases, abstract supertypes, and traits are long-term commitments. A public concrete type commits to both `Foo(a, b)` and `Foo(; a, b)`. Mark internal names that downstream code already depends on as `public` rather than leaving them accidental. + +## Probability + +When writing distribution-aware code (accumulators, transforms, log-density paths): + + - Separate sample type, mathematical support, and reference measure. Floating-point samples can still have atoms; `pdf` may be a density, a mass, or mixed for censored/truncated cases. + - Check domain boundaries and invalid parameters explicitly. + - Thread an explicit RNG; never reach for the global RNG implicitly. + - Consider parameter gradients, not just gradients with respect to observations. + +```julia +@test isfinite(logpdf(d, x)) +@test logcdf(d, x) <= 0 +@test isapprox(f(Float64(x)), Float64(f(big(x))); rtol=1e-12) +@test rand(StableRNG(1), d) == rand(StableRNG(1), d) +``` + +## Testing Generic Code + +Exercise type variety when the contract is "works for any number type": + +```julia +@test f(Float32(1)) isa Float32 +@test f(big"1.0") isa BigFloat +@test ForwardDiff.derivative(f, 1.0) isa Real +@test f(SVector(1.0, 2.0)) isa SVector +``` diff --git a/docs/make.jl b/docs/make.jl index 71fbfb2c0..83380bddc 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -38,6 +38,7 @@ makedocs(; ], pages=[ "Home" => "index.md", + "Contributor onboarding" => "onboarding.md", "Conditioning and fixing" => "conditionfix.md", "VarNamedTuple" => [ "vnt/motivation.md", diff --git a/docs/src/onboarding.md b/docs/src/onboarding.md new file mode 100644 index 000000000..01f90f530 --- /dev/null +++ b/docs/src/onboarding.md @@ -0,0 +1,220 @@ +# Contributor onboarding + +This page summarizes recurring lessons from DynamicPPL and AbstractPPL history +for contributors who are new to Julia, Turing.jl, or DynamicPPL internals. +It is a starting point, not a checklist. For day-to-day Julia style, see `JULIA.md`; for coding-agent instructions, see `AGENTS.md`. + +The source pass covered GitHub history available on 2026-05-06. For +DynamicPPL, that included 422 issues, 957 pull requests, 6,958 issue/PR +comments, 3,726 PR reviews, and 5,176 inline review comments. For AbstractPPL, +that included 46 issues, 101 pull requests, 654 issue/PR comments, 332 PR +reviews, and 441 inline review comments. Linked issues and PRs are +representative starting points, not current API documentation. + +## What DynamicPPL Does + +DynamicPPL is the modelling and evaluation layer under Turing.jl. It provides +`@model`, tilde (`~`) statement handling, conditioning, fixing, parameter +transforms, accumulators, and log-density interfaces for samplers and automatic +differentiation. It uses AbstractPPL for shared interfaces such as `VarName`, +contexts, and evaluator protocols. + +A useful mental model: + + 1. `@model` lowers user code into a model function. + 2. Each ordinary `~` statement becomes an assume or observe statement. + 3. Contexts and initialisation strategies decide where values come from. + 4. Accumulators decide which quantities are collected. + 5. `LogDensityFunction` maps named model parameters to flat vectors. + +Start with these docs: + + - [Model evaluation](evaluation.md) + - [Tilde-statements](tilde.md) + - [Initialisation strategies](init.md) + - [Transform strategies](transforms.md) + - [Accumulators](accs/overview.md) + - [VarNamedTuple](vnt/motivation.md) + - [LogDensityFunction](ldf/overview.md) + +## Core Lessons + +### Prefer explicit evaluation state + +For new evaluation code, prefer explicit initialisation strategies and +accumulators over adding more responsibilities to `VarInfo`. `VarInfo` remains +important, but fast paths should carry only the state they need. + +A common migration shape is: + +```julia +evaluate!!(model, varinfo) +``` + +to: + +```julia +init!!( + model, + OnlyAccsVarInfo(accumulators...), + InitFromParams(varinfo.values), + varinfo.transform_strategy, +) +``` + +The exact strategy and accumulator set depend on the caller. + +### Use names and shapes carefully + +Use `@varname(x)` and `@varname(x[1])`; avoid manual construction of indexed +`VarName`s. Use subsumption for containment checks: `@varname(x)` can cover +`@varname(x[1])`, but they are not equal. + +`VarName` display, sorting, prefixing, unprefixing, and serialization are +downstream-facing interface behaviour. Test nested fields, indices, ranges, +`Colon`, and non-standard indices when changing them. Avoid broad `Base` +overloads such as generic `get(obj, vn)` unless the method is clearly owned. + +`VarNamedTuple` is the preferred internal container for named parameter values +where supported. Convert user-facing `NamedTuple` or `Dict{VarName}` inputs at +API boundaries. Preserve templates, shapes, and index structure so values can +round-trip between named form and flat vectors. Avoid large mostly-empty shadow +arrays and keep eltypes concrete in hot paths. + +### Keep `!!` return values + +DynamicPPL uses BangBang-style `!!` functions. They may mutate in place or +return a replacement object. Always use the returned value. + +```julia +vi = accumulate_assume!!(vi, value, tval, logjac, vn, dist, template) +``` + +If your function calls a `!!` function, it usually needs to return the updated +state as well. + +### Treat `@model` as Julia code + +`@model` lowering must preserve ordinary Julia behaviour as well as PPL +semantics. For compiler changes, test positional and keyword arguments, +defaults, splatting, closures, interpolation, return values, no-observation +models, and data- or parameter-dependent control flow. + +Macro hygiene matters. User variables, generated temporaries, and globals +should not capture each other accidentally. Returned quantities are +user-visible and are distinct from accumulated random variables. + +DynamicPPL tracks variables through tilde statements. A left-hand-side value can +be treated as a model variable even when it was derived earlier in the model. + +```julia +@model function f() + x ~ Normal() + y = x + 1 + return y ~ Normal() +end +``` + +If the intent is to add a likelihood term for a derived value, prefer +`@addlogprob!` or a clearer model structure. Do not copy old `.~` examples; the +dot-tilde pipeline was removed. + +Passing `missing` can affect whether a value is observed or latent. Add tests +for the exact data shape you support, especially arrays with missing values, +arrays of arrays, and mutable structs. + +### Test contexts with nested models + +Contexts change model evaluation without rewriting the model body. `condition`, +`fix`, `decondition`, `unfix`, `to_submodel`, and prefixes all interact. + +Prefer `condition`, `fix`, and `to_submodel` over hardcoded special cases. Use +the same `VarName` semantics as the tilde pipeline. Add nested-submodel tests +when changing contexts, prefixes, conditioning, or fixing. + +### Know which space values live in + +DynamicPPL moves between constrained model space and unconstrained sampler +space. Be explicit about which space each value lives in. + + - `val`: constrained model-space value used for distribution densities. + - `tval`: `TransformedValue`, which may contain a linked value. + - `logjac`: log absolute Jacobian contribution from the link transform. + - `getlogjoint`: constrained-space log joint. + - `getlogjoint_internal`: internal log density for sampler-facing paths. + - `vi[:]`: internal stored vector; do not assume it is in distribution support. + +`LogDensityFunction` is the usual boundary for HMC/NUTS, optimisers, and AD. +When changing log-density or transform code, test the relevant AD backends. +Avoid compiled ReverseDiff tapes for models whose control flow depends on +parameter values. + +Evaluator APIs should separate structural preparation from AD-specific +preparation. `!!` evaluator and gradient APIs may reuse internal buffers, so +copy results before storing them long term. + +## Working in Julia + +DynamicPPL code often sits on hot paths for inference and AD. Small edits can change inference, allocations, invalidation, or downstream package behaviour, so performance-sensitive changes need measurement rather than intuition. + +The general rules live in `JULIA.md`. The ones most likely to matter here are generic numeric code, concrete storage types, deterministic doctests, extension-based backend integrations, and type-stability checks for compiler output, `VarNamedTuple`s, accumulators, transforms, and log-density paths. + +## Copying, Accumulators, and Threading + +Be explicit about aliasing. Copy stored values when later mutation by model code +would otherwise change accumulated results. Use the cheapest correct copy: +`copy` or `collect` is often enough, while `deepcopy` can be much slower. + +Accumulators collect outputs from model execution, such as log probabilities, +raw values, vector values, pointwise log densities, and returned values. Add +only the accumulators you need. `copy(acc)` must not accidentally share mutable +internal state. + +Avoid designs that depend on `Threads.threadid()` indexing. Promote accumulator +storage when thread-safe evaluation must hold AD tracer types. Treat threaded +assume support as subtle unless current docs and tests cover the exact case. + +## Getting a PR Ready + +For a first contribution, scope the change by deciding whether it is user-facing, internal, or downstream-facing through Turing.jl. Add the smallest tests that exercise the behaviour, then widen coverage only where the change touches shared machinery: nested submodels for contexts and prefixes, AD backends for log-density or transform paths, round trips for flattening and unflattening, and type-stability or allocation checks for hot paths. + +Run JuliaFormatter before submitting and treat docs, Aqua, JET, formatting, and extension-loading failures as part of the change. Put dependencies in the narrowest environment that owns them: runtime, extension, test, or docs. + +## Further Reading + + - Evaluation state and `VarInfo`: [#1132](https://github.com/TuringLang/DynamicPPL.jl/pull/1132), + [#1252](https://github.com/TuringLang/DynamicPPL.jl/issues/1252), + [#1311](https://github.com/TuringLang/DynamicPPL.jl/pull/1311), + [#1376](https://github.com/TuringLang/DynamicPPL.jl/issues/1376). + - Named parameter storage: [#1150](https://github.com/TuringLang/DynamicPPL.jl/pull/1150), + [#1183](https://github.com/TuringLang/DynamicPPL.jl/pull/1183), + [#1204](https://github.com/TuringLang/DynamicPPL.jl/pull/1204), + [#1238](https://github.com/TuringLang/DynamicPPL.jl/pull/1238), + AbstractPPL [#117](https://github.com/TuringLang/AbstractPPL.jl/issues/117), + [#122](https://github.com/TuringLang/AbstractPPL.jl/issues/122), + [#136](https://github.com/TuringLang/AbstractPPL.jl/issues/136), + [#150](https://github.com/TuringLang/AbstractPPL.jl/pull/150). + - Tilde syntax and contexts: [#519](https://github.com/TuringLang/DynamicPPL.jl/issues/519), + [#804](https://github.com/TuringLang/DynamicPPL.jl/pull/804), + [#892](https://github.com/TuringLang/DynamicPPL.jl/pull/892), + [#1221](https://github.com/TuringLang/DynamicPPL.jl/issues/1221). + - Transforms, log densities, and AD: + [#575](https://github.com/TuringLang/DynamicPPL.jl/pull/575), + [#1303](https://github.com/TuringLang/DynamicPPL.jl/pull/1303), + [#1348](https://github.com/TuringLang/DynamicPPL.jl/pull/1348), + [#1354](https://github.com/TuringLang/DynamicPPL.jl/pull/1354), + AbstractPPL [#155](https://github.com/TuringLang/AbstractPPL.jl/pull/155), + [#157](https://github.com/TuringLang/AbstractPPL.jl/pull/157). + - Julia engineering and CI: [#50](https://github.com/TuringLang/DynamicPPL.jl/pull/50), + [#147](https://github.com/TuringLang/DynamicPPL.jl/pull/147), + [#242](https://github.com/TuringLang/DynamicPPL.jl/pull/242), + [#733](https://github.com/TuringLang/DynamicPPL.jl/pull/733), + [#777](https://github.com/TuringLang/DynamicPPL.jl/issues/777), + AbstractPPL [#25](https://github.com/TuringLang/AbstractPPL.jl/pull/25), + [#44](https://github.com/TuringLang/AbstractPPL.jl/pull/44), + [#120](https://github.com/TuringLang/AbstractPPL.jl/issues/120). + - Accumulators and threading: [#429](https://github.com/TuringLang/DynamicPPL.jl/issues/429), + [#885](https://github.com/TuringLang/DynamicPPL.jl/pull/885), + [#925](https://github.com/TuringLang/DynamicPPL.jl/pull/925), + [#1137](https://github.com/TuringLang/DynamicPPL.jl/pull/1137), + [#1340](https://github.com/TuringLang/DynamicPPL.jl/pull/1340).