Skip to content

SDM Refactor#3114

Merged
asalmgren merged 48 commits intoerf-model:developmentfrom
llnl:dg/sdm_refactor
Apr 11, 2026
Merged

SDM Refactor#3114
asalmgren merged 48 commits intoerf-model:developmentfrom
llnl:dg/sdm_refactor

Conversation

@debog
Copy link
Copy Markdown
Contributor

@debog debog commented Apr 9, 2026

Refactors the warm SDM implementation to reduce per-tile boilerplate, consolidate GPU memory management, improve numerical precision in templated code, and add GPU-direct distribution sampling. No physics changes; all 8 SDM ctests pass with identical results.

New files

  • Source/Particles/ERF_SuperDropletPCProcess.H: Defines SDProcess::ProcessContext and SDProcess::ParticlePointers structs that bundle geometry/species metadata and SoA attribute pointers, respectively. These replace the repeated manual extraction of plo, dxi, num_species, num_aerosols, species/aerosol mass pointer arrays, etc. at the top of every particle kernel.
  • Source/Utils/ERF_InterpolationUtils.H: Generic multi-field cloud-in-cell (CIC) interpolation helper ERF::Interpolation::interpolateFields() templated on an enum of field indices. Replaces duplicated per-field interpolation code in advection, mass change, and boundary routines.

Core infrastructure (ERF_SuperDropletPC.H, ERF_SuperDropletPCDefinitions.H)

  • forEachParticleTile(): Three overloads (with ProcessContext, lightweight, serial) that replace manual #pragma omp parallel + ParIter loops throughout the codebase. The serial variant (forEachParticleTileSerial) is used where thread safety is required (e.g., DenseBins in coalescence).
  • buildProcessContext(lev): Consolidates extraction of geometry arrays, species/aerosol counts and indices, water density, and device property pointers into a single ProcessContext struct.
  • setupParticlePointers(): Consolidates extraction of SoA attribute pointers (velocity, mass, radius, multiplicity, terminal velocity, species masses, aerosol masses) into a ParticlePointers struct.
  • updateParticleAttributes(): Static AMREX_GPU_DEVICE method that recomputes effective radius and total mass from species/aerosol mass arrays. Replaces inline SD_effective_radius + SD_total_mass calls scattered across boundary, recycle, and add-particle routines.
  • Persistent device vectors for material properties (m_sp_density, m_sp_solubility, m_sp_ionization, m_sp_mol_weight, and corresponding m_ae_* arrays): allocated once via initializeDeviceProperties() and invalidated on setSpeciesMaterial/setAerosolMaterial. Replaces per-tile Gpu::DeviceVector allocations + host-to-device copies that previously occurred inside every particle loop.
  • particleToMeshHelper(): Template for particle-to-mesh deposition with support for non-uniform vertical grids (m_zlevels_d). Refactors SDNumberDensity, numberDensity, massDensity, massFlux, speciesMassDensity, and the new cloudRainDensity to delegate to this helper.
  • ridx_a/ridx_s refactoring: These runtime-offset index functions are now the base implementations (without SuperDropletsRealIdxSoA::ncomps offset); idx_a/idx_s are redefined to call them with the compile-time offset added.
  • SDTerminalVelocityType converted from enum struct to AMREX_ENUM, enabling direct ParmParse parsing and getEnumNameString() for terminal velocity model selection.
  • SuperDropletsIntIdxSoA typedef added for integer SoA component access.
  • m_term_vel_type renamed to m_term_vel_type_w; m_idx_w initialized to -1.

Distribution sampling (ERF_SDInitialization.H/.cpp, ERF_SuperDropletPCAddParticles.cpp)

  • SDDistributionType AMREX_ENUM replaces the SupDropInit::attrib_init_* string constants. All distribution type comparisons, ParmParse queries, and printing now use the enum directly instead of string matching.
  • SDDistributionParams struct: GPU-compatible POD holding distribution type, mass/radius bounds, pre-computed CDF bounds for truncated log-normal, and multiplicity parameters.
  • SD_erfinv_gpu(): AMREX_GPU_HOST_DEVICE inverse error function approximation for GPU-side log-normal inverse CDF sampling.
  • SD_sample_mass_gpu(): AMREX_GPU_HOST_DEVICE function that generates a mass sample from SDDistributionParams + a RandomEngine, supporting constant, exponential, and log-normal (including autorange) distributions with both sampled and constant multiplicity modes.
  • getSpeciesDistParams(), getAerosolDistParams(), makeDistributionParams(): Build SDDistributionParams from the existing member vectors, pre-computing log-space ranges and CDF bounds on the host.
  • addParticles(): Replaced host-side getSpeciesDistribution/getAerosolDistribution calls (which allocated O(num_species * np) host vectors, sampled on CPU, then copied to device) with a single ParallelForRNG kernel that calls SD_sample_mass_gpu directly into the particle SoA. Per-tile device vectors for material properties replaced with persistent device arrays.
  • setDefaults() now checks m_is_water to set species-specific defaults (water gets tiny initial mass; non-water species get zero).

Per-file refactoring

  • ERF_SuperDropletPCAdvection.cpp: Rewritten with forEachParticleTile + buildProcessContext + interpolateFields. InterpFieldsAdv enum for density/pressure/temperature interpolation. Post-Redistribute k-index update loop added.
  • ERF_SuperDropletPCBoundaries.cpp: Rewritten with forEachParticleTile + buildProcessContext. Uses updateParticleAttributes instead of inline radius/mass recalculation.
  • ERF_SuperDropletPCMassChange.cpp: Rewritten with forEachParticleTile. InterpFieldsLV enum for saturation/pressure/temperature interpolation. Solver objects (drsqdt, newton_solver) moved outside the tile loop.
  • ERF_SuperDropletPCRecycle.cpp: Rewritten with forEachParticleTile for all three iteration blocks (recycle, post-redistribute location update, remove inactive). Division-by-zero guard added for deac_frac computation.
  • ERF_SuperDropletPCDiagnostics.cpp: Individual ReduceMin/ReduceMax/ReduceSum calls replaced with a single fused ReduceData<>/ReduceOps pass. Individual MPI reductions replaced with batched array ReduceRealMin/ReduceRealMax/ReduceRealSum calls.
  • ERF_SuperDropletPCInitializations.cpp: Terminal velocity parsing uses AMREX_ENUM directly instead of string comparison. AMREX_ASSERT changed to AMREX_ALWAYS_ASSERT for species/aerosol count checks. initializeDeviceProperties() called in define(). Z-levels reading added for non-uniform grids. SetAttributes and DensityScaling rewritten with forEachParticleTile.
  • ERF_SuperDropletPCUtils.cpp: computeMeshVar refactored with early returns and cleaner dispatch. SDNumberDensity, numberDensity, massDensity, massFlux delegate to particleToMeshHelper. cloudRainDensity added (previously was a 4-argument overload of speciesMassDensity).

Precision and constant fixes

  • ERF_SuperDropletPCMassChange.H: Include guard fixed (COALESCENCE -> MASSCHANGE). Namespace renamed SDMassChangeUtils -> SDMassChangeUtils_LV. Named constants (one, zero, myhalf, two, three) replaced with RT() casts throughout ODE functions. std::exp(-myhalf*std::log(R_sq)) replaced with RT(1.0)/std::sqrt(R_sq). Helper methods added to TimeIntegrator: computeTimestep, computeTau, limitTimestep, isTimestepTooSmall, evalRHS, printStepInfo, printStepInfoNewton.
  • ERF_SuperDropletPCCoalescence.H: amrex::Real(...) replaced with RT(...) throughout all data tables and kernel functions. Named constants replaced with RT() casts.
  • ERF_TerminalVelocity.H: m_rho renamed to m_rho_w. Viscosity calculation consolidated into viscCoeff(a_T) method (was inline in CloudRainShima). All amrex::Real(...) replaced with RT(...).
  • ERF_Constants.H: four_thirds_pi constant added. Used throughout in place of Real(4.0)/three*PI or (amrex::Real(4.0)/three)*PI.

Misc

  • ERF_SuperDropletsMoist.H: GetPlotVar refactored with try_copy helper lambda.
  • ERF_SuperDropletsMoistUtils.cpp: speciesMassDensity 4-argument calls updated to cloudRainDensity.
  • Backward-compatibility block removed from SDInitProperties::readInputs (the per-species keyed queries supersede the old condensate_* keys).

debog and others added 30 commits April 2, 2026 16:32
Instead of aborting when user-specified refinement box indices are not
divisible by ref_ratio, automatically snap lo indices down and hi
indices up to the nearest aligned value, with a diagnostic print.
ERF_InitCustomPertVels_ParticleTests.H accessed z_nd at the full
domain khi+1, which is out of bounds for partial-z L1 boxes. Use
geomdata.ProbHi()[2] for the domain top height instead.

ERF_InitCustomPert_ParticleTests.H had an assertion requiring the
box to span the full z-domain, which fails with partial-z AMR.
Removed the assertion and unused khi variable.
Add terrain-aware k-index fixing (FixKIndexAMR) and per-level
Redistribute for particles on refined levels. Add
ExtractAndRouteOORParticles to handle particles escaping the fine
level z-extent in partial-z refinement by recomputing k-indices for
the target level. Add compute_k_from_z for uniform and stretched
vertical grids. Add k-clamping in ERFParticlesAssignor and bounds
checks in AdvectWithFlow and ComputeTemperature. Use terrain-aware
Redistribute(z_phys_nd) in post_timestep and after regrid. Remove
premature Redistribute calls from MakeNewLevel functions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…l-by-level

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
debog and others added 18 commits April 8, 2026 17:40
…recision fixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e m_zlevels_d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@debog debog marked this pull request as ready for review April 11, 2026 00:44
@debog debog changed the title [Draft] SDM Refactor SDM Refactor Apr 11, 2026
@asalmgren asalmgren merged commit 27f975d into erf-model:development Apr 11, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants