Skip to content

Slab force update#215

Open
The9Cat wants to merge 54 commits intodevelfrom
slabUpdate
Open

Slab force update#215
The9Cat wants to merge 54 commits intodevelfrom
slabUpdate

Conversation

@The9Cat
Copy link
Copy Markdown
Member

@The9Cat The9Cat commented Apr 5, 2026

Updates for consistency with the current conventions

  1. Density for isothermal slab is now proportional to sech(z/2h)^2 rather than sech(z/h)^2
  2. SlablSL now implements the self_consistent flag for enabling a frozen potential
  3. Added a version # to the slab cache along with a user warning about this change

One general development support feature

Added compile -time parsing of the EXP version string into major, minor, patch for internal version control logic. E.g.

#include <iostream>
#include "libvars.H"
using namespace __EXP__;

int main()
{
  std::cout << "Major=" << exp_build.major << '\n';
  std::cout << "Minor=" << exp_build.minor << '\n';
  std::cout << "Patch=" << exp_build.patch << '\n';
  return 0;
}

This will make future version-dependent toggles easier to implement. We have a usage example in both SLGridSlab and in utils/Test.

@The9Cat The9Cat added the enhancement New feature or request label Apr 5, 2026
@The9Cat The9Cat requested a review from Copilot April 5, 2026 19:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the slab force/basis implementation to match current conventions (isothermal slab profile normalization, cache versioning, and a self_consistent “frozen potential” mode) and adds compile-time parsing of the EXP version string for internal version-dependent logic.

Changes:

  • Switch isothermal slab model normalization to use a sech^2(z/2H) form and warn users when running older EXP versions.
  • Add a self_consistent configuration flag to SlabSL to support freezing the potential coefficients.
  • Add compile-time EXP version parsing (EXPversion.H, exp_build) and a small test utility; add slab cache version attribute.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
utils/Test/version_test.cc Adds a small example program that prints VERSION and parsed major/minor/patch.
utils/Test/CMakeLists.txt Builds the new vtest utility.
utils/ICs/bonnerebert.cc Removes the local VERSION macro and the -V version flag handling.
src/SlabSL.H Documents self_consistent and adds storage for frozen coefficients.
src/SlabSL.cc Adds YAML parsing for self_consistent and uses frozen coefficients during force evaluation.
src/cudaSlabSL.cu Uses frozen coefficients when copying coefficients to the GPU if self_consistent=false.
include/SLGridMP2.H Adds cache versioning for slab cache and improves internal grouping/comments.
include/libvars.H Exposes exp_build as a compile-time parsed {major,minor,patch} version triple.
include/EXPversion.H Introduces a constexpr version-string parser.
exputil/SLGridMP2.cc Updates isothermal slab profile functions and adds slab cache version checks/write.
exputil/libvars.cc Ensures config header is included in libvars translation unit.
expui/BiorthBasis.cc Allows self_consistent as a recognized YAML key for the slab basis config.
Comments suppressed due to low confidence (1)

utils/ICs/bonnerebert.cc:404

  • The -V/--version option was removed from getopt_long parsing, but long_options still defines a version entry and usage() still advertises -V. This makes the CLI help incorrect and --version will likely behave unexpectedly (depending on getopt_long behavior). Either restore handling for -V/--version or remove it consistently from long_options and the usage text.
			   "R:"	/* runit   */
			   "N:"	/* number  */
			   "S:"	/* seed    */
			   "h", /* help    */
			   long_options, (int *) 0)) != EOF)
    {
      switch (c)
	{
	case 'o':		/* --output */
	  oname = string(optarg);
	  break;
	case 'd':		/* --xstep */
	  h = atof(optarg);
	  break;
	case 'r':		/* --ratio */
	  Rratio = atof(optarg);
	  break;
	case 'R':		/* --runit */
	  R = atof(optarg);
	  break;
	case 'M':		/* --munit */
	  Munit = atof(optarg);
	  break;
	case 'm':		/* --mass */
	  M = atof(optarg);
	  break;
	case 'T':		/* --temp */
	  T = atof(optarg);
	  break;
	case 'N':		/* --number */
	  N = atoi(optarg);
	  break;
	case 'S':		/* --seed */
	  S = atoi(optarg);
	  break;
	case 'h':
	  usage (0);

	default:
	  usage (-1);
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/SlabSL.cc Outdated
Comment thread src/SlabSL.H Outdated
Comment thread src/SlabSL.H Outdated
Comment thread exputil/SLGridMP2.cc Outdated
Comment thread exputil/SLGridMP2.cc Outdated
Comment thread include/EXPversion.H Outdated
Comment thread utils/Test/version_test.cc
The9Cat and others added 7 commits April 5, 2026 16:33
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@The9Cat
Copy link
Copy Markdown
Member Author

The9Cat commented Apr 5, 2026

@michael-petersen I am probably the only one that uses the slab stuff and these changes implement some of the standard features from the other force classes. So no need for you to spend time on most of this. It's all tested and seems to work.

However, note that this PR adds a compile-time function that parses the CMake-generated VERSION string into a structure of three ints (major, minor, patch) that may be used to gate various features rather than reverting to some sort of on-the-fly parse of the VERSION for each case or using CPP macros (yuck). The structure, named EXPversion, is a global library variable compiled into exputil. E.g. this PR uses the EXPversion instance to alert the user to the scale height convention change for the the isothermal disk. Your comments on this bit will be helpful I think.

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 5, 2026

@michael-petersen I am probably the only one that uses the slab stuff and these changes implement some of the standard features from the other force classes. So no need for you to spend time on most ...

The version_test.cc file now consistently uses "integer triplet" in both the file comment and the runtime output (commit b7ed7eb). The other resolved review comments were already addressed in earlier commits on this branch.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 14 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyEXP/BasisWrappers.cc Outdated
Comment thread src/SlabSL.H
Comment on lines +25 to +55
@details **YAML configuration**

@param nmaxx is the maximum order of the expansion in x (default 6)

@param nmaxy is the maximum order of the expansion in y (default 6)

@param nmaxz is the maximum order of the expansion in z (default 6)

@param nminx is the minimum order of the expansion in x (default 0)

@param nminy is the minimum order of the expansion in y (default 0)

@param hslab is the scale height of the slab (default 0.2)

@param zmax is the maximum z for the slab (default 10.0)

@param ngrid is the number of grid points in z for the
Sturm-Liouville solver (default 1000)

@param cmap is the coordinate mapping for the SL solver (must be one of "tanh", "sech", or "linear", default "linear")

@param type is the type of slab to solve for (default
"isothermal", must be "isothermal", "parabolic", or "constant")

@param self_consistent set to true allows the particles to evolve
under the time-dependent basis expansion. For a basis fixed in time to
the initial time: set to false.

@param cachename is the name of the basis cache file. This is a
required parameter.
*/
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added nint and samplesz to the YAML configuration docstring in commit 7b4f5d7, including the constraint that samplesz must be greater than 0 when nint>0.

Comment thread utils/SL/orthochk.cc Outdated
Comment thread utils/SL/orthochk.cc Outdated
Comment thread utils/SL/orthochk.cc Outdated
Comment on lines +355 to +367
std::vector<Eigen::MatrixXd> orthoMat;
switch (Type) {
case Trig:
std::cout << "No internal orthogonality check for OneDTrig"
<< std::endl;
break;
case SL:
{
int knots;
std::cout << "Number of knots? ";
std::cin >> knots;
auto orthoMat = orthoSL->orthoCheck(knots);
std::cout << "Orthogonality check for SLGridSlab" << std::endl;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unused outer std::vector<Eigen::MatrixXd> orthoMat; declaration in commit 5dbe30a. The inner auto orthoMat = orthoSL->orthoCheck(knots) in the SL block is the only declaration now.

Comment thread expui/BiorthBasis.H Outdated
Comment thread src/SlabSL.cc Outdated
Comment on lines 95 to 104
@@ -82,6 +103,7 @@ SlabSL::SlabSL(Component* c0, const YAML::Node& conf) : PotAccel(c0, conf)
}
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed if (true) to if (false) in commit 0cc1bc5, matching the deep-debug pattern from Cylinder.cc. The comment now reads "Set to 'true' only for deep debugging; 'false' for production."

Comment thread include/SLGridMP2.H
Comment thread src/cudaSlabSL.cu
Comment on lines +217 to 226
// Coordinate map
std::map<std::string, int> CoordMap =
{
{"tanh" , 0},
{"sech" , 1},
{"linear", 2}
};

int Cmap = CoordMap[cmap];

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced CoordMap[cmap] with CoordMap.find(cmap) in commit 0cc1bc5. An invalid cmap value now throws a std::runtime_error with a message listing the valid options.

Comment thread utils/ICs/bonnerebert.cc
Comment on lines 357 to 368
while ((c = getopt_long (argc, argv,
"o:" /* output */
"d:" /* xstep */
"r:" /* ratio */
"M:" /* munit */
"m:" /* mass */
"T:" /* temp */
"R:" /* runit */
"N:" /* number */
"S:" /* seed */
"h" /* help */
"V", /* version */
"h", /* help */
long_options, (int *) 0)) != EOF)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored -V/--version handling in commit 0cc1bc5: added 'V' to the getopt_long option string and a case 'V': that prints program_name + VERSION (from libvars.H) and exits cleanly.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The9Cat and others added 4 commits May 3, 2026 10:52
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The9Cat and others added 2 commits May 3, 2026 10:54
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@The9Cat
Copy link
Copy Markdown
Member Author

The9Cat commented May 3, 2026

@copilot The slab force is implemented in SlabSL.cc and the Cuda implementation is in cudaSlabSL.cu. There are two parts to this computation: the coefficient accumulation from particles and the force/acceleration computation from the coefficients. Are you able to study these implementations and flag any possible differences that would lead to significantly results?

Copy link
Copy Markdown
Contributor

Copilot AI commented May 3, 2026

@copilot The slab force is implemented in SlabSL.cc and the Cuda implementation is in cudaSlabSL.cu. There are two parts to this computation: the coefficient accumulation from particles and the f...

Here are the three differences I found that could lead to significantly different results:


1. Texture lookup: missing (kx, ky) swap when abs(jj) > abs(ii) — both coefficient and force kernels

The textures are initialized for the upper-triangular set (kx, ky) with kx ≥ ky (cudaSLGridMP2.cu:318-350), so the mapping formula 1 + (kx*(kx+1)/2 + ky)*slabNumZ is only correct when kx = abs(ii) ≥ ky = abs(jj). The CPU path explicitly swaps:

if (iix >= iiy) grid->get_pot(zpot, zz, iix, iiy);
else            grid->get_pot(zpot, zz, iiy, iix);

The CUDA kernels omit this swap. For every (ii, jj) pair where abs(jj) > abs(ii), both coefKernelSlab (line 386) and forceKernelSlab (line 504) fetch the wrong texture — one that was stored for a different transverse wavenumber . For example, (ii=0, jj=2) should use the k²=4 texture but instead fetches the k²=2 texture for (kx=1, ky=1). The fix is:

int kx = max(abs(ii), abs(jj));
int ky = min(abs(ii), abs(jj));
int offset = 1 + (kx*(kx+1)/2 + ky)*slabNumZ;

2. Opposite sign convention for the potential term at z < 0 — forceKernelSlab only

coefKernelSlab (line 394) correctly flips sign for odd modes (antisymmetric potential, φ(−z) = −φ(z)):

if (pos[2]<0 && 2*(n/2)!=n) sign = -1;   // odd n

forceKernelSlab (line 511) flips sign for even modes:

if (pos[2]<0 && 2*(n/2)==n) sign = -1;   // even n

That sign is correct for the z-force f (whose even-mode part is antisymmetric), but it is applied to the potential v as well via the same variable. For v, even modes are symmetric (no flip needed), and odd modes are antisymmetric (flip needed). So the potential term in forceKernelSlab has the sign backwards for all modes at z < 0. Since acc[0] and acc[1] are derived from fac = X*Y*v*coef, the transverse accelerations are also wrong for all particles at z < 0 (~half the particles in a slab simulation). The fix requires two separate sign variables:

int signV = 1, signF = 1;
if (pos[2]<0 && 2*(n/2)!=n) signV = -1;  // potential: flip odd modes
if (pos[2]<0 && 2*(n/2)==n) signF = -1;  // z-force:   flip even modes

and apply signV to v, signF to f.


3. Missing Adiabatic() factor in coefKernelSlab

CPU (determine_coefficients_thread, line 486):

double adb = cC->Adiabatic();
double mm  = -4.0*M_PI * cC->Mass(i) * adb;

CUDA (coefKernelSlab, line 323):

cuFP_t mm = - p.mass * 2.0*slabDfac;   // = -4π * mass, no Adiabatic()

For standard runs Adiabatic() == 1 so this has no effect, but for any adiabatic ramp the CUDA coefficients will differ from the CPU coefficients by the adiabatic factor — a global multiplicative error in every coefficient.


Minor additional discrepancies (bookkeeping/subsampling only):

  • sampleMasses in the CUDA path accumulates raw p.mass (line 847 via u_d) while the CPU path accumulates mm (which includes the sign and the adiabatic factor). Only relevant for the covariance/subsampling output.
  • cuda_initialize() unconditionally sets fullCovar = false, so the full covariance matrix is never computed on the GPU path.
  • The nminx/nminy low-wavenumber cut-off applied in the CPU force loop (line 681) is absent from forceKernelSlab. This is only relevant when nminx or nminy are set to non-zero values.

Copilot finished work on behalf of The9Cat May 3, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants