Submit rgpycrumbs for review

Submitting Author: Rohit Goswami (@HaoZeke)
All current maintainers: (@HaoZeke)
Package Name: `rgpycrumbs`
One-Line Description of Package: A dispatcher-managed analytical engine for computational chemistry -- kernel-based surface fitting, structural analysis, and reactive path projection.
Repository Link: https://github.com/HaoZeke/rgpycrumbs
Version submitted: v1.1.0
EiC: @kysolvik 
Editor: TBD
Reviewer 1: TBD
Reviewer 2: TBD
Archive: TBD
JOSS DOI: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD

---

## Code of Conduct & Commitment to Maintain Package

- [x] I agree to abide by [pyOpenSci's Code of Conduct][PyOpenSciCodeOfConduct] during the review process and in future interactions in spaces supported by pyOpenSci should it be accepted.
- [x] I have read and will commit to package maintenance after the review as per the [pyOpenSci Policies Guidelines][Commitment].

## Description

`rgpycrumbs` provides a modular analytical framework for chemical physics and molecular kinetics. The library implements:

- **Kernel-based surface fitting** (`rgpycrumbs.surfaces`): JAX-based Gaussian process regression with Matern 3/2, inverse multiquadric, rational quadratic, and squared exponential kernels. Includes gradient-enhanced and Nystrom-approximated variants for interpolating 2D energy/eigenvalue landscapes from sparse NEB data.
- **Structural analysis** (`rgpycrumbs.geom`): Fragment detection via Wiberg Bond Orders, Iterative Rotations and Assignments (IRA) for structure matching, and RMSD-based reactive path projection into 2D manifolds.
- **Hermite spline interpolation** (`rgpycrumbs.interpolation`): Force-consistent spline interpolation of energy profiles along reaction coordinates.
- **Shared data types** (`rgpycrumbs.basetypes`): Standardized representations for NEB paths and molecular structures.
- **PEP 723 dispatcher**: An execution "launchpad" that manages isolated environments for CLI scripts with conflicting binary dependencies (e.g. OVITO for defect analysis alongside tblite for electronic structure).

The companion library [chemparseplot](https://github.com/HaoZeke/chemparseplot) (`chemparseplot>=1.0.0`) handles parsing outputs from computational chemistry codes (eOn, ORCA, ChemGP) and delegates heavy computation to `rgpycrumbs` for publication-quality visualizations.

## Scope

- [ ] Data retrieval
- [ ] Data extraction
- [x] Data processing/munging
- [ ] Data deposition
- [ ] Data validation and testing
- [x] Data visualization[^1]
- [x] Workflow automation
- [ ] Citation management and bibliometrics
- [x] Scientific software wrappers
- [ ] Database interoperability

## Domain Specific

- [ ] Geospatial
- [ ] Education

## Community Partnerships

- [ ] Astropy: [My package adheres to Astropy community standards](https://www.pyopensci.org/software-peer-review/partners/astropy.html)
- [ ] Pangeo: My package adheres to the [Pangeo standards listed in the pyOpenSci peer review guidebook][PangeoCollaboration]

> [^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

- **For all submissions**, explain how and why the package falls under the categories you indicated above:

  **Who is the target audience and what are the scientific applications?**

  Computational chemists and physicists investigating reaction rates, saddle point searches, and long-timescale molecular dynamics. The software provides the implementation foundation for the doctoral dissertation "Efficient Exploration of Chemical Kinetics" (University of Iceland, 2025, arXiv: [2510.21368](https://arxiv.org/abs/2510.21368)) and supports GP-accelerated saddle point searches published in ChemPhysChem (doi: [10.1002/cphc.202500730](https://doi.org/10.1002/cphc.202500730)).

  **Are there other Python packages that accomplish the same thing?**

  General libraries like ASE or pymatgen provide atomic simulation environments but lack: (1) the 2D reactive path RMSD projection method, (2) gradient-enhanced kernel interpolation for sparse NEB landscapes, (3) environmental isolation for heterogeneous binary dependencies. `rgpycrumbs` complements these tools rather than replacing them -- it uses ASE's Atoms objects as its native data type.

  **Pre-submission inquiry:**

  [pyOpenSci/software-submission#275](https://github.com/pyOpenSci/software-submission/issues/275) -- accepted with `submission-requested` tag by @kysolvik.

  External adoption:
                                                                                                                                     
  - [eOn](https://github.com/TheochemUI/eOn) -- long-timescale molecular dynamics code; uses rgpycrumbs as its companion
  diagnostic and visualization suite (https://eondocs.org/)
  - [lab-cosmo cookbook](https://github.com/lab-cosmo/atomistic-cookbook) -- community cookbook for atomistic simulations; the
  https://atomistic-cookbook.org/examples/eon-pet-neb/eon-pet-neb.html imports rgpycrumbs.eon.helpers and rgpycrumbs.run.jupyter
  - [metatensor](https://github.com/metatensor/ecosystem-article) -- uses rgpycrumbs for eOn NEB plotting and PLUMED FES
  reconstruction
  - Five reproduction packages for published/submitted work.
  - [Carpentries-style](https://github.com/HaoZeke/python_dissemination_workbench) teaching materials
  (https://github.com/carpentries-lab/reviews/issues/35)

  Full list maintained at https://rgpycrumbs.rgoswami.me/used_by.html. Documentation uses an org-to-rst pipeline with Sphinx         
  (https://rgpycrumbs.rgoswami.me).

## Technical checks

For details about the pyOpenSci packaging requirements, see our [packaging guide][PackagingGuide]. Confirm each of the following by checking the box. This package:

- [x] does not violate the Terms of Service of any service it interacts with.
- [x] uses an [OSI approved license][OsiApprovedLicense] (MIT).
- [x] contains a README with instructions for installing the development version.
- [x] includes documentation with examples for all functions.
- [x] contains a tutorial with examples of its essential functions and uses.
- [x] has a test suite.
- [x] has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

## Generative AI Disclosure

Generative AI (Claude, Anthropic) was used as an agentic coding assistant throughout development of rgpycrumbs and the companion library chemparseplot. Specific usage:

- **Components:** All modules received some degree of AI-assisted development, including surface fitting kernels, test suites, CLI scripts, and documentation.
- **Scale:** AI assistance ranged from line-level completions to full function implementations. The mathematical core (kernel definitions, gradient derivations, RMSD projections) was specified by the human author with AI implementing the numerical code. Test suites were largely AI-generated from human-specified test cases and reference values.
- **Methods:** Agentic workflows (Claude Code CLI) for multi-file refactoring, code generation from specifications, and test writing. Interactive queries for algorithm design and debugging.
- **Human review:** All generated material has been reviewed and edited for clarity, correctness, and completeness. Mathematical implementations were validated against published formulas and reference implementations (MATLAB, Julia). Test suites run in CI on every commit.


  To clarify the development history: the core algorithms derive from https://github.com/TheochemUI/gpr_optim, a C++ port of
  Oli-Pekka Koistinen's original MATLAB GPR-dimer code, written by hand with Satish Kamath and Maxim Masterov (est. 2020). The kernel
   mathematics, gradient derivations, and RMSD projection methods were developed and validated against these references over the
  course of doctoral dissertation work (2020--2025). The reproduction packages (https://github.com/TheochemUI/gpr_sella_repro,
  https://github.com/TheochemUI/otgpd_repro, https://github.com/HaoZeke/nebmmf_repro, https://github.com/HaoZeke/nebviz_repro,
  https://github.com/HaoZeke/brms_idrot_repro) predate any AI-assisted development. The package architecture (PEP 723 dispatcher,
  org-to-rst documentation pipeline, Sphinx integration) is original design work.

## Publication Options

- [x] Do you wish to automatically submit to the [Journal of Open Source Software][JournalOfOpenSourceSoftware]? If so:

<details>
 <summary>JOSS Checks</summary>

- [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
- [x] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor 'utility' packages, including 'thin' API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
- [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`.
- [ ] The package is deposited in a long-term repository with the DOI:

*Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*

</details>

## Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

- [x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

- [x] I have read the [author guide](https://www.pyopensci.org/software-peer-review/how-to/author-guide.html).
- [x] I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

## Please fill out our survey

- [x] [Last but not least please fill out our pre-review survey](https://forms.gle/F9mou7S3jhe8DMJ16). This helps us track submission and improve our peer review process. We will also ask our reviewers and editors to fill this out.

**P.S.** Have feedback/comments about our review process? Leave a comment [here][Comments]

## Editor and Review Templates

The [editor template can be found here][Editor Template].

The [review template can be found here][Review Template].

[PackagingGuide]: https://www.pyopensci.org/python-package-guide/

[PackageCategories]: https://www.pyopensci.org/software-peer-review/about/package-scope.html

[JournalOfOpenSourceSoftware]: http://joss.theoj.org/

[JossSubmissionRequirements]: https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements

[JossPaperRequirements]: https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain

[PyOpenSciCodeOfConduct]: https://www.pyopensci.org/handbook/CODE_OF_CONDUCT.html

[OsiApprovedLicense]: https://opensource.org/licenses

[Editor Template]: https://www.pyopensci.org/software-peer-review/appendices/templates.html#editor-s-template

[Review Template]: https://www.pyopensci.org/software-peer-review/appendices/templates.html#peer-review-template

[Comments]: https://pyopensci.discourse.group/

[PangeoCollaboration]: https://www.pyopensci.org/software-peer-review/partners/pangeo

[pangeoWebsite]: https://www.pangeo.io

[Commitment]: https://www.pyopensci.org/software-peer-review/our-process/policies.html#after-acceptance-package-ownership-and-maintenance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Submit rgpycrumbs for review #286

Code of Conduct & Commitment to Maintain Package

Description

Scope

Domain Specific

Community Partnerships

Technical checks

Generative AI Disclosure

Publication Options

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

Please fill out our survey

Editor and Review Templates

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Submit rgpycrumbs for review #286

Description

Code of Conduct & Commitment to Maintain Package

Description

Scope

Domain Specific

Community Partnerships

Technical checks

Generative AI Disclosure

Publication Options

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

Please fill out our survey

Editor and Review Templates

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions