Skip to content

[sham::Dump] versioning & unify all metadata in single json#1705

Draft
tdavidcl wants to merge 1 commit intoShamrock-code:mainfrom
tdavidcl:patch-2026-03-08-23-16
Draft

[sham::Dump] versioning & unify all metadata in single json#1705
tdavidcl wants to merge 1 commit intoShamrock-code:mainfrom
tdavidcl:patch-2026-03-08-23-16

Conversation

@tdavidcl
Copy link
Member

@tdavidcl tdavidcl commented Mar 8, 2026

No description provided.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

Thanks @tdavidcl for opening this PR!

You can do multiple things directly here:
1 - Comment pre-commit.ci run to run pre-commit checks.
2 - Comment pre-commit.ci autofix to apply fixes.
3 - Add label autofix.ci to fix authorship & pre-commit for every commit made.
4 - Add label light-ci to only trigger a reduced & faster version of the CI (need the full one before merge).
5 - Add label trigger-ci to create an empty commit to trigger the CI.

Once the workflow completes a message will appear displaying informations related to the run.

Also the PR gets automatically reviewed by gemini, you can:
1 - Comment /gemini review to trigger a review
2 - Comment /gemini summary for a summary
3 - Tag it using @gemini-code-assist either in the PR or in review comments on files

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the Shamrock dump mechanism by centralizing all metadata into a single JSON structure. This change simplifies the interface for saving and loading simulation states, enhances data consistency by combining various metadata types, and introduces a versioning system for dump files. The update improves the robustness and maintainability of the dumping process across different simulation models.

Highlights

  • Unified Metadata Handling: All metadata (user-defined, patch-specific, and internal dump information) is now consolidated into a single nlohmann::json object when writing and reading Shamrock dump files.
  • API Simplification: The write_shamrock_dump and load_shamrock_dump functions in ShamrockDump.hpp now directly accept and return nlohmann::json objects for metadata, removing the need for manual string serialization/deserialization.
  • Dump Versioning: A version tag (WRITER_VERSION_TAG "2.0") has been introduced and is stored within the unified JSON metadata, allowing for future compatibility management of dump file formats.
  • Model Integration: The GSPH, RAMSES, and SPH model dump/load functionalities have been updated to utilize the new unified JSON metadata approach.
Changelog
  • examples/sph/run_sph_rendering.py
    • Added a test case to verify the dump and load functionality of the SPH model.
  • src/shammodels/gsph/include/shammodels/gsph/Model.hpp
    • Updated load_from_dump to directly receive nlohmann::json metadata.
    • Modified dump to pass nlohmann::json metadata directly to write_shamrock_dump.
  • src/shammodels/ramses/include/shammodels/ramses/Model.hpp
    • Updated dump to pass nlohmann::json metadata directly.
    • Modified load_from_dump to directly receive nlohmann::json metadata.
  • src/shammodels/sph/include/shammodels/sph/Model.hpp
    • Updated load_from_dump to directly receive nlohmann::json metadata.
    • Modified dump to pass nlohmann::json metadata directly.
  • src/shamrock/include/shamrock/io/ShamrockDump.hpp
    • Included nlohmann/json.hpp header.
    • Changed write_shamrock_dump signature to accept const nlohmann::json& for metadata.
    • Changed load_shamrock_dump signature to accept nlohmann::json& for metadata.
  • src/shamrock/src/io/ShamrockDump.cpp
    • Defined WRITER_VERSION_TAG for dump versioning.
    • Modified write_shamrock_dump to combine patch metadata, user metadata, and versioning into a single nlohmann::json object before writing.
    • Removed separate header writes for user and patch metadata, now writing a single unified JSON header.
    • Modified load_shamrock_dump to read a single unified JSON header.
    • Extracted user metadata and patch metadata from the unified JSON object during loading.
    • Updated byte count and offset calculations for preallocation to reflect the single metadata header.
Activity
  • No specific activity (comments, reviews, or progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Shamrock dump functionality to unify all metadata into a single JSON object and introduces versioning. While this simplifies the dump/load logic, a critical security vulnerability was identified in the loading logic: the code lacks validation for the sizes of arrays read from the JSON header, which could lead to out-of-bounds memory access when processing a malformed or malicious dump file. Explicit size checks are recommended before iterating over these arrays. Furthermore, the review identified several code quality issues, including test code that breaks an example script, opportunities to improve the new versioning implementation by actually using the version tag during loading to ensure compatibility, adherence to C++ best practices for constants, and the presence of unused code that should be removed.

Comment on lines +216 to +225
model.dump("tmp.sham")


ctx = shamrock.Context()
model = shamrock.get_Model_SPH(context=ctx, vector_type="f64_3", sph_kernel="M4")
model.load_from_dump("tmp.sham")

model.dump("tmp2.sham")

exit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This block of code appears to be for testing the dump/load functionality and includes an exit() call. This prevents the rest of the example script from running, effectively breaking it. This test code should be removed before merging.

Comment on lines +189 to +193
json jmeta = json::parse(metadata);

metadata_user = jmeta.at("metadata_user");

auto jmeta_patch = jmeta.at("metadata_patch");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new dump format includes a versioning field, but this field is not checked during the loading process in load_shamrock_dump. This breaks backward compatibility with older dump files and misses an opportunity to validate the dump format.

The current implementation will fail when trying to load old dumps because it expects a single unified JSON header.

To make the loading process more robust, you should check for the presence and value of the versioning field. This would allow you to either support older formats or provide a clear error message if an unsupported format is detected.

For example:

json jmeta = json::parse(metadata);

if (jmeta.contains("versioning")) {
    // New format
    auto version = jmeta.at("versioning").get<std::string>();
    if (version != WRITER_VERSION_TAG) {
        logger::warn_ln("ShamrockDump", "Loading dump with version " + version + ", which differs from writer version " + WRITER_VERSION_TAG);
    }
    // ... proceed with loading new format
    metadata_user = jmeta.at("metadata_user");
    auto jmeta_patch = jmeta.at("metadata_patch");
    // ...
} else {
    // Old format or corrupted file
    shambase::throw_with_loc<std::runtime_error>(
        "Unsupported dump format: versioning information is missing.");
}


/// Currently this is the only valid tag as "1.0" of Shamrock dumps did not include any versioning
/// :'(
#define WRITER_VERSION_TAG "2.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a #define for a string constant is generally discouraged in modern C++. It's better to use a constexpr variable, which is type-safe and respects scope. This improves maintainability and prevents potential issues with macro expansion.

Suggested change
#define WRITER_VERSION_TAG "2.0"
static constexpr auto WRITER_VERSION_TAG = "2.0";

/// :'(
#define WRITER_VERSION_TAG "2.0"

bool fused_metadata = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The global variable fused_metadata is introduced but it's not used anywhere in the codebase. This unused variable should be removed to improve code clarity and avoid confusion.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

Workflow report

workflow report corresponding to commit fe2f52a
Commiter email is timothee.davidcleris@proton.me

Light CI is enabled. This will only run the basic tests and not the full tests.
Merging a PR require the job "on PR / all" to pass which is disabled in this case.

Pre-commit check report

Pre-commit check: ✅

trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check for merge conflicts................................................Passed
check that executables have shebangs.....................................Passed
check that scripts with shebangs are executable..........................Passed
check for added large files..............................................Passed
check for case conflicts.................................................Passed
check for broken symlinks................................................Passed
check yaml...............................................................Passed
detect private key.......................................................Passed
No-tabs checker..........................................................Passed
Tabs remover.............................................................Passed
Validate GitHub Workflows................................................Passed
clang-format.............................................................Passed
ruff check...............................................................Passed
ruff format..............................................................Passed
Check doxygen headers....................................................Passed
Check license headers....................................................Passed
Check #pragma once.......................................................Passed
Check SYCL #include......................................................Passed
No ssh in git submodules remote..........................................Passed
No UTF-8 in files (except for authors)...................................Passed

Test pipeline can run.

Doxygen diff with main

Removed warnings : 8
New warnings : 9
Warnings count : 8203 → 8204 (0.0%)

Detailed changes :
+ src/shammodels/gsph/include/shammodels/gsph/Model.hpp:364: warning: Member dump(std::string fname) (function) of class shammodels::gsph::Model is not documented.
- src/shammodels/gsph/include/shammodels/gsph/Model.hpp:365: warning: Member dump(std::string fname) (function) of class shammodels::gsph::Model is not documented.
+ src/shammodels/gsph/include/shammodels/gsph/Model.hpp:381: warning: Member timestep() (function) of class shammodels::gsph::Model is not documented.
+ src/shammodels/gsph/include/shammodels/gsph/Model.hpp:383: warning: Member evolve_once() (function) of class shammodels::gsph::Model is not documented.
- src/shammodels/gsph/include/shammodels/gsph/Model.hpp:383: warning: Member timestep() (function) of class shammodels::gsph::Model is not documented.
- src/shammodels/gsph/include/shammodels/gsph/Model.hpp:385: warning: Member evolve_once() (function) of class shammodels::gsph::Model is not documented.
+ src/shammodels/gsph/include/shammodels/gsph/Model.hpp:388: warning: Member evolve_until(Tscal target_time, i32 niter_max=-1) (function) of class shammodels::gsph::Model is not documented.
- src/shammodels/gsph/include/shammodels/gsph/Model.hpp:390: warning: Member evolve_until(Tscal target_time, i32 niter_max=-1) (function) of class shammodels::gsph::Model is not documented.
+ src/shammodels/sph/include/shammodels/sph/Model.hpp:895: warning: Member evolve_once_time_expl(f64 t_curr, f64 dt_input) (function) of class shammodels::sph::Model is not documented.
+ src/shammodels/sph/include/shammodels/sph/Model.hpp:897: warning: Member timestep() (function) of class shammodels::sph::Model is not documented.
- src/shammodels/sph/include/shammodels/sph/Model.hpp:898: warning: Member evolve_once_time_expl(f64 t_curr, f64 dt_input) (function) of class shammodels::sph::Model is not documented.
+ src/shammodels/sph/include/shammodels/sph/Model.hpp:899: warning: Member evolve_once() (function) of class shammodels::sph::Model is not documented.
- src/shammodels/sph/include/shammodels/sph/Model.hpp:900: warning: Member timestep() (function) of class shammodels::sph::Model is not documented.
- src/shammodels/sph/include/shammodels/sph/Model.hpp:902: warning: Member evolve_once() (function) of class shammodels::sph::Model is not documented.
+ src/shammodels/sph/include/shammodels/sph/Model.hpp:904: warning: Member evolve_until(Tscal target_time, i32 niter_max) (function) of class shammodels::sph::Model is not documented.
- src/shammodels/sph/include/shammodels/sph/Model.hpp:907: warning: Member evolve_until(Tscal target_time, i32 niter_max) (function) of class shammodels::sph::Model is not documented.
+ src/shamrock/src/io/ShamrockDump.cpp:26: warning: Member fused_metadata (variable) of file ShamrockDump.cpp is not documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant