[sham::Dump] versioning & unify all metadata in single json#1705
[sham::Dump] versioning & unify all metadata in single json#1705tdavidcl wants to merge 1 commit intoShamrock-code:mainfrom
Conversation
|
Thanks @tdavidcl for opening this PR! You can do multiple things directly here: Once the workflow completes a message will appear displaying informations related to the run. Also the PR gets automatically reviewed by gemini, you can: |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the Shamrock dump mechanism by centralizing all metadata into a single JSON structure. This change simplifies the interface for saving and loading simulation states, enhances data consistency by combining various metadata types, and introduces a versioning system for dump files. The update improves the robustness and maintainability of the dumping process across different simulation models. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the Shamrock dump functionality to unify all metadata into a single JSON object and introduces versioning. While this simplifies the dump/load logic, a critical security vulnerability was identified in the loading logic: the code lacks validation for the sizes of arrays read from the JSON header, which could lead to out-of-bounds memory access when processing a malformed or malicious dump file. Explicit size checks are recommended before iterating over these arrays. Furthermore, the review identified several code quality issues, including test code that breaks an example script, opportunities to improve the new versioning implementation by actually using the version tag during loading to ensure compatibility, adherence to C++ best practices for constants, and the presence of unused code that should be removed.
| model.dump("tmp.sham") | ||
|
|
||
|
|
||
| ctx = shamrock.Context() | ||
| model = shamrock.get_Model_SPH(context=ctx, vector_type="f64_3", sph_kernel="M4") | ||
| model.load_from_dump("tmp.sham") | ||
|
|
||
| model.dump("tmp2.sham") | ||
|
|
||
| exit() |
| json jmeta = json::parse(metadata); | ||
|
|
||
| metadata_user = jmeta.at("metadata_user"); | ||
|
|
||
| auto jmeta_patch = jmeta.at("metadata_patch"); |
There was a problem hiding this comment.
The new dump format includes a versioning field, but this field is not checked during the loading process in load_shamrock_dump. This breaks backward compatibility with older dump files and misses an opportunity to validate the dump format.
The current implementation will fail when trying to load old dumps because it expects a single unified JSON header.
To make the loading process more robust, you should check for the presence and value of the versioning field. This would allow you to either support older formats or provide a clear error message if an unsupported format is detected.
For example:
json jmeta = json::parse(metadata);
if (jmeta.contains("versioning")) {
// New format
auto version = jmeta.at("versioning").get<std::string>();
if (version != WRITER_VERSION_TAG) {
logger::warn_ln("ShamrockDump", "Loading dump with version " + version + ", which differs from writer version " + WRITER_VERSION_TAG);
}
// ... proceed with loading new format
metadata_user = jmeta.at("metadata_user");
auto jmeta_patch = jmeta.at("metadata_patch");
// ...
} else {
// Old format or corrupted file
shambase::throw_with_loc<std::runtime_error>(
"Unsupported dump format: versioning information is missing.");
}|
|
||
| /// Currently this is the only valid tag as "1.0" of Shamrock dumps did not include any versioning | ||
| /// :'( | ||
| #define WRITER_VERSION_TAG "2.0" |
There was a problem hiding this comment.
Using a #define for a string constant is generally discouraged in modern C++. It's better to use a constexpr variable, which is type-safe and respects scope. This improves maintainability and prevents potential issues with macro expansion.
| #define WRITER_VERSION_TAG "2.0" | |
| static constexpr auto WRITER_VERSION_TAG = "2.0"; |
| /// :'( | ||
| #define WRITER_VERSION_TAG "2.0" | ||
|
|
||
| bool fused_metadata = true; |
Workflow reportworkflow report corresponding to commit fe2f52a Light CI is enabled. This will only run the basic tests and not the full tests. Pre-commit check reportPre-commit check: ✅ Test pipeline can run. Doxygen diff with
|
No description provided.