refactor: Introduce workspace, module, checkpoint#148
refactor: Introduce workspace, module, checkpoint#148zhanglei1949 wants to merge 3 commits intoalibaba:mainfrom
Conversation
ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan Review Summary by QodoRefactor storage management with Module abstraction, Workspace/Checkpoint framework, and ModuleFactory registry
WalkthroughsDescription• Introduces comprehensive module abstraction layer with Module base class defining unified lifecycle interface (Open, Dump, Close, Fork) for all storage components • Implements ModuleFactory singleton with type-erased instantiation and registration mechanism via NEUG_REGISTER_MODULE macro • Introduces Workspace and Checkpoint abstractions for managing numbered checkpoint directories with snapshot/runtime/wal subdirectories • Implements ModuleDescriptor metadata structure with JSON serialization for module inventory tracking • Introduces SnapshotMeta module managing checkpoint's module inventory and persisting metadata as JSON • Refactors PropertyGraph, VertexTable, EdgeTable, Schema, CsrBase, ColumnBase, and LFIndexer to inherit from Module and implement the unified lifecycle interface • Registers 78+ template specializations (MutableCsr, ImmutableCsr, TypedColumn) with ModuleFactory covering all data types • Adds StorageTypeName template trait for C++ type to string mapping with macro-based specialization • Refactors NeugDB to use Workspace for checkpoint discovery and opens graphs from latest checkpoint • Updates all storage tests and benchmarks to use new Checkpoint-based API instead of legacy directory management • Creates new CMake target neug_storages_module for module factory and base classes Diagramflowchart LR
A["Module Base Class<br/>Open/Dump/Close/Fork"] --> B["ModuleFactory<br/>Registry & Creation"]
C["Workspace<br/>Checkpoint Management"] --> D["Checkpoint<br/>Numbered Directories"]
D --> E["SnapshotMeta<br/>Module Inventory"]
E --> F["Schema<br/>VertexTable<br/>EdgeTable"]
F --> G["CsrBase<br/>ColumnBase<br/>LFIndexer"]
G --> B
H["PropertyGraph"] --> C
H --> A
I["NeugDB"] --> H
File Changes1. include/neug/storages/module/module.h
|
Code Review by Qodo
1. Checkpoint restore drops data
|
516e96f to
5bc7c6d
Compare
b0c0529 to
382166b
Compare
ac894ec to
347749a
Compare
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
fd47ddd to
2e9e53a
Compare
46823b4 to
96eb6ed
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors persistent storage management to revolve around Workspace + Checkpoint + ModuleDescriptor (module-based Open/Dump/Fork), and updates graph/table/CSR codepaths and tests accordingly (Fix #133).
Changes:
- Introduce checkpointed workspace layout (
Workspace,Checkpoint,SnapshotMeta,ModuleDescriptor) and module registry (ModuleFactory). - Migrate
PropertyGraph,VertexTable,EdgeTable,Table/Column, and CSR implementations to the new module-driven Open/Dump/Fork API. - Update/extend unit & storage tests for the new checkpoint/module workflow and add new fork/dump matrix tests.
Reviewed changes
Copilot reviewed 67 out of 67 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/test_table.cc | Updates table tests to use Workspace/Checkpoint and descriptors instead of hardcoded dirs. |
| tests/unittest/utils.h | Adds standard library includes needed by updated unit test helpers. |
| tests/unittest/test_column.cc | New typed tests for column fork isolation + dump/open matrix. |
| tests/unittest/logical_delete_test.cc | Migrates PropertyGraph open to checkpoint-based API. |
| tests/unittest/CMakeLists.txt | Adds new test_column target. |
| tests/storage/vertex_table_benchmark.cc | Updates benchmark to use checkpoint/module descriptors. |
| tests/storage/test_vertex_table.cc | Migrates vertex table tests to checkpoint-based Open/Dump. |
| tests/storage/test_set_property.cc | Includes updated unittest utils header. |
| tests/storage/test_property_graph.cc | Migrates property graph test fixture to checkpoint-based Open. |
| tests/storage/test_open_graph.cc | Includes updated unittest utils header and tweaks logging. |
| tests/storage/test_mutable_csr.cc | Adds fork/dump tests and migrates CSR tests to checkpoint-based API. |
| tests/storage/test_immutable_csr.cc | Migrates immutable CSR tests to checkpoint-based API and adds workspace usage. |
| tests/storage/test_export.cc | Includes updated unittest utils header. |
| tests/storage/test_ddl.cc | Includes updated unittest utils header. |
| tests/storage/test_csr_stream_ops.cc | Migrates CSR stream tests to checkpoint-based API and workspace-managed dirs. |
| tests/storage/test_csr_batch_ops.cc | Migrates CSR batch tests to checkpoint-based Open/Dump. |
| tests/storage/alter_property_test.cc | Migrates alter-property test to open graph via checkpoint. |
| tests/execution/test_value.cc | Minor formatting cleanup for EXPECT_DEATH. |
| src/utils/uuid.cc | Adds UUID generator implementation. |
| src/utils/property/table.cc | Refactors Table to module-based Open/Dump/Fork and registers column module types. |
| src/utils/id_indexer.cc | Registers LF indexer module type. |
| src/storages/module/module_factory.cc | Adds module factory implementation and registry. |
| src/storages/module/CMakeLists.txt | Adds object library for module sources. |
| src/storages/loader/abstract_property_graph_loader.cc | Loader now creates a workspace+checkpoint and uses checkpoint meta schema. |
| src/storages/graph/workspace.cc | New workspace implementation that discovers/creates checkpoints. |
| src/storages/graph/vertex_timestamp.cc | Migrates vertex timestamp to module-based Open/Dump/Fork and registers module. |
| src/storages/graph/vertex_table.cc | Migrates vertex table to module-based Open/Dump and descriptor submodules. |
| src/storages/graph/snapshot_meta.cc | Adds JSON-persisted snapshot meta (schema + module inventory). |
| src/storages/graph/schema.cc | Adds Schema JSON (de)serialization via YAML<->JSON conversion and fixes edge-strategy labels. |
| src/storages/graph/property_graph.cc | Migrates graph Open/Dump to checkpoint/meta-driven module open. |
| src/storages/graph/module_descriptor.cc | Implements ModuleDescriptor JSON and submodule tree handling. |
| src/storages/graph/edge_table.cc | Migrates edge table to module-based Open/Dump and property mutations using checkpoint. |
| src/storages/graph/checkpoint.cc | New checkpoint implementation (dirs, meta persistence, commit semantics, cleanup). |
| src/storages/csr/mutable_csr.cc | Migrates mutable CSR variants to module API, implements descriptor-based dump/open, registers module types. |
| src/storages/csr/immutable_csr.cc | Migrates immutable CSR variants to module API, implements dump/open, registers module types. |
| src/storages/container/mmap_container.cc | Improves mmap resize error and adds Fork for containers. |
| src/storages/container/container_utils.cc | Improves error logging when tmp path is missing in disk-backed mode. |
| src/storages/CMakeLists.txt | Adds module/ subdir into storages build. |
| src/server/neug_db_service.cc | Updates session pool init after WAL path handling changes. |
| src/main/neug_db.cc | Migrates DB open/ingest/checkpoint flow to workspace+checkpoint model. |
| include/neug/utils/uuid.h | Declares UUIDGenerator. |
| include/neug/utils/property/table.h | Updates Table API to checkpoint/module descriptor and adds Fork. |
| include/neug/utils/property/column.h | Migrates columns to Module interface and implements Open/Dump/Fork/type names. |
| include/neug/utils/id_indexer.h | Migrates LFIndexer to Module interface and adds Open/Dump/Fork. |
| include/neug/storages/workspace.h | Declares Workspace API. |
| include/neug/storages/snapshot_meta.h | Declares SnapshotMeta (schema + modules) and key helpers. |
| include/neug/storages/module_descriptor.h | Declares ModuleDescriptor with JSON + submodules. |
| include/neug/storages/module/type_name.h | Adds StorageTypeName<> trait for module type strings. |
| include/neug/storages/module/module_factory.h | Declares ModuleFactory + registration macro. |
| include/neug/storages/module/module.h | Declares the module lifecycle interface. |
| include/neug/storages/loader/abstract_property_graph_loader.h | Loader now owns a Workspace and seeds checkpoint meta schema. |
| include/neug/storages/graph/vertex_timestamp.h | VertexTimestamp now implements Module. |
| include/neug/storages/graph/vertex_table.h | VertexTable migrated to descriptor-based Open/Dump; indexer is now a pointer. |
| include/neug/storages/graph/schema.h | Adds Schema JSON API declarations. |
| include/neug/storages/graph/property_graph.h | Switches graph open/dump API to checkpoint-based, stores Checkpoint*. |
| include/neug/storages/graph/edge_table.h | Migrates EdgeTable API to checkpoint-based Open/Dump/Close and property mutations. |
| include/neug/storages/file_names.h | Removes temp_checkpoint_dir helper (no longer used with checkpoint model). |
| include/neug/storages/csr/mutable_csr.h | Migrates CSR base implementations to module API and adds Fork/type names. |
| include/neug/storages/csr/immutable_csr.h | Migrates immutable CSR to module API and defines module type names. |
| include/neug/storages/csr/csr_base.h | Makes CSR base inherit Module and adds fork_vertex hook. |
| include/neug/storages/container/mmap_container.h | Declares container Fork. |
| include/neug/storages/container/i_container.h | Adds container Fork virtual method. |
| include/neug/storages/checkpoint.h | Declares Checkpoint class and helpers (commit/meta/containers). |
| include/neug/server/session_pool.h | SessionPool now derives WAL location from graph.checkpoint(). |
| include/neug/main/neug_db.h | NeugDB::work_dir() now returns workspace db dir; adds Workspace member. |
Comments suppressed due to low confidence (12)
src/storages/graph/property_graph.cc:1
SnapshotMeta::edge_table_keyis declared as(src_label, edge_label, dst_label)ininclude/neug/storages/snapshot_meta.h, but here it’s called as(src_label, dst_label, edge_label). This will break meta lookups (and likely won’t compile if overloads don’t match). Align the argument order either in the helper signature or at all call sites.
src/storages/graph/vertex_table.cc:1Tablewas refactored toOpen/Dump/Forkand no longer shows aclose()API in the updated header diff, sotable_->close()is very likely a compile error. Either reintroduce aTable::Close()method (closing all columns’ modules) and call that, or update all callers to the new lifecycle method name consistently.
src/storages/graph/edge_table.cc:1- Same issue as VertexTable:
table_->close()is inconsistent with the refactoredTableAPI and is likely missing now. Consider adding and usingTable::Close()consistently from both VertexTable and EdgeTable.
src/utils/property/table.cc:1 add_columnsignores thememory_levelargument and always opens new columns askInMemory. This changes behavior for disk-backed or hugepage configurations. Use the providedmemory_level(and a stable per-column sub-descriptor/path strategy) so added columns follow the table’s storage policy.
src/utils/property/table.cc:1add_columnsignores thememory_levelargument and always opens new columns askInMemory. This changes behavior for disk-backed or hugepage configurations. Use the providedmemory_level(and a stable per-column sub-descriptor/path strategy) so added columns follow the table’s storage policy.
src/storages/graph/vertex_timestamp.cc:1- When
descis default-constructed,desc.pathis empty;std::filesystem::exists(\"\")can resolve to the current directory on some platforms/libstdc++ versions, causing an unintendedload_tscall with an empty path. Guard explicitly with!desc.path.empty()before checkingexists().
tests/utils/test_table.cc:1 - This uses
::getpid()but the file diff doesn’t show an include for<unistd.h>(or an equivalent platform header). This can fail to compile on some toolchains. Add the appropriate include or replacegetpid()usage with a portable alternative.
tests/storage/vertex_table_benchmark.cc:1 VertexTable::Openexpects theModuleDescriptorto contain sub-modules (indexer,property_table,vertex_timestamp). Providing onlymodule_type+pathwill result in empty sub-descriptors and can cause runtime failures (e.g., LFIndexer needing key type). For a benchmark, either (1) create a new empty table by passing a default descriptor but ensuring the indexer is initialized with pk type and opened accordingly, or (2) load a real descriptor produced byVertexTable::Dump.
src/storages/graph/checkpoint.cc:1- This constructs a new
UUIDGenerator(and thus a newstd::random_device/PRNG) on every loop iteration. Consider reusing a static/shared generator (e.g.,generate_uuid()already uses a static generator) to reduce overhead, especially since this is called frequently for runtime object creation.
src/utils/uuid.cc:1 <regex>is included but not used in this translation unit. Removing unused includes reduces compile time and avoids confusion about intended validation logic.
src/storages/graph/checkpoint.cc:1- Checkpoint meta updates now drive deletion of orphaned snapshot files. This is correctness-critical and potentially destructive; please add focused tests that (1) write meta referencing a subset of UUID files, (2) call
UpdateMeta, and (3) verify only unreferenced UUID files are deleted while referenced ones remain.
src/storages/graph/checkpoint.cc:1 - Checkpoint meta updates now drive deletion of orphaned snapshot files. This is correctness-critical and potentially destructive; please add focused tests that (1) write meta referencing a subset of UUID files, (2) call
UpdateMeta, and (3) verify only unreferenced UUID files are deleted while referenced ones remain.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f54bdd6 to
2e6503d
Compare
There was a problem hiding this comment.
Pull request overview
Refactors the on-disk workspace layout by introducing first-class Workspace, Checkpoint, and ModuleDescriptor abstractions, migrating graph/storage modules and tests away from ad-hoc file_names.h path composition (Fix #133).
Changes:
- Introduce
Workspace+Checkpointdirectory model with JSONSnapshotMetaand hierarchicalModuleDescriptortrees for module open/dump. - Convert core storage modules (columns, tables, CSRs, indexer, vertex/edge tables, property graph) to a unified
Moduleinterface (Open/Dump/Close/Fork). - Update unit/integration tests and benchmarks to create checkpoints via
Workspaceand reopen modules via returned descriptors.
Reviewed changes
Copilot reviewed 68 out of 68 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/test_table.cc | Updates table tests to use Workspace/Checkpoint and descriptor-based open/dump. |
| tests/unittest/utils.h | Adds standard headers needed by updated unit-test utilities. |
| tests/unittest/test_column.cc | Adds typed-column fork/dump/open matrix tests using new checkpoint/descriptors. |
| tests/unittest/logical_delete_test.cc | Switches logical-delete tests to open graphs from a Checkpoint. |
| tests/unittest/CMakeLists.txt | Registers new test_column target. |
| tests/storage/vertex_table_benchmark.cc | Updates benchmark open path to use checkpoint + descriptor. |
| tests/storage/test_vertex_table.cc | Migrates vertex table tests to checkpoint-based open/dump/reopen. |
| tests/storage/test_set_property.cc | Includes common unit-test utilities header. |
| tests/storage/test_property_graph.cc | Uses Workspace + Checkpoint to open PropertyGraph. |
| tests/storage/test_open_graph.cc | Includes common unit-test utilities header and adjusts logging. |
| tests/storage/test_mutable_csr.cc | Adds fork/dump/open tests and migrates CSR open/dump APIs to checkpoint/descriptors. |
| tests/storage/test_immutable_csr.cc | Migrates immutable CSR tests to checkpoint/descriptors and adds workspace usage. |
| tests/storage/test_export.cc | Includes common unit-test utilities header. |
| tests/storage/test_ddl.cc | Includes common unit-test utilities header. |
| tests/storage/test_csr_stream_ops.cc | Replaces manual dir layout with Workspace + checkpoint-based CSR open. |
| tests/storage/test_csr_batch_ops.cc | Migrates batch CSR tests to checkpoint/descriptors and removes manual dir plumbing. |
| tests/storage/alter_property_test.cc | Uses Workspace + Checkpoint to open/modify graph instead of raw work dir. |
| tests/execution/test_value.cc | Minor formatting change to EXPECT_DEATH. |
| src/utils/uuid.cc | Adds UUID generation implementation. |
| src/utils/property/table.cc | Refactors Table to module-based Open/Dump/Fork integration. |
| src/utils/property/column.cc | Registers TypedColumn<T> template modules with ModuleFactory. |
| src/utils/id_indexer.cc | Registers LFIndexer as a template module. |
| src/storages/module/module_factory.cc | Implements ModuleFactory registry and creation. |
| src/storages/module/CMakeLists.txt | Adds object library for module factory sources. |
| src/storages/loader/abstract_property_graph_loader.cc | Uses Workspace and checkpoint meta for loader error reporting / initialization. |
| src/storages/graph/workspace.cc | Implements workspace directory scanning and checkpoint creation. |
| src/storages/graph/vertex_timestamp.cc | Converts VertexTimestamp to Module with descriptor-based open/dump/fork. |
| src/storages/graph/vertex_table.cc | Converts VertexTable to descriptor-based module open/dump and updated indexer usage. |
| src/storages/graph/snapshot_meta.cc | Introduces JSON-backed snapshot metadata for schema + module inventory. |
| src/storages/graph/schema.cc | Adds YAML↔JSON conversion helpers and schema JSON (de)serialization. |
| src/storages/graph/property_graph.cc | Refactors graph open/dump to operate on a Checkpoint + SnapshotMeta. |
| src/storages/graph/module_descriptor.cc | Implements hierarchical ModuleDescriptor with JSON serialization. |
| src/storages/graph/edge_table.cc | Converts EdgeTable open/dump and property mutation flows to checkpoint/descriptors. |
| src/storages/graph/checkpoint.cc | Implements checkpoint directory management, meta update, and orphan cleanup. |
| src/storages/csr/mutable_csr.cc | Refactors mutable CSR open/dump/close to checkpoint/descriptors and registers templates. |
| src/storages/csr/immutable_csr.cc | Refactors immutable CSR open/dump/close to checkpoint/descriptors and registers templates. |
| src/storages/container/mmap_container.cc | Adds IDataContainer::Fork support and improves mmap resize error detail. |
| src/storages/container/container_utils.cc | Strengthens validation/logging for disk-backed container creation. |
| src/storages/CMakeLists.txt | Adds new module subdirectory into storages object build. |
| src/server/neug_db_service.cc | Adjusts session pool initialization to derive WAL location from graph/checkpoint. |
| src/main/neug_db.cc | Uses Workspace checkpoints to open graph, ingest WALs, and create new checkpoints. |
| include/neug/utils/uuid.h | Declares UUIDGenerator. |
| include/neug/utils/property/table.h | Updates Table API to checkpoint/descriptors and adds Fork. |
| include/neug/utils/property/column.h | Converts columns to Module interface and adds fork/type naming. |
| include/neug/utils/id_indexer.h | Converts LFIndexer to Module interface with open/dump/fork + descriptor metadata. |
| include/neug/storages/workspace.h | Declares Workspace API for checkpoint management. |
| include/neug/storages/snapshot_meta.h | Declares SnapshotMeta for schema + module descriptors persisted as JSON. |
| include/neug/storages/module_descriptor.h | Declares hierarchical ModuleDescriptor with JSON and submodules. |
| include/neug/storages/module/type_name.h | Adds template type-name mapping for module registration keys. |
| include/neug/storages/module/module_factory.h | Declares ModuleFactory and registration macros. |
| include/neug/storages/module/module.h | Declares unified Module lifecycle interface. |
| include/neug/storages/loader/abstract_property_graph_loader.h | Refactors loader to create a workspace/checkpoint and seed meta schema. |
| include/neug/storages/graph/vertex_timestamp.h | Converts VertexTimestamp to Module interface. |
| include/neug/storages/graph/vertex_table.h | Refactors VertexTable open/dump APIs and indexer ownership. |
| include/neug/storages/graph/schema.h | Adds schema JSON conversion APIs. |
| include/neug/storages/graph/property_graph.h | Refactors graph open/dump to checkpoint-based model and exposes checkpoint accessor. |
| include/neug/storages/graph/edge_table.h | Refactors EdgeTable open/dump and property mutation APIs to include checkpoint. |
| include/neug/storages/file_names.h | Removes temp_checkpoint_dir helper (obsolete under new checkpoint model). |
| include/neug/storages/csr/mutable_csr.h | Updates CSR base to Module interface, adds fork/type naming, and shared nbr-list semantics. |
| include/neug/storages/csr/immutable_csr.h | Updates immutable CSR to Module interface and adds type naming. |
| include/neug/storages/csr/csr_base.h | Makes CsrBase a Module and introduces fork_vertex requirement. |
| include/neug/storages/container/mmap_container.h | Adds Fork API to MMapContainer. |
| include/neug/storages/container/i_container.h | Adds IDataContainer::Fork API. |
| include/neug/storages/checkpoint.h | Declares Checkpoint with runtime/snapshot/wal dirs and commit utilities. |
| include/neug/server/session_pool.h | Derives WAL writer URI from graph.checkpoint().wal_dir(). |
| include/neug/main/neug_db.h | Stores a Workspace and exposes work_dir() via ws_.db_dir(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
00ce728 to
e47f9d1
Compare
|
TODO(@zhanglei1949 ) Avoid full dump if the file is not changes(not dirty) |
Implemented. TODO: validate on sf300 |
There was a problem hiding this comment.
Pull request overview
This PR refactors on-disk workspace management by introducing explicit Workspace, Checkpoint, and ModuleDescriptor/Module abstractions, replacing the previous ad-hoc file_names.h-driven layout (Fix #133).
Changes:
- Introduce
Workspace/Checkpointdirectory model plusSnapshotMetaJSON for persisted module inventory. - Convert core storage components (graph, tables, CSRs, columns, indexer) to
Module::Open/Dump/Close/ForkusingModuleDescriptor. - Update tests/benchmarks and build wiring to use the new checkpointed workspace layout; remove
file_names.husage across the codebase.
Reviewed changes
Copilot reviewed 72 out of 72 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/test_table.cc | Refactor table tests to use Workspace + checkpoint/module descriptors. |
| tests/unittest/utils.h | Add common stdlib includes for updated unittest utilities. |
| tests/unittest/test_column.cc | New typed tests for column fork/dump/open across memory levels. |
| tests/unittest/logical_delete_test.cc | Open graph via Workspace checkpoint instead of raw directory. |
| tests/unittest/CMakeLists.txt | Add test_column target. |
| tests/storage/vertex_table_benchmark.cc | Update benchmark to use Checkpoint/ModuleDescriptor open path. |
| tests/storage/test_set_property.cc | Remove file_names.h dependency; include unittest utils. |
| tests/storage/test_property_graph.cc | Open graph via Workspace checkpoint. |
| tests/storage/test_open_graph.cc | Remove file_names.h; adjust logging and test utilities usage. |
| tests/storage/test_mutable_csr.cc | Convert CSR tests to checkpoint/module descriptor APIs; add fork matrix tests. |
| tests/storage/test_immutable_csr.cc | Convert immutable CSR tests to checkpoint/module descriptor APIs. |
| tests/storage/test_export.cc | Remove file_names.h; include unittest utils. |
| tests/storage/test_ddl.cc | Remove file_names.h; include unittest utils. |
| tests/storage/test_csr_stream_ops.cc | Open CSRs via checkpoint/module descriptors; remove manual dir management. |
| tests/storage/test_csr_batch_ops.cc | Open/dump/open CSR flows via checkpoint/module descriptors. |
| tests/storage/test_checkpoint.cc | Add checkpoint optimization tests (hardlink fast path). |
| tests/storage/alter_property_test.cc | Open graph via checkpoint created from workspace. |
| tests/execution/test_value.cc | Minor formatting change to EXPECT_DEATH. |
| src/utils/uuid.cc | Add UUID generator implementation. |
| src/utils/property/table.cc | Convert Table open/dump/add-columns to module-descriptor-based APIs. |
| src/utils/property/column.cc | Register TypedColumn<T> module instantiations. |
| src/utils/id_indexer.cc | Register LFIndexer module instantiation. |
| src/transaction/wal/wal.cc | Remove file_names.h dependency. |
| src/transaction/update_transaction.cc | Remove file_names.h dependency. |
| src/storages/module/module_factory.cc | Add module factory registry implementation. |
| src/storages/module/CMakeLists.txt | Add storages module object library build. |
| src/storages/loader/abstract_property_graph_loader.cc | Use workspace db_dir() when reporting disk remaining. |
| src/storages/graph/workspace.cc | Add Workspace implementation for checkpoint discovery/creation. |
| src/storages/graph/vertex_timestamp.cc | Make VertexTimestamp a Module with descriptor-based open/dump/fork. |
| src/storages/graph/vertex_table.cc | Convert vertex table open/dump/close to module-descriptor APIs. |
| src/storages/graph/snapshot_meta.cc | Add JSON-backed SnapshotMeta for schema + module descriptors. |
| src/storages/graph/schema.cc | Add schema JSON <-> YAML conversion helpers for checkpoint meta. |
| src/storages/graph/property_graph.cc | Open/dump graph via checkpoint meta + module descriptors; remove schema file IO. |
| src/storages/graph/module_descriptor.cc | Implement ModuleDescriptor JSON serialization + sub-module tree. |
| src/storages/graph/edge_table.cc | Convert edge table open/dump/add/delete props to module-descriptor APIs. |
| src/storages/graph/checkpoint.cc | Add checkpoint implementation (runtime/snapshot dirs, meta update, cleanup, hardlinks). |
| src/storages/csr/mutable_csr.cc | Convert mutable CSRs to module-descriptor open/dump/close/fork model. |
| src/storages/csr/immutable_csr.cc | Convert immutable CSRs to module-descriptor open/dump/close model; add registration. |
| src/storages/container/mmap_container.cc | Improve error message; add IDataContainer::Fork implementation. |
| src/storages/container/container_utils.cc | Improve error logging for missing tmp path in disk-backed containers. |
| src/storages/CMakeLists.txt | Add module subdirectory and include in storages object files. |
| src/server/neug_db_service.cc | Session pool now derives WAL path from graph checkpoint instead of wal_dir(...). |
| src/main/neug_db.cc | Open DB via workspace checkpoints; checkpoint creation now creates new checkpoint dir + dumps into it. |
| include/neug/utils/uuid.h | Add UUID generator header. |
| include/neug/utils/property/table.h | Update table API to checkpoint/module descriptor pattern. |
| include/neug/utils/property/column.h | Make columns Modules with descriptor-based open/dump/fork. |
| include/neug/utils/id_indexer.h | Make LFIndexer a Module with descriptor-based open/dump/fork. |
| include/neug/storages/workspace.h | Add workspace API for managing checkpoint set. |
| include/neug/storages/snapshot_meta.h | Add snapshot meta interface (schema + module map). |
| include/neug/storages/module_descriptor.h | Add module descriptor structure with JSON/sub-module support. |
| include/neug/storages/module/type_name.h | Add storage-type-to-name mapping for template module registration. |
| include/neug/storages/module/module_factory.h | Add module factory + registration macros. |
| include/neug/storages/module/module.h | Define Module interface for open/dump/close/fork. |
| include/neug/storages/loader/abstract_property_graph_loader.h | Loader now creates/uses workspace + checkpoint meta schema. |
| include/neug/storages/graph/vertex_timestamp.h | Convert vertex timestamp tracker to Module interface. |
| include/neug/storages/graph/vertex_table.h | Refactor vertex table to depend on checkpoint/module descriptors and module indexer. |
| include/neug/storages/graph/schema.h | Add schema JSON serialization hooks. |
| include/neug/storages/graph/property_graph.h | Change graph open/dump API to checkpoint-based; expose checkpoint for WAL writers. |
| include/neug/storages/graph/edge_table.h | Update edge table interface to checkpoint/module descriptor APIs. |
| include/neug/storages/file_names.h | Remove legacy path helper header. |
| include/neug/storages/csr/mutable_csr.h | Update mutable CSR interfaces to Module open/dump/close/fork and fork-vertex API. |
| include/neug/storages/csr/immutable_csr.h | Update immutable CSR interfaces to Module open/dump/close. |
| include/neug/storages/csr/csr_base.h | Make CSR base class a Module and add fork_vertex contract. |
| include/neug/storages/container/mmap_container.h | Add container fork API surface. |
| include/neug/storages/container/i_container.h | Add IDataContainer::Fork interface. |
| include/neug/storages/checkpoint.h | Add checkpoint API (open/commit/runtime objects/meta). |
| include/neug/server/session_pool.h | WAL writers now use graph.checkpoint().wal_dir() instead of injected wal uri. |
| include/neug/main/neug_db.h | Expose db work dir via workspace; allocator init now takes allocator dir. |
| .github/workflows/neug-test.yml | Adjust ccache key hashing inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Refactors the database directory layout and persistence layer by introducing Workspace, Checkpoint, and ModuleDescriptor/SnapshotMeta, replacing the previous ad-hoc file_names.h-based path conventions (Fix #133).
Changes:
- Introduce checkpoint-scoped workspace management (
Workspace,Checkpoint) and JSON-based metadata (SnapshotMeta) to track module snapshots. - Convert core storage components (CSR, columns/indexers, vertex/edge tables, property graph) to a
Module-styleOpen/Dump/Close/Forklifecycle driven byModuleDescriptor. - Update and expand tests/benchmarks to open/dump via checkpoints and to cover fork + hardlink reuse behavior.
Reviewed changes
Copilot reviewed 72 out of 72 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/test_table.cc | Moves table tests to checkpoint/workspace-based open/dump. |
| tests/unittest/utils.h | Adds includes used by new test helpers. |
| tests/unittest/test_column.cc | Adds new typed-column fork/dump/open matrix tests. |
| tests/unittest/logical_delete_test.cc | Opens graph via workspace checkpoint. |
| tests/unittest/CMakeLists.txt | Registers new test_column target. |
| tests/storage/vertex_table_benchmark.cc | Updates benchmark to open vertex table via checkpoint + descriptor. |
| tests/storage/test_set_property.cc | Switches test utilities away from file_names.h. |
| tests/storage/test_property_graph.cc | Opens property graph via workspace checkpoint. |
| tests/storage/test_open_graph.cc | Updates test includes/utilities; minor logging change. |
| tests/storage/test_mutable_csr.cc | Updates CSR tests to checkpoint/module descriptor API; adds fork tests. |
| tests/storage/test_immutable_csr.cc | Updates immutable CSR tests to checkpoint/module descriptor API. |
| tests/storage/test_export.cc | Switches test utilities away from file_names.h. |
| tests/storage/test_ddl.cc | Switches test utilities away from file_names.h. |
| tests/storage/test_csr_stream_ops.cc | Updates streaming CSR tests to checkpoint/module API. |
| tests/storage/test_csr_batch_ops.cc | Updates batch CSR tests to checkpoint/module API. |
| tests/storage/test_checkpoint.cc | Adds hardlink optimization tests for unchanged dumps; removes file_names.h. |
| tests/storage/alter_property_test.cc | Opens graph via workspace checkpoint in alter-property flow. |
| tests/execution/test_value.cc | Formatting-only change in EXPECT_DEATH. |
| src/utils/uuid.cc | Adds UUID generator implementation. |
| src/utils/property/table.cc | Refactors Table to dump/open via Checkpoint + ModuleDescriptor. |
| src/utils/property/column.cc | Registers TypedColumn<T> module types. |
| src/utils/id_indexer.cc | Registers indexer template module types. |
| src/transaction/wal/wal.cc | Removes dependency on removed file_names.h. |
| src/transaction/update_transaction.cc | Removes dependency on removed file_names.h. |
| src/storages/module/module_factory.cc | Implements module factory registry. |
| src/storages/module/CMakeLists.txt | Adds storages module object library. |
| src/storages/loader/abstract_property_graph_loader.cc | Uses workspace db dir for disk remaining reporting. |
| src/storages/graph/workspace.cc | Implements workspace checkpoint enumeration + creation. |
| src/storages/graph/vertex_timestamp.cc | Converts vertex timestamp tracker into a Module. |
| src/storages/graph/vertex_table.cc | Converts vertex table to descriptor-driven open/dump. |
| src/storages/graph/snapshot_meta.cc | Implements JSON meta persistence for checkpoint inventories. |
| src/storages/graph/schema.cc | Adds Schema YAML↔JSON conversion for meta persistence. |
| src/storages/graph/property_graph.cc | Refactors graph open/dump to use Checkpoint + SnapshotMeta. |
| src/storages/graph/module_descriptor.cc | Implements descriptor JSON serialization + nested submodules. |
| src/storages/graph/edge_table.cc | Refactors edge table open/dump + property rebuild paths for checkpoint API. |
| src/storages/graph/checkpoint.cc | Implements checkpoint dirs, commit/hardlink/copy semantics, cleanup. |
| src/storages/csr/mutable_csr.cc | Converts mutable CSR to module lifecycle + descriptor dump/open. |
| src/storages/csr/immutable_csr.cc | Converts immutable CSR to module lifecycle + descriptor dump/open. |
| src/storages/container/mmap_container.cc | Adds container Fork() and improves resize error diagnostics. |
| src/storages/container/container_utils.cc | Adds explicit error logging for missing tmp path in disk-backed mode. |
| src/storages/CMakeLists.txt | Adds module/ subdirectory to storages build. |
| src/server/neug_db_service.cc | Updates session pool init now that WAL dir comes from checkpoint. |
| src/main/neug_db.cc | Refactors DB open/checkpoint to use workspace/head checkpoint and checkpoint-scoped dirs. |
| include/neug/utils/uuid.h | Declares UUID generator. |
| include/neug/utils/property/table.h | Updates Table API to checkpoint/module descriptor model. |
| include/neug/utils/property/column.h | Converts columns into Module implementations with dump/open/fork. |
| include/neug/utils/id_indexer.h | Converts LFIndexer into a Module with descriptor-based persistence/fork. |
| include/neug/storages/workspace.h | Declares Workspace API. |
| include/neug/storages/snapshot_meta.h | Declares checkpoint meta format and key scheme. |
| include/neug/storages/module_descriptor.h | Declares module descriptor structure and JSON support. |
| include/neug/storages/module/type_name.h | Adds type-name trait for module registration keys. |
| include/neug/storages/module/module_factory.h | Declares module factory + registration macros. |
| include/neug/storages/module/module.h | Introduces the core module lifecycle interface. |
| include/neug/storages/loader/abstract_property_graph_loader.h | Constructs loader graph through workspace/checkpoint + meta schema. |
| include/neug/storages/graph/vertex_timestamp.h | Updates vertex timestamp to implement Module. |
| include/neug/storages/graph/vertex_table.h | Updates vertex table API and internals for module-based persistence. |
| include/neug/storages/graph/schema.h | Declares Schema JSON conversion API. |
| include/neug/storages/graph/property_graph.h | Changes property graph open/dump API to checkpoint-based. |
| include/neug/storages/graph/edge_table.h | Changes edge table open/dump and property ops to checkpoint-based. |
| include/neug/storages/file_names.h | Removes legacy ad-hoc directory naming helpers. |
| include/neug/storages/csr/mutable_csr.h | Updates CSR interfaces to module lifecycle + fork support. |
| include/neug/storages/csr/immutable_csr.h | Updates immutable CSR interfaces to module lifecycle. |
| include/neug/storages/csr/csr_base.h | Makes CSR base a Module and adds fork_vertex hook. |
| include/neug/storages/container/mmap_container.h | Declares container Fork(). |
| include/neug/storages/container/i_container.h | Adds container Fork() to the interface. |
| include/neug/storages/checkpoint.h | Declares checkpoint API including commit semantics. |
| include/neug/server/session_pool.h | Constructs WAL writers from graph checkpoint WAL dir. |
| include/neug/main/neug_db.h | Uses workspace db dir as work dir and updates allocator init signature. |
| .github/workflows/neug-test.yml | Adjusts ccache key inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1059b20 to
ba65938
Compare
There was a problem hiding this comment.
Pull request overview
Refactors database directory/file management by introducing a structured Workspace → Checkpoint → ModuleDescriptor architecture, replacing the previous ad-hoc file_names.h-based layout (Fix #133).
Changes:
- Introduce
Workspace,Checkpoint,SnapshotMeta(JSON),ModuleDescriptor, andModuleFactoryto standardize persistence and restore of storage modules. - Refactor graph/table/CSR/indexer components (and many tests) to open/dump through
Checkpoint+ModuleDescriptorinstead of direct path conventions. - Add/extend tests for module fork/dump/open behavior and checkpoint hardlink optimization.
Reviewed changes
Copilot reviewed 72 out of 72 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/test_table.cc | Migrates table tests to Workspace/Checkpoint-based open/dump paths. |
| tests/unittest/utils.h | Expands test utilities includes (support for new filesystem usage). |
| tests/unittest/test_column.cc | New typed-column fork/dump/open matrix tests across memory levels. |
| tests/unittest/logical_delete_test.cc | Updates graph opening to go through Workspace checkpoint. |
| tests/unittest/CMakeLists.txt | Adds new test_column target. |
| tests/storage/vertex_table_benchmark.cc | Updates benchmark to open VertexTable via Checkpoint/ModuleDescriptor. |
| tests/storage/test_vertex_table.cc | Migrates vertex table tests to Workspace/Checkpoint descriptors. |
| tests/storage/test_set_property.cc | Removes legacy file_names dependency (uses unittest utils). |
| tests/storage/test_property_graph.cc | Migrates PropertyGraph test open path to Workspace checkpoint. |
| tests/storage/test_open_graph.cc | Removes file_names dependency; minor logging fix. |
| tests/storage/test_mutable_csr.cc | Adds CSR fork tests + migrates dump/open APIs to descriptors. |
| tests/storage/test_immutable_csr.cc | Migrates immutable CSR tests to descriptor-based open/dump/close. |
| tests/storage/test_export.cc | Removes legacy file_names dependency (uses unittest utils). |
| tests/storage/test_ddl.cc | Removes legacy file_names dependency (uses unittest utils). |
| tests/storage/test_csr_stream_ops.cc | Migrates stream CSR tests to Workspace/Checkpoint module open. |
| tests/storage/test_csr_batch_ops.cc | Migrates batch CSR tests to descriptor-based open/dump/open. |
| tests/storage/test_checkpoint.cc | Adds new checkpoint hardlink optimization tests. |
| tests/storage/alter_property_test.cc | Updates helper to open graph via Checkpoint instead of directory paths. |
| tests/execution/test_value.cc | Minor formatting change to death test. |
| src/utils/uuid.cc | Adds UUIDGenerator implementation used for runtime/snapshot object naming. |
| src/utils/property/table.cc | Refactors Table open/dump/add-columns to use Checkpoint + ModuleDescriptor. |
| src/utils/property/column.cc | Registers TypedColumn module types with ModuleFactory. |
| src/utils/id_indexer.cc | Registers LFIndexer module instantiation with ModuleFactory. |
| src/transaction/wal/wal.cc | Removes legacy file_names include usage. |
| src/transaction/update_transaction.cc | Removes legacy file_names include usage. |
| src/storages/module/module_factory.cc | Implements ModuleFactory registry. |
| src/storages/module/CMakeLists.txt | Adds module sub-library build definition. |
| src/storages/loader/abstract_property_graph_loader.cc | Uses Workspace/db_dir from new architecture for disk remaining logs. |
| src/storages/graph/workspace.cc | Implements Workspace directory manager for checkpoints. |
| src/storages/graph/vertex_timestamp.cc | Converts VertexTimestamp to Module (Open/Dump/Fork/Close) + registers it. |
| src/storages/graph/vertex_table.cc | Migrates VertexTable open/dump/close to module descriptors. |
| src/storages/graph/snapshot_meta.cc | Implements SnapshotMeta JSON read/write and schema/module inventory. |
| src/storages/graph/schema.cc | Adds Schema JSON (de)serialization via YAML↔JSON conversion helpers. |
| src/storages/graph/property_graph.cc | Refactors PropertyGraph open/dump to use SnapshotMeta + ModuleDescriptors. |
| src/storages/graph/module_descriptor.cc | Implements ModuleDescriptor JSON serialization + sub-module tree. |
| src/storages/graph/edge_table.cc | Refactors EdgeTable open/dump/add/delete-props to descriptor-based flow. |
| src/storages/graph/checkpoint.cc | Implements Checkpoint lifecycle, meta update, commit/link/copy semantics. |
| src/storages/csr/immutable_csr.cc | Refactors immutable CSR open/dump/close to Checkpoint+ModuleDescriptor. |
| src/storages/container/mmap_container.cc | Adds container Fork + improves mmap resize error detail. |
| src/storages/container/container_utils.cc | Improves error logging when tmp path is missing for disk-backed containers. |
| src/storages/CMakeLists.txt | Adds storages/module subdirectory to build objects. |
| src/server/neug_db_service.cc | Adjusts SessionPool init (WAL path derived from graph checkpoint). |
| src/main/neug_db.cc | Refactors DB open/checkpoint creation to use Workspace/Checkpoint and meta. |
| include/neug/utils/uuid.h | Adds UUIDGenerator API. |
| include/neug/utils/property/table.h | Updates Table API to Open/Dump via Checkpoint + ModuleDescriptor. |
| include/neug/utils/property/column.h | Converts ColumnBase to Module; adds descriptor-based Open/Dump/Fork. |
| include/neug/utils/id_indexer.h | Converts LFIndexer to Module; adds Open/Dump/Fork using descriptors. |
| include/neug/storages/workspace.h | Adds Workspace API for managing checkpoint directories. |
| include/neug/storages/snapshot_meta.h | Adds SnapshotMeta API (schema + module descriptor map). |
| include/neug/storages/module_descriptor.h | Adds ModuleDescriptor (path/type/size/extra + sub-modules) definition. |
| include/neug/storages/module/type_name.h | Adds type→string mapping helper for template module registration keys. |
| include/neug/storages/module/module_factory.h | Adds ModuleFactory + registration macros. |
| include/neug/storages/module/module.h | Introduces Module interface (Open/Dump/Close/Fork). |
| include/neug/storages/loader/abstract_property_graph_loader.h | Loader now initializes an empty Workspace + initial checkpoint meta schema. |
| include/neug/storages/graph/vertex_timestamp.h | Makes VertexTimestamp a Module and declares new lifecycle methods. |
| include/neug/storages/graph/vertex_table.h | Refactors VertexTable internals to use module-based indexer + dump descriptors. |
| include/neug/storages/graph/schema.h | Declares Schema ToJson/FromJson API. |
| include/neug/storages/graph/property_graph.h | Refactors PropertyGraph API to open/dump via Checkpoint and expose checkpoint(). |
| include/neug/storages/graph/edge_table.h | Refactors EdgeTable API to open/dump via Checkpoint + descriptor. |
| include/neug/storages/file_names.h | Removes legacy file/directory naming helpers (deleted). |
| include/neug/storages/csr/mutable_csr.h | Converts mutable CSRs to Module lifecycle + adds Fork/fork_vertex plumbing. |
| include/neug/storages/csr/immutable_csr.h | Converts immutable CSRs to Module lifecycle (descriptor-based open/dump). |
| include/neug/storages/csr/csr_base.h | Makes CsrBase inherit Module; adds fork_vertex hook. |
| include/neug/storages/container/mmap_container.h | Adds GetHeader and Fork API for mmap containers. |
| include/neug/storages/container/i_container.h | Adds IDataContainer::Fork interface. |
| include/neug/storages/checkpoint.h | Adds Checkpoint API for file/container open + commit + meta management. |
| include/neug/server/session_pool.h | WAL writer path now derived from graph.checkpoint().wal_dir(). |
| include/neug/main/neug_db.h | NeugDB now exposes work_dir via Workspace db_dir and updates allocator init signature. |
| .github/workflows/neug-test.yml | Adjusts ccache key inputs for CI caching. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ed23154 to
15d924f
Compare
refine module_store and module_desc refactor snapshot_meta/checkpoint, remove unncessary changes remove module store minor ref ref wal apply refine refine checkpoint manage fix ci minor refine fix minor refine refine module register macro fix format rm file_names.h minor refine minor refine supporing fast dump remove some cmments TBD: add dirty flag for MutableCSR, to avoid redumping the data minor fix remove method generate_uuid fix test revert is_dirty_ flag and compute md5 for mutable_csr's nbr_list before dump fix test check whether data_buffer is modified at column level refine remove unused field last_descriptor
15d924f to
99e3b74
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors on-disk database state management around Workspace / Checkpoint and a Module + ModuleDescriptor persistence API, aiming to speed up checkpoint/open/close flows (including hardlink-based snapshot reuse) and update tests accordingly.
Changes:
- Introduce
Workspace,Checkpoint,SnapshotMeta,ModuleDescriptor, andModuleFactoryto standardize module open/dump/fork/close and persist module inventories as JSON. - Migrate core storage components (graph, vertex/edge tables, CSR, columns/indexer, vertex timestamp) and many tests from path-based open/dump to descriptor-based checkpoint operations.
- Add new unit tests/benchmarks for fork isolation and checkpoint hardlink optimization; update CI cache key and gitignore.
Reviewed changes
Copilot reviewed 72 out of 73 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/test_table.cc | Updates table tests to use Workspace/Checkpoint + ModuleDescriptor open/dump. |
| tests/unittest/utils.h | Adds standard headers used by new test utilities. |
| tests/unittest/test_column.cc | Adds typed column fork/dump/open matrix tests. |
| tests/unittest/logical_delete_test.cc | Switches graph open to checkpoint-based API. |
| tests/unittest/CMakeLists.txt | Adds new test_column target. |
| tests/storage/vertex_table_benchmark.cc | Updates benchmark to checkpoint + descriptor-based VertexTable::Open. |
| tests/storage/test_vertex_table.cc | Migrates vertex table tests to checkpoint-based open/dump/reopen. |
| tests/storage/test_set_property.cc | Replaces old file path utilities include with unittest utils. |
| tests/storage/test_property_graph.cc | Migrates property graph tests to Workspace/Checkpoint open. |
| tests/storage/test_open_graph.cc | Removes legacy file-names include; minor logging fix. |
| tests/storage/test_mutable_csr.cc | Adds CSR fork tests; migrates open/dump/close to checkpoint-based API. |
| tests/storage/test_immutable_csr.cc | Migrates immutable CSR tests to checkpoint-based API and module dump/open. |
| tests/storage/test_export.cc | Removes legacy file-names include; uses unittest utils. |
| tests/storage/test_ddl.cc | Removes legacy file-names include; uses unittest utils. |
| tests/storage/test_csr_stream_ops.cc | Migrates CSR stream tests to checkpoint-based API. |
| tests/storage/test_csr_batch_ops.cc | Migrates CSR batch tests to checkpoint-based open/dump. |
| tests/storage/test_checkpoint.cc | Adds hardlink optimization tests; removes legacy file-names include. |
| tests/storage/alter_property_test.cc | Opens graph via checkpoint rather than directory. |
| tests/execution/test_value.cc | Formatting tweak for EXPECT_DEATH. |
| src/utils/uuid.cc | Adds UUID generator implementation used by checkpoint runtime objects. |
| src/utils/property/table.cc | Refactors Table to ModuleDescriptor dump/open and checkpoint-based column management. |
| src/utils/property/column.cc | Registers TypedColumn<T> module types with ModuleFactory. |
| src/utils/id_indexer.cc | Registers LFIndexer module type with ModuleFactory. |
| src/transaction/wal/wal.cc | Removes dependency on legacy file naming utilities. |
| src/transaction/update_transaction.cc | Removes dependency on legacy file naming utilities. |
| src/storages/module/module_factory.cc | Implements ModuleFactory registry. |
| src/storages/module/CMakeLists.txt | Adds new storages module object library build. |
| src/storages/loader/abstract_property_graph_loader.cc | Uses Workspace db dir in disk remaining logging. |
| src/storages/graph/workspace.cc | Implements Workspace checkpoint discovery and creation. |
| src/storages/graph/vertex_timestamp.cc | Converts VertexTimestamp to Module supporting dump/open/fork via checkpoint. |
| src/storages/graph/vertex_table.cc | Converts VertexTable open/dump to module descriptors + checkpoint. |
| src/storages/graph/snapshot_meta.cc | Implements JSON meta read/write for schema + module inventory. |
| src/storages/graph/schema.cc | Adds YAML<->JSON conversion and schema JSON serialization hooks. |
| src/storages/graph/property_graph.cc | Refactors graph open/dump to checkpoint + SnapshotMeta inventory. |
| src/storages/graph/module_descriptor.cc | Implements ModuleDescriptor JSON serialization and sub-module handling. |
| src/storages/graph/edge_table.cc | Refactors edge tables to checkpoint/descriptors; updates property-alter rebuild paths. |
| src/storages/graph/checkpoint.cc | Implements checkpoint runtime/snapshot dirs, commit/link semantics, and cleanup. |
| src/storages/csr/immutable_csr.cc | Refactors immutable CSR to module descriptors + checkpoint commit. |
| src/storages/container/mmap_container.cc | Improves resize error; adds container Fork support. |
| src/storages/container/container_utils.cc | Improves error logging for missing tmp path on disk-backed containers. |
| src/storages/CMakeLists.txt | Adds module/ subdirectory and links module object library. |
| src/server/neug_db_service.cc | Updates SessionPool init signature (wal dir now derived elsewhere). |
| src/main/neug_db.cc | Refactors DB open/checkpoint to use Workspace/Checkpoint and checkpoint-local wal/allocator dirs. |
| include/neug/utils/uuid.h | Declares UUID generator. |
| include/neug/utils/property/table.h | Updates table API for checkpoint-based open/dump/add_columns. |
| include/neug/utils/property/column.h | Makes columns implement Module; adds dump/open/fork; string column hardlink fast path. |
| include/neug/utils/id_indexer.h | Makes LFIndexer implement Module with descriptor dump/open/fork. |
| include/neug/storages/workspace.h | Declares Workspace public API. |
| include/neug/storages/snapshot_meta.h | Declares SnapshotMeta JSON-persisted module inventory and schema. |
| include/neug/storages/module_descriptor.h | Declares ModuleDescriptor structure + JSON serialization/sub-modules. |
| include/neug/storages/module/type_name.h | Adds template type-name mapping used for module type strings. |
| include/neug/storages/module/module_factory.h | Declares ModuleFactory and registration macros. |
| include/neug/storages/module/module.h | Declares Module interface (Open/Dump/Close/Fork). |
| include/neug/storages/loader/abstract_property_graph_loader.h | Moves loader graph initialization to workspace/checkpoint + schema in meta. |
| include/neug/storages/graph/vertex_timestamp.h | Updates VertexTimestamp API to Module. |
| include/neug/storages/graph/vertex_table.h | Updates VertexTable open/dump/add-properties to checkpoint/descriptors. |
| include/neug/storages/graph/schema.h | Adds schema JSON conversion interface. |
| include/neug/storages/graph/property_graph.h | Updates graph open/dump to checkpoint-based API and exposes checkpoint. |
| include/neug/storages/graph/edge_table.h | Updates edge table open/dump/property ops to checkpoint-based API. |
| include/neug/storages/file_names.h | Removes legacy file naming helpers. |
| include/neug/storages/csr/mutable_csr.h | Refactors mutable CSR to Module API, adds fork semantics and fork_vertex. |
| include/neug/storages/csr/immutable_csr.h | Refactors immutable CSR to Module API. |
| include/neug/storages/csr/csr_base.h | Makes CsrBase a Module and adds fork_vertex requirement. |
| include/neug/storages/container/mmap_container.h | Exposes file header and container fork API. |
| include/neug/storages/container/i_container.h | Adds IDataContainer::Fork. |
| include/neug/storages/checkpoint.h | Declares checkpoint APIs for commit/link/open and meta management. |
| include/neug/server/session_pool.h | Removes explicit wal uri param; constructs wal writers from graph checkpoint. |
| include/neug/main/neug_db.h | Wires Workspace into DB state and allocator init signature change. |
| .gitignore | Adds .aone_copilot/ ignore entry. |
| .github/workflows/neug-test.yml | Adjusts ccache key hashing inputs. |
Comments suppressed due to low confidence (7)
include/neug/server/session_pool.h:1
- WAL writers are bound to
graph.checkpoint().wal_dir()atSessionPool::Init()time, but checkpoints can rotate during runtime (NeugDB::createCheckpoint()now creates a new checkpoint). Existing sessions would continue writing WAL to the old checkpoint directory, while recovery (openGraphAndIngestWals()) reads WAL from the latest checkpoint directory, risking missed WAL replay or split WAL history. Consider keeping WAL under a stable workspace-level directory (e.g.,<db_dir>/wal) or adding a mechanism to rebind/recreate per-session WAL writers on checkpoint rotation.
/** Copyright 2020 Alibaba Group Holding Limited.
src/storages/graph/property_graph.cc:1
- Deleting the checkpoint's WAL directory during
Dump()can break subsequent writes if WAL is expected to continue in that location (especially sinceSessionPoolcreates WAL writers targetinggraph.checkpoint().wal_dir()). If the intent is to truncate old WAL logs after a successful checkpoint, prefer deleting WAL files while keeping the directory, or recreating the directory immediately after cleanup. Also consider whether WAL should be workspace-scoped rather than checkpoint-scoped (to avoid coupling runtime logging to snapshot rotation).
src/storages/graph/workspace.cc:1 - Checkpoint directories are named without zero-padding (e.g.,
checkpoint-1,checkpoint-10). Any code that does lexicographic sorting on directory names (including the new tests) will mis-order checkpoints once ids reach 10+. Consider zero-padding to a fixed width (matching the header commentcheckpoint-00001) or ensuring any directory listing sorts by parsed numeric id rather than by string path.
tests/storage/test_checkpoint.cc:1 std::sort(dirs)sorts lexicographically by path, which becomes incorrect once checkpoint ids reach multiple digits (e.g.,checkpoint-10sorts beforecheckpoint-2). This can make the optimization tests flaky as the DB accumulates checkpoints. Prefer sorting by parsed numeric checkpoint id (e.g., extract the integer suffix and sort by that), or enforce zero-padded checkpoint directory names inWorkspace::CreateCheckpoint().
src/storages/container/mmap_container.cc:1errnois used here but this file doesn't show an explicit include for<cerrno>/<errno.h>. Depending on transitive includes, this can fail to compile. Add an explicit include (#include <cerrno>or#include <errno.h>) in this translation unit to makeerrnoavailable.
tests/storage/test_checkpoint.cc:1- The test uses a fixed
/tmp/...directory name. This can make CI flaky when tests run concurrently (or if a previous run crashed and left files with restrictive permissions). Prefer using a unique per-test directory (e.g.,std::filesystem::temp_directory_path() / (name + pid + timestamp)) similar to other updated fixtures in this PR.
src/utils/uuid.cc:1 <regex>is included but not used in this implementation unit. Removing it reduces compile time and avoids unnecessary dependencies.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Refactor the management of database files.
Fix #133
FIx #125
Fix #271
Fix #262
Open & Close LDBC SNB sf300 takes 13mins