Skip to content

fix: Maintain the edge count in the CSR#209

Open
liulx20 wants to merge 10 commits intoalibaba:mainfrom
liulx20:edge-num
Open

fix: Maintain the edge count in the CSR#209
liulx20 wants to merge 10 commits intoalibaba:mainfrom
liulx20:edge-num

Conversation

@liulx20
Copy link
Copy Markdown
Collaborator

@liulx20 liulx20 commented Apr 10, 2026

Fixes

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@qodo-code-review
Copy link
Copy Markdown

ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan

Review Summary by Qodo

Maintain edge count with atomic counter for O(1) access

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Replace expensive edge counting logic with atomic counter
  - Removed O(n) iteration through all edges in edge_num() method
  - Replaced with O(1) atomic load from edge_num_ member
• Maintain edge count across all CSR operations
  - Add fetch_add/fetch_sub calls in batch operations and single edge modifications
  - Track edge count in batch_delete_vertices, batch_delete_edges, delete_edge,
  revert_delete_edge, batch_put_edges
• Persist edge count to metadata file
  - Update load_meta and dump_meta to read/write edge count
  - Replace C-style FILE I/O with C++ ifstream/ofstream
• Add validation checks for edge count consistency
  - Verify edge count matches degree list during open operations
  - Check consistency after compaction operations
Diagram
flowchart LR
  A["Edge Operations<br/>delete/put/revert"] -->|fetch_add/fetch_sub| B["Atomic Counter<br/>edge_num_"]
  B -->|O(1) load| C["edge_num() method"]
  D["load_meta()"] -->|read from file| B
  B -->|write to file| E["dump_meta()"]
  F["Validation"] -->|verify consistency| B
Loading

Grey Divider

File Changes

1. src/storages/csr/immutable_csr.cc 🐞 Bug fix +59/-8

Maintain atomic edge counter in immutable CSR

• Replace O(n) edge counting loop with O(1) atomic load in edge_num() method
• Add edge_num_.fetch_sub() calls in batch_delete_vertices, batch_delete_edges, and
 delete_edge methods
• Add edge_num_.fetch_add() call in revert_delete_edge and batch_put_edges methods
• Update load_meta() to read edge count from metadata file using ifstream
• Update dump_meta() to write edge count to metadata file using ofstream
• Add load_meta() calls in SingleImmutableCsr::open, open_in_memory, and open_with_hugepages
• Add dump_meta() call in SingleImmutableCsr::dump
• Implement load_meta() and dump_meta() for SingleImmutableCsr class

src/storages/csr/immutable_csr.cc


2. src/storages/csr/mutable_csr.cc 🐞 Bug fix +81/-7

Maintain atomic edge counter in mutable CSR

• Add edge count validation in open_internal() to verify consistency between metadata and degree
 list
• Add edge_num_.fetch_sub() calls in batch_delete_vertices, batch_delete_edges, and
 delete_edge methods
• Add edge_num_.fetch_add() calls in revert_delete_edge and batch_put_edges methods
• Add edge count consistency check in compact() method
• Update load_meta() to read edge count from metadata file using ifstream
• Update dump_meta() to write edge count to metadata file using ofstream
• Add load_meta() calls in SingleMutableCsr::open, open_in_memory, and open_with_hugepages
• Add dump_meta() call in SingleMutableCsr::dump
• Implement load_meta() and dump_meta() for SingleMutableCsr class
• Add edge count tracking in batch_put_edges with added_edge_num counter

src/storages/csr/mutable_csr.cc


3. include/neug/storages/csr/immutable_csr.h ✨ Enhancement +6/-36

Add atomic edge counter to immutable CSR headers

• Replace expensive edge_num() implementation with O(1) atomic load
• Add std::atomic<uint64_t> edge_num_{0} member variable to ImmutableCsr class
• Add std::atomic<uint64_t> edge_num_{0} member variable to SingleImmutableCsr class
• Add load_meta() and dump_meta() private method declarations to SingleImmutableCsr

include/neug/storages/csr/immutable_csr.h


View more (1)
4. include/neug/storages/csr/mutable_csr.h ✨ Enhancement +8/-27

Add atomic edge counter to mutable CSR headers

• Replace expensive edge_num() implementation with O(1) atomic load in MutableCsr
• Replace expensive edge_num() implementation with O(1) atomic load in SingleMutableCsr
• Add std::atomic<uint64_t> edge_num_{0} member variable to MutableCsr class
• Add std::atomic<uint64_t> edge_num_{0} member variable to SingleMutableCsr class
• Add load_meta() and dump_meta() private method declarations to SingleMutableCsr
• Add edge_num_.fetch_add(1) call in put_edge() method

include/neug/storages/csr/mutable_csr.h


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Apr 10, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🖥 UI issues (0) 🎨 UX Issues (0)

Grey Divider


Action required

1. Missing atomic header🐞
Description
immutable_csr.h now uses std::atomic<uint64_t> edge_num_ but does not include <atomic>, which can
cause compilation failures depending on include order.
Code

include/neug/storages/csr/immutable_csr.h[122]

+  std::atomic<uint64_t> edge_num_{0};
Evidence
The header declares a std::atomic member but the header’s includes do not include <atomic>, so
compilation relies on non-guaranteed transitive includes.

include/neug/storages/csr/immutable_csr.h[15-22]
include/neug/storages/csr/immutable_csr.h[118-123]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`include/neug/storages/csr/immutable_csr.h` declares `std::atomic<uint64_t> edge_num_{0};` but does not include `<atomic>`, which can break builds.
### Issue Context
This PR introduced the atomic counter into the header.
### Fix Focus Areas
- include/neug/storages/csr/immutable_csr.h[15-27]
- include/neug/storages/csr/immutable_csr.h[118-123]
### Expected change
Add `#include <atomic>` near the other standard includes.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Missing fstream header🐞
Description
mutable_csr.cc and immutable_csr.cc use std::ifstream/std::ofstream in load_meta/dump_meta but do
not include <fstream>, which is not guaranteed to be available via other headers.
Code

src/storages/csr/immutable_csr.cc[R387-395]

+    std::ifstream meta_file(meta_file_path, std::ios::binary);
+    timestamp_t unsorted_since;
+    uint64_t edge_num;
+    meta_file.read(reinterpret_cast<char*>(&unsorted_since),
+                   sizeof(timestamp_t));
+    meta_file.read(reinterpret_cast<char*>(&edge_num), sizeof(uint64_t));
+    unsorted_since_ = unsorted_since;
+    edge_num_.store(edge_num);
+    meta_file.close();
Evidence
Both translation units use file stream types, but their include lists omit ``, risking compilation
failures in standard-compliant builds.

src/storages/csr/immutable_csr.cc[18-29]
src/storages/csr/immutable_csr.cc[383-396]
src/storages/csr/mutable_csr.cc[16-32]
src/storages/csr/mutable_csr.cc[571-598]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`src/storages/csr/immutable_csr.cc` and `src/storages/csr/mutable_csr.cc` use `std::ifstream` / `std::ofstream` but do not include `<fstream>`.
### Issue Context
This PR replaced C stdio meta handling with C++ streams.
### Fix Focus Areas
- src/storages/csr/immutable_csr.cc[18-35]
- src/storages/csr/mutable_csr.cc[16-42]
### Expected change
Add `#include <fstream>` to both .cc files (near other standard headers).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Meta reads not validated🐞
Description
load_meta reads edge_num from disk without checking stream open/read success, so
truncated/old-format *.meta files can set edge_num_ to garbage and break edge_num() correctness.
Code

src/storages/csr/immutable_csr.cc[R387-398]

+    std::ifstream meta_file(meta_file_path, std::ios::binary);
+    timestamp_t unsorted_since;
+    uint64_t edge_num;
+    meta_file.read(reinterpret_cast<char*>(&unsorted_since),
+                   sizeof(timestamp_t));
+    meta_file.read(reinterpret_cast<char*>(&edge_num), sizeof(uint64_t));
+    unsorted_since_ = unsorted_since;
+    edge_num_.store(edge_num);
+    meta_file.close();
} else {
unsorted_since_ = 0;
}
Evidence
The code reads binary fields into uninitialized locals and never checks meta_file.good()/fail()
or gcount(). If the meta file exists but is shorter than expected (e.g., pre-change format), the
second read may fail and edge_num remains indeterminate before being stored into edge_num_.

src/storages/csr/immutable_csr.cc[383-399]
src/storages/csr/mutable_csr.cc[571-587]
src/storages/csr/mutable_csr.cc[819-830]
src/storages/csr/immutable_csr.cc[590-601]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`load_meta()` uses `std::ifstream::read()` but does not validate that the file opened successfully or that the expected bytes were read. For truncated or old-format `.meta` files, `edge_num_` can be set incorrectly.
### Issue Context
This PR changed `.meta` layout to include edge count.
### Fix Focus Areas
- src/storages/csr/immutable_csr.cc[383-399]
- src/storages/csr/mutable_csr.cc[571-587]
- src/storages/csr/immutable_csr.cc[590-601]
- src/storages/csr/mutable_csr.cc[819-830]
### Expected change
- Check `meta_file.is_open()` and `meta_file.good()` after each read.
- If reads fail (or file too small), fall back to computing edge count from the loaded CSR data (or treat as legacy format and only read the fields that exist), then `edge_num_.store(computed)`.
- Consider throwing a clear exception on corruption instead of silently using garbage.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (3)
4. Open rejects deleted snapshots🐞
Description
MutableCsr::open_internal sums degrees to compute edge_count and throws when it differs from meta
edge_num, but degrees count slots (including deleted edges) while edge_num_ is decremented on
delete_edge without shrinking degrees.
Code

src/storages/csr/mutable_csr.cc[R91-108]

+  uint64_t edge_count = 0;
for (size_t i = 0; i < v_cap; ++i) {
int deg = degree_ptr[i];
+    edge_count += deg;
int cap = cap_ptr[i];
adj_lists_ptr[i] = nbr_list_ptr;
adj_list_size_ptr[i].store(deg);
adj_list_cap_ptr[i] = cap;
nbr_list_ptr += cap;
}
+  if (edge_num_.load() != edge_count) {
+    LOG(WARNING) << "Edge count from meta (" << edge_num_.load()
+                 << ") does not match count computed from degree list ("
+                 << edge_count << "). Using computed count.";
+    THROW_STORAGE_EXCEPTION(
+        "Edge count mismatch: meta has " + std::to_string(edge_num_.load()) +
+        " but degree list implies " + std::to_string(edge_count));
+  }
Evidence
open_internal() computes edge_count as the sum of degree entries and throws on mismatch.
However, delete_edge() marks edges deleted via timestamp and decrements edge_num_ without
decreasing adj_list_size_, and dump() persists degrees from adj_list_size_. After any
deletions (without compaction removing slots), a dumped snapshot will have degree-sum > live
edge_num, and reopening will always throw.

src/storages/csr/mutable_csr.cc[43-109]
src/storages/csr/mutable_csr.cc[137-156]
src/storages/csr/mutable_csr.cc[446-469]
src/storages/csr/mutable_csr.cc[571-587]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`MutableCsr::open_internal()` compares meta `edge_num_` to `sum(degree_list)` and throws. Degrees/`adj_list_size_` count allocated slots, not necessarily live edges, because deletions only set timestamps.
### Issue Context
This makes it impossible to reopen snapshots that contain deleted edges unless degrees were compacted to match live edges.
### Fix Focus Areas
- src/storages/csr/mutable_csr.cc[43-109]
- src/storages/csr/mutable_csr.cc[137-156]
- src/storages/csr/mutable_csr.cc[446-469]
### Expected change
- Either (A) compute the live edge count during open by scanning neighbor entries and checking `timestamp != max`, and set `edge_num_` to that, or
- (B) redefine persisted degrees to represent live edges and ensure all delete paths maintain degrees accordingly.
- If keeping a consistency check, ensure it compares like-with-like (live vs live) and either repairs or produces a clear error message (don’t log “Using computed count” and then throw).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Edge_num can underflow🐞
Description
Several delete paths decrement edge_num_ without verifying the edge was live (or is already
deleted), so repeated deletes or deleting already-deleted edges can wrap the uint64_t counter and
corrupt edge_num().
Code

src/storages/csr/mutable_csr.cc[R352-353]

+      edge_num_.fetch_sub(sz_arr[src].load(), std::memory_order_relaxed);
  sz_arr[src].store(0, std::memory_order_relaxed);
Evidence
batch_delete_vertices() subtracts sz_arr[src].load() even though sz_arr includes deleted slots
(delete_edge doesn’t shrink it), and batch delete APIs decrement edge_num_ without checking
current deletion marker (timestamp == max). Because edge_num_ is uint64_t, over-decrementing wraps
to a huge value.

src/storages/csr/mutable_csr.cc[345-383]
src/storages/csr/mutable_csr.cc[386-415]
src/storages/csr/mutable_csr.cc[417-444]
src/storages/csr/mutable_csr.cc[446-469]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new `edge_num_` counter is decremented in batch delete paths without checking whether an edge is currently live, and vertex-delete subtracts adjacency sizes that include already-deleted slots. This can over-decrement and underflow.
### Issue Context
In Mutable CSR, deletions typically set `timestamp = max` but do not reduce `adj_list_size_`.
### Fix Focus Areas
- src/storages/csr/mutable_csr.cc[345-383]
- src/storages/csr/mutable_csr.cc[386-415]
- src/storages/csr/mutable_csr.cc[417-444]
### Expected change
- In each delete path, check the prior state (e.g., `timestamp != max`) before decrementing.
- For `batch_delete_vertices`, subtract the number of live edges in the list (count timestamps != max) rather than `sz_arr[src]`.
- Ensure idempotent behavior (deleting an already-deleted edge should not change `edge_num_`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Single put_edge misses increment🐞
Description
SingleMutableCsr::put_edge does not increment edge_num_, but edge_num() now returns
edge_num_.load(), so single-edge inserts leave the edge count incorrect.
Code

include/neug/storages/csr/mutable_csr.h[232]

+  size_t edge_num() const override { return edge_num_.load(); }
Evidence
SingleMutableCsr::edge_num() was changed to return the atomic counter, but
SingleMutableCsr::put_edge() never updates that counter. The unit test expects the edge count to
recover after deleting and re-putting an edge; without increment, it will remain low.

include/neug/storages/csr/mutable_csr.h[230-288]
tests/storage/test_mutable_csr.cc[561-584]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SingleMutableCsr::put_edge()` writes a new live edge (sets timestamp) but never increments `edge_num_`, even though `edge_num()` now returns `edge_num_.load()`.
### Issue Context
This PR switched `edge_num()` to rely on the maintained counter.
### Fix Focus Areas
- include/neug/storages/csr/mutable_csr.h[275-288]
- include/neug/storages/csr/mutable_csr.h[230-233]
### Expected change
After successfully inserting (i.e., transitioning from deleted to live), call `edge_num_.fetch_add(1, std::memory_order_relaxed)` (or stronger ordering if required). Also consider guarding against double-insert overwriting an existing live edge.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

It may be necessary to maintain the EdgeNum counter to improve the efficiency of statistics updates

1 participant