Skip to content

perf(sorted_set): O(n+m) co-iteration for set operations#3324

Open
bobzhang wants to merge 1 commit intomainfrom
hongbo/sorted-set-co-iteration
Open

perf(sorted_set): O(n+m) co-iteration for set operations#3324
bobzhang wants to merge 1 commit intomainfrom
hongbo/sorted-set-co-iteration

Conversation

@bobzhang
Copy link
Copy Markdown
Contributor

@bobzhang bobzhang commented Mar 21, 2026

Summary

Replace O(n*log m) implementations of subset, disjoint, intersection, difference, and symmetric_difference with O(n+m) co-iteration over both sorted iterators.

Before

Each operation iterated self and called contains() on src for each element — O(log m) per lookup, O(n*log m) total:

pub fn subset(self, src) -> Bool {
  self.iter().all(x => src.contains(x))  // O(n * log m)
}
pub fn intersection(self, src) -> SortedSet {
  let ret = new()
  self.each(x => if src.contains(x) { ret.add(x) })  // O(n * log m)
  ret
}

symmetric_difference was even worse — it created two intermediate sets via difference() then called union() (with a TODO acknowledging this).

After

Co-iterate both sorted iterators in lockstep using compare(), advancing the smaller side — O(n+m) total, zero allocations on the input sets:

pub fn subset(self, src) -> Bool {
  let it1 = self.iter()
  let it2 = src.iter()
  // advance in lockstep, O(n+m)
}

Complexity comparison

Operation Before After
subset O(n·log m) O(n+m)
disjoint O(n·log m) O(n+m)
intersection O(n·log m) O(n+m)
difference O(n·log m) O(n+m)
symmetric_difference O(n·log m + m·log n + n+m) O(n+m)

Codex review

I did not find a correctness issue in the co-iteration logic. The iterator advancement rules for subset, disjoint, intersection, difference, and symmetric_difference match the expected merge-style behavior on two sorted unique streams, and moon test sorted_set passes after the change.

Test plan

  • All 48 sorted_set tests pass

🤖 Generated with Claude Code


Open with Devin

Replace O(n*log m) implementations of subset, disjoint, intersection,
difference, and symmetric_difference with O(n+m) co-iteration over
both sorted iterators.

Before: each element of self was looked up via contains() at O(log m)
each, giving O(n*log m) total.

After: both iterators advance in lockstep using compare(), giving
O(n+m) total with zero tree allocations on the input sets.

Also resolves the TODO in symmetric_difference to avoid creating
two intermediate sets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coveralls
Copy link
Copy Markdown
Collaborator

Pull Request Test Coverage Report for Build 3103

Details

  • 61 of 71 (85.92%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.05%) to 95.636%

Changes Missing Coverage Covered Lines Changed/Added Lines %
sorted_set/set.mbt 61 71 85.92%
Totals Coverage Status
Change from base Build 3102: -0.05%
Covered Lines: 13850
Relevant Lines: 14482

💛 - Coveralls

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants