Skip to content

fix(qwp): empty arrays are distinct from NULL in the columnar codec#63

Merged
bluestreak01 merged 4 commits into
mainfrom
mt_cleanup-null-empty-array
Jun 17, 2026
Merged

fix(qwp): empty arrays are distinct from NULL in the columnar codec#63
bluestreak01 merged 4 commits into
mainfrom
mt_cleanup-null-empty-array

Conversation

@mtopolnik

@mtopolnik mtopolnik commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What

QuestDB stores a non-null empty array (cardinality 0) as a value distinct from a NULL array, and emits it on QWP egress inline with a 0-length dimension. The egress decoder rejected dl < 1, so reading an empty array back failed (ARRAY dim N must be >= 1); the 1D/2D/3D/ND array setters also collapsed a nil slice into a (malformed) empty array instead of NULL.

Changes

  • Decoder (parseArray): accept dl >= 0, keeping nDims >= 1 and a per-dimension cap so a 0 in one dimension can't slip an arbitrary value past the element-count cap. Matches the server's QwpArrayColumnCursor.
  • Array setters (Float64Array{1,2,3}D, Int64Array{1,2,3}D, Float64ArrayNDColumn): a nil slice is a NULL array (null bitmap); a non-nil empty slice is a distinct empty array (inline, 0 elements).
  • Tests: encode→decode round-trip for empty vs NULL and nil/empty setter semantics (unit), plus a real-server ingest→egress round-trip (TestQwpIntegrationEmptyAndNullArray). Flipped the stale H27b zero-dim hardening case.

Verification

go build, go vet, staticcheck, focused unit tests, fuzz seed corpora, and the real-server integration test all pass. End-to-end: the unfixed decoder rejects the server's empty-array frame with ARRAY dim 0 out of range [...]: 0; the fix round-trips it as a non-null 0-element array distinct from NULL.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced array handling to properly distinguish between NULL arrays and empty arrays (zero-element arrays).
    • Relaxed array dimension validation to allow zero-length dimensions.
  • Tests

    • Added decoder and sender tests for NULL vs empty array semantics.
    • Added integration test verifying end-to-end NULL vs empty array handling.
  • Documentation

    • Clarified nil slice vs empty slice behavior in array column methods.

QuestDB stores a non-null empty array (cardinality 0) as a value distinct
from a NULL array, and emits it on QWP egress inline with a 0-length
dimension. The egress decoder rejected dl < 1, so reading an empty array
back failed ("ARRAY dim N must be >= 1"); the 1D/2D/3D/ND array setters
also collapsed a nil slice into a (malformed) empty array instead of NULL.

- parseArray: accept dl >= 0, keeping nDims >= 1 and a per-dimension cap
  so a 0 in one dimension can't slip an arbitrary value past the
  element-count cap. Matches the server's QwpArrayColumnCursor.
- Array setters (Float64/Int64 1D/2D/3D + ND): a nil slice is a NULL
  array (null bitmap); a non-nil empty slice is a distinct empty array
  (inline, 0 elements).
- Tests: encode->decode round-trip for empty vs NULL and nil/empty setter
  semantics (unit), plus a real-server ingest->egress round-trip
  (TestQwpIntegrationEmptyAndNullArray). Flip the stale H27b zero-dim
  hardening case to assert the empty array now decodes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@mtopolnik, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 54 minutes and 27 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bff3a206-6e4d-454c-b99d-8accb96dded4

📥 Commits

Reviewing files that changed from the base of the PR and between f96cf03 and 97284d3.

📒 Files selected for processing (3)
  • qwp_cursor_bounds_check_fuzz_test.go
  • qwp_query_decoder_test.go
  • qwp_query_integration_test.go
📝 Walkthrough

Walkthrough

Defines and enforces a nil-slice-is-NULL / empty-slice-is-empty-array distinction for all QWP typed array column methods (Float64Array*, Int64Array*). The decoder's parseArray is relaxed to accept zero-length dimensions. Documentation, sender unit tests, decoder round-trip tests, and an end-to-end integration test are added. A fuzz fixture gains a required JVM --add-exports flag.

Changes

QWP Array nil-vs-empty semantics

Layer / File(s) Summary
Interface and doc contracts for nil-vs-empty arrays
sender.go, qwp_sender.go
LineSender and QwpSender doc comments formally define nil slices as NULL arrays and non-nil empty slices as distinct non-null empty arrays.
Sender nil-branch implementation
qwp_sender.go
Adds explicit nil guards to all six typed array column methods (Float64Array{1D,2D,3D}Column, Int64Array{1D,2D,3D}Column) and Float64ArrayNDColumn, calling col.addNull() and returning early instead of falling through to the array-shape path.
Decoder: allow zero-length dimensions
qwp_query_decoder.go
Relaxes parseArray per-dimension validation from dl < 1 to dl < 0, permitting dimension value 0 as a valid non-null empty array; comments distinguish malformed inline-nDims-0 from a valid dimension of 0.
Sender unit test: nil vs empty encoding
qwp_sender_test.go
TestQwpSenderArrayNilIsNullEmptyIsEmpty inspects internal column buffers to assert nil produces nullCount=1 / zero arrayData while non-nil empty produces nullCount=0 / inline nDims+shape header bytes.
Decoder tests: round-trip and hardening
qwp_query_decoder_test.go
Adds TestQwpDecoderRoundTripEmptyVsNullArray with four row cases and updates the H27b_ArrayZeroDim hardening subtest to expect decoding success with a non-nil empty slice result.
Integration test: end-to-end empty vs NULL
qwp_query_integration_test.go
TestQwpIntegrationEmptyAndNullArray ingests regular, empty, and NULL float array rows via the line sender, queries them back, and validates nullability, dimensionality, and element values.
Fuzz fixture JVM arg fix
qwp_fuzz_fixture_test.go
Adds --add-exports=java.base/jdk.internal.vm=io.questdb to the fuzz server JVM launch arguments with explanatory comments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • questdb/go-questdb-client#60: Introduces the foundational QWP array buffering, encoding, and shape-validation machinery that this PR's nil-vs-empty changes build directly upon.
  • questdb/go-questdb-client#62: Introduces the QWP egress decoder (parseArray and related logic) that this PR modifies to accept zero-length dimensions.

Suggested reviewers

  • bluestreak01

Poem

🐇 A nil is not empty, an empty's not null,
The bunny insisted, and fixed it for all.
addNull() for nil slices, a header for empty,
The decoder now welcomes a zero-length plenty.
Hop hop, the arrays are distinct — what a ball! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main fix: distinguishing empty arrays from NULL in the QWP columnar codec.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mt_cleanup-null-empty-array

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

The qwp-fuzz CI job builds QuestDB master and launches the SNAPSHOT
jar as the named module io.questdb. Recent master adds
io.questdb.mp.continuation.WorkerContinuation, which reaches the
JDK-internal jdk.internal.vm.ContinuationScope. java.base does not
export jdk.internal.vm to io.questdb by default, so every worker
thread dies at class-init with IllegalAccessError /
NoClassDefFoundError and the HTTP service never comes up.

Pass --add-exports=java.base/jdk.internal.vm=io.questdb, the flag
QuestDB's own launcher (questdb.sh, docker-entrypoint.sh) and the
canonical c-questdb-client fixture.py both use. The fixture's launch
args were ported from fixture.py before WorkerContinuation existed,
so they predate this requirement.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mtopolnik

Copy link
Copy Markdown
Contributor Author

Review: PR #63

Reviewing at level 2 (Agents 1–8 + a fresh adversarial pass, batched source verification).

Verdict up front: the core fix is correct, safe, and well-reasoned. The decoder relaxation is sound, the encode/decode path is byte-exact for empty arrays, every one of the 12 array read-accessors is guarded against the newly-reachable non-null elems==0 state, and the setter nil→NULL mapping is consistent and creates the column on every path. No Critical or blocking issues. Findings are coverage and documentation.

Critical

None.

Moderate

M1 — LONG_ARRAY empty/NULL decode and the Float64ArrayNDColumn(nil) behavior change are untested. (in-diff, test files)
The decoder guard is shared by DOUBLE_ARRAY and LONG_ARRAY (dispatch at qwp_query_decoder.go:563), but only DOUBLE_ARRAY empty-array decode is exercised (TestQwpDecoderRoundTripEmptyVsNullArray, integration test). parseArray is genuinely type-agnostic (8 bytes/element either way), so the risk is low — but the "shared by both" claim is asserted for only one type. More importantly, Float64ArrayNDColumn(name, nil) is the most behaviorally significant setter change in the PR — it went from a silent no-op that never created the column (old return s before name validation) to create-column + explicit NULL — and no QWP test covers it. A regression back to no-op (or a panic in the new getOrCreateColumn) would go uncaught.
Fix: extend TestQwpSenderArrayNilIsNullEmptyIsEmpty with Float64ArrayNDColumn("ndnull", nil) (assert nullCount==1, rowCount==1) and add an Int64 empty+NULL case to TestQwpDecoderRoundTripEmptyVsNullArray via addLongArray(1, []int32{0}, nil) / addNull decoded through batch.Int64Array.

M2 — The exported array setters don't document the nil=NULL vs empty=empty-array contract. (out-of-diff, sender.go:153-209, qwp_sender.go:72-79)
This PR makes nil vs non-nil-empty a load-bearing, user-visible distinction, and silently changes QWP's nil semantics (previously nil→empty array; for ND, nil→nothing). The contract now lives only in implementation comments. The LineSender/QwpSender method docs say nothing about it.
Fix: one line per method, e.g. "A nil slice writes a NULL array; a non-nil empty slice writes a distinct, non-null empty array." Worth a release-note callout too — existing QWP callers passing nil and expecting an empty array now get NULL (this is the intended correction, but it is a wire-behavior change).

Minor

  • m1 — Now-incorrect comment introduced by the behavior change. (out-of-diff, qwp_cursor_bounds_check_fuzz_test.go:138-140) Reads "the Go decoder rejects a 0-length dim ('ARRAY dim 0 must be >= 1'), so a valid seed needs >= 1". The decoder no longer rejects 0-length dims and that error string is gone. The code (n := 1 + r.Intn(3)) is fine; the rationale is false and misleading. Update it.
  • m2 — Buffer test under-asserts the empty-array header. (in-diff, qwp_sender_test.go) TestQwpSenderArrayNilIsNullEmptyIsEmpty checks len(arrayData)==5 but not the contents; 5 garbage bytes or a nDims=0 header would still pass. Assert arrayData[0]==1 and the trailing int32 ==0. (The decoder round-trip test compensates at the decode layer, so this is belt-and-suspenders.)
  • m3 — QwpColumn handle accessors + *ArrayInto untested on a non-null empty array. (in-diff-adjacent) Float64ArrayInto/Int64ArrayInto correctly collapse empty and NULL to "return dst unchanged" (guarded by if elems==0), so the only way a *Into consumer distinguishes them is IsNull staying false on empty. Nothing pins that. A small test next to TestQwpColumnBatchEmptyArrayViaZeroShape would close it.
  • m4 — Stale helper comment. (in-diff, qwp_query_decoder_test.go:1918-1921) buildArrayHardeningFrame says it "rejects on the shape/nDims check before reading any element bytes" — still true for H27/H29 callers, now false for the flipped H27b. Reads oddly; tighten or scope the comment.
  • m5 — Changelog-style framing in test doc comments. qwp_query_decoder_test.go:894 and qwp_query_integration_test.go:127 are framed as "Before parseArray was relaxed from dl>=1 to dl>=0 … failed outright" — the contrast-with-old-state style your global convention discourages. They explain why the regression test exists, so they're borderline acceptable, but could be rephrased as a present-tense property ("an empty-array row decodes to a non-null 0-element value").
  • m6 — Unrelated change bundled. qwp_fuzz_fixture_test.go's --add-exports JVM flag (separate commit b3c8d69) is correct and is in fact a prerequisite for the new integration test to boot the server — but it's unrelated to the empty/NULL fix and unmentioned in the PR title/body. Scope nit only.

Summary

  • One-line verdict: Approve. Land it; the two test additions in M1 and the doc lines in M2 are worth doing as quick follow-ups (or in this PR), but nothing blocks merge.
  • Regressions/tradeoffs: none. The per-dimension cap is a strict defensive improvement over the old product-only check (it now bounds each shape entry even when a 0 zeroes the product). The only behavior change — QWP nil→NULL — is the intended fix and is correctly described in the PR body.

mtopolnik and others added 2 commits June 17, 2026 15:55
The empty-vs-NULL array fix landed with decode coverage only for
DOUBLE_ARRAY and with the nil/empty contract recorded only in
implementation comments. This fills both gaps.

- TestQwpDecoderRoundTripEmptyVsNullArray now carries a second
  LONG_ARRAY column alongside the existing DOUBLE_ARRAY one, decoding
  the same regular/empty-1D/NULL/empty-2D shapes via Int64Array. The
  double-array column precedes the long one in the frame, so the long
  column decoding correctly also pins that the empty and NULL rows of
  the first column consumed exactly their wire bytes (inter-column
  alignment), since parseArray reads columns sequentially with no
  length prefix.

- TestQwpSenderArrayNilIsNullEmptyIsEmpty now exercises
  Float64ArrayNDColumn(nil), pinning that a nil NdArray creates the
  column and marks the row NULL instead of being silently dropped.

- The LineSender Float64Array{1,2,3}D/ND and QwpSender
  Int64Array{1,2,3}D doc comments now state the nil=NULL versus
  non-nil-empty=empty-array contract.

No production logic changes; the doc comments are the only non-test
edits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The empty-vs-NULL array fix left three test comments describing the
old decoder behavior, where a 0-length dimension was rejected. They
are now wrong or misleading; this updates them to present-tense facts.

- qwp_cursor_bounds_check_fuzz_test.go: the array seed comment no
  longer claims the decoder rejects a 0-length dim. It now explains
  the real reason for 1-3 elements — the seed carries element-data
  bytes for the corruption and truncation passes to mutate, which
  still reach the 0-length-dim form.

- buildArrayHardeningFrame: the padding comment no longer asserts the
  decoder always rejects before reading element bytes (untrue for the
  flipped H27b case). It now covers all callers: each shape is either
  rejected at the shape/nDims check or accepted as a 0-element array,
  so no element bytes are consumed.

- TestQwpIntegrationEmptyAndNullArray: dropped the changelog-style
  framing for a present-tense property — an empty array reads back as
  a non-null, zero-element row distinct from NULL.

Comment-only; no test logic changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
qwp_sender_test.go (1)

1042-1065: ⚡ Quick win

Assert empty-array inline header bytes, not only length.

The check currently allows malformed encodings with the same arrayData length. Please also assert the exact header bytes for empty arrays (e.g. nDims=1, dim0=0) so wire-shape regressions are caught.

Suggested patch
@@
 import (
 	"bytes"
@@
 	"testing"
 	"time"
@@
 	const emptyArrayInlineLen = 5
+	empty1DHeader := []byte{1, 0, 0, 0, 0}
 	check := func(name string, wantNullCount, wantArrayDataLen int) {
@@
 		if len(col.arrayData) != wantArrayDataLen {
 			t.Fatalf("%s: len(arrayData) = %d, want %d", name, len(col.arrayData), wantArrayDataLen)
 		}
+		if wantArrayDataLen == emptyArrayInlineLen && !bytes.Equal(col.arrayData, empty1DHeader) {
+			t.Fatalf("%s: arrayData header = %v, want %v", name, col.arrayData, empty1DHeader)
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qwp_sender_test.go` around lines 1042 - 1065, The check function currently
only validates the length of col.arrayData but does not verify the actual header
bytes, which allows malformed encodings to pass if they have the correct length.
Enhance the check function to also assert the exact header byte values for empty
arrays when wantArrayDataLen equals emptyArrayInlineLen, specifically verifying
that the first byte (nDims) is 1 and the following 4 bytes (dim0) are 0 to catch
wire-shape regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@qwp_query_integration_test.go`:
- Around line 135-163: The sender variable created by newQwpLineSender in the
test is only closed on the happy path at the end, meaning any early t.Fatalf
call will exit without closing the resource. Add a defer statement immediately
after successfully creating the sender s to ensure s.Close(ctx) is called
regardless of whether subsequent operations fail, preventing resource leaks that
could cause test instability.

---

Nitpick comments:
In `@qwp_sender_test.go`:
- Around line 1042-1065: The check function currently only validates the length
of col.arrayData but does not verify the actual header bytes, which allows
malformed encodings to pass if they have the correct length. Enhance the check
function to also assert the exact header byte values for empty arrays when
wantArrayDataLen equals emptyArrayInlineLen, specifically verifying that the
first byte (nDims) is 1 and the following 4 bytes (dim0) are 0 to catch
wire-shape regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 72689d46-d357-40cd-bf45-b2f4fa52d109

📥 Commits

Reviewing files that changed from the base of the PR and between ffec03a and f96cf03.

📒 Files selected for processing (7)
  • qwp_fuzz_fixture_test.go
  • qwp_query_decoder.go
  • qwp_query_decoder_test.go
  • qwp_query_integration_test.go
  • qwp_sender.go
  • qwp_sender_test.go
  • sender.go

Comment thread qwp_query_integration_test.go

@bluestreak01 bluestreak01 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve.

Reviewed at level 3 (full mission-critical pass). The core fix is correct, safe, and internally consistent:

  • Decoder relaxation (parseArray) is a net hardening: the new per-dimension cap (dl < 0 || int64(dl) > qwpMaxArrayElements) closes the gap the old product-only check left when a 0 zeroes the running product. No int64 overflow (max intermediate max*max ≈ 7.2e16 < 2^63). nDims==0 and negative dims still rejected.
  • All 12 array accessors safely handle the newly-reachable non-null elems==0 state (make(...,0) non-nil empty; unsafe.Slice guarded behind elems>0; *Into short-circuits).
  • nil->NULL setter mapping is uniform across all six typed QWP setters + the ND path; array columns are always nullable so addNull takes the bitmap branch (the non-nullable-array panic is an unreachable internal-invariant guard). rowCount advances exactly once per call.
  • Interface doc holds for all transports (ILP buffer.go already mapped nil->omit (=server NULL) and non-nil-empty->inline empty array).
  • go build, go vet, staticcheck, and the new unit + decoder round-trip (DOUBLE+LONG) + hardening + fuzz-seed tests all pass.

Non-blocking follow-ups:

  • M1 (Moderate, visibility): ensure the release note explicitly calls out the QWP nil->NULL semantics change (the user-facing risk is the nil path, not the empty path).
  • m1 (Minor): integration test leaks the sender on early t.Fatalf paths — add defer s.Close(ctx) after creation.
  • m2 (Minor): TestQwpSenderArrayNilIsNullEmptyIsEmpty should assert the empty-array header bytes {1,0,0,0,0}, not only len==5.
  • m3 (Minor): add a QwpColumn.*ArrayInto test on a non-null empty array (only IsNull distinguishes empty from NULL there).

No Critical or Moderate-correctness issues.

@bluestreak01 bluestreak01 merged commit 87a1fb2 into main Jun 17, 2026
6 checks passed
@bluestreak01 bluestreak01 deleted the mt_cleanup-null-empty-array branch June 17, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants