Skip to content

Commit 23397d0

Browse files
committed
Update test examples and document virtualized windowed fetching
- Make DuckDB test Rmd chunks self-contained (inline data generation) - Add Parquet + virtual scrolling test example - Update virtual scrolling test to use backend API - Document pre-render jank with virtual + pagination=FALSE - Add virtualized windowed fetching as future Phase 5
1 parent b8fca1a commit 23397d0

4 files changed

Lines changed: 71 additions & 2 deletions

File tree

design/duckdb-wasm-engine/duckdb-wasm-engine-test.Rmd

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,14 @@ reactable(
131131
### Large dataset: 1M rows (virtualized)
132132

133133
```{r}
134+
n <- 1000000
135+
big_data <- data.frame(
136+
id = seq_len(n),
137+
value = rnorm(n),
138+
category = sample(LETTERS, n, replace = TRUE),
139+
score = round(runif(n, 0, 100), 2)
140+
)
141+
134142
reactable(
135143
big_data,
136144
backend = backendDuckDB(),
@@ -547,6 +555,15 @@ reactable(
547555
### Large multi-level grouping: 100k rows grouped by category and region
548556

549557
```{r}
558+
n <- 100000
559+
large_data <- data.frame(
560+
id = seq_len(n),
561+
category = sample(LETTERS, n, replace = TRUE),
562+
region = sample(c("East", "West", "North", "South"), n, replace = TRUE),
563+
value = round(rnorm(n, mean = 100, sd = 50), 2),
564+
score = round(runif(n, 0, 100), 2)
565+
)
566+
550567
reactable(
551568
large_data,
552569
backend = backendDuckDB(),
@@ -890,6 +907,15 @@ reactable(
890907
Force embedded Arrow IPC even for large data.
891908

892909
```{r}
910+
n <- 1000000
911+
parquet_data <- data.frame(
912+
id = seq_len(n),
913+
value = rnorm(n),
914+
category = sample(LETTERS, n, replace = TRUE),
915+
score = round(runif(n, 0, 100), 2),
916+
label = paste0("item-", seq_len(n))
917+
)
918+
893919
reactable(
894920
parquet_data,
895921
backend = backendDuckDB(format = "arrow"),
@@ -899,6 +925,33 @@ reactable(
899925
)
900926
```
901927

928+
### Parquet with virtual scrolling
929+
930+
Unpaginated virtual scrolling with Parquet sidecar. DuckDB sends all rows in a single
931+
query (pageSize = null), and virtual scrolling handles rendering.
932+
933+
```{r}
934+
n <- 1000000
935+
parquet_data <- data.frame(
936+
id = seq_len(n),
937+
value = rnorm(n),
938+
category = sample(LETTERS, n, replace = TRUE),
939+
score = round(runif(n, 0, 100), 2),
940+
label = paste0("item-", seq_len(n))
941+
)
942+
943+
reactable(
944+
parquet_data,
945+
backend = backendDuckDB(format = "parquet"),
946+
pagination = FALSE,
947+
virtual = TRUE,
948+
height = 500,
949+
sortable = TRUE,
950+
filterable = TRUE,
951+
searchable = TRUE
952+
)
953+
```
954+
902955
### Parquet in Shiny (client mode)
903956

904957
Test that Parquet sidecar files work in a Shiny app. In Shiny, `backendDuckDB()` defaults

design/duckdb-wasm-engine/duckdb-wasm-engine.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1099,6 +1099,10 @@ and memory than it saves on queries.
10991099
- **Debounce tuning:** The POC debounces search input at 300ms. For large datasets, increasing the debounce or
11001100
requiring a minimum query length (3+ chars) would reduce perceived lag.
11011101

1102+
### Future: virtualized windowed fetching
1103+
1104+
With `virtual = TRUE, pagination = FALSE`, DuckDB currently fetches all rows at once (`pageSize: null` omits LIMIT/OFFSET). For Parquet, this means downloading the entire file over HTTP before the table renders. A future enhancement would use scroll-position-driven queries to fetch only a sliding window of rows around the viewport, leveraging Parquet HTTP range requests for efficient partial reads. See Phase 5 in `design/server-side-data/server-side-data.md` for the full plan.
1105+
11021106
## End-to-end benchmark: DuckDB vs default backend
11031107

11041108
Measured in Chrome (Windows), serving rendered R Markdown documents over HTTP. Both documents use the same dataset

design/server-side-data/server-side-data.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,8 @@ When server-side search returns zero results, pagination shows "1-10 of 0 rows"
197197

198198
**Future simplification: consider removing pre-rendered first page.** The R-side pre-rendering of the first page (to avoid a blank flash while WASM loads) adds significant JS complexity: `canSkipInitialDuckDBQuery`, `duckdbQueryCount`, `stateMatchesPrerender` comparison against `defaultSorted`, the groupBy special case (pre-rendered data is flat so we must query immediately), and race conditions when users interact before DuckDB is ready. Without pre-rendering, the query effect fires unconditionally after init and the entire skip optimization disappears. The tradeoff is showing a loading/empty state during WASM init instead of instant first-page display.
199199

200+
Pre-rendering is also problematic with **virtual scrolling + `pagination = FALSE`**: the pre-rendered `defaultPageSize` rows (e.g., 10) display immediately, then several seconds later the full dataset loads from DuckDB and the table jumps to show all rows. This creates a jarring partial-load effect. Deferring table readiness until all client-side data is fetched (showing a loading indicator instead of the partial pre-render) would give a smoother experience for this combination.
201+
200202
Another issue: **floating point precision mismatch** between the two data paths. The pre-rendered page goes through `jsonlite::toJSON(digits = NA)` which uses C's `%.15g` format (15 significant digits), while DuckDB query results come through Arrow's `row.toJSON()` which uses JavaScript's `Number.toString()` (up to 17 significant digits for exact float64 round-trip). Since 15 significant digits isn't always enough to recover the exact float64 value, numbers with many decimal places can visibly change when the user first interacts and DuckDB takes over from the pre-rendered data. This is unsolvable without either (a) increasing jsonlite's digits to 17 for exact round-trip, (b) rounding DuckDB results to 15 significant digits to match jsonlite, or (c) removing pre-rendering so there's only one data path.
201203

202204
**Option B: Full server-side implementation (future)**
@@ -419,6 +421,16 @@ reactableServerData.duckdb_backend <- function(
419421
- Document current limitation first
420422
- Full implementation if user demand warrants
421423

424+
5. **Phase 5: Virtualized windowed fetching** (future)
425+
- Enable `virtual = TRUE, pagination = FALSE` with DuckDB/Parquet without loading all rows at once
426+
- Watch `virtualizer.range` (debounced) to detect when visible rows change
427+
- Fire DuckDB queries with `LIMIT bufferSize OFFSET scrollPosition` for a sliding window (~500 rows centered on viewport)
428+
- Maintain a sparse data array of length `totalRowCount` with placeholder objects for unfetched rows
429+
- Show loading skeleton/shimmer for placeholder rows while data is in-flight
430+
- Invalidate entire buffer on sort/filter/search and re-fetch from current scroll position
431+
- Key benefit for Parquet: HTTP range requests mean DuckDB reads only the byte ranges needed, not the full file
432+
- This is bidirectional infinite scroll -- the main complexity is buffer management and debouncing queries during fast scrolling
433+
422434
## Verification
423435

424436
### Manual Testing

design/virtual-scrolling/virtual-scrolling-test.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,7 @@ server <- function(input, output) {
228228
229229
reactable(
230230
data,
231-
server = TRUE,
231+
backend = backendV8(),
232232
virtual = TRUE,
233233
height = 500,
234234
defaultPageSize = 1000,
@@ -276,7 +276,7 @@ Hypothetical API:
276276
reactable(
277277
# Column schema only, no data
278278
data.frame(id = integer(), value = numeric(), category = character()),
279-
server = TRUE,
279+
backend = backendV8(),
280280
virtual = TRUE,
281281
pagination = FALSE, # No pagination - seamless scrolling
282282
height = 500,

0 commit comments

Comments
 (0)