Conversation
Add effective_root: NodePath to WD, enabling a sub-directory to act as the root for path resolution. This is the foundation for cross-pond sitegen where foreign ponds define site.yaml with absolute paths that must resolve within their own root, not the importing pond's root. Key changes: - WD struct gains effective_root field (always present, not Option) - FS::root() sets effective_root to actual root; FS::wd() takes parameter - resolve(): Component::RootDir resets stack to effective root - resolve(): Component::ParentDir clamps at effective root boundary - Symlink targets are contained within effective root scope - Glob/visitor patterns honor effective root for leading / strip - All child WD creation propagates effective root via child_wd() - New API: as_root(), effective_root(), is_at_root_boundary() 9 new unit tests cover absolute paths, .. clamping, symlink containment, glob patterns, child WD inheritance, and backward compatibility. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove stdin_open/tty from docker-compose.test.yaml — they caused docker compose run to hang in non-interactive shells. Interactive debugging still works via run-test.sh --interactive (uses docker run -it directly, not compose). Add timeout --foreground to run-test.sh so timeout can detect child process exit when invoked from non-interactive shells (e.g., run-all.sh). Without --foreground, timeout creates a new process group and cannot detect when docker compose run exits. S3/MinIO tests now complete in 7-25s instead of hitting the 300s timeout. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Allow source_path "/" (and "/**") in cross-pond import config. When importing the foreign root, the mount point is a fresh local directory (not the foreign root UUID, which would collide with the local root). The foreign root's children are linked with their original foreign IDs via recursive import. New testsuite test 532-cross-pond-path-boundaries.sh (12 checks): - Imports producer pond by root into consumer at /imports/producer - Positive: imported files accessible, data matches byte-for-byte - Negative: consumer-only paths not resolvable in import context, same-name files in consumer vs import have distinct content - Provenance: pond_ids are distinct Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…oot() Add effective_root field to FactoryContext with root() method that returns a chrooted WD when set, or the global root otherwise. Factories should use context.root() instead of context.context.filesystem().root() for automatic cross-pond path scoping. Sitegen updated to use context.root() at all 3 call sites. Struct literal constructions simplified to use FactoryContext::new() instead of spelling out all fields (removes field-listing duplication in 6 test helpers + 2 internal sites). Next step: add pond_id to Node so tinyfs resolve() can auto-detect import boundaries and set effective_root during traversal. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
FileID now carries pond_id: Uuid as part of identity (participates in Eq/Hash). This distinguishes nodes from different ponds, enabling cross-pond imports where the same root UUID exists in multiple ponds. Key changes: - FileID struct gains pond_id field, included in Eq/Hash/Serialize - FileID::root_for(pond_id) creates pond-scoped root identity - FileID::root() uses local_pond_uuid() default for backward compat - All FileID constructors gain pond_id parameter - new_child_id() and child_id() inherit pond_id from parent - local_pond_uuid() constant for memory/hostmount/test contexts - ~18 files updated across tinyfs, tlogfs, provider, remote, steward, cmd All 181 tinyfs + 62 tlogfs + 184 provider tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- journal-ingest: track per-file timestamp bounds, store temporal metadata and extended_attributes on FilePhysicalSeries writes - tlogfs: persist temporal metadata for FilePhysicalSeries in both small and large file paths; relax new_large_file_series assertion to accept FilePhysicalSeries entry type - export: refactor into export_series_to_parquet (wrapper) and export_table_provider_to_parquet (core) with configurable timestamp column name - sitegen: detect URL-scheme patterns in export stages, route through UrlPatternMatcher and format provider registry; add timestamp_column config field to ExportStage - sitegen: add log_viewer shortcode, logs layout, log-viewer.js client-side viewer with DuckDB-WASM, unit filter pills, pagination, and priority coloring - linux: add site.yaml, site templates, and sitegen setup to setup.sh for local log viewing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When resolve() encounters a child directory with a different pond_id than its parent, it detects a cross-pond boundary. The parent directory (mount point) is automatically set as the effective root on the returned WD. This means absolute paths from that point resolve within the imported subtree, not the global root. Detection is transparent — any code that resolves paths through an imported partition gets automatic chroot scoping. No explicit as_root() call needed for cross-pond imports. Two new unit tests: - test_auto_detect_pond_boundary: verifies effective_root is set to the mount point when crossing a pond boundary - test_auto_detect_absolute_path_scoped_to_mount: verifies that after auto-detection, absolute paths resolve within the mount Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- chart.js/overlay.js: use import.meta.url for absolute vendor URLs (fixes blob Worker importScripts with relative paths, fixes Vite rewriting ./vendor/ to /vendor without trailing slash) - noyo: remove hardcoded ROOT path, use cargo run --release, source deploy.env for S3 credentials, envsubst for backup.yaml - noyo/export.sh: preview command uses correct BASE_PATH for subdir site - septic/backup.yaml: fix envsubst syntax (remove shell default) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Major changes to cross-pond import infrastructure: Persistence layer - pond_id query scoping: - Add pond_uuid() method to PersistenceLayer trait with default fallback - Implement pond_uuid() on State using real pond UUID from OpLogPersistence - Forward pond_uuid() through CachingPersistence wrapper - Filter committed records by pond_id in query_latest_record, query_latest_directory_record, and query_records - FS::root() now uses FileID::root_for(persistence.pond_uuid()) - initialize_root_directory takes explicit pond_id parameter Import architecture simplification: - Remove create_child_dirs_recursive - one directory entry per mount point is sufficient; foreign partition data contains the full directory tree - Extract foreign pond UUID from backup OpLog for correct FileID construction - Include foreign root partition in import set for root imports - Add collect_partitions_recursive for partition ID discovery (metadata only) - Fix deep recursion bug in execute_import partition discovery Bug fixes: - Fix Ship::create_pond double PondMetadata::default() - was creating two different UUIDs for control table vs data persistence - Thread pond_id through tinyfs_object_store path parsing - Fix S3 endpoint hostname in deploy.env.example files New cross-pond example (cross/): - setup.sh discovers pond IDs from noyo, water, septic - import.sh pulls data from all three source ponds - generate.sh builds combined site via sitegen - Combined site.yaml with exports from all three sources WIP: Foreign root partition directory listing not yet resolving correctly through pond_id scoping - the cache lookup needs further debugging. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DirectoryEntry did not store the child's pond_id. When OpLogDirectory::insert() added a foreign node to a local directory, the foreign pond_id was discarded. OpLogDirectory::get() then used the parent's pond_id to reconstruct the child FileID, causing queries to filter by the wrong pond_id and return the local root's directory record instead of the foreign one. Added pond_id: Option<String> to DirectoryEntry: - None = same pond as parent (default, backward compatible) - Some(uuid) = child from a foreign pond (cross-pond import) insert() now compares child vs parent pond_id and stores it when they differ. get() and remove() use the stored pond_id for child FileID construction. flush_directory_operations() preserves pond_id when recreating entries at flush time. Backward compatible: existing Arrow IPC data without the pond_id column deserializes as None via #[serde(default)]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sql_derived, temporal_reduce, and timeseries_pivot resolved absolute paths via fs.root() which always returns the global pond root. When these factories run inside a cross-pond import mount, absolute paths like /sensors/station_a need to resolve within the foreign tree, not the consumer's tree. Changed all read-path factories to use context.root() which respects the effective_root set by cross-pond boundary detection. This is the same API sitegen already uses. Write-path factories (remote, hydrovu, logfile_ingest, journal_ingest) are not changed — they legitimately need the global root to write to the local pond. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a dynamic node (dynamic-dir, timeseries-join, etc.) belongs to a foreign pond (cross-pond import), its FactoryContext needs an effective_root so that absolute paths in factory configs resolve within the imported tree, not the consumer's global root. Three changes: - create_dynamic_node_from_oplog_entry (persistence.rs): detects foreign nodes by comparing node pond_id vs persistence pond_uuid, loads the foreign root, and sets effective_root on the context - DynamicDirDirectory::create_child_context (dynamic_dir.rs): propagates effective_root to child factory contexts - FactoryContext::effective_root() getter (context.rs): added for propagation Also adds test 533-cross-pond-factory-resolution.sh which imports a producer pond containing a dynamic-dir with timeseries-join and verifies the factory output matches in the consumer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Documents three fixes from this session: - DirectoryEntry pond_id for cross-pond child identity - Factory path resolution via context.root() - effective_root threading into dynamic factory contexts Updates architecture notes with factory path resolution convention and effective_root derivation model. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add subsites: directive to site.yaml config. When building a combined site, the top-level sitegen reads each imported pond's own sitegen config and generates it into a subdirectory, using WD::as_root() to scope path resolution to the import mount point. Key changes: - config.rs: SubsiteConfig struct (name, path, config, base_url) - factory.rs: Extract build_site_from_root() from execute(), iterate subsites in execute() reading foreign configs and building scoped - layouts.rs: Add root_base_url to LayoutContext for shared asset references; use base_url for per-site theme.css link - factory.rs: Split write_builtin_assets into write_shared_assets (once at top level) and write_theme_css (per site) - Refactor run_export_stages and run_content_stages to take WD root directly instead of FactoryContext cross/ example simplified from 140-line manual duplication to 50-line config using subsites: directive. Per-system templates removed; each sub-site uses its own templates from the imported pond. Integration test: testsuite/tests/540-recursive-sitegen.sh Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… resolution Three bugs prevented cross-pond import from working end-to-end: 1. list_transaction_files() constructed bundle_id with today's date via transaction_bundle_id(), but stored bundles contain the original push date. Changed to LIKE prefix+suffix matching so the date component is not required to match exactly. 2. extract_foreign_pond_id() used a non-deterministic SQL query on the oplog. Replaced with RemoteTable::extract_pond_id() which reads the pond_id directly from FILE-META partition keys in the Delta table. Also removed the config_url parameter from execute_import() which parsed the URL to guess the pond_id. 3. Provider path resolution ignored effective_root. All four self.fs.root() calls in provider_api.rs bypassed cross-pond root scoping, so absolute paths inside imported factories resolved from the consumer pond's root instead of the foreign pond's root. Added Provider::with_root() and wired it through temporal_reduce and sql_derived factories. Fixed WD::as_root() to reset display paths to "/" (chroot semantics) so collect_matches returns paths that round-trip correctly through resolve_path. Infrastructure: - Makefile: added site-noyo, site-septic, site-water, sites, site-cross - noyo: added rm -rf to setup.sh, allow_http to backup config - water: added backup.yaml and backup factory to setup-local.sh - septic/water: skip rsync when local data already exists Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three issues prevented large files from being backed up and imported: 1. get_large_files() only matched 'sha256=' prefix but large files are now named with 'blake3=' prefix. Added blake3= to the match. 2. execute_push() returned early when all transactions were already backed up, skipping the large file backup section. Restructured to always run the large file check regardless of transaction state. 3. execute_import() never downloaded large files from the foreign backup. Added a post-partition step that scans for _large_files/ entries in the remote table and downloads them to the local pond. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two issues prevented the cross-pond site from functioning:
1. TinyFS ObjectStore path format lacked pond_id. The tinyfs:// URL
format was part/{part_id}/node/{node_id}/version/ which meant
the ObjectStore always constructed FileIDs with the local pond's
UUID. For foreign mount files this caused list_file_versions to
return empty results (wrong pond_id filter). Added pond/{pond_id}
prefix to the URL format so the correct pond_id is preserved
through the DataFusion -> ObjectStore -> persistence round-trip.
Updated TinyFsPathBuilder, parse_tinyfs_path (with legacy
fallback), memory file as_table_provider, and sql_derived URL
construction.
2. Subsite sidebar links used the source site's original base_url
(e.g., /noyo-harbor/) instead of the cross site's mount point
(e.g., /noyo/). Added SiteConfig::rewrite_sidebar_urls() which
replaces the old base_url prefix with the new one in all sidebar
href values when a subsite is mounted at a different path.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The sidebar.md had {{ content_nav }} (missing closing /}),
which was treated as a literal template variable instead of
a shortcode invocation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use correct temporal-reduce config fields (in_pattern, out_pattern,
time_column, aggregations) instead of deprecated names
- Fix shortcode syntax: {{ chart /}} and {{ content_nav /}}
- Add missing mkdir -p /system/site for consumer portal templates
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shared assets (style.css, chart.js, overlay.js, log-viewer.js) must
always load from the web server root ('/'), not prefixed with base_url.
The root_base_url mechanism incorrectly prefixed them for subdir builds.
Removed root_base_url from LayoutContext and all call sites. Shared
asset references are now hardcoded to '/' in layouts, matching the
original design where style.css and chart.js are served from the
top-level output directory.
Updated test 209 to check theme.css instead of style.css for theme
overrides, since theme overrides are now written to a separate file.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three issues in the browser test (202): 1. Vendor files (DuckDB-WASM, Observable Plot, D3) were missing from the Docker test image. Added vendor/ copy to build-image.sh and Dockerfile so sitegen can bundle them into generated sites. 2. Test 201's synthetic data used 2025 dates which fall outside chart.js's default 3-month window (now April 2026). Updated to span 2025-06 through 2026-06 so charts render data. 3. Vite's base path rewriting doubled the base_url prefix in pre-rendered HTML hrefs (e.g., /myapp/myapp/theme.css). Changed the subdir browser test to nest the output under myapp/ and serve from the parent with base '/' — matching how a real reverse proxy deployment works. Also simplified DuckDB Worker creation (direct URL instead of blob+importScripts) and added @vite-ignore hints on vendor imports. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The browser test needs DuckDB-WASM and Observable Plot vendor files bundled into the Docker test image so sitegen can include them in generated sites. Added 'make vendor' step before build-image.sh. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Also a prototype logs viewer for jsonlogs:// files.
Attempting to run DuckDB-WASM from vendored copy, to run offline.