Add telemetry for bundle config-remote-sync#5556
Merged
Merged
Conversation
Collaborator
Integration test reportCommit: 0786514
26 interesting tests: 15 SKIP, 7 KNOWN, 4 flaky
Top 26 slowest tests (at least 2 minutes):
|
Co-authored-by: Isaac
…unters Co-authored-by: Isaac
…ten_local_edits, drop raw_values_with_var_syntax Co-authored-by: Isaac
…nd unsupported) Co-authored-by: Isaac
fa19e19 to
d955798
Compare
…local_edits Co-authored-by: Isaac
…ounters The unit tests cover add_count/remove_count, recreate_forcing_changes, and the variable-restoration counts, but they were missing acceptance coverage. These scenarios all round-trip against the local mock testserver, so they don't need a cloud test. - job_fields: assert the telemetry event (add_count/remove_count + per-resource breakdown). The remote edits already add and remove keyed fields. - resolve_variables: assert refs_retargeted (env param re-targeted to a different variable) and refs_from_siblings (added params/tasks restored from siblings). - New telemetry/config-remote-sync-recreate: a remote change to the immutable ingestion_definition.connection_name makes recreate_forcing_changes non-zero. storage (the other immutable pipeline field) can't be used because configsync always skips it as a backend-generated default. The two enriched tests run on both engines, so the asserted event drops the engine name (every other counter is engine-agnostic); RecordRequests is enabled and out.requests.txt is removed in cleanup so it isn't compared.
ilyakuz-db
commented
Jun 16, 2026
| @@ -1,11 +1,25 @@ | |||
| package phases | |||
| package telemetry | |||
Contributor
Author
There was a problem hiding this comment.
Moved this to telemetry/ as we need to use it in config-remote-sync errors
shreyas-goenka
approved these changes
Jun 17, 2026
| } | ||
| if len(refs) == 1 { | ||
| for ref := range refs { | ||
| stats.FromSiblings++ |
Contributor
There was a problem hiding this comment.
nit: Add nil check to avoid panics?
Contributor
There was a problem hiding this comment.
We should not ever fail because of telemetry code paths.
Telemetry is best-effort and must not break config-remote-sync: - CollectChangeStats and LogTelemetry now defer a recover() that logs at debug and swallows any panic (e.g. the dresources.MustLoad* lifecycle-config loaders called from isRecreateForcing, or anything during proto build/upload). - RestoreStats counters are incremented through nil-safe incRetargeted / incFromSiblings methods, so threading the pointer through the deep restoration recursion can never nil-panic; the now-redundant nil guard in RestoreVariableReferences is removed.
…-tests # Conflicts: # bundle/phases/telemetry.go
pietern
approved these changes
Jun 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Add a
bundle_config_remote_sync_eventtelemetry event, emitted on every execution of thebundle config-remote-synccommand (previously this command emitted no telemetry at all).The event records aggregate data only: change counts by operation and resource type, file counts, variable-reference restoration counts by mechanism, a count of remote string values containing
${, the engine, and an enumerated error category. No resource names, keys, paths, or configuration values are logged.Also wraps the missing-snapshot failure in a sentinel error (
configsync.ErrStateSnapshotNotFound) so it can be classified.Note: the server-side proto changes for this event are in flight; until they ship, the new event field is dropped at ingestion.
Why
The command is used by the in-workspace UI sync flow, and today its executions and failures are invisible in telemetry, which makes investigating sync-related issues unnecessarily hard.
Tests
Unit tests for stats collection and restoration counters; two new acceptance tests (happy path detecting a remote job rename, and the error path without deployed state reporting
STATE_NOT_FOUND).