Skip to content

Add experiment signals to fleet remote config#2872

Open
khewonc wants to merge 16 commits intomainfrom
khewonc/experiment-signals
Open

Add experiment signals to fleet remote config#2872
khewonc wants to merge 16 commits intomainfrom
khewonc/experiment-signals

Conversation

@khewonc
Copy link
Copy Markdown
Collaborator

@khewonc khewonc commented Apr 7, 2026

What does this PR do?

Adds experiment signals: start, stop, and promote. This depends on #2838. The latest commit contains the relevant changes

Motivation

https://datadoghq.atlassian.net/browse/CONTP-1424
https://datadoghq.atlassian.net/browse/CONTP-1425
https://datadoghq.atlassian.net/browse/CONTP-1426

Additional Notes

Merge after #2838

Minimum Agent Versions

Are there minimum versions of the Datadog Agent and/or Cluster Agent required?

  • Agent: vX.Y.Z
  • Cluster Agent: vX.Y.Z

Describe your test plan

TBD

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label
  • All commits are signed (see: signing commits)

@khewonc khewonc added this to the v1.26.0 milestone Apr 7, 2026
@khewonc khewonc added the enhancement New feature or request label Apr 7, 2026
@khewonc khewonc requested a review from a team April 7, 2026 13:31
@khewonc khewonc requested review from a team as code owners April 7, 2026 13:31
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2e153f00f4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/controller/datadogagent/experiment.go
Comment thread internal/controller/datadogagent/controller_reconcile_v2.go
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 72.84595% with 104 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.54%. Comparing base (a646370) to head (a816a2a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/fleet/daemon.go 69.13% 58 Missing and 17 partials ⚠️
pkg/fleet/experiment.go 83.67% 7 Missing and 1 partial ⚠️
internal/controller/datadogagent/experiment.go 82.50% 4 Missing and 3 partials ⚠️
internal/controller/datadogagent/revision.go 79.41% 4 Missing and 3 partials ⚠️
pkg/remoteconfig/updater.go 0.00% 4 Missing ⚠️
cmd/main.go 0.00% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2872      +/-   ##
==========================================
+ Coverage   40.03%   40.54%   +0.51%     
==========================================
  Files         319      321       +2     
  Lines       28066    28514     +448     
==========================================
+ Hits        11235    11560     +325     
- Misses      16008    16107      +99     
- Partials      823      847      +24     
Flag Coverage Δ
unittests 40.54% <72.84%> (+0.51%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/fleet/remote_config.go 100.00% <100.00%> (ø)
cmd/main.go 6.66% <0.00%> (ø)
pkg/remoteconfig/updater.go 0.00% <0.00%> (ø)
internal/controller/datadogagent/experiment.go 85.15% <82.50%> (+0.37%) ⬆️
internal/controller/datadogagent/revision.go 78.12% <79.41%> (-3.39%) ⬇️
pkg/fleet/experiment.go 83.67% <83.67%> (ø)
pkg/fleet/daemon.go 65.45% <69.13%> (+65.45%) ⬆️

... and 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a646370...a816a2a. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@khewonc khewonc force-pushed the khewonc/experiment-signals branch from e686774 to d0ff10a Compare April 7, 2026 20:03
@datadog-prod-us1-6
Copy link
Copy Markdown

datadog-prod-us1-6 bot commented Apr 15, 2026

Code Coverage

Fix all issues with BitsAI

🛑 Gate Violations

🎯 1 Code Coverage issue detected

A Patch coverage percentage gate may be blocking this PR.

Patch coverage: 71.83% (threshold: 80.00%)

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 71.83%
Overall Coverage: 40.61% (+0.50%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: a816a2a | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

Comment thread pkg/fleet/daemon.go

// Apply the spec patch.
if err := retryWithBackoff(ctx, func() error {
return d.client.Patch(ctx, dda, client.RawPatch(types.MergePatchType, op.Config))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ question: ‏Is this a blocking call? Else, any UPDATER_TASK instruction will block any further UPDATER_TASK until the first one is done.
This means we can't stop a change from the backend

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a blocking call. Both spec patch and status update are needed to start the experiment before stoping it though. If we patch the spec, then cancel the start signal to stop, stopExperiment would see there's no active experiment (since the status wasn't updated yet) and fail. The patched spec would still be applied, but with no experiment to track it. If we cancel the start signal before patching, then there wouldn't be an experiment to stop either

Comment thread pkg/fleet/daemon.go Outdated
Comment thread pkg/fleet/daemon.go Outdated
continue
}

seen[req.ID] = struct{}{}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ question: ‏Should it be before the h(req)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely certain on this, but I think either would be fine

Comment thread pkg/fleet/experiment.go
Comment thread pkg/fleet/daemon.go
Comment thread pkg/fleet/daemon.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants