Skip to content

refactor: move entrypoint to cmd/pop and convert root to library package#144

Open
linkdata wants to merge 3 commits into
dnstapir:mainfrom
linkdata:main
Open

refactor: move entrypoint to cmd/pop and convert root to library package#144
linkdata wants to merge 3 commits into
dnstapir:mainfrom
linkdata:main

Conversation

@linkdata
Copy link
Copy Markdown

@linkdata linkdata commented May 8, 2026

The motivation for this is making end-to-end integration testing of the entire DNS Tapir chain easier.
This was AI-generated and since this package has no tests it needs extra scrutiny before considering a merge.

Refactor dnstapir-pop Into cmd/pop + Root pop Package

Summary

Move the executable entrypoint to cmd/pop and convert the repo root into package pop. The root package exposes a library-style Run API, while cmd/pop/main.go owns process concerns such as signals, stderr reporting, and exit codes.

Public API And Layout

  • Keep functionality in the repo root as package pop; do not create a /pop subdirectory.
  • Add:
    type RunOptions struct {
        Name, Version, Commit string
        Args []string
        Stdout, Stderr io.Writer
        Reload <-chan struct{}
    }
    
    func Run(ctx context.Context, opts RunOptions) error
  • cmd/pop/main.go defines the ldflag-backed name, version, and commit vars, installs SIGINT/SIGTERM context cancellation, converts SIGHUP into RunOptions.Reload, calls pop.Run, and uses os.Exit(1) only on returned errors.
  • Remove root globals used only by the old executable path: Gconfig, mqttclientid, and POPExiter.

Implementation Changes

  • Rename all root Go files from package main to package pop; move the old startup logic out of main() into Run.
  • Update Makefile so make build runs go build ... -o out/dnstapir-pop ./cmd/pop; keep package/install artifacts and binary name unchanged.
  • Replace process exits in pop with returned errors:
    • SetupLogging, config validation/loading, MQTT setup/start, source parsing, output parsing, bootstrap setup, and policy parsing return contextual errors.
    • log.Fatal, os.Exit, panic, log.Panicf, and POPExiter disappear from package pop for normal failure paths.
  • Convert long-running workers to context/error style:
    • DnsEngine(ctx, *Config) error
    • APIhandler(ctx, *Config) error
    • ConfigUpdater(ctx, *Config) error
    • StatusUpdater(ctx, *Config) error
    • RefreshEngine(ctx, *Config) error
  • Run starts workers, watches worker errors, context cancellation, API stop requests, and reload events; it saves the RPZ serial before returning.
  • Remove MqttEngine.SetupInterruptHandler() from the library path; shutdown should flow through Run and StopEngine.
  • Update policy/RPZ helper signatures where needed so invalid list/zone formats return errors instead of exiting, and propagate those errors through RPZ generation.
  • Fix the current vet-blocking format string issues encountered in configupdater.go, refreshengine.go, and statusupdater.go as part of making go test ./... meaningful after the refactor.

Test Plan

  • Run gofmt on touched Go files.
  • Run go list ./... and confirm packages include dnstapir-pop and dnstapir-pop/cmd/pop.
  • Run go test ./...; expected target is green after the pre-existing vet issues are fixed.
  • Run make build and confirm out/dnstapir-pop is produced with ldflag metadata wired through cmd/pop.
  • Verify with search that POPExiter, root-package os.Exit, log.Fatal, and normal-error panic usages are gone.

Assumptions

  • The official library entrypoint is pop.Run; existing exported helper names can remain, but signatures may gain context.Context or error to satisfy the strict library API.
  • Config file paths, service names, package install paths, and runtime behavior stay the same unless needed to remove process exits.
  • Only cmd/pop may terminate the process; package pop reports failures to its caller.

Summary by CodeRabbit

  • New Features

    • Add reload-on-signal support so the running service can reload configuration without full restart.
  • Refactor

    • Split executable entrypoint from core runtime to provide a reusable library-style runtime (no user-facing behavior change).
  • Chores

    • Use a project-local Go build cache and update ignore rules for temporary artifacts.
  • Improvements

    • More graceful shutdowns and clearer error reporting during startup and runtime.

Refactor the repository to separate the executable entrypoint from the library
logic. The root package is renamed to `pop` and exposes a `Run` API, while
`cmd/pop/main.go` handles process concerns like signals and exit codes.

Key changes:
- Move `main` logic to `cmd/pop/main.go`
- Rename root package from `main` to `pop`
- Replace `os.Exit`, `log.Fatal`, and `panic` with error returns in library code
- Update `Makefile` to build from `./cmd/pop` and use local Go cache
- Add `LIBRARY-REFACTOR.md` documenting the changes and API
@linkdata linkdata requested a review from a team as a code owner May 8, 2026 10:33
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack

Warning

Rate limit exceeded

@linkdata has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 44 minutes and 7 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fdcadee1-3155-4b5e-917e-99c153d29a29

📥 Commits

Reviewing files that changed from the base of the PR and between 33b6a05 and 71a0b1e.

📒 Files selected for processing (1)
  • refreshengine.go
📝 Walkthrough

Walkthrough

Refactors the repo into a reusable library package pop with a new Run(ctx, RunOptions) API and a thin cmd/pop executable; converts workers to context-driven functions that return errors, replaces process-exit behaviors with error returns, and updates build/docs for the new layout.

Changes

Library Refactor

Layer / File(s) Summary
Refactor Plan & Build Configuration
LIBRARY-REFACTOR.md, .gitignore, Makefile
Adds refactor doc, sets project-local GOCACHE and builds ./cmd/pop, and ignores *~ and .gocache/.
Package Declarations & Types
config.go, structs.go, bootstrap.go, xfr.go, reaper.go, lists.go, logging.go, mqtt.go
Multiple files changed from package main to package pop; exported structs and some field placements were reflowed into the library namespace.
Library Entrypoint & cmd wrapper
main.go, cmd/pop/main.go
Adds RunOptions and Run(ctx, opts) in pop; new cmd/pop/main.go handles signals, reload channel, and process exit on error.
Worker Refactoring: Context-Driven Lifecycle
configupdater.go, refreshengine.go, statusupdater.go, apihandler.go, dnshandler.go
Workers converted from stop-channel + exit semantics to ctx context.Context + error returns; servers and goroutines coordinate shutdown via contexts, WaitGroups, and error channels.
Error Handling & Logging
config.go, logging.go, bootstrap.go, mqtt.go, policy.go
Functions that previously terminated the process now return wrapped errors (fmt.Errorf)—config validation, logging setup, bootstrap TLS, MQTT creation/start, and policy parsing updated.
Sources & RPZ Error Propagation
sources.go, policy.go, rpz.go
Per-source parsing aggregates errors via resultCh and errors.Join; ParseLocalFile/ParseRpzFeed return errors; ComputeRpzAction/ComputeRpzDoubtlistAction return (action, error) and callers propagate failures.
API Wiring & Misc
apihandler.go, main.go, Makefile
Route-walk errors surfaced, API bootstrap mutex usage fixed, reload handling wired to Run reload channel, SaveRpzSerial and cleanup aggregate errors instead of exiting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • dnstapir/pop#81: Overlapping terminology and list-type updates (allowlist/denylist/doubtlist) across the same files.
  • dnstapir/pop#131: Overlaps with Makefile and build/packaging layout changes.
  • dnstapir/pop#71: Related changes to BootstrapMqttSource and MQTT/source handling.

Suggested reviewers

  • jschlyter
  • johanix
  • oej

Poem

🐰 From main to pop I softly hop,
Context in paw, no abrupt stop,
Errors returned, no fatal leap,
A tiny cmd to run and keep,
The rabbit hums — clean start, calm stop.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main refactoring goal: moving the executable entrypoint to cmd/pop and converting the root to a library package, which aligns with the comprehensive changes throughout the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sources.go (1)

243-264: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Data race on threads counter shared between goroutine log line and main-loop decrement.

Inside the xfr goroutine, threads is read (line 245) while the main goroutine concurrently decrements it (line 259). thread (the captured parameter) is the snapshot at launch and is what you actually want for the log message; remove threads from the format args (or pass the remaining count through sourceResult if you want it). go test -race will flag this.

🐛 Proposed fix
-			case "xfr":
-				err = pd.ParseRpzFeed(name, &newsource)
-				pd.Logger.Printf("Thread %d: source \"%s\" now returned from ParseRpzFeed(). %d remaining", thread, name, threads)
+			case "xfr":
+				err = pd.ParseRpzFeed(name, &newsource)
+				pd.Logger.Printf("Thread %d: source %q now returned from ParseRpzFeed()", thread, name)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sources.go` around lines 243 - 264, The goroutine launched to handle sources
reads the shared variable `threads` concurrently (in the log line after
`pd.ParseRpzFeed`) causing a data race; change the goroutine to not access the
shared `threads` variable—use the captured `thread` snapshot for logging (e.g.,
replace the `threads` arg in that log call with `thread`) or, if you need the
remaining count, send it through the `resultCh`/`sourceResult` returned by the
goroutine so only the main loop mutates and reads `threads` (affecting the code
paths around `ParseRpzFeed`, the anonymous goroutine, `thread`, `threads`, and
the `resultCh`/`sourceResult` handling).
🧹 Nitpick comments (2)
refreshengine.go (1)

263-294: ⚡ Quick win

A single missing RRParseFunc tears down the whole refresh engine.

Returning from the periodic refresh loop on rc.RRParseFunc == nil exits RefreshEngine entirely and (via Run) is likely to bubble up as a fatal error, even though every other zone in refreshCounters was healthy. Consider logging and skipping the offending zone (or removing it from refreshCounters) instead of taking down all refresh activity.

♻️ Proposed refactor
 			if rc.CurRefresh <= 0 {
 				upstream = rc.Upstream
 				if rc.RRParseFunc == nil {
-					return fmt.Errorf("RefreshEngine: parsefunc=nil")
+					log.Printf("RefreshEngine: zone %s has nil RRParseFunc, skipping", zone)
+					rc.CurRefresh = rc.SOARefresh
+					continue
 				}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@refreshengine.go` around lines 263 - 294, In RefreshEngine, inside the loop
over refreshCounters where you check rc.RRParseFunc, do not return an error from
RefreshEngine (which kills the whole engine); instead log a clear error
referencing the zone and rc (e.g., in the branch currently doing "return
fmt.Errorf(...)"), then skip that zone (continue) or remove it from
refreshCounters so other zones keep refreshing; update any state (like not
decrementing CurRefresh for removed zones) and ensure you still call downstream
notify/update logic only for successfully parsed/refreshed zones
(functions/variables to touch: RefreshEngine loop, refreshCounters map entries,
rc.RRParseFunc, pd.RpzSources[zone], and Run error handling).
configupdater.go (1)

90-93: ⚡ Quick win

Don't silently swallow RemoveTopic errors.

pd.MqttEngine.RemoveTopic(topic) returns an error that's currently dropped (errcheck flag). At a minimum, log the failure so that a stale subscription on the previous topic is observable; otherwise this update path can silently leave the engine subscribed to an obsolete topic in addition to the new one.

♻️ Proposed fix
 		for _, topic := range wbgl.MqttDetails.Topics {
-			pd.MqttEngine.RemoveTopic(topic)
+			if err := pd.MqttEngine.RemoveTopic(topic); err != nil {
+				pd.Logger.Printf("ProcessTapirGlobalConfig: error removing previous MQTT topic %q: %v", topic, err)
+			}
 			break // Only one topic
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@configupdater.go` around lines 90 - 93, The loop over wbgl.MqttDetails.Topics
calls pd.MqttEngine.RemoveTopic(topic) but discards its error; change this to
capture the returned error from RemoveTopic(topic), check if err != nil and log
the failure with the topic name and error (e.g., using the existing logger in
this package/context) so a stale subscription is observable, then keep the
existing break to only process the first topic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bootstrap.go`:
- Around line 28-37: The early guard on certs.certdir (variable cd from
viper.GetString("certs.certdir")) is now unused and causes bootstrap to fail;
remove the check that returns an error when cd == "" or instead use cd to
construct the key/cert/cacert paths before calling tapir.NewClientConfig (i.e.
derive key and cert by joining cd with relative filenames if your config uses
relative names). Update the code around tapir.NewClientConfig(...) so it no
longer rejects when certs.certdir is empty unless you actually build paths from
it.

In `@lists.go`:
- Around line 68-69: The current code logs unsupported formats using
pd.Logger.Printf and then does "return false", which aborts the entire doubtlist
scan; change this to skip only the offending source by replacing the "return
false" behavior with a continue/skip so the loop proceeds to remaining lists
(i.e., after logging the unsupported list.Format, do not return from the outer
function). Additionally, add validation during source/config loading to reject
or surface unsupported list.Format values early so misconfigured sources are
caught before scanning.

In `@main.go`:
- Around line 182-206: loadConfigFiles currently overwrites cfgFileUsed and
returns only the last merged file, so reloadConfig ends up re-reading only that
one file; change loadConfigFiles (and the callers) to collect and return the
full ordered list of config files used (e.g., []string) instead of a single
string, and update reloadConfig to iterate that list: call viper.SetConfigFile +
viper.ReadInConfig for the base file (tapir.DefaultPopCfgFile) and then
viper.SetConfigFile + viper.MergeInConfig for each of tapir.PopSourcesCfgFile,
tapir.PopOutputsCfgFile, tapir.PopPolicyCfgFile so each file is reloaded in the
same order as startup; ensure you stop treating viper.ConfigFileUsed() as a
single authoritative value and instead log each file as it is read in
loadConfigFiles and on reloadConfig.

In `@refreshengine.go`:
- Around line 113-127: The success status is being sent even when
pd.ProcessTapirUpdate(tm) fails; update the code in the RefreshEngine handler
around the call to pd.ProcessTapirUpdate so that the StatusOK publish only
happens in the success branch (make the second pd.ComponentStatusCh <-
tapir.ComponentStatusUpdate and subsequent log.Printf run in an else block), and
replace fmt.Sprintf("ProcessTapirUpdate: MQTT observation message received")
with the raw string literal to satisfy staticcheck (remove the unnecessary
fmt.Sprintf). Ensure the failing branch still sends the StatusFail update and
logs the error as before.

In `@statusupdater.go`:
- Around line 41-43: The worker currently reuses pd.MqttEngine and then calls
StartMqttEngine(me), which double-starts the same instance; modify the worker so
it does not start an already running engine: if pd.MqttEngine is non-nil, use it
as-is and remove the StartMqttEngine(me) call (or, if you truly need an
independent status-only engine, create a new instance variable instead of
aliasing pd.MqttEngine and call StartMqttEngine on that new instance); update
all occurrences where StartMqttEngine is invoked on pd.MqttEngine (including the
block referencing me and the similar block at lines 85-90) to follow this
approach, leaving Run as the sole initializer of the shared engine.

---

Outside diff comments:
In `@sources.go`:
- Around line 243-264: The goroutine launched to handle sources reads the shared
variable `threads` concurrently (in the log line after `pd.ParseRpzFeed`)
causing a data race; change the goroutine to not access the shared `threads`
variable—use the captured `thread` snapshot for logging (e.g., replace the
`threads` arg in that log call with `thread`) or, if you need the remaining
count, send it through the `resultCh`/`sourceResult` returned by the goroutine
so only the main loop mutates and reads `threads` (affecting the code paths
around `ParseRpzFeed`, the anonymous goroutine, `thread`, `threads`, and the
`resultCh`/`sourceResult` handling).

---

Nitpick comments:
In `@configupdater.go`:
- Around line 90-93: The loop over wbgl.MqttDetails.Topics calls
pd.MqttEngine.RemoveTopic(topic) but discards its error; change this to capture
the returned error from RemoveTopic(topic), check if err != nil and log the
failure with the topic name and error (e.g., using the existing logger in this
package/context) so a stale subscription is observable, then keep the existing
break to only process the first topic.

In `@refreshengine.go`:
- Around line 263-294: In RefreshEngine, inside the loop over refreshCounters
where you check rc.RRParseFunc, do not return an error from RefreshEngine (which
kills the whole engine); instead log a clear error referencing the zone and rc
(e.g., in the branch currently doing "return fmt.Errorf(...)"), then skip that
zone (continue) or remove it from refreshCounters so other zones keep
refreshing; update any state (like not decrementing CurRefresh for removed
zones) and ensure you still call downstream notify/update logic only for
successfully parsed/refreshed zones (functions/variables to touch: RefreshEngine
loop, refreshCounters map entries, rc.RRParseFunc, pd.RpzSources[zone], and Run
error handling).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3d5ef4e-977a-4121-951f-0cd2f11669eb

📥 Commits

Reviewing files that changed from the base of the PR and between c114078 and 0b1d6ac.

📒 Files selected for processing (21)
  • .gitignore
  • LIBRARY-REFACTOR.md
  • Makefile
  • apihandler.go
  • bootstrap.go
  • cmd/pop/main.go
  • config.go
  • configupdater.go
  • dnshandler.go
  • lists.go
  • logging.go
  • main.go
  • mqtt.go
  • policy.go
  • reaper.go
  • refreshengine.go
  • rpz.go
  • sources.go
  • statusupdater.go
  • structs.go
  • xfr.go

Comment thread bootstrap.go Outdated
Comment thread lists.go Outdated
Comment thread main.go Outdated
Comment thread refreshengine.go
Comment thread statusupdater.go
Reload all config files on SIGHUP, avoid double-starting the shared MQTT engine, and prevent successful observation status updates from masking processing failures.

Also skip malformed doubtlist sources, log stale MQTT topic removal failures, remove a stale certdir requirement, and avoid the source parser thread counter race.
@linkdata
Copy link
Copy Markdown
Author

linkdata commented May 8, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jschlyter jschlyter requested review from johanix and zluudg May 8, 2026 12:00
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
refreshengine.go (1)

436-480: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Blocking sends to pd.ComponentStatusCh inside NotifyDownstreams can deadlock RefreshEngine during shutdown.

NotifyDownstreams has no ctx parameter. Each send to pd.ComponentStatusCh (lines 456, 463, 470, 476) is an unconditional blocking send. If StatusUpdater exits due to initialization error or context cancellation without draining the channel, and RefreshEngine is mid-iteration through pd.Downstreams (called from select cases at lines 185, 291, 322), the function will block indefinitely once the 10-element buffer fills—ctx.Done() is not monitored inside NotifyDownstreams, so the goroutine cannot be interrupted.

The same risk applies to inline sends on lines 115, 122, and 132 inside RefreshEngine's select cases.

Recommended fix: pass ctx to NotifyDownstreams and wrap each send:

🛡️ Proposed fix
-func (pd *PopData) NotifyDownstreams() error {
+func (pd *PopData) NotifyDownstreams(ctx context.Context) error {

Replace each blocking send:

-    pd.ComponentStatusCh <- csu
+    select {
+    case pd.ComponentStatusCh <- csu:
+    case <-ctx.Done():
+        return ctx.Err()
+    }

Update call sites to pass ctx.

Alternatively, use non-blocking sends:

-    pd.ComponentStatusCh <- csu
+    select {
+    case pd.ComponentStatusCh <- csu:
+    default:
+        pd.Logger.Printf("NotifyDownstreams: ComponentStatusCh full, dropping status update")
+    }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@refreshengine.go` around lines 436 - 480, NotifyDownstreams currently
performs unconditional blocking sends to pd.ComponentStatusCh (and RefreshEngine
has inline unconditional sends) which can deadlock during shutdown; change
NotifyDownstreams signature to accept a context (ctx context.Context) and update
all call sites (the RefreshEngine select cases that invoke NotifyDownstreams) to
pass the same ctx, then replace each direct send to pd.ComponentStatusCh inside
NotifyDownstreams and the inline sends in RefreshEngine with a select that
either sends the message or aborts on ctx.Done() (or uses a non-blocking default
branch if you prefer); this prevents blocking when the StatusUpdater has exited
or the context is cancelled while preserving existing status reporting behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@refreshengine.go`:
- Around line 70-78: The drain loop currently ignores rpzcmdch and never
fulfills the RpzCmdData's cmd.Result, leaking any senders blocked on cmd.Result;
modify the loop that selects on ctx.Done() and zonerefch to also select on
rpzcmdch, and for each received RpzCmdData (or pointer) write a terminal result
(e.g., an error value or a closed channel indication) to cmd.Result so callers
are unblocked; ensure you handle nil/closed cmd.Result safely and document the
chosen error/result value so callers can detect the inactive-engine case.

---

Outside diff comments:
In `@refreshengine.go`:
- Around line 436-480: NotifyDownstreams currently performs unconditional
blocking sends to pd.ComponentStatusCh (and RefreshEngine has inline
unconditional sends) which can deadlock during shutdown; change
NotifyDownstreams signature to accept a context (ctx context.Context) and update
all call sites (the RefreshEngine select cases that invoke NotifyDownstreams) to
pass the same ctx, then replace each direct send to pd.ComponentStatusCh inside
NotifyDownstreams and the inline sends in RefreshEngine with a select that
either sends the message or aborts on ctx.Done() (or uses a non-blocking default
branch if you prefer); this prevents blocking when the StatusUpdater has exited
or the context is cancelled while preserving existing status reporting behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8f2e2a7-2a34-4674-8a79-9ccf9b47d3e0

📥 Commits

Reviewing files that changed from the base of the PR and between 0b1d6ac and 33b6a05.

📒 Files selected for processing (8)
  • bootstrap.go
  • cmd/pop/main.go
  • configupdater.go
  • lists.go
  • main.go
  • refreshengine.go
  • sources.go
  • statusupdater.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • lists.go
  • bootstrap.go
  • main.go
  • cmd/pop/main.go
  • configupdater.go
  • sources.go

Comment thread refreshengine.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant