refactor: move entrypoint to cmd/pop and convert root to library package#144
refactor: move entrypoint to cmd/pop and convert root to library package#144linkdata wants to merge 3 commits into
Conversation
Refactor the repository to separate the executable entrypoint from the library logic. The root package is renamed to `pop` and exposes a `Run` API, while `cmd/pop/main.go` handles process concerns like signals and exit codes. Key changes: - Move `main` logic to `cmd/pop/main.go` - Rename root package from `main` to `pop` - Replace `os.Exit`, `log.Fatal`, and `panic` with error returns in library code - Update `Makefile` to build from `./cmd/pop` and use local Go cache - Add `LIBRARY-REFACTOR.md` documenting the changes and API
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughRefactors the repo into a reusable library package ChangesLibrary Refactor
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sources.go (1)
243-264:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winData race on
threadscounter shared between goroutine log line and main-loop decrement.Inside the
xfrgoroutine,threadsis read (line 245) while the main goroutine concurrently decrements it (line 259).thread(the captured parameter) is the snapshot at launch and is what you actually want for the log message; removethreadsfrom the format args (or pass the remaining count throughsourceResultif you want it).go test -racewill flag this.🐛 Proposed fix
- case "xfr": - err = pd.ParseRpzFeed(name, &newsource) - pd.Logger.Printf("Thread %d: source \"%s\" now returned from ParseRpzFeed(). %d remaining", thread, name, threads) + case "xfr": + err = pd.ParseRpzFeed(name, &newsource) + pd.Logger.Printf("Thread %d: source %q now returned from ParseRpzFeed()", thread, name)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sources.go` around lines 243 - 264, The goroutine launched to handle sources reads the shared variable `threads` concurrently (in the log line after `pd.ParseRpzFeed`) causing a data race; change the goroutine to not access the shared `threads` variable—use the captured `thread` snapshot for logging (e.g., replace the `threads` arg in that log call with `thread`) or, if you need the remaining count, send it through the `resultCh`/`sourceResult` returned by the goroutine so only the main loop mutates and reads `threads` (affecting the code paths around `ParseRpzFeed`, the anonymous goroutine, `thread`, `threads`, and the `resultCh`/`sourceResult` handling).
🧹 Nitpick comments (2)
refreshengine.go (1)
263-294: ⚡ Quick winA single missing
RRParseFunctears down the whole refresh engine.Returning from the periodic refresh loop on
rc.RRParseFunc == nilexitsRefreshEngineentirely and (viaRun) is likely to bubble up as a fatal error, even though every other zone inrefreshCounterswas healthy. Consider logging and skipping the offending zone (or removing it fromrefreshCounters) instead of taking down all refresh activity.♻️ Proposed refactor
if rc.CurRefresh <= 0 { upstream = rc.Upstream if rc.RRParseFunc == nil { - return fmt.Errorf("RefreshEngine: parsefunc=nil") + log.Printf("RefreshEngine: zone %s has nil RRParseFunc, skipping", zone) + rc.CurRefresh = rc.SOARefresh + continue }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@refreshengine.go` around lines 263 - 294, In RefreshEngine, inside the loop over refreshCounters where you check rc.RRParseFunc, do not return an error from RefreshEngine (which kills the whole engine); instead log a clear error referencing the zone and rc (e.g., in the branch currently doing "return fmt.Errorf(...)"), then skip that zone (continue) or remove it from refreshCounters so other zones keep refreshing; update any state (like not decrementing CurRefresh for removed zones) and ensure you still call downstream notify/update logic only for successfully parsed/refreshed zones (functions/variables to touch: RefreshEngine loop, refreshCounters map entries, rc.RRParseFunc, pd.RpzSources[zone], and Run error handling).configupdater.go (1)
90-93: ⚡ Quick winDon't silently swallow
RemoveTopicerrors.
pd.MqttEngine.RemoveTopic(topic)returns an error that's currently dropped (errcheck flag). At a minimum, log the failure so that a stale subscription on the previous topic is observable; otherwise this update path can silently leave the engine subscribed to an obsolete topic in addition to the new one.♻️ Proposed fix
for _, topic := range wbgl.MqttDetails.Topics { - pd.MqttEngine.RemoveTopic(topic) + if err := pd.MqttEngine.RemoveTopic(topic); err != nil { + pd.Logger.Printf("ProcessTapirGlobalConfig: error removing previous MQTT topic %q: %v", topic, err) + } break // Only one topic }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@configupdater.go` around lines 90 - 93, The loop over wbgl.MqttDetails.Topics calls pd.MqttEngine.RemoveTopic(topic) but discards its error; change this to capture the returned error from RemoveTopic(topic), check if err != nil and log the failure with the topic name and error (e.g., using the existing logger in this package/context) so a stale subscription is observable, then keep the existing break to only process the first topic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bootstrap.go`:
- Around line 28-37: The early guard on certs.certdir (variable cd from
viper.GetString("certs.certdir")) is now unused and causes bootstrap to fail;
remove the check that returns an error when cd == "" or instead use cd to
construct the key/cert/cacert paths before calling tapir.NewClientConfig (i.e.
derive key and cert by joining cd with relative filenames if your config uses
relative names). Update the code around tapir.NewClientConfig(...) so it no
longer rejects when certs.certdir is empty unless you actually build paths from
it.
In `@lists.go`:
- Around line 68-69: The current code logs unsupported formats using
pd.Logger.Printf and then does "return false", which aborts the entire doubtlist
scan; change this to skip only the offending source by replacing the "return
false" behavior with a continue/skip so the loop proceeds to remaining lists
(i.e., after logging the unsupported list.Format, do not return from the outer
function). Additionally, add validation during source/config loading to reject
or surface unsupported list.Format values early so misconfigured sources are
caught before scanning.
In `@main.go`:
- Around line 182-206: loadConfigFiles currently overwrites cfgFileUsed and
returns only the last merged file, so reloadConfig ends up re-reading only that
one file; change loadConfigFiles (and the callers) to collect and return the
full ordered list of config files used (e.g., []string) instead of a single
string, and update reloadConfig to iterate that list: call viper.SetConfigFile +
viper.ReadInConfig for the base file (tapir.DefaultPopCfgFile) and then
viper.SetConfigFile + viper.MergeInConfig for each of tapir.PopSourcesCfgFile,
tapir.PopOutputsCfgFile, tapir.PopPolicyCfgFile so each file is reloaded in the
same order as startup; ensure you stop treating viper.ConfigFileUsed() as a
single authoritative value and instead log each file as it is read in
loadConfigFiles and on reloadConfig.
In `@refreshengine.go`:
- Around line 113-127: The success status is being sent even when
pd.ProcessTapirUpdate(tm) fails; update the code in the RefreshEngine handler
around the call to pd.ProcessTapirUpdate so that the StatusOK publish only
happens in the success branch (make the second pd.ComponentStatusCh <-
tapir.ComponentStatusUpdate and subsequent log.Printf run in an else block), and
replace fmt.Sprintf("ProcessTapirUpdate: MQTT observation message received")
with the raw string literal to satisfy staticcheck (remove the unnecessary
fmt.Sprintf). Ensure the failing branch still sends the StatusFail update and
logs the error as before.
In `@statusupdater.go`:
- Around line 41-43: The worker currently reuses pd.MqttEngine and then calls
StartMqttEngine(me), which double-starts the same instance; modify the worker so
it does not start an already running engine: if pd.MqttEngine is non-nil, use it
as-is and remove the StartMqttEngine(me) call (or, if you truly need an
independent status-only engine, create a new instance variable instead of
aliasing pd.MqttEngine and call StartMqttEngine on that new instance); update
all occurrences where StartMqttEngine is invoked on pd.MqttEngine (including the
block referencing me and the similar block at lines 85-90) to follow this
approach, leaving Run as the sole initializer of the shared engine.
---
Outside diff comments:
In `@sources.go`:
- Around line 243-264: The goroutine launched to handle sources reads the shared
variable `threads` concurrently (in the log line after `pd.ParseRpzFeed`)
causing a data race; change the goroutine to not access the shared `threads`
variable—use the captured `thread` snapshot for logging (e.g., replace the
`threads` arg in that log call with `thread`) or, if you need the remaining
count, send it through the `resultCh`/`sourceResult` returned by the goroutine
so only the main loop mutates and reads `threads` (affecting the code paths
around `ParseRpzFeed`, the anonymous goroutine, `thread`, `threads`, and the
`resultCh`/`sourceResult` handling).
---
Nitpick comments:
In `@configupdater.go`:
- Around line 90-93: The loop over wbgl.MqttDetails.Topics calls
pd.MqttEngine.RemoveTopic(topic) but discards its error; change this to capture
the returned error from RemoveTopic(topic), check if err != nil and log the
failure with the topic name and error (e.g., using the existing logger in this
package/context) so a stale subscription is observable, then keep the existing
break to only process the first topic.
In `@refreshengine.go`:
- Around line 263-294: In RefreshEngine, inside the loop over refreshCounters
where you check rc.RRParseFunc, do not return an error from RefreshEngine (which
kills the whole engine); instead log a clear error referencing the zone and rc
(e.g., in the branch currently doing "return fmt.Errorf(...)"), then skip that
zone (continue) or remove it from refreshCounters so other zones keep
refreshing; update any state (like not decrementing CurRefresh for removed
zones) and ensure you still call downstream notify/update logic only for
successfully parsed/refreshed zones (functions/variables to touch: RefreshEngine
loop, refreshCounters map entries, rc.RRParseFunc, pd.RpzSources[zone], and Run
error handling).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d3d5ef4e-977a-4121-951f-0cd2f11669eb
📒 Files selected for processing (21)
.gitignoreLIBRARY-REFACTOR.mdMakefileapihandler.gobootstrap.gocmd/pop/main.goconfig.goconfigupdater.godnshandler.golists.gologging.gomain.gomqtt.gopolicy.goreaper.gorefreshengine.gorpz.gosources.gostatusupdater.gostructs.goxfr.go
Reload all config files on SIGHUP, avoid double-starting the shared MQTT engine, and prevent successful observation status updates from masking processing failures. Also skip malformed doubtlist sources, log stale MQTT topic removal failures, remove a stale certdir requirement, and avoid the source parser thread counter race.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
refreshengine.go (1)
436-480:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftBlocking sends to
pd.ComponentStatusChinsideNotifyDownstreamscan deadlockRefreshEngineduring shutdown.
NotifyDownstreamshas noctxparameter. Each send topd.ComponentStatusCh(lines 456, 463, 470, 476) is an unconditional blocking send. IfStatusUpdaterexits due to initialization error or context cancellation without draining the channel, andRefreshEngineis mid-iteration throughpd.Downstreams(called from select cases at lines 185, 291, 322), the function will block indefinitely once the 10-element buffer fills—ctx.Done()is not monitored insideNotifyDownstreams, so the goroutine cannot be interrupted.The same risk applies to inline sends on lines 115, 122, and 132 inside
RefreshEngine's select cases.Recommended fix: pass
ctxtoNotifyDownstreamsand wrap each send:🛡️ Proposed fix
-func (pd *PopData) NotifyDownstreams() error { +func (pd *PopData) NotifyDownstreams(ctx context.Context) error {Replace each blocking send:
- pd.ComponentStatusCh <- csu + select { + case pd.ComponentStatusCh <- csu: + case <-ctx.Done(): + return ctx.Err() + }Update call sites to pass
ctx.Alternatively, use non-blocking sends:
- pd.ComponentStatusCh <- csu + select { + case pd.ComponentStatusCh <- csu: + default: + pd.Logger.Printf("NotifyDownstreams: ComponentStatusCh full, dropping status update") + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@refreshengine.go` around lines 436 - 480, NotifyDownstreams currently performs unconditional blocking sends to pd.ComponentStatusCh (and RefreshEngine has inline unconditional sends) which can deadlock during shutdown; change NotifyDownstreams signature to accept a context (ctx context.Context) and update all call sites (the RefreshEngine select cases that invoke NotifyDownstreams) to pass the same ctx, then replace each direct send to pd.ComponentStatusCh inside NotifyDownstreams and the inline sends in RefreshEngine with a select that either sends the message or aborts on ctx.Done() (or uses a non-blocking default branch if you prefer); this prevents blocking when the StatusUpdater has exited or the context is cancelled while preserving existing status reporting behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@refreshengine.go`:
- Around line 70-78: The drain loop currently ignores rpzcmdch and never
fulfills the RpzCmdData's cmd.Result, leaking any senders blocked on cmd.Result;
modify the loop that selects on ctx.Done() and zonerefch to also select on
rpzcmdch, and for each received RpzCmdData (or pointer) write a terminal result
(e.g., an error value or a closed channel indication) to cmd.Result so callers
are unblocked; ensure you handle nil/closed cmd.Result safely and document the
chosen error/result value so callers can detect the inactive-engine case.
---
Outside diff comments:
In `@refreshengine.go`:
- Around line 436-480: NotifyDownstreams currently performs unconditional
blocking sends to pd.ComponentStatusCh (and RefreshEngine has inline
unconditional sends) which can deadlock during shutdown; change
NotifyDownstreams signature to accept a context (ctx context.Context) and update
all call sites (the RefreshEngine select cases that invoke NotifyDownstreams) to
pass the same ctx, then replace each direct send to pd.ComponentStatusCh inside
NotifyDownstreams and the inline sends in RefreshEngine with a select that
either sends the message or aborts on ctx.Done() (or uses a non-blocking default
branch if you prefer); this prevents blocking when the StatusUpdater has exited
or the context is cancelled while preserving existing status reporting behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c8f2e2a7-2a34-4674-8a79-9ccf9b47d3e0
📒 Files selected for processing (8)
bootstrap.gocmd/pop/main.goconfigupdater.golists.gomain.gorefreshengine.gosources.gostatusupdater.go
🚧 Files skipped from review as they are similar to previous changes (6)
- lists.go
- bootstrap.go
- main.go
- cmd/pop/main.go
- configupdater.go
- sources.go
The motivation for this is making end-to-end integration testing of the entire DNS Tapir chain easier.
This was AI-generated and since this package has no tests it needs extra scrutiny before considering a merge.
Refactor
dnstapir-popIntocmd/pop+ RootpopPackageSummary
Move the executable entrypoint to
cmd/popand convert the repo root into packagepop. The root package exposes a library-styleRunAPI, whilecmd/pop/main.goowns process concerns such as signals, stderr reporting, and exit codes.Public API And Layout
package pop; do not create a/popsubdirectory.cmd/pop/main.godefines the ldflag-backedname,version, andcommitvars, installs SIGINT/SIGTERM context cancellation, converts SIGHUP intoRunOptions.Reload, callspop.Run, and usesos.Exit(1)only on returned errors.Gconfig,mqttclientid, andPOPExiter.Implementation Changes
package maintopackage pop; move the old startup logic out ofmain()intoRun.Makefilesomake buildrunsgo build ... -o out/dnstapir-pop ./cmd/pop; keep package/install artifacts and binary name unchanged.popwith returned errors:SetupLogging, config validation/loading, MQTT setup/start, source parsing, output parsing, bootstrap setup, and policy parsing return contextual errors.log.Fatal,os.Exit,panic,log.Panicf, andPOPExiterdisappear from packagepopfor normal failure paths.DnsEngine(ctx, *Config) errorAPIhandler(ctx, *Config) errorConfigUpdater(ctx, *Config) errorStatusUpdater(ctx, *Config) errorRefreshEngine(ctx, *Config) errorRunstarts workers, watches worker errors, context cancellation, API stop requests, and reload events; it saves the RPZ serial before returning.MqttEngine.SetupInterruptHandler()from the library path; shutdown should flow throughRunandStopEngine.configupdater.go,refreshengine.go, andstatusupdater.goas part of makinggo test ./...meaningful after the refactor.Test Plan
gofmton touched Go files.go list ./...and confirm packages includednstapir-popanddnstapir-pop/cmd/pop.go test ./...; expected target is green after the pre-existing vet issues are fixed.make buildand confirmout/dnstapir-popis produced with ldflag metadata wired throughcmd/pop.POPExiter, root-packageos.Exit,log.Fatal, and normal-errorpanicusages are gone.Assumptions
pop.Run; existing exported helper names can remain, but signatures may gaincontext.Contextorerrorto satisfy the strict library API.cmd/popmay terminate the process; packagepopreports failures to its caller.Summary by CodeRabbit
New Features
Refactor
Chores
Improvements