Current Behavior
While deflaking t/plugin/log-rotate.t TEST 4 (#13536) I found the flake is masking a real problem: after disabling log-rotate via PUT /apisix/admin/plugins/reload, the rotate timer in the privileged agent sometimes keeps running essentially forever — I observed rotation continuing for 55+ seconds after the reload returned, on unmodified master, roughly 1 in 6 runs on a loaded machine.
The original test never caught this because it counts *__error.log files and the count plateaus at max_kept: 3 while rotation keeps cycling (rotate one in, delete the oldest), so the assertion passes even when the disable never took effect.
Two details point at event delivery rather than the unload path:
- comparing the timestamped rotated file-name signature (which changes on every rotation) shows rotation continuing across a 50+ second observation window after a successful-looking reload;
- re-sending the same reload request reliably stops the rotation, so
_M.destroy() → timers.unregister_timer() works fine once the event actually arrives.
This looks like the same family as the plugins-reload.t TEST 1 timing flake (hardened in #13332): the reload event is broadcast asynchronously, and the privileged agent can apparently miss it entirely under load — not just receive it late. Any plugin that registers privileged-agent timers (log-rotate, server-info, etc.) would keep its timer alive after being removed from the config.
Expected Behavior
A plugins reload should deterministically (eventually) reach every worker including the privileged agent, so timers of removed plugins stop.
Steps to Reproduce
- Patch
t/plugin/log-rotate.t TEST 4 to compare the sorted rotated file-name set instead of the count, with a 50-iteration (~55s) observation window after the reload (no retry).
- Run the file repeatedly on a busy machine:
prove -Itest-nginx/lib -I./ t/plugin/log-rotate.t — every few runs the signature keeps changing for the whole window.
Environment
- APISIX version: master (cb72d0e / eb1af76)
- OS: Linux x86_64
- OpenResty: openresty/1.29.2.4
Current Behavior
While deflaking
t/plugin/log-rotate.tTEST 4 (#13536) I found the flake is masking a real problem: after disabling log-rotate viaPUT /apisix/admin/plugins/reload, the rotate timer in the privileged agent sometimes keeps running essentially forever — I observed rotation continuing for 55+ seconds after the reload returned, on unmodified master, roughly 1 in 6 runs on a loaded machine.The original test never caught this because it counts
*__error.logfiles and the count plateaus atmax_kept: 3while rotation keeps cycling (rotate one in, delete the oldest), so the assertion passes even when the disable never took effect.Two details point at event delivery rather than the unload path:
_M.destroy()→timers.unregister_timer()works fine once the event actually arrives.This looks like the same family as the
plugins-reload.tTEST 1 timing flake (hardened in #13332): the reload event is broadcast asynchronously, and the privileged agent can apparently miss it entirely under load — not just receive it late. Any plugin that registers privileged-agent timers (log-rotate, server-info, etc.) would keep its timer alive after being removed from the config.Expected Behavior
A plugins reload should deterministically (eventually) reach every worker including the privileged agent, so timers of removed plugins stop.
Steps to Reproduce
t/plugin/log-rotate.tTEST 4 to compare the sorted rotated file-name set instead of the count, with a 50-iteration (~55s) observation window after the reload (no retry).prove -Itest-nginx/lib -I./ t/plugin/log-rotate.t— every few runs the signature keeps changing for the whole window.Environment