Enable dynamic changes in the router for non-kube environments by nluaces · Pull Request #2356 · skupperproject/skupper

nluaces · 2026-01-18T17:56:27Z

Implements #2337
Branch based on #2334, the proper commits that implement this issue are bb8ab35 and the following.

flowchart TD
   id1([new listener resource is created])-->|triggers|InputResourceHandler-->Q{site exists?}
   Q -->|Yes| A[Refresh Router Config]
   Q -->|No| B[Bootstrap]

flowchart LR
  SystemAdaptorHandler-->|has a callback|RouterStatusHandler
  SystemAdaptorHandler-->|spawns|processRouterConfig-->id2([reconciliates the router config file with the router])

The InputResourceHandler was modified to not bootstrap in case a site is already created in the namespace.
It also includes validations for commands start, stop and reload, that should be disabled if the dynamic changes are happening.

fgiorgetti

My impression so far is that it is on the right track.
Really nice work @nluaces!

A few other comments I have so far:

The site name in the router config is changing (we need to check if this will cause issues to the console or vanflow metrics)
As I already added, the system controller update and the update of the router container for the controlled namespaces is something we need to figure out still

internal/nonkube/common/site_state_renderer_common.go

internal/nonkube/compat/site_state_renderer.go

fgiorgetti · 2026-02-06T14:18:50Z

internal/nonkube/compat/site_state_renderer.go

+
+	// the container endpoint is mapped to the podman socket inside the container
+	if api.IsRunningInContainer() {
+		endpoint = "unix:///var/run/podman.sock"


If running in a container, the endpoint should be provided through:

os.Getenv("CONTAINER_ENDPOINT")

Then, if the endpoint is not set, you can use the defaults you have below.

Maybe we could have a function under pkg/nonkube/api/environment.go that takes the platform as an argument and returns the correct container endpoint, eliminating the dups we have (I should have done that earlier).

When skupper system install runs, it creates a volume bind between with CONTAINER_ENDPOINT env variable and a volume destination that starts with /var/run/ (I believe that the bootstrap script works like that as well):

skupper/internal/nonkube/bootstrap/install.go

Lines 80 to 104 in ce10f83

//To mount a volume as a bind, the host path must be specified in the Name field

//instead of the Source field. If the values Name/Destination are empty, volumes will be ignored, not mounted,

//and the system-controller container will fail to start.

mounts := []container.Volume{}

volumeDestination := fmt.Sprintf("/var/run/%s.sock", platform)

if strings.HasPrefix(config.containerEndpoint, "unix://") {

socketPath := strings.TrimPrefix(config.containerEndpoint, "unix://")

mounts = append(mounts, container.Volume{

Name: socketPath,

Destination: volumeDestination,

Mode: "z",

RW: true,

})

} else if strings.HasPrefix(config.containerEndpoint, "/") {

mounts = append(mounts, container.Volume{

Name: config.containerEndpoint,

Destination: volumeDestination,

Mode: "z",

RW: true,

})

}

If running inside the container we use the env variable instead of the mapped endpoint, I think it is not going to work as expected.

Actually, you brought up a good point. For auto-reloads, the system-controller container must be aware of all endpoints on the host machine, beforehands. So when the system-controller container is created, it needs to know the PODMAN_CONTAINER_ENDPOINT as well as the DOCKER_CONTAINER_ENDPOINT.

This way if the system controller is running with podman, but a docker site is created, the system controller already knows what is the endpoint to map, from the host machine into the target router container.

Or... we could simply argue that if the system controller is using podman, it will only initialize podman sites.

If a system controller can operate with different system platforms it would be really nice, I would prefer though to have this change implemented in a different pull request, what do you think?

internal/nonkube/controller/input_resource_handler.go

internal/nonkube/controller/namespace_controller.go

fgiorgetti · 2026-02-06T19:05:02Z

internal/nonkube/controller/input_resource_handler.go

+	defer h.lock.Unlock()
+
+	_, err := os.Stat(api.GetInternalOutputPath(h.namespace, api.RuntimeSiteStatePath))
+	if err == nil {


I believe an extra validation is needed here.
If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image. If not, then we need to "reload" the site.

This will be needed in cases when the system controller is updated.

By the way, we still need to handle system controller updates through the CLI.
I am not sure if it could be part of the system install command, through a flag,
or if it should be a separate command like reinstall.

Anyway, I believe we need to think about it, as it will be needed as existing sites
cannot be reloaded when using automated reloads.

If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image.

I need some clarification, what is considered the correct image? the one specified in the env variable? Or simply check that there is a router running for that namespace?

Maybe we can handle updates separately from this initial PR, so it will be simpler to evaluate.
Let's just raise an enhancement issue for now, so we can think more about it.
Or we could do something similar to what the kube controller is doing, which consists of comparing
the version defined at the controller with the version defined to a given site/namespace. If controller
has a newer version, then we could use the same router image we use when a new site is created.
But again, better to keep it separate. WDYT?

I agree with you on keeping this change separately, we need to define as well the CLI improvement for this.

internal/qdr/amqp_mgmt.go

…no longer auto

nluaces · 2026-02-10T15:32:09Z

The site name in the router config is changing (we need to check if this will cause issues to the console or vanflow metrics)

I checked that the name of the site was not changing when the controller just refresh (for example, when a listener has been created); do you have any examples for me to understand this better?

…g in a container

fgiorgetti · 2026-02-12T11:31:57Z

internal/nonkube/controller/input_resource_handler.go

+	defer h.lock.Unlock()
+
+	_, err := os.Stat(api.GetInternalOutputPath(h.namespace, api.RuntimeSiteStatePath))
+	if err == nil {


Maybe we can handle updates separately from this initial PR, so it will be simpler to evaluate.
Let's just raise an enhancement issue for now, so we can think more about it.
Or we could do something similar to what the kube controller is doing, which consists of comparing
the version defined at the controller with the version defined to a given site/namespace. If controller
has a newer version, then we could use the same router image we use when a new site is created.
But again, better to keep it separate. WDYT?

internal/nonkube/controller/input_resource_handler.go

fgiorgetti · 2026-02-12T12:07:23Z

internal/nonkube/compat/site_state_renderer.go

+
+	// the container endpoint is mapped to the podman socket inside the container
+	if api.IsRunningInContainer() {
+		endpoint = "unix:///var/run/podman.sock"


Actually, you brought up a good point. For auto-reloads, the system-controller container must be aware of all endpoints on the host machine, beforehands. So when the system-controller container is created, it needs to know the PODMAN_CONTAINER_ENDPOINT as well as the DOCKER_CONTAINER_ENDPOINT.

This way if the system controller is running with podman, but a docker site is created, the system controller already knows what is the endpoint to map, from the host machine into the target router container.

fgiorgetti · 2026-02-12T12:48:05Z

internal/nonkube/controller/input_resource_handler.go

+
+	//If there is no site configured, the namespace needs to be removed
+	if err != nil || siteState == nil || siteState.Site == nil {
+		err = h.tearDownNamespace()


Here I believe we should not teardown the namespace if an error occurred.

For example, if you have a site named "mysite" running on the default namespace and you accidentally create another site named "mysite2", site state loader will fail, as 2 sites have been defined, which is acceptable.

But then, if you remove the mysite2, to fix the situation, it will teardown this running site, which is not desirable, leaving it in a bad state.

…e there is automatic reloading configured

nluaces self-assigned this Jan 18, 2026

nluaces changed the title ~~Enable dynamic changes in the router for non-kube environments (wip)~~ Enable dynamic changes in the router for non-kube environments Feb 1, 2026

nluaces marked this pull request as ready for review February 1, 2026 17:31

nluaces requested review from c-kruse and fgiorgetti as code owners February 1, 2026 17:31

fgiorgetti reviewed Feb 6, 2026

View reviewed changes

internal/qdr/amqp_mgmt.go Show resolved Hide resolved

nluaces added 10 commits February 9, 2026 15:08

add watchers for system input files

b4ce05d

change system controller reload to manual by default

1963564

fix unit test given that the default value for SYSTEM_AUTO_RELOAD is …

039657e

…no longer auto

make the reload type env variable manual by default

58a6230

change log level

f4a552f

tear down if there is no site configured

e754fd5

adapt tests

dab70d7

improve parameter readability

f99f1c6

use appropiate logger when checking valid platforms

303188b

add system reload type at the startup

c64192f

nluaces force-pushed the 2337-system-adaptor-process branch from 0eb07fd to d4bc626 Compare February 10, 2026 10:22

nluaces requested a review from fgiorgetti February 10, 2026 16:02

nluaces added 2 commits February 11, 2026 18:49

modify default container function to respond properly if it is runnin…

867d6ac

…g in a container

use the right container endpoint when removing a router

a90b119

fgiorgetti reviewed Feb 12, 2026

View reviewed changes

nluaces linked an issue Feb 12, 2026 that may be closed by this pull request

Create system-adaptor that process dynamic changes in the router #2337

Open

nluaces added 5 commits February 12, 2026 15:44

delete RemoveAll function as it is no longer necessary

352e7ea

system adaptor handler wip

6d3a0b9

fix sslProfiles synchronisation

71850d7

add some unit tests

7ef8d1f

add extra validation to start, stop and reload system commands in cas…

e93419b

…e there is automatic reloading configured

nluaces added 5 commits February 13, 2026 18:12

only refresh the site with new resources if it is not a bundle

363a1b1

add safety check when recovering skupper-local-normal listener port

e7c1d58

fix format

d7bbe85

handle refresh error

8941183

fix wrong setup of callbacks for the router state handler

d63b383

nluaces force-pushed the 2337-system-adaptor-process branch from 9c2e556 to d63b383 Compare February 13, 2026 17:16

nluaces added 3 commits February 13, 2026 19:02

handle errors better when loading a site

5b88d34

manage several sites scenario

e0671e1

allow reload command when the reloading type is set to auto

e921cc6

	//To mount a volume as a bind, the host path must be specified in the Name field
	//instead of the Source field. If the values Name/Destination are empty, volumes will be ignored, not mounted,
	//and the system-controller container will fail to start.

	mounts := []container.Volume{}

	volumeDestination := fmt.Sprintf("/var/run/%s.sock", platform)

	if strings.HasPrefix(config.containerEndpoint, "unix://") {
	socketPath := strings.TrimPrefix(config.containerEndpoint, "unix://")
	mounts = append(mounts, container.Volume{
	Name: socketPath,
	Destination: volumeDestination,
	Mode: "z",
	RW: true,
	})
	} else if strings.HasPrefix(config.containerEndpoint, "/") {

	mounts = append(mounts, container.Volume{
	Name: config.containerEndpoint,
	Destination: volumeDestination,
	Mode: "z",
	RW: true,
	})
	}

Conversation

nluaces commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fgiorgetti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nluaces commented Feb 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nluaces commented Jan 18, 2026 •

edited

Loading