Enable dynamic changes in the router for non-kube environments #2356
Enable dynamic changes in the router for non-kube environments #2356nluaces wants to merge 25 commits intoskupperproject:mainfrom
Conversation
fgiorgetti
left a comment
There was a problem hiding this comment.
My impression so far is that it is on the right track.
Really nice work @nluaces!
A few other comments I have so far:
- The site name in the router config is changing (we need to check if this will cause issues to the console or vanflow metrics)
- As I already added, the system controller update and the update of the router container for the controlled namespaces is something we need to figure out still
|
|
||
| // the container endpoint is mapped to the podman socket inside the container | ||
| if api.IsRunningInContainer() { | ||
| endpoint = "unix:///var/run/podman.sock" |
There was a problem hiding this comment.
If running in a container, the endpoint should be provided through:
os.Getenv("CONTAINER_ENDPOINT")
Then, if the endpoint is not set, you can use the defaults you have below.
Maybe we could have a function under pkg/nonkube/api/environment.go that takes the platform as an argument and returns the correct container endpoint, eliminating the dups we have (I should have done that earlier).
There was a problem hiding this comment.
When skupper system install runs, it creates a volume bind between with CONTAINER_ENDPOINT env variable and a volume destination that starts with /var/run/ (I believe that the bootstrap script works like that as well):
skupper/internal/nonkube/bootstrap/install.go
Lines 80 to 104 in ce10f83
If running inside the container we use the env variable instead of the mapped endpoint, I think it is not going to work as expected.
There was a problem hiding this comment.
Actually, you brought up a good point. For auto-reloads, the system-controller container must be aware of all endpoints on the host machine, beforehands. So when the system-controller container is created, it needs to know the PODMAN_CONTAINER_ENDPOINT as well as the DOCKER_CONTAINER_ENDPOINT.
This way if the system controller is running with podman, but a docker site is created, the system controller already knows what is the endpoint to map, from the host machine into the target router container.
There was a problem hiding this comment.
Or... we could simply argue that if the system controller is using podman, it will only initialize podman sites.
There was a problem hiding this comment.
If a system controller can operate with different system platforms it would be really nice, I would prefer though to have this change implemented in a different pull request, what do you think?
| defer h.lock.Unlock() | ||
|
|
||
| _, err := os.Stat(api.GetInternalOutputPath(h.namespace, api.RuntimeSiteStatePath)) | ||
| if err == nil { |
There was a problem hiding this comment.
I believe an extra validation is needed here.
If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image. If not, then we need to "reload" the site.
This will be needed in cases when the system controller is updated.
By the way, we still need to handle system controller updates through the CLI.
I am not sure if it could be part of the system install command, through a flag,
or if it should be a separate command like reinstall.
Anyway, I believe we need to think about it, as it will be needed as existing sites
cannot be reloaded when using automated reloads.
There was a problem hiding this comment.
If the namespace has already been initialized, the controller needs to check if the
router container is using the correct image.
I need some clarification, what is considered the correct image? the one specified in the env variable? Or simply check that there is a router running for that namespace?
There was a problem hiding this comment.
Maybe we can handle updates separately from this initial PR, so it will be simpler to evaluate.
Let's just raise an enhancement issue for now, so we can think more about it.
Or we could do something similar to what the kube controller is doing, which consists of comparing
the version defined at the controller with the version defined to a given site/namespace. If controller
has a newer version, then we could use the same router image we use when a new site is created.
But again, better to keep it separate. WDYT?
There was a problem hiding this comment.
I agree with you on keeping this change separately, we need to define as well the CLI improvement for this.
0eb07fd to
d4bc626
Compare
I checked that the name of the site was not changing when the controller just refresh (for example, when a listener has been created); do you have any examples for me to understand this better? |
| defer h.lock.Unlock() | ||
|
|
||
| _, err := os.Stat(api.GetInternalOutputPath(h.namespace, api.RuntimeSiteStatePath)) | ||
| if err == nil { |
There was a problem hiding this comment.
Maybe we can handle updates separately from this initial PR, so it will be simpler to evaluate.
Let's just raise an enhancement issue for now, so we can think more about it.
Or we could do something similar to what the kube controller is doing, which consists of comparing
the version defined at the controller with the version defined to a given site/namespace. If controller
has a newer version, then we could use the same router image we use when a new site is created.
But again, better to keep it separate. WDYT?
|
|
||
| // the container endpoint is mapped to the podman socket inside the container | ||
| if api.IsRunningInContainer() { | ||
| endpoint = "unix:///var/run/podman.sock" |
There was a problem hiding this comment.
Actually, you brought up a good point. For auto-reloads, the system-controller container must be aware of all endpoints on the host machine, beforehands. So when the system-controller container is created, it needs to know the PODMAN_CONTAINER_ENDPOINT as well as the DOCKER_CONTAINER_ENDPOINT.
This way if the system controller is running with podman, but a docker site is created, the system controller already knows what is the endpoint to map, from the host machine into the target router container.
|
|
||
| //If there is no site configured, the namespace needs to be removed | ||
| if err != nil || siteState == nil || siteState.Site == nil { | ||
| err = h.tearDownNamespace() |
There was a problem hiding this comment.
Here I believe we should not teardown the namespace if an error occurred.
For example, if you have a site named "mysite" running on the default namespace and you accidentally create another site named "mysite2", site state loader will fail, as 2 sites have been defined, which is acceptable.
But then, if you remove the mysite2, to fix the situation, it will teardown this running site, which is not desirable, leaving it in a bad state.
9c2e556 to
d63b383
Compare
Implements #2337
Branch based on #2334, the proper commits that implement this issue are bb8ab35 and the following.
flowchart TD id1([new listener resource is created])-->|triggers|InputResourceHandler-->Q{site exists?} Q -->|Yes| A[Refresh Router Config] Q -->|No| B[Bootstrap]The InputResourceHandler was modified to not bootstrap in case a site is already created in the namespace.
It also includes validations for commands start, stop and reload, that should be disabled if the dynamic changes are happening.