Skip to content

Stale references to replica after worker node removal #7841

@nrauso

Description

@nrauso

Removing an offline worker node (ex-leader) introduces an issue: neither openldap nor ldapproxy in remaining nodes seem to detect that the worker has been removed.

Step to reproduce

  • Install an NS8 cluster with two nodes.
  • Install an OpenLDAP account provider with replicas enabled on both nodes.
  • Simulate a failure of the leader node (10.5.4.1).
  • Promote the worker node to the leader role using switch-leader.

Expected behavior

When an offline node is removed from the cluster:

  • OpenLDAP should update its replication configuration accordingly.
  • Stale or unreachable replication providers should be automatically removed or clearly reported.
  • The system should remain in a consistent state without requiring manual intervention.

Actual behavior

After removing the offline ex-leader node:

  • The OpenLDAP replication configuration still contains references to the removed node (10.5.4.1).
  • The OpenLDAP replica for that node no longer exists, but the replication entry remains configured.
  • ldapproxy does not automatically reconfigure itself, but it does recover correctly after a restart.

OpenLDAP keeps the replication configuration pointing to a non-existent provider (10.5.4.1).

Configuration details

OpenLDAP configuration:

# {2}mdb, config
dn: olcDatabase={2}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {2}mdb
olcDbDirectory: /var/lib/openldap/openldap-data
olcSuffix: dc=nicolatest,dc=it
olcAccess: {0}to attrs=userPassword by dn.base="gidNumber=101+uidNumber=100,cn=peercred,cn=external,cn=auth" write by set="[cn=domain admins,ou=Groups,dc=nicolatest,dc=it]/memberUid & user/uid" write by self write by * auth
olcAccess: {1}to * by dn.base="gidNumber=101+uidNumber=100,cn=peercred,cn=external,cn=auth" manage by set="[cn=domain admins,ou=Groups,dc=nicolatest,dc=it]/memberUid & user/uid" write by * read
olcRootDN: cn=mdbsync,dc=nicolatest,dc=it
olcRootPW: ax5K9r3aciPE/TNwPrNBcxRi
olcSyncrepl: {0}rid=3 provider=ldap://10.5.4.1:20001 binddn="cn=mdbsync,dc=nicolatest,dc=it" bindmethod=simple credentials=ax5K9r3aciPE/TNwPrNBcxRi searchbase="dc=nicolatest,dc=it" type=refreshAndPersist retry="5 5 300 +" timeout=1
olcSyncrepl: {1}rid=5 provider=ldap://10.5.4.2:20001 binddn="cn=mdbsync,dc=nicolatest,dc=it" bindmethod=simple credentials=ax5K9r3aciPE/TNwPrNBcxRi searchbase="dc=nicolatest,dc=it" type=refreshAndPersist retry="5 5 300 +" timeout=1
olcMultiProvider: TRUE
olcDbIndex: objectClass eq
olcDbIndex: uid pres,eq
olcDbIndex: cn pres,eq,sub,subinitial
olcDbIndex: memberUid pres,eq

ldapproxy config:

~]# cat /home/ldapproxy2/.config/state/nginx/nginx.conf
user nginx;
worker_processes auto;

error_log /var/log/nginx/error.log info;
pid /var/run/nginx.pid;

events {
    worker_connections  1024;
}

# L4 proxy to LDAP account providers
stream {
    # Domain nicolatest.it
    server {
        proxy_pass nicolatest_it;
        listen 127.0.0.1:20000;

        proxy_ssl off;
    }
    upstream nicolatest_it {
        server 10.5.4.2:20001; # origin nicolatest.it
        server 10.5.4.1:20001 backup; # origin nicolatest.it
    }
}

Components

  • Core 3.16
  • OpenLDAP 2.6.0

See also

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    Status

    ToDo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions