Skip to content

Fix SSH integration tests on Resolute#1512

Draft
adombeck wants to merge 2 commits intomainfrom
1511-fix-ssh-integration-tests
Draft

Fix SSH integration tests on Resolute#1512
adombeck wants to merge 2 commits intomainfrom
1511-fix-ssh-integration-tests

Conversation

@adombeck
Copy link
Copy Markdown
Contributor

@adombeck adombeck commented May 7, 2026

This PR fixes two issues:

  • The issue that "Could not chdir to home directory" is added to the golden files (which we only experience locally)
  • Issues caused by PAM_USER validation added in OpenSSH 10.2. This doesn't affect CI because it's still running an older OpenSSH version.

See commit messages for details.

Closes #1511
Closes #1366

adombeck added 2 commits May 7, 2026 13:10
Three issues caused sshd to print 'Could not chdir to home directory':

1. The sshd_preloader only overrode getpwnam(), not getpwnam_r().
   pam_mkhomedir uses pam_modutil_getpwnam() which calls getpwnam_r()
   internally, so it could not find the fake test user and skipped home
   directory creation. Add a getpwnam_r() override.

2. When AUTHD_TESTS_SSH_USE_AUTHD_NSS was defined and the user was not
   yet in authd (before authentication), the preloader returned NULL.
   This caused sshd to create a fake passwd struct with pw_dir set to
   '/nonexistent'. Remove the early return so valid test users always get
   a fake entry, even before auth.

3. The preloader used a single fixed AUTHD_TEST_SSH_HOME env var for all
   users' home directories, which is wrong in the shared sshd case where
   multiple users connect. Replace it with AUTHD_TEST_SSH_HOME_BASE and
   construct per-user home paths (base + username). When recycling cached
   entries, refresh pw_dir from NSS if the user has since been registered.

Also create sshTestsHomeBase in prepareSSHTests (rather than only for
wantLocalGroups tests) and pass WithHomeBaseDir(sshTestsHomeBase) to all
authd daemon instances. This ensures the home path authd records matches
the preloader's getpwnam/getpwnam_r return value.
OpenSSH 10.2 added check_pam_user() which compares PAM_USER against
authctxt->pw->pw_name (from getpwnam). This caused three types of
failures:

1. Upper-case username tests: The sshd_preloader returned lowercase
   pw_name from getpwnam, but sshd initialized PAM with the original
   SSH login name (preserving case). check_pam_user() fatal'd on the
   case mismatch before PAM modules could even run.
   Fix: The preloader now preserves the original query case for
   pw_name, so it matches the SSH login name that sshd uses for
   PAM_USER. Home directories still use lowercase paths (matching
   authd's convention). The PAM module no longer changes PAM_USER for
   case-only differences (isDifferentUser check), while still using
   lowercase internally via Username().

2. Connection error test: On systems where common-auth includes
   authd's PAM module, the fallback auth fails with 'permission
   denied' and disconnects instead of showing a Password: prompt.
   Fix: Updated the VHS wait pattern to accept either 'Password:' or
   'Disconnected from', with an appropriate timeout for the fallback
   auth sequence.
Copy link
Copy Markdown
Contributor

@3v1n0 3v1n0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The sshd_preloader only overrode getpwnam(), not getpwnam_r().
    pam_mkhomedir uses pam_modutil_getpwnam() which calls getpwnam_r()
    internally, so it could not find the fake test user and skipped home
    directory creation. Add a getpwnam_r() override.

I knew this but at the date I did check that, but I remember that getpwnam_r was still redirected to the getpwnam by libc IIRC.

Comment on lines +99 to +104
// Only update PAM_USER when the user is genuinely different
// (not just a case variation). OpenSSH 10.2+ checks PAM_USER
// against pw_name from getpwnam() and rejects mismatches, so
// we must not change PAM_USER for case-only differences.
// The lowercase version is still used internally by authd
// via the Username() method.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok so this was due to me (openssh/openssh-portable@140bae1), right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also suspected your commit to openssh as the "culprit" :)

passwd_entity->pw_gecos = AUTHD_TEST_GECOS;
passwd_entity->pw_name = (char *) name;
{
MockPasswd *mock_passwd = &passwd_entities[entity_idx];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit just set this once and use also for setting the parent earlier

Comment on lines +236 to +246
for (char *p = lower_name; *p; p++)
*p = tolower ((unsigned char) *p);
free (mock_passwd->home_path);
if (asprintf (&mock_passwd->home_path, "%s/%s",
get_home_base_path (), lower_name) < 0)
mock_passwd->home_path = NULL;
free (lower_name);
passwd_entity->pw_dir = mock_passwd->home_path
? mock_passwd->home_path
: (char *) get_home_base_path ();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add vertical spaces after each relevant statement

Comment on lines -7 to +9
(user-integration-pre-check-ssh-error-if-cannot-connect-to-authd@example.com@localhost) Password:
could not get current available brokers: permission denied: only root can perform this operation
Received disconnect from ${SSH_HOST} port ${SSH_PORT} Too many authentication failures
Disconnected from ${SSH_HOST} port ${SSH_PORT}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhmh 🤔

Comment on lines -385 to +387
vhsCommandFinalAuthWaitVariable: `Wait /Password:/`,
vhsCommandFinalAuthWaitVariable: fmt.Sprintf(
`Wait+Screen /Password:|Disconnected from/
Wait@%dms`, sshDefaultFinalWaitTimeout),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is worth another commit.

passwd_entity = &passwd_entities[i].parent;

if (!passwd_entity->pw_name || strcmp (passwd_entity->pw_name, name) != 0)
if (!passwd_entity->pw_name || strcasecmp (passwd_entity->pw_name, name) != 0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering, can't we still keep the strict comparison, assuming that the authd name is still matching or should not?

I mean, the goal here was still to provide to sshd what it asks for without mocking too much, while ideally the comparison case should be already handled by orig_getpwnam.

Maybe rather than comparing them by their name we could use the uid instead?

* "Could not chdir to home directory".
*/
int
getpwnam_r (const char *name, struct passwd *pwd, char *buf, size_t buflen,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite sure this was not neeeded, as it was doing the same, but maybe I did not check all the edge cases.

I did have a local branch where doing the same thing though that I never landed since there was no difference.

Comment on lines +283 to +296
if (!is_valid_test_user (name))
return orig_getpwnam_r (name, pwd, buf, buflen, result);

#ifdef AUTHD_TESTS_SSH_USE_AUTHD_NSS
/* Try the real NSS lookup first (e.g. via authd NSS module). */
int ret = orig_getpwnam_r (name, pwd, buf, buflen, result);
if (ret == 0 && *result != NULL)
{
/* Override uid/gid like getpwnam does. */
pwd->pw_uid = getuid ();
pwd->pw_gid = getgid ();
return 0;
}
#endif
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some logging here.

Comment on lines -163 to +170
else if (!is_supported_test_fake_user (name))
else
{
fprintf (stderr, "sshd_preloader[%d]: User %s is not handled by authd brokers\n",
getpid (), name);
return NULL;
fprintf (stderr, "sshd_preloader[%d]: User %s is not yet handled by authd brokers,"
" creating a fake entry\n", getpid (), name);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this logic was not accepted, we do not want to create fake users for non-allow-listed cases, no?

Comment on lines +199 to +220
passwd_entity->pw_dir = (char *) get_home_path ();

/* Construct a per-user home path from the base dir + username,
* so each user gets their own home directory even in the shared
* sshd case where AUTHD_TEST_SSH_USER is the accept-all user.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the point here was to control the home path from the go side, so rather than making this tool to do even more than it should, I would try to make the runnner to do things per each user.

Although I feel that this changed when you refactored the SSH tests so that we run them always using a shared SSHd demon, rather than one per each test (as it was instead before and thus when this library was initially written).

// we must not change PAM_USER for case-only differences.
// The lowercase version is still used internally by authd
// via the Username() method.
if isDifferentUser {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good solution. Commit a226ba1 specifically changed the code to also set PAM_USER when it only differs in case to avoid breaking other tools.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to accept that users must type their username in lowercase when trying to log in via ssh. It would be nice to at least show them a useful error message, but I'm afraid that's not under our control - or does sshd support displaying a custom error message to the client?

Copy link
Copy Markdown
Contributor Author

@adombeck adombeck May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, on Resolute, this is what users see when they use uppercase letters in their username with ssh:

ssh Test@ubudev1.onmicrosoft.com@localhost
Connection closed by 127.0.0.1 port 22

Relevant logs:

May 07 16:09:27 ubuntu26-04 authd[45307]: Pre-checking user "test@ubudev1.onmicrosoft.com"
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck: test@ubudev1.onmicrosoft.com
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck result: {"name":"test@ubudev1.onmicrosoft.com","uuid":"","dir":"/home/test@ubudev1.onmicrosoft.com","shell":"/usr/bin/bash","gecos":"test@ubudev1.onmicrosoft.com","groups":null}
May 07 16:09:27 ubuntu26-04 authd[45307]: Locking local entries
May 07 16:09:27 ubuntu26-04 authd[45307]: glibc lckpwdf: Local entries locked!
May 07 16:09:27 ubuntu26-04 authd[45307]: GetUserByName: user "test@ubudev1.onmicrosoft.com" not found
May 07 16:09:27 ubuntu26-04 authd[45307]: Parsing local passwd file: /etc/passwd
May 07 16:09:27 ubuntu26-04 authd[45307]: Reading groups from "/etc/group"
May 07 16:09:27 ubuntu26-04 authd[45307]: GetUserByID: user with UID 10000 not found
May 07 16:09:27 ubuntu26-04 authd[45307]: GetGroupByID: user with UID 10000 not found
May 07 16:09:27 ubuntu26-04 authd[45307]: GetGroupByID: user with UID 10000 not found
May 07 16:09:27 ubuntu26-04 authd[45307]: Added temporary record for user "test@ubudev1.onmicrosoft.com" with UID 10000 as "authd-pre-auth-user-1f776b57f08c7c81c12264157996af910d7289cdd4a8f154b03bd0c55ad2508d"
May 07 16:09:27 ubuntu26-04 authd[45307]: Using new UID 10000 for temporary user "test@ubudev1.onmicrosoft.com"
May 07 16:09:27 ubuntu26-04 authd[45307]: Unlocking local entries
May 07 16:09:27 ubuntu26-04 authd[45307]: glibc lckpwdf: Local entries unlocked!
May 07 16:09:27 ubuntu26-04 authd[45307]: Pre-checking user "test@ubudev1.onmicrosoft.com"
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck: test@ubudev1.onmicrosoft.com
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck result: {"name":"test@ubudev1.onmicrosoft.com","uuid":"","dir":"/home/test@ubudev1.onmicrosoft.com","shell":"/usr/bin/bash","gecos":"test@ubudev1.onmicrosoft.com","groups":null}
May 07 16:09:27 ubuntu26-04 authd[45307]: user "test@ubudev1.onmicrosoft.com" already pre-authenticated
May 07 16:09:27 ubuntu26-04 sshd-session[45406]: fatal: PAM user mismatch

If debug logs are not enabled, the only message that's logged is sshd-session[45406]: fatal: PAM user mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SSH integration tests fail on Resolute Issue: SSH login fails for usernames containing uppercase letters (Resolute)

2 participants