Conversation
Three issues caused sshd to print 'Could not chdir to home directory': 1. The sshd_preloader only overrode getpwnam(), not getpwnam_r(). pam_mkhomedir uses pam_modutil_getpwnam() which calls getpwnam_r() internally, so it could not find the fake test user and skipped home directory creation. Add a getpwnam_r() override. 2. When AUTHD_TESTS_SSH_USE_AUTHD_NSS was defined and the user was not yet in authd (before authentication), the preloader returned NULL. This caused sshd to create a fake passwd struct with pw_dir set to '/nonexistent'. Remove the early return so valid test users always get a fake entry, even before auth. 3. The preloader used a single fixed AUTHD_TEST_SSH_HOME env var for all users' home directories, which is wrong in the shared sshd case where multiple users connect. Replace it with AUTHD_TEST_SSH_HOME_BASE and construct per-user home paths (base + username). When recycling cached entries, refresh pw_dir from NSS if the user has since been registered. Also create sshTestsHomeBase in prepareSSHTests (rather than only for wantLocalGroups tests) and pass WithHomeBaseDir(sshTestsHomeBase) to all authd daemon instances. This ensures the home path authd records matches the preloader's getpwnam/getpwnam_r return value.
OpenSSH 10.2 added check_pam_user() which compares PAM_USER against authctxt->pw->pw_name (from getpwnam). This caused three types of failures: 1. Upper-case username tests: The sshd_preloader returned lowercase pw_name from getpwnam, but sshd initialized PAM with the original SSH login name (preserving case). check_pam_user() fatal'd on the case mismatch before PAM modules could even run. Fix: The preloader now preserves the original query case for pw_name, so it matches the SSH login name that sshd uses for PAM_USER. Home directories still use lowercase paths (matching authd's convention). The PAM module no longer changes PAM_USER for case-only differences (isDifferentUser check), while still using lowercase internally via Username(). 2. Connection error test: On systems where common-auth includes authd's PAM module, the fallback auth fails with 'permission denied' and disconnects instead of showing a Password: prompt. Fix: Updated the VHS wait pattern to accept either 'Password:' or 'Disconnected from', with an appropriate timeout for the fallback auth sequence.
3v1n0
left a comment
There was a problem hiding this comment.
- The sshd_preloader only overrode getpwnam(), not getpwnam_r().
pam_mkhomedir uses pam_modutil_getpwnam() which calls getpwnam_r()
internally, so it could not find the fake test user and skipped home
directory creation. Add a getpwnam_r() override.
I knew this but at the date I did check that, but I remember that getpwnam_r was still redirected to the getpwnam by libc IIRC.
| // Only update PAM_USER when the user is genuinely different | ||
| // (not just a case variation). OpenSSH 10.2+ checks PAM_USER | ||
| // against pw_name from getpwnam() and rejects mismatches, so | ||
| // we must not change PAM_USER for case-only differences. | ||
| // The lowercase version is still used internally by authd | ||
| // via the Username() method. |
There was a problem hiding this comment.
Ah ok so this was due to me (openssh/openssh-portable@140bae1), right?
There was a problem hiding this comment.
Yes, I also suspected your commit to openssh as the "culprit" :)
| passwd_entity->pw_gecos = AUTHD_TEST_GECOS; | ||
| passwd_entity->pw_name = (char *) name; | ||
| { | ||
| MockPasswd *mock_passwd = &passwd_entities[entity_idx]; |
There was a problem hiding this comment.
Nit just set this once and use also for setting the parent earlier
| for (char *p = lower_name; *p; p++) | ||
| *p = tolower ((unsigned char) *p); | ||
| free (mock_passwd->home_path); | ||
| if (asprintf (&mock_passwd->home_path, "%s/%s", | ||
| get_home_base_path (), lower_name) < 0) | ||
| mock_passwd->home_path = NULL; | ||
| free (lower_name); | ||
| passwd_entity->pw_dir = mock_passwd->home_path | ||
| ? mock_passwd->home_path | ||
| : (char *) get_home_base_path (); | ||
| } |
There was a problem hiding this comment.
nit: add vertical spaces after each relevant statement
| (user-integration-pre-check-ssh-error-if-cannot-connect-to-authd@example.com@localhost) Password: | ||
| could not get current available brokers: permission denied: only root can perform this operation | ||
| Received disconnect from ${SSH_HOST} port ${SSH_PORT} Too many authentication failures | ||
| Disconnected from ${SSH_HOST} port ${SSH_PORT} |
| vhsCommandFinalAuthWaitVariable: `Wait /Password:/`, | ||
| vhsCommandFinalAuthWaitVariable: fmt.Sprintf( | ||
| `Wait+Screen /Password:|Disconnected from/ | ||
| Wait@%dms`, sshDefaultFinalWaitTimeout), |
There was a problem hiding this comment.
This is worth another commit.
| passwd_entity = &passwd_entities[i].parent; | ||
|
|
||
| if (!passwd_entity->pw_name || strcmp (passwd_entity->pw_name, name) != 0) | ||
| if (!passwd_entity->pw_name || strcasecmp (passwd_entity->pw_name, name) != 0) |
There was a problem hiding this comment.
Wondering, can't we still keep the strict comparison, assuming that the authd name is still matching or should not?
I mean, the goal here was still to provide to sshd what it asks for without mocking too much, while ideally the comparison case should be already handled by orig_getpwnam.
Maybe rather than comparing them by their name we could use the uid instead?
| * "Could not chdir to home directory". | ||
| */ | ||
| int | ||
| getpwnam_r (const char *name, struct passwd *pwd, char *buf, size_t buflen, |
There was a problem hiding this comment.
I'm quite sure this was not neeeded, as it was doing the same, but maybe I did not check all the edge cases.
I did have a local branch where doing the same thing though that I never landed since there was no difference.
| if (!is_valid_test_user (name)) | ||
| return orig_getpwnam_r (name, pwd, buf, buflen, result); | ||
|
|
||
| #ifdef AUTHD_TESTS_SSH_USE_AUTHD_NSS | ||
| /* Try the real NSS lookup first (e.g. via authd NSS module). */ | ||
| int ret = orig_getpwnam_r (name, pwd, buf, buflen, result); | ||
| if (ret == 0 && *result != NULL) | ||
| { | ||
| /* Override uid/gid like getpwnam does. */ | ||
| pwd->pw_uid = getuid (); | ||
| pwd->pw_gid = getgid (); | ||
| return 0; | ||
| } | ||
| #endif |
| else if (!is_supported_test_fake_user (name)) | ||
| else | ||
| { | ||
| fprintf (stderr, "sshd_preloader[%d]: User %s is not handled by authd brokers\n", | ||
| getpid (), name); | ||
| return NULL; | ||
| fprintf (stderr, "sshd_preloader[%d]: User %s is not yet handled by authd brokers," | ||
| " creating a fake entry\n", getpid (), name); |
There was a problem hiding this comment.
Not sure why this logic was not accepted, we do not want to create fake users for non-allow-listed cases, no?
| passwd_entity->pw_dir = (char *) get_home_path (); | ||
|
|
||
| /* Construct a per-user home path from the base dir + username, | ||
| * so each user gets their own home directory even in the shared | ||
| * sshd case where AUTHD_TEST_SSH_USER is the accept-all user. | ||
| */ |
There was a problem hiding this comment.
IIRC the point here was to control the home path from the go side, so rather than making this tool to do even more than it should, I would try to make the runnner to do things per each user.
Although I feel that this changed when you refactored the SSH tests so that we run them always using a shared SSHd demon, rather than one per each test (as it was instead before and thus when this library was initially written).
| // we must not change PAM_USER for case-only differences. | ||
| // The lowercase version is still used internally by authd | ||
| // via the Username() method. | ||
| if isDifferentUser { |
There was a problem hiding this comment.
I don't think this is a good solution. Commit a226ba1 specifically changed the code to also set PAM_USER when it only differs in case to avoid breaking other tools.
There was a problem hiding this comment.
I think we have to accept that users must type their username in lowercase when trying to log in via ssh. It would be nice to at least show them a useful error message, but I'm afraid that's not under our control - or does sshd support displaying a custom error message to the client?
There was a problem hiding this comment.
Currently, on Resolute, this is what users see when they use uppercase letters in their username with ssh:
ssh Test@ubudev1.onmicrosoft.com@localhost
Connection closed by 127.0.0.1 port 22
Relevant logs:
May 07 16:09:27 ubuntu26-04 authd[45307]: Pre-checking user "test@ubudev1.onmicrosoft.com"
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck: test@ubudev1.onmicrosoft.com
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck result: {"name":"test@ubudev1.onmicrosoft.com","uuid":"","dir":"/home/test@ubudev1.onmicrosoft.com","shell":"/usr/bin/bash","gecos":"test@ubudev1.onmicrosoft.com","groups":null}
May 07 16:09:27 ubuntu26-04 authd[45307]: Locking local entries
May 07 16:09:27 ubuntu26-04 authd[45307]: glibc lckpwdf: Local entries locked!
May 07 16:09:27 ubuntu26-04 authd[45307]: GetUserByName: user "test@ubudev1.onmicrosoft.com" not found
May 07 16:09:27 ubuntu26-04 authd[45307]: Parsing local passwd file: /etc/passwd
May 07 16:09:27 ubuntu26-04 authd[45307]: Reading groups from "/etc/group"
May 07 16:09:27 ubuntu26-04 authd[45307]: GetUserByID: user with UID 10000 not found
May 07 16:09:27 ubuntu26-04 authd[45307]: GetGroupByID: user with UID 10000 not found
May 07 16:09:27 ubuntu26-04 authd[45307]: GetGroupByID: user with UID 10000 not found
May 07 16:09:27 ubuntu26-04 authd[45307]: Added temporary record for user "test@ubudev1.onmicrosoft.com" with UID 10000 as "authd-pre-auth-user-1f776b57f08c7c81c12264157996af910d7289cdd4a8f154b03bd0c55ad2508d"
May 07 16:09:27 ubuntu26-04 authd[45307]: Using new UID 10000 for temporary user "test@ubudev1.onmicrosoft.com"
May 07 16:09:27 ubuntu26-04 authd[45307]: Unlocking local entries
May 07 16:09:27 ubuntu26-04 authd[45307]: glibc lckpwdf: Local entries unlocked!
May 07 16:09:27 ubuntu26-04 authd[45307]: Pre-checking user "test@ubudev1.onmicrosoft.com"
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck: test@ubudev1.onmicrosoft.com
May 07 16:09:27 ubuntu26-04 authd-msentraid[41615]: UserPreCheck result: {"name":"test@ubudev1.onmicrosoft.com","uuid":"","dir":"/home/test@ubudev1.onmicrosoft.com","shell":"/usr/bin/bash","gecos":"test@ubudev1.onmicrosoft.com","groups":null}
May 07 16:09:27 ubuntu26-04 authd[45307]: user "test@ubudev1.onmicrosoft.com" already pre-authenticated
May 07 16:09:27 ubuntu26-04 sshd-session[45406]: fatal: PAM user mismatch
If debug logs are not enabled, the only message that's logged is sshd-session[45406]: fatal: PAM user mismatch.
This PR fixes two issues:
See commit messages for details.
Closes #1511
Closes #1366