Skip to content

feat(auth): OIDC device flow & API audience for library (sc-15741)#501

Open
jamadriz wants to merge 4 commits intomainfrom
jamadriz/sc-15741/make-validmind-library-work-with-oauth
Open

feat(auth): OIDC device flow & API audience for library (sc-15741)#501
jamadriz wants to merge 4 commits intomainfrom
jamadriz/sc-15741/make-validmind-library-work-with-oauth

Conversation

@jamadriz
Copy link
Copy Markdown

@jamadriz jamadriz commented May 2, 2026

Pull Request Description

Backend PR - https://github.com/validmind/backend/pull/3075

What and why?

This PR adds OAuth 2.0 / OIDC device authorization flow support to the ValidMind Python library so notebooks can authenticate with a Bearer token instead of only API keys.

Behavior:

  • vm.init(..., issuer=..., client_id=..., scope=..., audience=...) runs RFC 8628 device login, caches tokens under ~/.validmind/credentials.json, and sends Authorization: Bearer ... to the tracking API.
  • audience (or env VM_OIDC_AUDIENCE) is passed to the device authorize and token endpoints so Auth0 (and similar IdPs) can issue RS256 API access tokens for the resource identifier that matches the backend api_audience.
  • normalize_issuer, normalize_client_id, and normalize_audience strip common wrapping quotes from copied .env / notebook values.
  • Credential cache keys are scoped by issuer, client id, and audience when audience is set so tokens for different APIs do not collide.
  • New modules: validmind/oidc_device.py, validmind/credentials_store.py. Auth failures raise ValidMindAuthError (extends existing error hierarchy).
  • Tutorial notebooks and OIDC plan doc updated with setup guidance.

Before: Library authentication required api_key / api_secret only.
After: Either API keys or OIDC device flow parameters can be used (mutually exclusive).

How to test

  1. pip install -e . from this repo; restart the Jupyter kernel.
  2. In Auth0: Native app with Device Code grant; API with Identifier equal to backend api_audience; authorize the app for that API.
  3. vm.init( api_host='', model='<your model id>', document="documentation", issuer='', client_id='', audience='', ) (empty strings prevent env API keys from mixing with OIDC during local tests).
  4. Run unit tests:
pytest tests/test_credentials_store.py tests/test_oidc_device.py tests/test_api_client.py::TestAPIClientOIDC -q

What needs special review?

  • OIDC parameter naming and env var (VM_OIDC_AUDIENCE) vs platform docs.
  • Notebook edits (large diff may include cell outputs — consider clearing outputs before merge if policy requires).

Dependencies, breaking changes, and deployment notes

  • Pairs with backend PR that accepts Bearer tokens on tracking routes and verifies OIDC JWTs (JWKS / HS256 as configured).
  • No breaking change for existing API-key users.
  • After release, PyPI consumers need a version that includes audience on vm.init for RS256 API-token flows.

Release notes

Enhancement: Optional OAuth device login for vm.init via issuer, client_id, optional scope, and optional audience (Auth0 API Identifier). Environment variable VM_OIDC_AUDIENCE supported. Tokens cached under ~/.validmind/credentials.json.

Suggested label: enhancement, environment-variables

Checklist

  • What and why
  • Screenshots or videos (Frontend)
  • How to test
  • What needs special review
  • Dependencies, breaking changes, and deployment notes
  • Labels applied
  • PR linked to Shortcut (sc-15741)
  • Unit tests added (Backend)
  • Tested locally
  • Documentation updated (if required)
  • Environment variable additions/changes documented (if required)

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 2, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ jamadriz
❌ mdeyell-valid-mind
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment thread validmind/oidc_device.py Fixed
Comment thread validmind/oidc_device.py Fixed
@jamadriz jamadriz force-pushed the jamadriz/sc-15741/make-validmind-library-work-with-oauth branch from 13aa151 to 9bc78f0 Compare May 2, 2026 04:58
@jamadriz jamadriz marked this pull request as ready for review May 4, 2026 16:19
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@jamadriz jamadriz force-pushed the jamadriz/sc-15741/make-validmind-library-work-with-oauth branch from a628c1c to 91fe3fc Compare May 4, 2026 16:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

2 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@jamadriz jamadriz added the enhancement New feature or request label May 4, 2026
- vm.init: issuer, client_id, scope, audience; env VM_OIDC_AUDIENCE
- Device authorize/token requests send audience for RS256 API tokens (e.g. Auth0)
- credentials_store + oidc_device; normalize issuer/client_id/audience; cache keys
- ValidMindAuthError; CodeQL-safe device prompt
- Unit tests; poetry.lock update
- fix(lint): reduce poll_device_token complexity (flake8 C901)
- fix(pandas): replace deprecated is_categorical_dtype checks (pandas 2.x)
- test: skip xgboost-dependent tests when extra not installed
- fix(async): use asyncio.run when no running loop in run_async (Python 3.9+ CI)

Co-authored-by: Cursor <cursoragent@cursor.com>
@jamadriz jamadriz force-pushed the jamadriz/sc-15741/make-validmind-library-work-with-oauth branch from 91fe3fc to 421e1ed Compare May 4, 2026 19:02
@jamadriz jamadriz requested a review from AnilSorathiya May 4, 2026 19:04
Comment thread validmind/api_client.py
api_key: Optional[str] = None,
api_secret: Optional[str] = None,
api_host: Optional[str] = None,
api_url: Optional[str] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not derive api_url from api_host?

No need to offer a new argument to a user if they both accomplish the same.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, the ticket actually had api_url specified in the story hence why it was added

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now they are both the same (oidc device flow will still work passing api_host and not api_url), api_url seems better naming though, since we were using api_host as actually being the tracking url, with the full tracking namespace route. I suggest we leave both, and we deprecate api_host on followup PRs

Copy link
Copy Markdown
Contributor

@cachafla cachafla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good 👍 I'd suggest an accompanying notebook to allow users to test this without changing an existing notebook.

Maybe we can copy quickstart_model_documentation.ipynb and create a new one called quickstart_model_documentation_oidc_device_flow.ipynb?

Only difference in content is an explanation at the beginning about what parameters to pass to vm.init() and how to complete the flow.

@mdeyell-valid-mind
Copy link
Copy Markdown
Contributor

mdeyell-valid-mind commented May 4, 2026

I'll dig into this more tomorrow but when trying to use an entra token the backend gives me the error below. I think the token is generated correctly since it gives me a code to put into a web browser.

2026-05-04T16:15:31.562445 [info] JWT validation error for m$: InvalidSignatureError filename=auth_oidc.py func_name=verify_auth_token lineno=139 method=GET module=auth.auth_oidc path=/api/v1/tracking/ping request_id=4af3cdbd-fc99-4637-826c-799298d51353

- Add quickstart_model_documentation_oidc_device_flow.ipynb with vm.init(issuer, client_id, ...) and device-flow UX.
- Add 1-set_up_validmind_oidc_device_flow.ipynb variant with links to docs.
- Add docs/oidc-device-flow-release-notes.md (OIDC library release notes).

Co-authored-by: Cursor <cursoragent@cursor.com>
@jamadriz
Copy link
Copy Markdown
Author

jamadriz commented May 5, 2026

Looking good 👍 I'd suggest an accompanying notebook to allow users to test this without changing an existing notebook.

Maybe we can copy quickstart_model_documentation.ipynb and create a new one called quickstart_model_documentation_oidc_device_flow.ipynb?

Only difference in content is an explanation at the beginning about what parameters to pass to vm.init() and how to complete the flow.

done

@jamadriz jamadriz requested a review from cachafla May 5, 2026 01:11
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to provide a demo notebook for this one. The quickstart is good enough.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to refer to a separate file here:

...see [OIDC device flow release notes](../../docs/oidc-device-flow-release-notes.md).

Better to prevent this kind of linking since that will force us to carry 2 files around: this one and the markdown. Better to link it to a ValidMind docs page if needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be removed in the #### Configure vm.init() for OIDC section.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    issuer="https://login.microsoftonline.com/<tenant-id>/v2.0",  # your IdP issuer (OpenID discovery)

Why not use ValidMind's prod URL so it can be tested without looking up the prod login authority?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cachafla I didn’t see valimind url in any notebooks vm.init so I assumed we don’t want to include this here. I will add issuer per your request (but leave client id empty, maybe we want to add this somewhere in the app instead)

@mdeyell-valid-mind
Copy link
Copy Markdown
Contributor

mdeyell-valid-mind commented May 5, 2026

To get ping to work with Entra I had to change this

+def _is_entra_issuer(issuer: str) -> bool:
+    return "login.microsoftonline.com" in issuer.lower()
+
+
+def _select_oidc_bearer_token(entry: Dict[str, Any]) -> str:
+    if _is_entra_issuer(entry.get("issuer", "")) and entry.get("id_token"):
+        return entry["id_token"]
+    return entry["access_token"]
+
+
 def init(
     api_key: Optional[str] = None,
     api_secret: Optional[str] = None,
@@ -380,7 +390,7 @@ def init(
         entry = _obtain_oidc_tokens(
             issuer, client_id, scope_val, audience=oidc_audience_opt
         )
-        _access_token = entry["access_token"]
+        _access_token = _select_oidc_bearer_token(entry)
         _oidc_login_context = {
             "issuer": entry["issuer"],
             "client_id": entry["client_id"],

I also verified that I can call log_input with Entra

@mdeyell-valid-mind
Copy link
Copy Markdown
Contributor

To get ping to work with Entra I had to change this

+def _is_entra_issuer(issuer: str) -> bool:
+    return "login.microsoftonline.com" in issuer.lower()
+
+
+def _select_oidc_bearer_token(entry: Dict[str, Any]) -> str:
+    if _is_entra_issuer(entry.get("issuer", "")) and entry.get("id_token"):
+        return entry["id_token"]
+    return entry["access_token"]
+
+
 def init(
     api_key: Optional[str] = None,
     api_secret: Optional[str] = None,
@@ -380,7 +390,7 @@ def init(
         entry = _obtain_oidc_tokens(
             issuer, client_id, scope_val, audience=oidc_audience_opt
         )
-        _access_token = entry["access_token"]
+        _access_token = _select_oidc_bearer_token(entry)
         _oidc_login_context = {
             "issuer": entry["issuer"],
             "client_id": entry["client_id"],

I also verified that I can call log_input with Entra

Unit tests


    @patch("validmind.api_client._ping")
    @patch("validmind.api_client._obtain_oidc_tokens")
    def test_init_entra_oidc_uses_id_token(self, mock_obtain, mock_ping):
        mock_obtain.return_value = {
            "issuer": "https://login.microsoftonline.com/tenant-id/v2.0",
            "client_id": "cid",
            "access_token": "access-token",
            "expires_at": "2099-01-01T00:00:00+00:00",
            "refresh_token": None,
            "id_token": "id-token",
        }
        api_client.init(
            model="model-cuid",
            api_host="http://localhost/track/",
            api_key="",
            api_secret="",
            issuer="https://login.microsoftonline.com/tenant-id/v2.0",
            client_id="cid",
            document="documentation",
        )
        headers = api_client._get_api_headers()
        self.assertEqual(headers["Authorization"], "Bearer id-token")

    @patch("validmind.api_client._ping")
    @patch("validmind.api_client._obtain_oidc_tokens")
    def test_init_non_entra_oidc_prefers_access_token(self, mock_obtain, mock_ping):
        mock_obtain.return_value = {
            "issuer": "https://issuer.example.com/",
            "client_id": "cid",
            "access_token": "access-token",
            "expires_at": "2099-01-01T00:00:00+00:00",
            "refresh_token": None,
            "id_token": "id-token",
        }
        api_client.init(
            model="model-cuid",
            api_host="http://localhost/track/",
            api_key="",
            api_secret="",
            issuer="https://issuer.example.com/",
            client_id="cid",
            document="documentation",
        )
        headers = api_client._get_api_headers()
        self.assertEqual(headers["Authorization"], "Bearer access-token")

@jamadriz
Copy link
Copy Markdown
Author

jamadriz commented May 5, 2026

@mdeyell-valid-mind with these changes for Entra, did you have a chance to ensure it keeps working with Auth0?
can you push these to the branch?
thanks

@mdeyell-valid-mind
Copy link
Copy Markdown
Contributor

@mdeyell-valid-mind with these changes for Entra, did you have a chance to ensure it keeps working with Auth0? can you push these to the branch? thanks

I haven't ran this for auth0 yet but i'll check

Comment thread validmind/api_client.py Dismissed
Copy link
Copy Markdown
Contributor

@mdeyell-valid-mind mdeyell-valid-mind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works on my machine

@mdeyell-valid-mind
Copy link
Copy Markdown
Contributor

I tested with entra, auth0, and api keys

Copy link
Copy Markdown
Contributor

@cachafla cachafla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving now but the quickstart notebook still needs to be updated to not refer to a separate file (docs/oidc-device-flow-release-notes.md) because we need to ensure this notebook is portable and does not have any extra dependencies.

…ickstart

Keep only the OIDC quickstart; avoid linking to repo markdown. Point readers to ValidMind Library docs and admins for audience/scope.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR Summary

This PR adds comprehensive support for OIDC device flow authentication as an alternative to API key based authentication. The following changes and enhancements have been introduced:

  1. Documentation and Quickstart Updates:

    • A new release notes document in docs/oidc-device-flow-release-notes.md details the requirements from the identity provider, the necessary configuration (issuer, client_id, audience, and scopes), and outlines the device flow process according to RFC 8628.
    • A detailed interactive notebook (notebooks/quickstart/quickstart_model_documentation_oidc_device_flow.ipynb) has been added to guide users through model registration, dataset initialization, and running tests with OIDC authentication. The notebook explains how to set up credentials using environment variables and demonstrates end-to-end testing with ValidMind.
  2. API Client Enhancements:

    • The main API client initialization in validmind/api_client.py now supports OIDC parameters. Passing issuer and client_id triggers the device authorization flow; the existing API key parameters remain available for traditional authentication. The code enforces that API key and OIDC-based authentication cannot be mixed.
    • New helper functions have been added in the API client file to manage token retrieval, selection (between access token and id token based on issuer), and refreshing operations using device flow. The integration with the credentials store allows tokens to be cached securely under ~/.validmind/credentials.json.
  3. Credentials and Token Handling:

    • The validmind/credentials_store.py module now normalizes issuer, client_id, and audience values to ensure consistency in cache keys. It also implements safe file writes with restricted permissions (mode 600) to store tokens.
  4. OIDC Device Flow Module:

    • A new module validmind/oidc_device.py provides end-to-end support for the device authorization flow. It handles discovery of OIDC endpoints, requests for device authorization, polling for tokens, and token refresh operations. The implementation follows RFC 8628 standards with additional error handling and graceful handling of conditions such as slow_down, expired_token, and access_denied.
  5. Testing Enhancements:

    • Extensive tests have been added under tests/ to cover various scenarios, including API client validation for OIDC and API key exclusive modes, credential normalization, token caching and expiry checks, and the full device flow (including polling and refresh logic). These tests use mocks to simulate network calls and assert proper error responses where applicable.

Overall, the PR functionally enhances authentication flexibility, provides thorough documentation and guidance for users, and includes an extensive suite of tests to ensure reliability and correctness.

Test Suggestions

  • Manually test the interactive device flow by running the quickstart notebook in a Jupyter environment to confirm that the verification URL and user code are correctly displayed and that tokens are cached properly.
  • Simulate error conditions (e.g., expired token, slow_down response, and access_denied) using a mocked OIDC provider to ensure proper exception handling and error messages.
  • Write integration tests to verify that after successful OIDC login the Authorization header is correctly populated in API requests.
  • Check that environment variables override parameters as expected and that mixing API keys with OIDC parameters raises the proper error.
  • Perform tests to validate file permissions for the cached credentials file and confirm that no sensitive token information is leaked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants