Skip to content

Fix conf.has_option not respects default provider metadata#64209

Open
jason810496 wants to merge 16 commits intoapache:mainfrom
jason810496:task-sdk/fix-regression-for-make-conf-respect-provider-metadata
Open

Fix conf.has_option not respects default provider metadata#64209
jason810496 wants to merge 16 commits intoapache:mainfrom
jason810496:task-sdk/fix-regression-for-make-conf-respect-provider-metadata

Conversation

@jason810496
Copy link
Member

@jason810496 jason810496 commented Mar 25, 2026

Noted that all the ProviderManger includes RuntimeProviderManager, I just shorten as ProviderManger to refer both of them.

What

TL;DR;

Cleanup for deprecated path for the new "make conf respect provider metadata" mechanism.

The write(include_providers=True), with conf.make_sure_configuration_loaded(with_providers=False) or any conf.as_dict are the caller that introduce the recursive call between conf module and Provider Manager, so it would be better to clean them up in case they cause more regression just because we need to maintain those interfaces.

Before the clean up: configuration_description as stateful attributes

  • we need load_providers_configuration, restore_core_default_configuration to change the state of configuration_description just to reflect the sources of the conf
  • the configuration_description before the cleanup will not respect _provider_cfg_config_fallback_default_values or _provider_metadata_config_fallback_default_values.

After the current clean up: configuration_description as view

  • configuration_description is now a cache_property that can reflect _provider_cfg_config_fallback_default_values and _provider_metadata_config_fallback_default_values config sources
Was generative AI tooling used to co-author this PR?
  • Yes, Claude Code for fixing the unit tests

@jason810496 jason810496 changed the title Fix regression of respecting provider metadata for conf Fix has_option not respects provider metadata defaults Mar 25, 2026
@jason810496 jason810496 changed the title Fix has_option not respects provider metadata defaults Fix conf.has_option not respects default provider metadata Mar 25, 2026
@jason810496 jason810496 force-pushed the task-sdk/fix-regression-for-make-conf-respect-provider-metadata branch from c4c72ef to ab2a76f Compare March 26, 2026 04:11
@jason810496 jason810496 added the full tests needed We need to run full set of tests for this PR to merge label Mar 26, 2026
@jason810496 jason810496 force-pushed the task-sdk/fix-regression-for-make-conf-respect-provider-metadata branch from 6fe99cb to 191986a Compare March 26, 2026 07:25
@eladkal eladkal added this to the Airflow 3.2.0 milestone Mar 26, 2026
@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label Mar 26, 2026
@jason810496 jason810496 requested a review from Copilot March 26, 2026 08:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors Airflow’s shared configuration parser so provider metadata/cfg fallback defaults are respected by conf.has_option/conf.get_default_value, while removing deprecated “load/restore provider configuration” pathways that created conf↔providers-manager recursion.

Changes:

  • Reworks shared AirflowConfigParser provider-default handling by introducing a cached configuration_description “view” and adjusting fallback/sentinel behavior for has_option and provider metadata lookup.
  • Removes deprecated provider-config load/restore hooks from core + SDK providers managers and updates unit tests accordingly.
  • Adds new test coverage (core + task SDK) asserting provider metadata vs cfg fallback precedence, and tweaks default config lint script diagnostics.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
shared/configuration/src/airflow_shared/configuration/parser.py Core logic changes for provider fallbacks, configuration_description, has_option, get_default_value, and write/as_dict behavior.
shared/configuration/tests/configuration/test_parser.py Updates shared parser tests for the new provider-fallback behavior.
airflow-core/src/airflow/configuration.py Adapts core config wrapper to new internal _configuration_description and cache invalidation.
airflow-core/src/airflow/providers_manager.py Removes deprecated conf-loading side effect from provider initialization.
airflow-core/tests/unit/core/test_configuration.py Adds/updates tests for provider fallback precedence and provider toggle behavior.
task-sdk/src/airflow/sdk/configuration.py Mirrors core configuration changes for SDK config wrapper.
task-sdk/src/airflow/sdk/providers_manager_runtime.py Removes deprecated conf-loading side effect and unused property.
task-sdk/tests/task_sdk/test_configuration.py New SDK tests validating provider metadata/cfg fallback priority and has_option.
task-sdk/tests/task_sdk/test_providers_manager_runtime.py Updates runtime providers manager tests to use cache invalidation.
devel-common/src/tests_common/test_utils/config.py Adds shared test data for provider-metadata vs cfg-fallback defaults.
airflow-core/src/airflow/config_templates/provider_config_fallback_defaults.cfg Updates a celery default import path in cfg fallback defaults.
shared/providers_discovery/src/airflow_shared/providers_discovery/providers_discovery.py Switches logging from structlog to stdlib logging.
scripts/in_container/run_check_default_configuration.py Adjusts env/logging and failure diagnostics for default config linting.

@jason810496 jason810496 marked this pull request as ready for review March 26, 2026 09:48
Copy link
Member Author

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI just passed with full test needed label (when the PR still marked as draft) and I add more test cases to harden the conf compatibility to avoid the regression.

Comment on lines +329 to +331
# For cfg-only options with no provider metadata, infer sensitivity from name.
if "sensitive" not in opt_dict and option.endswith(("password", "secret")):
opt_dict["sensitive"] = True
Copy link
Member Author

@jason810496 jason810496 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure do we need _secrets_masker().should_hide_value_for_key(option) or not?

We have secret masker for CLI, API, etc (anyplace that might expose the sensitive conf to user). The secret masker will handle the masking for us at higher level so perhaps we don't even need to handle the sensitive case here.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But secrets masker has to know if its sensitive? Some field names might not be deemed sensitive from name, like remote_task_handler_kwargs

Copy link
Member Author

@jason810496 jason810496 Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case here only handle provider_default_fallback_values.cfg as we couldn't specify which [section/option] is sensitive or not in .cfg.

We have sensitive: bool field in provider.yaml though, so I'm trying to make the .cfg case here compatible with the sensitive field of provider.yaml by inferring from the option name.

But maybe it's not worthwhile to handle at all as you mention even we use secret masker here is still not accurate enough to make sure the option is sensitive or not.

@jason810496 jason810496 added the affected_version:3.2.0beta Use for reporting issues with 3.2.0beta label Mar 26, 2026
@amoghrajesh
Copy link
Contributor

I want to take a look at this one tomorrow.

@jason810496
Copy link
Member Author

I want to take a look at this one tomorrow.

Thank you Amogh!

Lee-W
Lee-W previously approved these changes Mar 27, 2026
Copy link
Member

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, but overall looks good.

Simplifies and centralizes provider configuration loading by removing redundant methods and properties, ensuring consistent state management. Updates fallback logic to use a sentinel value instead of None, improving accuracy in config lookups. Improves test coverage for provider metadata fallbacks and clarifies lazy initialization behavior. Streamlines provider manager interfaces to reduce duplication and potential import cycles.
Ensures provider configuration sections do not overlap with core sections
and gives priority to provider metadata when merging configuration options.
Refactors tests to verify correct toggling of provider configuration and
proper separation of provider and core config sections. Updates fallback
lookup order and cleans up unused properties for clarity and maintainability.
Root cause: providers_discovery.py used structlog.getLogger() directly. Before structlog.configure() is called (which happens later in settings.py:726), structlog's default PrintLogger writes to stdout with no level filtering. So debug logs during early provider
  discovery pollute the stdout of airflow config list --default, corrupting the generated config file.

Fix: Switched to logging.getLogger() (stdlib). stdlib logging defaults to WARNING level and writes to stderr, so debug logs are suppressed and stdout stays clean. This is also the correct pattern for shared library code — structlog configuration is the application's
  responsibility.
@jason810496 jason810496 force-pushed the task-sdk/fix-regression-for-make-conf-respect-provider-metadata branch from 89206ea to 00e6a84 Compare March 27, 2026 04:44
@jason810496
Copy link
Member Author

Thanks @Lee-W for the review and all the CI pass after addressing the comments. It's ready for next review round, thanks!

Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, nothing too major but worth bringing to your notice

Comment on lines +329 to +331
# For cfg-only options with no provider metadata, infer sensitivity from name.
if "sensitive" not in opt_dict and option.endswith(("password", "secret")):
opt_dict["sensitive"] = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But secrets masker has to know if its sensitive? Some field names might not be deemed sensitive from name, like remote_task_handler_kwargs

Comment on lines +393 to 395
if value is not VALUE_NOT_FOUND_SENTINEL:
return value
return VALUE_NOT_FOUND_SENTINEL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed this, shouldn't it be inverted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VALUE_NOT_FOUND_SENTINEL here is more accurate than existing None if I understand correct.
The whole method looks like the following now:

def _get_option_from_provider_metadata_config_fallbacks(
        self,
        deprecated_key: str | None,
        deprecated_section: str | None,
        key: str,
        section: str,
        issue_warning: bool = True,
        extra_stacklevel: int = 0,
        **kwargs,
    ) -> str | ValueNotFound:
        """Get config option from provider metadata fallback defaults."""
        value = self.get_from_provider_metadata_config_fallback_defaults(section, key, **kwargs)
        if value is not VALUE_NOT_FOUND_SENTINEL:
            return value
        return VALUE_NOT_FOUND_SENTINEL

@jason810496 jason810496 marked this pull request as draft March 27, 2026 12:08
restore_core_default_configuration instead of directly remove
@jason810496
Copy link
Member Author

Thanks @amoghrajesh for the review and good catch on compatibility, thread-safe issue for test cases!

@jason810496 jason810496 marked this pull request as ready for review March 27, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affected_version:3.2.0beta Use for reporting issues with 3.2.0beta area:task-sdk full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SDK conf does not resolve provider metadata config defaults (has_option returns False)

5 participants