Skip to content

NASA_VIIRSActiveFiresEvents_Fixes_corrected config path as well as added skip_validation_config#2013

Open
balit-raibot wants to merge 3 commits into
datacommonsorg:masterfrom
balit-raibot:NASA_VIIRSActiveFiresEvents_Fixes
Open

NASA_VIIRSActiveFiresEvents_Fixes_corrected config path as well as added skip_validation_config#2013
balit-raibot wants to merge 3 commits into
datacommonsorg:masterfrom
balit-raibot:NASA_VIIRSActiveFiresEvents_Fixes

Conversation

@balit-raibot
Copy link
Copy Markdown
Contributor

@balit-raibot balit-raibot commented May 15, 2026

As discussed in the https://b.corp.google.com/issues/498452602, following changes have been done:

  • added config to skip validation and differ
  • modified path in config to avoid 'events/events' folder and have just 'events' folder

The changes have been validated in the test environment

https://pantheon.corp.google.com/batch/jobsDetail/regions/us-central1/jobs/nasa-viirsactivefiresevents-balit-20260515-131022/logs?e=13803378&mods=-monitoring_api_staging&project=datcom-infosys-dev

@balit-raibot balit-raibot changed the title corrected config path as well as added skip_validation_config NASA_VIIRSActiveFiresEvents_Fixes_corrected config path as well as added skip_validation_config May 15, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the GCS output directory path in the fire events pipeline configuration and updates the manifest to include a configuration override that disables several tools, including the import tool, while also removing the user script timeout. Feedback suggests that disabling the import tool prevents data from being imported and that removing the script timeout could lead to premature job termination. It is recommended to restore the timeout and enable the import tool to ensure the pipeline functions as intended.

Comment thread scripts/fires/firms/manifest.json
@balit-raibot
Copy link
Copy Markdown
Contributor Author

@gemini-code-assist review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the fire events pipeline configuration by modifying the GCS output directory and adds a config_override section to the manifest while removing a timeout. Feedback indicates that the config_override field is not yet supported by the ImportExecutor and that enabling skip_gcs_upload would disrupt version tracking. Furthermore, the change to the output directory path may incorrectly remove the 'events' subfolder, potentially violating directory structure guidelines.

Comment thread scripts/fires/firms/manifest.json Outdated
Comment thread scripts/fires/firms/fire_events_pipeline_config.py
@balit-raibot
Copy link
Copy Markdown
Contributor Author

@gemini-code-assist review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the GCS output directory path and updates the manifest configuration by removing the script timeout and adding several tool overrides. Feedback suggests that removing the user_script_timeout could cause long-running jobs to fail and that disabling the import tool and GCS upload might be unintended. Additionally, the change to the output directory path removes the stage subfolder entirely, which may conflict with the goal of maintaining a specific folder structure.

Comment on lines +19 to +25
"cron_schedule": "0 5 1 * *",
"config_override": {
"skip_gcs_upload": true,
"invoke_differ_tool": false,
"invoke_import_validation": false,
"invoke_import_tool": false
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The changes in the manifest appear to disable the automated import and remove safety timeouts:

  1. Timeout Removal: user_script_timeout was removed. Given that this pipeline processes global fire events for a full year, it likely requires a significant amount of time. Removing this may cause the automation to kill the job prematurely.
  2. Import Disabled: Setting "invoke_import_tool": false and "skip_gcs_upload": true prevents the generated data from being loaded into the database. The PR description only mentions skipping validation and the differ.

If the goal is to fix the production import, these settings should likely be adjusted to only skip the intended steps.

            "user_script_timeout": 86400,
            "cron_schedule": "0 5 1 * *",
            "config_override": {
                "invoke_differ_tool": false,
                "invoke_import_validation": false
            }

# as the import automation can copy all files in a single folder.
"output_dir":
"gs://{gcs_bucket}/{gcs_folder}/{stage}/{import_name}-{stage}-{year}-without-usa-",
"gs://{gcs_bucket}/{gcs_folder}/{import_name}-{stage}-{year}-without-usa-",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The PR description mentions a goal to "avoid 'events/events' folder and have just 'events' folder". However, the current change removes the {stage}/ directory (which resolves to events/) from the path entirely, while keeping the {stage} suffix in the filename prefix. This results in the output being placed in the parent directory (scripts/fires/firms/) with no events/ subfolder.

If the intention was to have a single events/ folder and remove the redundancy from the filename, consider keeping {stage}/ in the path but removing {stage}- from the filename prefix.

Suggested change
"gs://{gcs_bucket}/{gcs_folder}/{import_name}-{stage}-{year}-without-usa-",
"gs://{gcs_bucket}/{gcs_folder}/{stage}/{import_name}-{year}-without-usa-",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant