Updated key extraction aliasing to preserve per-view space constraints and solve run time issues#220
Draft
mostafasafdarnejadCognite wants to merge 60 commits into
Draft
Updated key extraction aliasing to preserve per-view space constraints and solve run time issues#220mostafasafdarnejadCognite wants to merge 60 commits into
mostafasafdarnejadCognite wants to merge 60 commits into
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
…nhance documentation - Changed 'rule_name' to 'rule_id' in key extraction logic for consistency. - Added 'Passthrough' extraction method to README and documentation, detailing its use cases and configuration. - Updated README to reflect the addition of the new extraction method, increasing the total to five. - Adjusted paths in main.py for clarity regarding the package structure. - Removed outdated configuration examples and pipelines to streamline the module. Co-authored-by: Cursor <cursoragent@cursor.com>
…d improve documentation - Updated `_load_configs` to return `alias_writeback_property`, allowing configuration of the property name for alias persistence. - Enhanced documentation to clarify the alias write-back process, including default behavior and configuration options. - Adjusted README and guides to reflect changes in extraction methods and aliasing functionality. - Improved error handling and logging for alias persistence operations. Co-authored-by: Cursor <cursoragent@cursor.com>
…pport and update documentation - Added support for alias mapping table rules that load data from a Cognite RAW catalog, enhancing aliasing capabilities. - Updated the AliasingEngine to accept a Cognite client for loading alias mapping data. - Enhanced README and configuration guide to include details on the new alias mapping table feature and its configuration. - Improved error handling in the aliasing engine for better logging and user feedback. Co-authored-by: Cursor <cursoragent@cursor.com>
Make fn_dm_key_extraction runnable locally against CDF-fetched entities by adding a local runner with caching, fixing handler/pipeline import behavior, and correcting rule conversion/extraction metadata so results include applied rule and source-field provenance. Made-with: Cursor
…so added the initial mapping between legacy and new architecture for key extraction
… workflow with config file coming from a workflow config (instead of an extraction pipeline)# Please enter the commit message for your changes. Lines starting
…lso, added functionality to read and write from raw in the workflow to be able to debug the key extraction and aliasing results easily
… to CDF functionality (since they were using extraction pipeline approach)
…action and aliasing functions. Adjusted the repo root for local runs and modified import statements to use relative paths. Updated usage documentation for the config path in the local run script.
…ons for writing foreign key references and specifying a write-back property. Updated configuration and documentation to reflect these changes, including new CLI arguments and environment variable support. Improved key extraction to deduplicate foreign key references and ensure proper integration with alias persistence.
… in key extraction aliasing module. Expanded transformation types from 10 to 11, detailing the new catalog-driven aliasing method. Enhanced README and configuration guide to clarify usage and integration with Cognite RAW.
…date method call in PatternRecognitionHandler, and add new unit tests for various transformation handlers including bidirectional and cascade substitutions, context-based suffix and prefix handling, and pattern recognition enhancements.
…rt paths in aliasing functions. Introduced a new job in the build workflow to validate the cdf_key_extraction_aliasing site workflow against the generator. Updated import statements to streamline dependencies and enhance code organization across various modules.
… introducing CLI arguments for scope and config path. Updated main entry point and documentation to reflect new configuration loading mechanisms. Removed deprecated pipeline examples and streamlined YAML structure for clarity and usability.
…_PLANT_A__AREA_01__COOLING_SYS, streamlining the module by eliminating unused configurations. This change supports ongoing efforts to enhance clarity and maintainability in the key extraction aliasing module.
… extraction aliasing module, streamlining the codebase and eliminating unused documentation. This change supports ongoing efforts to enhance clarity and maintainability in the module.
…removing deprecated YAML files and updating README documentation. Enhanced clarity by consolidating example paths and improving descriptions for configuration entries. This change supports ongoing efforts to streamline the module and improve maintainability.
… Introduced a new function to match source views based on instance space and filters, enhancing the ability to process relevant views. Updated main entry point and documentation to reflect these changes, including improved help descriptions for CLI arguments. Removed deprecated example README files to streamline the configuration examples.
… validation rules for confidence matching. Updated YAML files to include detailed confidence match rules for blacklist and ISA compliance, improving the robustness of key extraction processes. Removed deprecated parameters and streamlined configurations for better clarity and usability.
… scope rescan in key extraction aliasing module. Updated configuration to enable incremental processing and added support for processing all views when incremental mode is active. Enhanced documentation to reflect these changes and improved handling of YAML configurations for key extraction. Additionally, refactored functions to streamline the processing logic and improve maintainability.
…dule. Added support for skipping unchanged source inputs based on a SHA-256 hash of extraction inputs, enhancing efficiency in incremental updates. Updated README documentation to reflect new processing behavior and configuration options. Refactored related functions to improve maintainability and clarity in the incremental state update logic.
…y extraction configuration. The `verify cdf_key_extraction_aliasing site workflow` job was removed to streamline the build process. Additionally, new parameters for `full_rescan` were introduced in the key extraction configuration, replacing the `overwrite` parameter in various YAML examples to enhance clarity and usability. Documentation was updated accordingly to reflect these changes.
…e semantic expansion support. Updated `type_mappings` in YAML files to include comprehensive mappings for various equipment types, allowing for hierarchical tag patterns with optional leading segments. Improved documentation to reflect the new configuration options, including the ability to merge user-defined mappings with built-in ISA presets. Added tests to validate the new functionality and ensure correct behavior of semantic expansion logic.
…handling and improve documentation. Updated YAML files to replace `scope_document` with `configuration` for workflow inputs, enhancing clarity and consistency. Adjusted scope hierarchy definitions and improved multi-site support in the configuration. Enhanced README and related documentation to reflect these changes, ensuring better usability and understanding of the module's functionality.
…ew definitions and enhance validation rules. Introduced a `source_views` section in YAML files for better clarity and organization, allowing for more flexible data extraction configurations. Improved documentation to reflect these changes, including detailed descriptions of new parameters and examples for better usability. Adjusted existing configurations to align with the updated structure, ensuring consistency across the module.
…ations for multiple sites. This includes the deletion of YAML files for site-specific workflows and their corresponding triggers, streamlining the module and eliminating outdated definitions. The changes enhance maintainability and clarity in the configuration structure.
Carry over non-import behavior changes for RAW upload, workflow status transitions, scope filter handling, and function handlers/pipelines while excluding import-only local-run path adjustments. Made-with: Cursor
This commit introduces a new module that serves as a local CLI entry point for fetching CDF instances from data model views, executing key extraction and aliasing, and writing the results to JSON files. The module includes comprehensive configuration options, supports environment-based credential loading, and provides functionality for building and cleaning workflow artifacts. Enhanced documentation is included to guide users on usage and configuration.
…le.py` as the local CLI entry point. Adjusted documentation and configuration references throughout the codebase to reflect this change, ensuring consistency in usage instructions. Enhanced YAML configuration files to clarify the scope hierarchy and workflow management, including updates to the `default.config.yaml` and related README files for improved usability and understanding.
…les in the key extraction and aliasing functionality. This change enhances module organization and ensures consistency in import practices.
…ast_updated_time' error
…y and configuration clarity This commit modifies the local CLI entry point to use `module.py run` instead of `module.py`, ensuring consistent usage instructions across documentation. It updates the `default.config.yaml` to change the `scope_build_mode` from `trigger_only` to `full`, improving workflow artifact generation. Additionally, the README and other documentation files have been revised to reflect these changes, including clearer instructions for local runs and configuration examples. The validation rules for aliasing have also been refined to enhance data integrity.
…gnitedata/library into feature/key-extraction-aliasing
…liasing across the module. This update includes changes to documentation, CLI descriptions, and internal comments to reflect the new terminology, enhancing clarity and consistency throughout the codebase.
…flow samples Add Vite/React local operator UI with FastAPI backend for default.config and scope YAML; source views, key discovery, and aliasing editors with Rules/Settings subtabs and i18n. Extend scope build with workflow config copy, trigger path helpers, document limits, and tests; update orchestration, hierarchy, and patch logic. Refresh module docs and configuration guide; add per-site workflow YAML samples. Align local DM key extraction runner with pipeline filter operators where needed. Made-with: Cursor
…updates - Add structured discovery/aliasing/validation editors, source view filters, scope hierarchy and artifact tree UX; i18n (en/es/ja/nb/pt) and utilities. - Key extraction: source_view filter build, rule_utils/DataStructures/engine and adapter updates; parameters→config for dict rules; composite strategy dropdown in discovery rules UI; example and workflow YAML cleanup. - Tests: conftest, integration/unit coverage for filters, extraction, aliasing, local runner; new test_source_view_filter_build. - Docs: configuration_guide; migration plan for token reassembly simplification. Made-with: Cursor
…orkflow cleanup Replace legacy per-method handlers with field-rule and heuristic handlers, drop fixed-width and token-reassembly examples, remove duplicate site workflow YAML in favor of templates, add run_all and incremental hashing updates, refresh config/examples/docs and operator UI (including i18n). Made-with: Cursor
- Source Views: ValidationStructuredEditor for view.validation, advanced YAML, and button to clear overlay (global Key Discovery validation only) - Extract withoutRegexpMatch to ui/src/utils/validationConfig.ts for reuse - i18n: sourceViews.perViewValidation* across locales - Document cdf_access_control dev ports in howto and vite proxy comment - default.config: add CLK scope leaf; workflow YAMLs: validation rule fields and sample discovery rule ordering Made-with: Cursor
…low canvas - Add aliasing and confidence-match rule reference resolution and pipeline I/O helpers. - Extend scope document DM, aliasing/key-extraction engines, and pipelines for extraction-scoped behavior. - UI: match definition panels, validation refs editor, React Flow workflow canvas and scope sync; i18n and styling. - Tests for aliasing pathways, extraction pipeline, rule refs, pipeline_io; configuration guide and workflow templates. - Add workflow.local.canvas.yaml and update local/template workflow config. Made-with: Cursor
…pipeline sync, rule definitions - Add workflow.execution.graph.yaml and workflow_channel_contracts.md as SSOT for fn_dm_* DAG; validate against WorkflowVersion in scope build and validate script. - Add workflow_execution_graph.py, KahnRunContext, kahn_workflow_executor with parallel reference index and aliasing after key extraction; refactor run.py. - Sync canvas to key_extraction extraction_rules[].aliasing_pipeline from extraction to aliasing data edges and composition edges; seed per-rule pipelines from YAML. - Restore aliasing_rule_definitions for pipeline refs (cogniteasset_explicit_aliases_from_raw and related rules) in workflow.local.config.yaml. Made-with: Cursor
…names - On match-validation node drop, append confidence_match_rule_definitions stub and set node label/name; prefixes match_<validation_rule_context>_n. - On aliasing handler drop, append aliasing_rules row and ref.aliasing_rule_name; names use handler id prefix (e.g. prefix_suffix_1). - Add ensureMatchDefinitionStub, ensureAliasingRuleForPaletteDrop helpers; sanitizeRuleNamePrefix in ruleNaming. - Add Vitest config and unit tests for helpers. Made-with: Cursor
- Adjust extraction aliasing_pipeline nesting for several rules. - Reorder/move aliasing rule descriptions; update strip_numeric_unit_prefix block. Made-with: Cursor
…nvas, and pipeline updates Add workflow scope association helpers and compile script; extend flow UI with subgraph/subflow editing, alignment, and scope wiring. Refresh key extraction and aliasing confidence matching, examples, local runner, tests, i18n, and workflow templates. Made-with: Cursor
…and UI - Add workflow_compile package (canvas DAG, codegen, legacy IR) and scope_canvas_merge; extend execution graph and DM handlers. - Add CDF deploy/workflow scripts, scope document load, template promotion, and example site workflow YAMLs; update workflow templates and docs. - Local runner: ui_progress, kahn executor and payload updates. - UI: WorkflowCompileToolbar, compile mode utils, flow canvas and i18n. - Tests for compile, scripts, local runner, and scope suffix filtering. - Replace Cursor static KEA rule with workflow-validation rule and skill. Made-with: Cursor
- Reorder workflow graph so incremental_state feeds ext_asset_tag_candidate in the correct edge order. - Extend CDF workflow scripts (io/run) and deploy_scope_cdf; add deploy_kea_functions_cdf_api with unit tests. - Add fn_dm_incremental_state_update requirements.txt. - Update DM function handlers and shared utils (key extraction, aliasing, reference index, alias persistence). - UI: App and server API hooks; new i18n keys across locales. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
updated key-extraction/aliasing to address run time issues, specially to ensure preserved per-view space constraints (instance_space or node.space filter) so CogniteFile can stay scoped to its intended space. Also to ensure each view uses only its configured space restriction (no cross-space leakage).