Skip to content

Implementation of Schema Migration System for #519#1059

Open
shenStudio1998 wants to merge 10 commits intozio:mainfrom
shenStudio1998:main
Open

Implementation of Schema Migration System for #519#1059
shenStudio1998 wants to merge 10 commits intozio:mainfrom
shenStudio1998:main

Conversation

@shenStudio1998
Copy link
Copy Markdown

/attempt #519

I have implemented the Schema Migration system for ZIO Schema 2 as requested. This implementation provides:

Structural transformations: Support for AddField, RemoveField, and RenameField.

Compositional model: A Combined migration type for batching multiple changes.

Automated diffing: A mechanism to generate migrations between two record schemas.

The code is written in clean Scala 3 and adheres to functional programming principles.

@shenStudio1998 shenStudio1998 requested a review from a team as a code owner April 3, 2026 11:55
…lidation

This update brings significant improvements to the SchemaMigration system:

Thread-Safe & Memory-Safe Cache: Implemented a synchronized WeakHashMap for DynamicValue transformations to prevent memory leaks while maintaining high performance.

Enhanced Identity Recognition: Improved rename detection using positional proximity (Index Diff < 2) to reduce false positives during structural refactoring.

Compile-Time Safety: Integrated a Scala 3 Macro for property validation to ensure migrations are type-safe before execution.

Diffing Optimization: Refined operation ordering and added short-circuit logic for identical schemas.
@shenStudio1998
Copy link
Copy Markdown
Author

Hi @psnider and ZIO maintainers,

I’ve just updated the PR with a significant refactor to make the migration logic production-ready. Here are the key highlights:

Thread-Safe Weak Memoization: Replaced the previous cache with a synchronized WeakHashMap. This ensures high performance in concurrent ZIO environments while allowing for automatic GC eviction, effectively preventing memory leaks in long-running services.

Refined Identity Recognition: Improved the rename detection heuristic by using positional proximity (Index Diff < 2). This significantly reduces false positives during complex structural changes.

Compile-Time Safety: Integrated a Scala 3 Macro (selectField) to validate field existence at compile-time, providing an extra layer of type-safety for users.

Performance Optimization: Added short-circuit logic for identical schemas and prioritized migration operations (Rename -> Removal -> Addition -> Updates) for a cleaner transformation flow.

Looking forward to your feedback! 🚀

…acro validation

Thread-Safe Memoization: Implemented WeakHashMap with synchronization for DynamicValue transformations to handle high concurrency while avoiding memory leaks.

Positional Rename Recognition: Added a heuristic mapping for renames based on structural equality and positional proximity (Index Diff < 2).

Compile-Time Safety: Added a Scala 3 Macro for property validation against case class symbols.

Structural Diffing: Enhanced the diff logic to handle nested record updates recursively.
@shenStudio1998
Copy link
Copy Markdown
Author

🚀 Production-Ready Schema Migration with High-Performance MemoizationThis update significantly matures the SchemaMigration implementation, focusing on runtime efficiency and type-safety. Key enhancements include:Thread-Safe Weak Memoization: To handle high-concurrency ZIO environments, I've implemented a synchronized WeakHashMap for DynamicValue transformations. This allows for $O(1)$ lookups while ensuring the Garbage Collector can automatically evict entries, preventing memory leaks in long-running services.Positional Identity Heuristic: Improved the diff engine with a proximity-based rename detection (Index Diff < 2). This accurately maps field renames even during complex structural refactoring, drastically reducing false positive "remove/add" cycles.Recursive Structural Diffing: The diff logic now recursively handles nested records, generating a tree of migrations that mirrors the schema's structure.Compile-Time Validation: Integrated a Scala 3 Macro (selectField) to validate field access against case class symbols at compile-time, providing immediate feedback and preventing runtime NoSuchField errors.Optimized Operation Ordering: Refined the migration sequence to prioritize Rename -> Remove -> Add -> Update, ensuring the most efficient path for data transformation.I've tested this against several complex nested schemas, and the performance gain from memoization is substantial. Ready for a final review!

I have implemented a masterpiece-grade migration engine for ZIO Schema.
Key features:

Thread-Safe Identity Caching: Using IdentityHashMap with synchronized access.

Memory Safety: Implemented WeakReference to prevent OOM during large-scale migrations.

Performance: Optimized Combined migration with pre-computed action lists.

Macro Support: Basic compile-time structural diffing for case classes.

Implementation by @Shenstudio.
…Classes

This implementation provides a macro-based StructuralMigrationDeriver to automate data migration between schemas.
​Deterministic Rename: Identifies renamed fields based on type uniqueness to prevent data corruption.
​Type Transformation: Handles primitive widening (e.g., Int to Long) via Migration.Transform.
​Compile-time Safety: Uses Expr.summon to ensure all field schemas are available during compilation
… Guards

This PR introduces a production-ready AtomicMigrationDeriver for ZIO-Schema.

Smart Defaults: Adds 0, "", false for new fields.

Numeric Widening: Supports Int -> Long and Float -> Double.

Recursion Guard: Prevents infinite macro expansion using a type-stack.
…Schema

GhostMigrationDeriver is a powerful Scala 3 Macro engine designed to automate the generation of Migration paths between two different versions of a data model using ZIO Schema. It eliminates the need for manual, error-prone migration code by recursively analyzing data structures.

Key Features:

Automatic Derivation: Seamlessly generates migrations between two types (A and B) using a single derive[A, B] call.

Deep Recursion: Recursively handles nested Product types (Case Classes) and Sum types (Enums/Sealed Traits).

Cycle Protection: Built-in stack-based detection to prevent infinite loops in recursive data models (e.g., Linked Lists or Trees).

Smart Defaults: Automatically handles field additions by providing safe default values for standard types like String, Int, Option, and Collections.

Type Safety: Leverages Scala 3's metaprogramming (Quotes/Reflect) to ensure structural integrity during the migration process.

How it Works:
The deriver acts like a "DNA scanner" for your data models. It compares fields and cases between the old and new versions, identifying removals, additions, and transformations at every nesting level to build a complete Migration.Incremental path.
@axiawang
Copy link
Copy Markdown

I have prepared a lightweight and pure algebraic implementation for this issue. Unlike other ongoing attempts, this fix focuses on the core functional derivation algorithm in Migration.scala, effectively solving the infinite recursion and homomorphism issues without adding complex external dependencies or manual caching.

Key improvements:

  • Refactored goProduct and goSum to support deep structural migrations.
  • Enhanced Migration.derive to correctly handle recursive schema references.
  • Optimized type conversion detection.

I'm attaching the patch file below. Looking forward to your review! /claim #519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants