Skip to content

[Feature] Personalization Integration for CUGA #108

@gaodan-fang

Description

@gaodan-fang

📋 What We Want and Why

Implement comprehensive personalization capabilities in CUGA to enable context-aware, user-specific task execution. This feature builds upon the existing personal information (PI) infrastructure and memory system to deliver tailored experiences based on user preferences, historical interactions, and contextual data.

Primary Goals

  1. User-Centric Execution: Adapt agent behavior based on individual user preferences and context
  2. Contextual Awareness: Leverage user profile data to make informed decisions
  3. Preference Learning: Automatically learn and apply user preferences over time
  4. Seamless Integration: Build on existing PI and memory infrastructure without breaking changes

Success Metrics

  • Personalization Accuracy: 90% of tasks correctly apply user preferences
  • User Satisfaction: 25% improvement in user satisfaction scores
  • Preference Recall: 95% accuracy in retrieving and applying stored preferences
  • Context Utilization: 80% of tasks leverage available user context

🏗️ How It Could Work

Architecture Overview

The personalization system consists of three key components:

1. Profile Manager

  • Manages user profile data (name, contact info, preferences)
  • Integrates with existing PI (Personal Information) field
  • Supports profile CRUD operations
  • Handles profile versioning and updates

2. Preference Engine

  • Stores and retrieves user preferences via Kaizen entities
  • Learns preferences from user interactions
  • Applies preferences to agent decisions
  • Supports preference hierarchies (global, app-specific, task-specific)

3. Context Resolver

  • Resolves contextual information for tasks
  • Integrates with memory system for historical context
  • Provides context-aware recommendations
  • Handles multi-tenant context isolation

Current Implementation Status

✅ Existing Infrastructure (kaizen-integration branch)

  • User Preferences Context Module (src/cuga/backend/cuga_graph/state/user_preferences_context.py)

    • Structured fact extraction with category/key/value organization
    • Query-based relevance scoring for context selection
    • Compact context formatting for agent prompts
    • Support for legacy and structured preference formats
  • Kaizen-Based Storage

    • Preferences stored as Kaizen entities in Milvus
    • Category-based organization
    • Metadata filtering support
    • In-process memory access (no separate service)
  • Personal Information (PI) Field

    • PI field exists in agent state
    • Injected into first user message in CugaLite
    • Used for user identification in memory system
    • Parsed from markdown instructions

🚧 Gaps & Remaining Work

  1. User Profile Schema (P0 - Critical)

    • Define structured Pydantic models for user profiles
    • Implement validation for required/optional fields
    • Create migration path from free-form PI to structured profiles
    • Add backward compatibility support
  2. Preference Management System (P0 - Critical)

    • Complete CRUD operations for preferences
    • Implement preference hierarchy resolution (global → app → task)
    • Add preference expiration support
    • Create preference versioning system
    • Build bulk import/export functionality
  3. Agent Integration (P1 - High)

    • Task Analyzer: Use preferences for task interpretation
    • Task Decomposition: Leverage preferred workflows
    • API Planner: Apply default filters and preferred endpoints
    • Code Agent: Apply coding style preferences
    • Browser Agent: Use navigation preferences
  4. Preference Learning Engine (P1 - High)

    • Explicit learning from user commands
    • Implicit learning from repeated patterns (3+ occurrences)
    • Confidence scoring for learned preferences
    • User review/approval interface
    • Preference conflict resolution
  5. Privacy & Data Controls (P1 - High)

    • User consent management
    • Configurable data retention periods
    • Complete data export functionality
    • Full data deletion capability (right to be forgotten)
    • GDPR/CCPA compliance
  6. Personalization API (P2 - Medium)

    • RESTful API for profile/preference management
    • OpenAPI specification
    • Authentication and authorization
    • Rate limiting
    • Comprehensive API documentation

Key Technical Specifications

Proposed Data Models:

class UserProfile(BaseModel):
    user_id: str
    email: str
    phone_number: Optional[str] = None
    first_name: str
    last_name: str
    display_name: Optional[str] = None
    timezone: str = "UTC"
    locale: str = "en-US"
    preferences: Dict[str, Any] = Field(default_factory=dict)
    created_at: datetime
    updated_at: datetime
    last_active: Optional[datetime] = None
    data_retention_days: int = 90
    analytics_enabled: bool = True

class UserPreference(BaseModel):
    preference_id: str
    user_id: str
    scope: Literal["global", "app", "task"]
    app_name: Optional[str] = None
    task_type: Optional[str] = None
    key: str
    value: Any
    priority: int = 0
    confidence: float = 1.0
    source: Literal["explicit", "implicit", "collaborative"]
    created_at: datetime
    updated_at: datetime
    expires_at: Optional[datetime] = None

Storage Backend:

  • Milvus: Vector embeddings for semantic search
  • SQLite: Profile and preference metadata
  • Kaizen: Unified interface for memory operations

Performance Targets:

  • Profile retrieval: <50ms (p95)
  • Preference lookup: <30ms (p95)
  • Preference update: <100ms (p95)
  • Context resolution: <100ms (p95)

🔗 Links and Context

Related Documentation

  • Feature Document: docs/features/FEATURE-001-Personalization-Integration.md
  • Related Epic: EPIC-001: Memory Integration
  • Related Feature: docs/features/FEATURE-002-Learning-From-Experience.md
  • Memory README: docs/memory/README.md

Key Branches

  • kaizen-integration: Kaizen memory backend with user preferences context module
  • PR Kaizen integration #85: Kaizen Integration implementation

Implementation Files

  • User preferences context: src/cuga/backend/cuga_graph/state/user_preferences_context.py
  • Memory client: src/cuga/backend/memory/memory.py
  • Agent state: src/cuga/backend/cuga_graph/state/agent_state.py
  • Markdown parser: src/cuga/configurations/set_from_one_file.py
  • AppWorld auth: src/cuga/backend/tools_env/registry/registry/authentication/appworld_auth_manager.py

Dependencies

  • Kaizen Library: External memory/entity management system
    • Install: uv sync --extra memory
    • Configuration: src/cuga/configurations/memory/kaizen.settings.toml

📊 Implementation Roadmap

Phase 1: Foundation (Q1 2026) - ~15% Complete

Completed:

  • ✅ User preferences context module
  • ✅ Structured fact extraction with categories
  • ✅ Query-based relevance scoring
  • ✅ Kaizen-based entity storage

Remaining:

  • Complete user profile schema
  • Dedicated profile storage API
  • Full preference CRUD operations
  • Migration utilities
  • Comprehensive unit tests

Phase 2: Agent Integration (Q2 2026)

  • Task Analyzer personalization
  • Task Decomposition personalization
  • API Planner personalization
  • Code Agent personalization
  • Browser Agent personalization
  • Integration tests

Phase 3: Preference Learning (Q2-Q3 2026)

  • Explicit preference capture
  • Implicit preference learning
  • Confidence scoring system
  • User review interface
  • Learning pipeline tests

Phase 4: Privacy & API (Q3 2026)

  • Privacy controls implementation
  • Data export/import functionality
  • REST API implementation
  • API documentation
  • Compliance validation

Phase 5: Production Hardening (Q4 2026)

  • Performance optimization
  • Scalability testing
  • Security audit
  • User documentation
  • Production deployment

🎯 Acceptance Criteria

This feature will be complete when:

  1. ✅ User profile schema implemented and tested
  2. ✅ Preference management system operational
  3. ✅ All 5 core agents support personalization
  4. ✅ Preference learning engine functional
  5. ✅ Privacy controls implemented and compliant
  6. ✅ REST API documented and deployed
  7. ✅ Performance targets met (<50ms profile retrieval)
  8. ✅ Test coverage >85%
  9. ✅ Documentation complete
  10. ✅ Production deployment successful

Current Progress: ~15% (Foundation partially implemented)

📝 Next Steps

  1. Complete user profile schema with Pydantic models
  2. Implement full preference CRUD operations
  3. Integrate personalization into Task Analyzer agent
  4. Build preference learning pipeline
  5. Add privacy controls and compliance features
  6. Create REST API for external access
  7. Optimize performance and add caching
  8. Write comprehensive documentation

Reference: This issue tracks the implementation of FEATURE-001 as documented in docs/features/FEATURE-001-Personalization-Integration.md

Metadata

Metadata

Labels

component: agentCore agent loop, DynamicAgentGraph, LLM node, tool execution, CugaLitecomponent: memoryMemory management, conversation history, and state persistenceenhancementNew feature or requestpriority: highImportant, address soon

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions