Skip to content

fix: Scheduler architecture is monolithic and tightly coupled (hard to test/extend) #1003

@T800-2048

Description

@T800-2048

Pre-submission checklist | 提交前检查

  • I have searched existing issues and this hasn't been mentioned before | 我已搜索现有问题,确认此问题尚未被提及
  • I have read the project documentation and confirmed this issue doesn't already exist | 我已阅读项目文档并确认此问题尚未存在
  • This issue is specific to MemOS and not a general software issue | 该问题是针对 MemOS 的,而不是一般软件问题

Bug Description | 问题描述

The current scheduler design is structurally hard to maintain and test. It mixes message ingestion, routing, memory operations, logging, and external integrations inside a few massive classes, which increases coupling and makes future changes risky.

Why this matters

  • Difficult to add features or fix bugs without touching huge files and multiple responsibilities at once.
  • Hard to unit‑test in isolation due to direct coupling with queues, logs, and external services.
  • Increased risk of inconsistent behavior between scheduler flows and API flows.

Suggested direction

  • Extract dedicated handler classes (one per task label) with clear dependency injection.
  • Split search/retrieval into smaller units (search + enhancement + post‑processing).
  • Centralize search logic so scheduler and API share the same implementation.
  • Move logging and external integrations into thin adapters so they can be mocked in tests.

How to Reproduce | 如何重现

  • Oversized core classes
    • GeneralScheduler is ~1500 lines (src/memos/mem_scheduler/general_scheduler.py).
    • BaseScheduler is ~1300 lines (src/memos/mem_scheduler/base_scheduler.py).
    • These files contain business logic, queueing, logging, state tracking, and side‑effects in one place.
  • Handlers are not modular
    • GeneralScheduler implements the handlers directly (_query_message_consumer, _memory_update_consumer, _add_message_consumer, _mem_read_message_consumer, _mem_reorganize_message_consumer, _pref_add_message_consumer, _mem_feedback_message_consumer).
    • Each handler mixes validation, grouping, business logic, and logging, which makes it hard to test or reuse.
  • Retrieval pipeline is a “god object”
    • SchedulerRetriever (src/memos/mem_scheduler/memory_manage_modules/retriever.py) handles retrieval + enhancement + filtering + reranking + evaluation, collapsing several responsibilities into one class.
  • Search logic duplicated and inconsistent
    • OptimizedScheduler.search_memories has a comment “copied from server_router to avoid circular import” and directly calls mem_cube.text_mem.search.
    • This duplicates logic and makes it easy for API search to diverge from scheduler search behavior.

Environment | 环境信息

dev-2.0.4

Additional Context | 其他信息

No response

Willingness to Implement | 实现意愿

  • I'm willing to implement this myself | 我愿意自己解决
  • I would like someone else to implement this | 我希望其他人来解决

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpendingPending items to be addressed | 待解决事项。

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions