Skip to content

feat: skill quality overhaul — 95% faster E2EE, 28 evals, enhanced harness#17

Merged
CybotTM merged 5 commits intomainfrom
improve/skill-quality
Mar 26, 2026
Merged

feat: skill quality overhaul — 95% faster E2EE, 28 evals, enhanced harness#17
CybotTM merged 5 commits intomainfrom
improve/skill-quality

Conversation

@CybotTM
Copy link
Copy Markdown
Member

@CybotTM CybotTM commented Mar 26, 2026

Summary

  • Performance: E2EE send/read reduced from ~50s to ~3s by resolving room names post-sync from client.rooms instead of 193 HTTP calls via find_room_by_name()
  • SKILL.md: Rewritten for CSO compliance (818→479 words), expanded error handling and common mistakes, explicit Quick Reference examples
  • Evals: 3→28 scenarios covering all operations, error recovery, and performance
  • Scripts: PEP 723 metadata and line buffering added to all 15 scripts, E402 lint compliance
  • AGENTS.md: Full rewrite with complete command reference, 11 rules, updated performance notes
  • Harness: 10→18 checks with CI-compatible PR template detection
  • plugin.json: Keywords expanded for discoverability

Performance

Operation Before After
Send (E2EE) ~50s ~3s
Read (E2EE) ~48s ~2s

Root cause: find_room_by_name() made 2 HTTP requests × 96 joined rooms = 193 requests before every operation. Fix: sync first (timeout=0), resolve from client.rooms (in-memory).

Test plan

  • Send/read/react verified against #test room (5-cycle avg: 2.8s send, 2.7s read)
  • Thread reply, emote, --json, --search, --no-prefix flags verified
  • Harness Level 3 (18/18 checks, 0 errors, 0 warnings)
  • 28 evals valid JSON, all script refs resolve
  • uvx ruff check . and uvx ruff format --check . pass clean
  • All 15 scripts pass syntax check

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers substantial improvements to the Matrix communication skill, focusing on both user experience and developer maintainability. The core change dramatically boosts E2EE message performance by optimizing room resolution, making interactions significantly faster. Concurrently, the documentation has been overhauled to be more comprehensive and compliant, while the testing harness and evaluation suite have been expanded to ensure higher quality and reliability across all operations. These changes collectively enhance the skill's efficiency, robustness, and ease of use.

Highlights

  • Performance Optimization: Significantly reduced E2EE send/read times from ~50 seconds to ~3 seconds by resolving Matrix room names post-sync from the client's in-memory room list, eliminating 193 HTTP calls per operation.
  • Enhanced Skill Documentation: The SKILL.md and AGENTS.md files have been comprehensively rewritten to improve clarity, compliance with CSO standards, and provide detailed command references and development rules. SKILL.md also includes expanded error handling and common mistakes.
  • Expanded Evaluation Scenarios: The number of evaluation scenarios (evals/evals.json) has been increased from 3 to 28, covering a broader range of operations, error recovery, and performance aspects to ensure robust skill functionality.
  • Improved Harness Checks: The verify-harness.sh script has been enhanced from 10 to 18 checks, adding new categories for skill frontmatter, plugin consistency, version synchronization, reference documentation, evaluation coverage, and script health.
  • Script Quality Improvements: All 15 Python scripts now include PEP 723 metadata and line buffering for better non-interactive execution, alongside updates to room resolution logic in E2EE scripts for performance.
  • Plugin Discoverability: Expanded keywords in plugin.json to improve the discoverability of the Matrix communication skill.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/harness-verify.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the Matrix skill by updating its documentation, improving the harness verification script, and refining the Python scripts for Matrix operations. Key changes include adding new keywords to plugin.json, extensively updating AGENTS.md and SKILL.md with more detailed commands, rules, error handling, and common mistakes, and greatly expanding the evaluation coverage in evals.json. The verify-harness.sh script is now more robust, checking for skill-specific consistency, versions, and script health. Performance notes in e2ee-guide.md and script usage notes were updated to reflect faster E2EE sync times. Crucially, the Python scripts for E2EE operations (matrix-edit-e2ee.py, matrix-read-e2ee.py, matrix-send-e2ee.py) were refactored to use a new find_room_in_nio_client function, resolving room names more efficiently by leveraging already synced client data, thereby reducing HTTP calls. Additionally, line buffering was enabled for real-time output in most Python scripts. There are two feedback items: one regarding an outdated E2EE performance note in AGENTS.md that needs to be consistent with other documentation, and another suggesting a simplification and efficiency improvement for the room finding logic in skills/matrix-communication/scripts/_lib/rooms.py.

CybotTM added 5 commits March 26, 2026 16:10
- Rewrite description to pure triggering conditions (no capability list)
- Reduce word count from 818 to 479 (under 500 CI limit)
- Add all send/read/edit flags as explicit Quick Reference examples
- Add Common Mistakes section, expand Error Handling table (10 entries)
- Add set +H && to first send example for bash ! safety
- Restore source link in References
Root cause: find_room_by_name() made 193 HTTP requests (2 per room x 96
joined rooms) before every operation. The sync itself only took ~0.8s.

Fix: E2EE scripts now sync first (timeout=0, full_state=True), then
resolve room names from client.rooms (in-memory, zero HTTP calls).

- Add find_room_in_nio_client() to _lib/rooms.py for post-sync lookup
- Refactor send/read/edit to resolve rooms after sync
- Set sync timeout=0 (no long-poll) for immediate return
- Update e2ee-guide.md performance notes (~2-3s, not ~5-10s)
- Add # /// script blocks with dependencies to 8 non-E2EE scripts
- Add sys.stdout/stderr.reconfigure(line_buffering=True) after imports
- Place reconfigure calls after all imports to avoid E402 lint errors
- Ensures real-time output in non-interactive/piped contexts
AGENTS.md:
- Full-path commands for all 15 scripts (no shorthand)
- 11 development rules (E2EE first, bash !, key backup, etc.)
- Updated performance notes (~2-5s, not ~5-10s)

Evals: expanded from 3 to 28 covering all operations, error recovery,
room search, key requests, device verification, and performance.

plugin.json: added e2ee, encryption, messaging, notification, nio.
verify-harness.sh:
- L1: Add SKILL.md existence check
- L2: Frontmatter validation, plugin.json consistency, version sync,
  reference docs presence
- L3: Eval coverage (min 5), eval script references, script syntax
- Fix PR template check to handle missing API access in CI

harness-verify.yml: delegate to script instead of inline checks.
@CybotTM CybotTM force-pushed the improve/skill-quality branch from d7abacf to 6128148 Compare March 26, 2026 15:12
@CybotTM CybotTM changed the title feat: comprehensive skill quality improvement and 95% performance optimization feat: skill quality overhaul — 95% faster E2EE, 28 evals, enhanced harness Mar 26, 2026
@CybotTM CybotTM merged commit b78dd40 into main Mar 26, 2026
6 checks passed
@CybotTM CybotTM deleted the improve/skill-quality branch March 26, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant