Skip to content

Sync browser agent features from ConnectOnion CLI #4

@wu-changxing

Description

@wu-changxing

Sync Browser Agent Features from ConnectOnion CLI Version

Summary

The standalone browser-agent and ConnectOnion's CLI browser agent (@connectonion/cli/browser-agent) have diverged significantly. This issue tracks features from the CLI version that should be adopted to improve UX and reliability.

Current State

  • Standalone: Multi-agent architecture, deep research capabilities, advanced features
  • CLI Version: Better UX, defensive error handling, platform optimizations
  • Last Sync: Never systematically synced

Features to Adopt from CLI Version

1. Home Directory Profile Management

Priority: High

Current: Profile at .co/chrome_profile (project-specific)
Proposed: Profile at ~/.co/browser_profile (persistent across projects)

Benefits:

  • Sessions persist across different projects
  • User logs in once, cookies saved forever
  • More intuitive behavior for users

Implementation:

# In WebAutomation.__init__()
if profile_path:
    self.chrome_profile_path = str(profile_path)
else:
    self.chrome_profile_path = str(Path.home() / ".co" / "browser_profile")

Files to modify:

  • tools/web_automation.py (lines 40-43)

2. macOS Chrome Binary Detection

Priority: Medium

Problem: Playwright's bundled Chromium is unsigned and crashes on macOS in non-headless mode

Solution:

def open_browser(self, headless: bool = None) -> str:
    launch_kwargs = dict(
        headless=headless,
        args=['--disable-blink-features=AutomationControlled'],
        ignore_default_args=['--enable-automation'],
        timeout=120000,
    )

    # macOS fix: Use system Chrome for non-headless mode
    if not headless:
        import sys
        if sys.platform == 'darwin':
            chrome_path = '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'
            if os.path.exists(chrome_path):
                launch_kwargs['executable_path'] = chrome_path

    self.browser = self.playwright.chromium.launch_persistent_context(
        str(profile_dir),
        **launch_kwargs,
    )

Files to modify:

  • tools/web_automation.py (open_browser method)

3. Smart URL Handling

Priority: Low

Enhancement: Auto-add https:// for partial URLs

Current:

def go_to(self, url: str) -> str:
    self.page.goto(url, wait_until="load")

Proposed:

def go_to(self, url: str) -> str:
    if not url.startswith(('http://', 'https://')):
        url = f'https://{url}' if '.' in url else f'http://{url}'

    self.page.goto(url, wait_until='domcontentloaded', timeout=30000)
    self.page.wait_for_timeout(2000)  # Wait for dynamic content
    self.current_url = self.page.url
    return f"Navigated to {self.current_url}"

Benefits:

  • Users can type example.com instead of https://example.com
  • More forgiving UX

Files to modify:

  • tools/web_automation.py (go_to method, line 80-84)

4. Defensive Null Checking

Priority: Medium

Issue: Methods don't check if browser is open, leading to cryptic errors

Pattern to adopt:

def method_name(self, ...) -> str:
    if not self.page:
        return "Browser not open"

    # ... rest of implementation

Apply to all methods in:

  • tools/web_automation.py: get_text, select_option, check_checkbox, wait_for_element, etc.

5. Additional Helper Methods

Priority: Low

Add convenience methods from CLI version:

def get_current_url(self) -> str:
    """Get the current page URL."""
    if not self.page:
        return "Browser not open"
    return self.page.url

def get_current_page_html(self) -> str:
    """Get the HTML content of the current page."""
    if not self.page:
        return "Browser not open"
    return self.page.content()

def get_urls(self, domain_filter: str = "") -> List[str]:
    """Extract all unique URLs from the current page.

    Args:
        domain_filter: Only return URLs containing this string
    """
    if not self.page:
        return []

    urls = self.page.evaluate("""
        (filter) => {
            const seen = new Set();
            const result = [];
            for (const a of document.querySelectorAll('a[href]')) {
                const href = a.href;
                if (href && !seen.has(href) && (!filter || href.includes(filter))) {
                    seen.add(href);
                    result.push(href);
                }
            }
            return result;
        }
    """, domain_filter)
    return urls or []

def set_viewport(self, width: int, height: int) -> str:
    """Set the browser viewport size."""
    if not self.page:
        return "Browser not open"
    self.page.set_viewport_size({"width": width, "height": height})
    return f"Viewport set to {width}x{height}"

def wait(self, seconds: float) -> str:
    """Wait for a specified number of seconds."""
    if not self.page:
        return "Browser not open"
    self.page.wait_for_timeout(seconds * 1000)
    return f"Waited for {seconds} seconds"

Files to modify:

  • tools/web_automation.py (add after existing methods)

6. Enhanced Screenshot Method

Priority: Low

Current signature:

def take_screenshot(self, filename: str = None) -> str:

Proposed signature:

def take_screenshot(self, url: str = None, path: str = "",
                   width: int = 1920, height: int = 1080,
                   full_page: bool = False) -> str:

New features:

  • Navigate to URL before screenshot
  • Control viewport size
  • Full-page capture option
  • Auto-generate timestamped filenames

Files to modify:

  • tools/web_automation.py (take_screenshot method)

7. Better Manual Login UX

Priority: Medium

Current:

def wait_for_manual_login(self, site_name: str = "the website") -> str:
    print(f"\n{'='*60}\n⏸️  MANUAL LOGIN REQUIRED\n{'='*60}")
    print(f"Please login to {site_name} in the browser window.")
    input("Press Enter to continue...")
    return f"User confirmed login to {site_name}"

Proposed:

def wait_for_manual_login(self, site_name: str = "the website") -> str:
    if not self.page:
        return "Browser not open"

    print(f"\n{'='*60}")
    print(f"  MANUAL LOGIN REQUIRED")
    print(f"{'='*60}")
    print(f"Please login to {site_name} in the browser window.")
    print(f"Once you're logged in and ready to continue:")
    print(f"  Type 'yes' or 'Y' and press Enter")
    print(f"{'='*60}\n")

    while True:
        response = input("Ready to continue? (yes/Y): ").strip().lower()
        if response in ['yes', 'y']:
            print("Continuing automation...\n")
            return f"User confirmed login to {site_name} - continuing"
        else:
            print("Please type 'yes' or 'Y' when ready.")

Files to modify:

  • tools/web_automation.py (wait_for_manual_login method, lines 218-223)

Implementation Plan

Phase 1: Critical UX Improvements

Phase 2: Platform Support

Phase 3: Nice-to-Have

Testing Checklist

After implementing:

  • Test on macOS in non-headless mode
  • Test profile persistence across different projects
  • Test with browser not opened (defensive checks)
  • Test manual login flow with invalid inputs
  • Verify all helper methods work as expected

Related Issues

  • This issue created as part of browser agent sync effort
  • Related: ConnectOnion CLI browser agent issue (link TBD)

Files to Modify

  • tools/web_automation.py - Main implementation file
  • README.md - Update documentation for new features
  • CLAUDE.md - Update development guidance

Migration Notes

These changes are backward compatible - existing code will continue to work. New features are additive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions