[Performance] Optimize CLI startup time via lazy loading and deferred initialization

# Startup Performance Improvement Ideas

The CLI does not feel critically slow, but its startup has a slight delay.
At this stage, rewriting the whole project in Go would likely be excessive.
A better approach is to optimize the Python startup path first.

The main idea is simple:

> Do not prepare everything at process startup.
> Only prepare what is actually needed for the current command.

## Goal

Reduce startup latency for `multi-ai-cli`, especially for lightweight commands and single-shot CLI usage.

Examples:

* `--help`
* `--version`
* simple one-agent runs
* shell-style usage with stdin/stdout
* editor mode

## 1. Use lazy imports for heavy modules

Heavy SDK imports should not run unless the selected engine actually needs them.

Candidates:

* `google.generativeai`
* `openai`
* `anthropic`
* `dotenv`

### Current problem

These modules may be imported at process startup even when they are not used.

### Proposed change

Move imports into engine-specific methods or factory functions.

### Example

```python
class OpenAIEngine(BaseEngine):
    def _get_client(self):
        if self.client is None:
            from openai import OpenAI
            self.client = OpenAI(api_key=self.api_key)
        return self.client
```

### Expected benefit

* Faster startup for commands that do not use that engine
* Faster `--help` / `--version`
* Less unnecessary initialization work

## 2. Lazily initialize SDK clients

Do not create API clients in `__init__()` if they are not immediately needed.

### Current problem

Client creation may trigger unnecessary internal setup during startup.

### Proposed change

Store configuration only in `__init__()`, and instantiate the client only at first use.

### Example

```python
class ClaudeEngine(BaseEngine):
    def __init__(self, api_key):
        self.api_key = api_key
        self.client = None

    def _get_client(self):
        if self.client is None:
            from anthropic import Anthropic
            self.client = Anthropic(api_key=self.api_key)
        return self.client
```

### Expected benefit

* Lower startup overhead
* Engine creation becomes cheaper
* Only the active engine pays the initialization cost

## 3. Avoid unconditional `.env` loading at startup

Loading `.env` on every process launch may add avoidable overhead.

### Proposed change

* Load `.env` only when needed
* Or make it optional
* Or load it once inside a dedicated helper

### Example

```python
_ENV_LOADED = False

def ensure_env_loaded():
    global _ENV_LOADED
    if _ENV_LOADED:
        return
    try:
        from dotenv import load_dotenv
        load_dotenv()
    except ImportError:
        pass
    _ENV_LOADED = True
```

### Expected benefit

* Faster startup in environments where shell env vars are already set
* Less unnecessary file-system work

## 4. Keep logging initialization minimal

Heavy logging setup at startup can slow down short-lived CLI runs.

### Current problem

File handlers, rotating log files, directory creation, and verbose formatters may run even for trivial commands.

### Proposed change

* Use minimal logging by default
* Enable file-based logging only in debug / verbose mode
* Delay expensive handler setup until necessary

### Example

```python
logger = logging.getLogger("multi_ai")
logger.addHandler(logging.NullHandler())
```

### Expected benefit

* Faster startup for normal usage
* Cleaner behavior for shell pipelines

## 5. Parse arguments before expensive initialization

Do not initialize configs, SDKs, or logging before knowing what command was actually requested.

### Proposed change

Move expensive setup after `parse_args()`.

### Bad

```python
load_config()
setup_logging()
init_all_engines()
args = parse_args()
```

### Better

```python
args = parse_args()

if args.version:
    print(VERSION)
    return

setup_logging_if_needed(args)
config = load_config_if_needed(args)
engine = create_engine_if_needed(args, config)
```

### Expected benefit

* Lightweight commands return faster
* Avoids wasted initialization work

## 6. Instantiate only the selected engine

There is no need to create all engine objects on every launch.

### Proposed change

Keep a registry of engine classes and instantiate only the requested one.

### Example

```python
ENGINE_MAP = {
    "gpt": OpenAIEngine,
    "claude": ClaudeEngine,
    "gemini": GeminiEngine,
}
```

```python
engine_cls = ENGINE_MAP[engine_name]
engine = engine_cls(...)
```

### Expected benefit

* Lower startup cost
* Easier scaling when more engines are added later

## 7. Delay config, history, and persona loading

Files should be read only when required for the current command.

### Proposed change

* Load history only for the active engine
* Load persona only when specified
* Skip config reads for commands that do not need engine execution

### Expected benefit

* Less file-system overhead
* Better startup for lightweight commands

## 8. Reduce startup-time file checks and scans

Repeated file existence checks and repeated reads can add unnecessary overhead.

### Proposed change

* Avoid repeated `Path.exists()` checks
* Do not scan all related files up front
* Cache paths or decisions where appropriate

### Expected benefit

* Lower I/O overhead at startup

## 9. Avoid compiling unnecessary regex patterns at startup

If some regex patterns are only used in specific modes, they should not be compiled eagerly.

### Proposed change

Compile rarely used regexes lazily.

### Example

```python
_CODE_FENCE_RE = None

def get_code_fence_re():
    global _CODE_FENCE_RE
    if _CODE_FENCE_RE is None:
        _CODE_FENCE_RE = re.compile(...)
    return _CODE_FENCE_RE
```

### Expected benefit

* Slightly lower startup overhead
* Keeps hot path smaller

## 10. Reduce startup-time output and formatting work

Verbose banners and repeated startup messages may affect perceived responsiveness.

### Proposed change

* Keep normal startup quiet
* Print detailed status only in debug / verbose mode
* Avoid expensive preview formatting unless needed

### Expected benefit

* Better perceived performance
* Cleaner stdout behavior for piping

## 11. Isolate editor / shell / agent paths

Different modes should not pay for each other’s initialization costs.

### Proposed change

Keep `main()` as a lightweight dispatcher and isolate mode-specific setup inside dedicated functions.

### Example

```python
def main():
    args = parse_args()

    if args.version:
        return show_version()

    if args.editor:
        return run_editor_mode(args)

    return run_agent_mode(args)
```

### Expected benefit

* Editor mode does not need to import AI SDKs
* Shell-only workflows stay lightweight

## 12. Measure import cost explicitly

Before rewriting structure aggressively, measure which imports are actually expensive.

### Commands

```bash
python -X importtime multi_ai_cli.py --version
```

```bash
python -X importtime -c "import multi_ai_cli"
```

### Expected benefit

* Optimization becomes evidence-based
* Avoids premature or irrelevant refactoring

## 13. Add simple startup profiling

Import cost is only part of the story. Initialization code may also be slow.

### Candidates to measure

* config loading
* history loading
* persona loading
* log setup
* temp file setup
* engine creation

### Example

```python
t0 = time.perf_counter()
config = load_config()
t1 = time.perf_counter()
logger.debug("load_config: %.3f ms", (t1 - t0) * 1000)
```

### Expected benefit

* Makes bottlenecks visible
* Helps prioritize the highest-impact fixes

## 14. Consider a lightweight startup mode

For shell-style usage, a dedicated fast path may be useful.

### Possible options

* `--fast`
* `--no-history`
* `--no-log-file`
* `--no-dotenv`

### Expected benefit

* Better performance for pipeline-oriented or short-lived runs
* More UNIX-like behavior when desired

## Recommended implementation order

### Highest priority

1. Lazy imports for heavy SDKs
2. Lazy client initialization
3. Parse args before expensive setup
4. Minimal logging by default

### Next

5. Delay config / history / persona loading
6. Instantiate only the selected engine
7. Reduce startup-time messages

### Later if needed

8. Lazy regex compilation
9. Mode isolation improvements
10. Lightweight startup mode

## Summary

The key optimization is not “rewrite everything for speed.”
It is this:

> Stop preparing everything up front.
> Prepare only what the current command actually needs.

That approach should improve startup latency while preserving the current Python codebase and keeping the design mainta


[Performance] Optimize CLI startup time via lazy loading and deferred initialization #25

Description

Startup Performance Improvement Ideas

Goal

1. Use lazy imports for heavy modules

Current problem

Proposed change

Example

Expected benefit

2. Lazily initialize SDK clients

Current problem

Proposed change

Example

Expected benefit

3. Avoid unconditional .env loading at startup

Proposed change

Example

Expected benefit

4. Keep logging initialization minimal

Current problem

Proposed change

Example

Expected benefit

5. Parse arguments before expensive initialization

Proposed change

Bad

Better

Expected benefit

6. Instantiate only the selected engine

Proposed change

Example

Expected benefit

7. Delay config, history, and persona loading

Proposed change

Expected benefit

8. Reduce startup-time file checks and scans

Proposed change

Expected benefit

9. Avoid compiling unnecessary regex patterns at startup

Proposed change

Example

Expected benefit

10. Reduce startup-time output and formatting work

Proposed change

Expected benefit

11. Isolate editor / shell / agent paths

Proposed change

Example

Expected benefit

12. Measure import cost explicitly

Commands

Expected benefit

13. Add simple startup profiling

Candidates to measure

Example

Expected benefit

14. Consider a lightweight startup mode

Possible options

Expected benefit

Recommended implementation order

Highest priority

Next

Later if needed

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

3. Avoid unconditional `.env` loading at startup