Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
__pycache__/
/dist/
/result
install-local.sh
108 changes: 108 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@ The main control script for the Caelestia dotfiles.
- [`cliphist`](https://github.com/sentriz/cliphist) - clipboard history
- [`fuzzel`](https://codeberg.org/dnkl/fuzzel) - clipboard history/emoji picker

### Optional dependencies for OCR click-to-copy (`clicktodo` command)

- [`grim`](https://gitlab.freedesktop.org/emersion/grim) - taking screenshots (already listed above)
- [`wl-clipboard`](https://github.com/bugaevc/wl-clipboard) - copying to clipboard (already listed above)
- Python packages: `rapidocr-onnxruntime`, `onnxruntime`, `PyQt6`, `numpy`, `threadpoolctl` (install via `pip install caelestia[ocr]`)

**Performance Note:** The OCR feature uses RapidOCR with ONNXRuntime for optimal CPU performance (5-15x faster than EasyOCR). For best results on high-resolution displays, run the setup script to configure the persistent daemon:

</details>

## Installation
Expand Down Expand Up @@ -120,8 +128,108 @@ subcommands:
emoji emoji/glyph utilities
wallpaper manage the wallpaper
resizer window resizer daemon
clicktodo OCR-based click-to-copy from screen
```

### OCR Click-to-Copy (`clicktodo`)

The `clicktodo` command provides an OCR-based workflow for extracting and copying text from anywhere on your screen:

1. Captures a fullscreen screenshot
2. Runs OCR to detect all text on screen (via persistent daemon for speed)
3. Shows an interactive overlay with detected text regions highlighted
4. Click any text region to copy it to clipboard
5. Press `ESC` or right-click to cancel

**Performance:** Uses RapidOCR + ONNXRuntime for 5-15x faster processing than traditional OCR engines. Typical latency: 300-600ms on a 2880x1800 display.

**Setup:**

1. Install OCR dependencies:
```sh
pip install caelestia[ocr]
# Or manually: pip install rapidocr-onnxruntime onnxruntime PyQt6 numpy
```

2. Run the setup script to configure the OCR daemon:
```sh
./setup-ocr.sh
```

This will:
- Install dependencies if missing
- Set up a systemd user service for the OCR daemon
- Create default configuration at `~/.config/caelestia/ocr.json`
- Start the daemon (models stay hot in memory for instant responses)

**Requirements:**
- Requires `grim` and `wl-clipboard` (already needed for other features)
- Python 3.13+ with pip

**Hyprland keybinding example:**

Add to your `hyprland.conf`:
```
# Standard mode
bind = SUPER, O, exec, caelestia clicktodo

# Fast mode (more aggressive optimizations)
bind = SUPER SHIFT, O, exec, caelestia clicktodo --fast
```

**Usage:**
```sh
# Standard mode
caelestia clicktodo

# Fast mode (downscales more aggressively, limits max boxes)
caelestia clicktodo --fast --live
```

**Configuration:**

Edit `~/.config/caelestia/ocr.json` to customize:
```json
{
"provider": "cpu-ort", // cpu-ort, gpu-rocm, npu-xdna (future)
"downscale": 0.6, // Detection downscale factor (0.5-1.0)
"tiles": 1, // Parallel tiles (future feature)
"max_boxes": 300, // Maximum text boxes to detect
"use_gpu": false, // Enable GPU (experimental on AMD)
"warm_start": true, // Run warm-up on daemon start
"performance": {
"idle_threads": 1, // Background thread budget when idle
"standard_threads": 4, // Default thread budget during normal OCR
"fast_threads": 0, // 0 = auto, otherwise specific thread count
"idle_cores": 1, // CPU cores kept active when idle
"standard_cores": 0, // 0 = auto mid-range core count
"fast_cores": 0 // 0 = all available cores during bursts
}
}
```

Set any value to `0` (or omit the key) to allow the daemon to auto-detect from the host CPU. Leave the entire `performance` block out to use adaptive defaults.

**Daemon Management:**
```sh
# Check status
systemctl --user status caelestia-ocrd

# Restart daemon
systemctl --user restart caelestia-ocrd

# Stop daemon
systemctl --user stop caelestia-ocrd

# View logs
journalctl --user -u caelestia-ocrd -f
```

**Future Optimizations:**
- NPU acceleration via AMD XDNA (when ONNX Runtime EP is stable on Linux)
- GPU acceleration via ROCm (when Radeon 890M iGPU is officially supported)
- Parallel tile processing for ultra-high-resolution displays

## Configuring

All configuration options are in `~/.config/caelestia/cli.json`.
Expand Down
8 changes: 7 additions & 1 deletion completions/caelestia.fish
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
set -l seen '__fish_seen_subcommand_from'
set -l has_opt '__fish_contains_opt'

set -l commands shell toggle scheme screenshot record clipboard emoji-picker wallpaper resizer
set -l commands shell toggle scheme screenshot record clipboard emoji wallpaper resizer clicktodo
set -l not_seen "not $seen $commands"

# Disable file completions
Expand All @@ -20,6 +20,7 @@ complete -c caelestia -n $not_seen -a 'clipboard' -d 'Open clipboard history'
complete -c caelestia -n $not_seen -a 'emoji' -d 'Emoji/glyph utilities'
complete -c caelestia -n $not_seen -a 'wallpaper' -d 'Manage the wallpaper'
complete -c caelestia -n $not_seen -a 'resizer' -d 'Window resizer'
complete -c caelestia -n $not_seen -a 'clicktodo' -d 'OCR-based click-to-copy from screen'

# Shell
set -l commands mpris drawers wallpaper notifs
Expand Down Expand Up @@ -125,3 +126,8 @@ complete -c caelestia -n "$seen emoji" -s 'f' -l 'fetch' -d 'Fetch emoji/glyph d
complete -c caelestia -n "$seen resizer" -s 'd' -l 'daemon' -d 'Start in daemon mode'
complete -c caelestia -n "$seen resizer" -a 'pip' -d 'Quick pip mode'
complete -c caelestia -n "$seen resizer" -a 'active' -d 'Select the active window'

# Clicktodo
complete -c caelestia -n "$seen clicktodo" -s 'f' -l 'fast' -d 'Enable fast mode with aggressive optimizations'
complete -c caelestia -n "$seen clicktodo" -l 'debug' -d 'Show verbose debug output for troubleshooting'
complete -c caelestia -n "$seen clicktodo" -l 'live' -d 'Stream OCR results as they are recognized'
9 changes: 9 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,15 @@ dependencies = [
"materialyoucolor"
]

[project.optional-dependencies]
ocr = [
"rapidocr-onnxruntime>=1.3.0",
"onnxruntime>=1.16.0",
"PyQt6>=6.4.0",
"numpy>=1.24.0",
"threadpoolctl>=3.1.0"
]

[project.scripts]
caelestia = "caelestia:main"

Expand Down
111 changes: 111 additions & 0 deletions setup-ocr.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
#!/usr/bin/env bash

set -euo pipefail

SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)

find_python() {
if command -v python3 >/dev/null 2>&1; then
echo "python3"
elif command -v python >/dev/null 2>&1; then
echo "python"
else
exit 1
fi
}

PYTHON_BIN=$(find_python)

if ! "${PYTHON_BIN}" -m pip --version >/dev/null 2>&1; then
exit 1
fi

missing_pkgs=$("${PYTHON_BIN}" - <<'PY'
import importlib
modules = {
"rapidocr_onnxruntime": "rapidocr-onnxruntime",
"onnxruntime": "onnxruntime",
"numpy": "numpy",
"PyQt6": "PyQt6",
"threadpoolctl": "threadpoolctl",
}
missing = []
for module, pkg in modules.items():
try:
importlib.import_module(module)
except Exception:
missing.append(pkg)

if missing:
print(" ".join(missing))
PY
)

if [[ -n "${missing_pkgs}" ]]; then
"${PYTHON_BIN}" -m pip install --user ${missing_pkgs}
fi

SYSTEMD_USER_DIR="$HOME/.config/systemd/user"
mkdir -p "$SYSTEMD_USER_DIR"

SERVICE_FILE="${SCRIPT_DIR}/systemd/caelestia-ocrd.service"
if [[ -f "${SERVICE_FILE}" ]]; then
cp "${SERVICE_FILE}" "${SYSTEMD_USER_DIR}/"
systemctl --user daemon-reload
systemctl --user enable caelestia-ocrd.service >/dev/null 2>&1 || true
systemctl --user restart caelestia-ocrd.service || systemctl --user start caelestia-ocrd.service
fi

CONFIG_DIR="$HOME/.config/caelestia"
mkdir -p "$CONFIG_DIR"

OCR_CONFIG="$CONFIG_DIR/ocr.json"

"${PYTHON_BIN}" - <<'PY'
import json
import os
from pathlib import Path

config_path = Path(os.path.expanduser("~/.config/caelestia/ocr.json"))

DEFAULT = {
"provider": "cpu-ort",
"downscale": 0.6,
"tiles": 1,
"max_boxes": 300,
"use_gpu": False,
"warm_start": True,
"performance": {
"idle_threads": 1,
"standard_threads": 0,
"fast_threads": 0,
"idle_cores": 1,
"standard_cores": 0,
"fast_cores": 0,
},
}

if config_path.exists():
try:
data = json.loads(config_path.read_text())
except Exception:
data = {}
else:
data = {}

def deep_fill(default, target):
for key, value in default.items():
if isinstance(value, dict):
existing = target.get(key)
if not isinstance(existing, dict):
existing = {}
target[key] = existing
deep_fill(value, existing)
else:
target.setdefault(key, value)

deep_fill(DEFAULT, data)

config_path.parent.mkdir(parents=True, exist_ok=True)
config_path.write_text(json.dumps(data, indent=2))
PY
Loading