Use Codex or Cursor agents from python as easily as calling a function, using your CLI auth instead of the API.
Note: this project is not affiliated with OpenAI in any way. Thanks for the awesome tools and models though!
- Codex CLI installed and authenticated (
codexmust be on your PATH), or - Cursor Agent CLI installed and authenticated (
cursormust be on your PATH). - Python 3.8+.
pip install codexapifrom codexapi import agent, Agent, Task
# Run one-shot tasks as a function call
print(agent("Say hello"))
# Run a multi-turn conversation as a session
session = Agent(cwd="/path/to/project")
print(session("Summarize this repo."))
print(session("Now list any risks."))
# Save and resume a session later
thread_id = session.thread_id
session2 = Agent(cwd="/path/to/project", thread_id=thread_id)
print(session2("Continue from where we left off."))
# Define a task with a checker
class RepoTask(Task):
def check(self):
# Return an error string if something is wrong, or None/"" if OK
return None
task = RepoTask("Summarize this repo.", cwd="/path/to/project")
result = task()
print(result.success, result.summary)Use backend="cursor" (or set CODEXAPI_BACKEND=cursor) to switch to the
Cursor agent backend.
After installing, use the codexapi command:
codexapi run "Summarize this repo."
codexapi run --cwd /path/to/project "Fix the failing tests."
echo "Say hello." | codexapi run
codexapi run --backend cursor "Summarize this repo."codexapi task exits with code 0 on success and 1 on failure.
codexapi task "Fix the failing tests." --max-iterations 5
codexapi task -f task.yaml
codexapi task -f task.yaml -i README.mdCreate a new task file template:
codexapi create task.yaml
codexapi create my_task # adds .yamlProgress is shown by default for codexapi task; use --quiet to suppress it.
When using --item, the task file must include at least one {{item}} placeholder.
Task files default to using the standard check prompt for the task. Set check: "None" to skip verification.
Use max_iterations in the task file to override the default iteration cap (0 means unlimited).
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with success/reason.
Take tasks from a GitHub Project (requires gh-task):
codexapi task -p owner/projects/3 -n "Your Name" -s Ready task_a.yaml task_b.yamlFilter project issues by title before taking them:
codexapi task -p owner/projects/3 -n "Your Name" --only-matching "/n300/" task_a.yaml task_b.yamlReset owned tasks on a GitHub Project back to Ready:
codexapi reset -p owner/projects/3
codexapi reset -p owner/projects/3 -d # also removes the Progress sectionTask labels are derived from task filenames (basename without extension). The
issue title/body become {{item}} after removing any existing ## Progress
section.
Example task progress run:
./examples/example_task_progress.shShow running sessions and their latest activity:
codexapi topPress h for keys.
codexapi top and codexapi limit are Codex-only.
Resume a session and print the thread/session id to stderr:
codexapi run --thread-id THREAD_ID --print-thread-id "Continue where we left off."Use --no-yolo to disable --yolo (Codex uses --full-auto).
Use --include-thinking to return all agent messages joined together for codexapi run (Codex only).
Lead mode periodically checks in on a long-running agent session with the
current time and prints JSON status updates. The agent controls the loop by
setting continue to true/false in its JSON response. Each check-in expects
JSON keys:
status (one line), continue (bool), and optional comments (string). If the
JSON is invalid, lead asks the agent once to retry before stopping with an
error. When ~/.pushover is configured, lead sends a notification when it
stops.
Lead mode also uses a leadbook file as the agent's working page. By default this
is LEADBOOK.md in the working directory. The leadbook content is injected into
each check-in prompt and must be updated before the agent responds. Use
--leadbook PATH to point at a different file, or --no-leadbook to disable.
Use -f/--prompt-file to read the prompt from a file.
If the leadbook does not exist, lead creates it with a template.
codexapi lead 5 "Run the benchmark and wait for results."
Run without waiting between check-ins:
```bash
codexapi lead 0 "Do a rapid triage pass and report."
Ralph loop mode repeats the same prompt until a completion promise or a max
iteration cap is hit (0 means unlimited). Cancel by deleting
`.codexapi/ralph-loop.local.md` or running `codexapi ralph --cancel`.
By default each iteration starts with a fresh Agent context; use
`--ralph-reuse` to keep a single shared context across iterations.
The agent may also stop early by outputting `MAKE IT STOP` as the first
non-empty line of its message.
```bash
codexapi ralph "Fix the bug." --completion-promise DONE --max-iterations 5
codexapi ralph --ralph-reuse "Try again from the same context." --max-iterations 3
codexapi ralph --cancel --cwd /path/to/project
Science mode wraps a short task in a science prompt and runs it through the
Ralph loop. It defaults to --yolo and expects progress notes in SCIENCE.md.
Each iteration appends the agent output to LOGBOOK.md and the runner extracts
any improved figures of merit for optional notifications. You can also set
--max-duration to stop after the current iteration once a time limit is hit.
The default science wrapper also tells the agent to create/use a local git
branch when in a repo and make local commits for worthwhile improvements, while
never committing or resetting LOGBOOK.md or SCIENCE.md.
codexapi science "hyper-optimize the kernel cycles"
codexapi science --no-yolo "hyper-optimize the kernel cycles" --max-iterations 3
codexapi science "hyper-optimize the kernel cycles" --max-duration 90mOptional Pushover notifications: create ~/.pushover with two non-empty lines.
Line 1 is your user or group key, line 2 is the app API token. When this file
exists, Science will send a notification whenever it detects a new best result,
including the metric values and percent improvement, plus a final run-end status.
Task runs will also send a
✅/❌ notification with the task summary. Lead runs send a notification when the
loop stops.
Run a task file across a list file:
codexapi foreach list.txt task.yaml
codexapi foreach list.txt task.yaml -n 4
codexapi foreach list.txt task.yaml --retry-failed
codexapi foreach list.txt task.yaml --retry-allRuns a single agent turn and returns only the agent's message. Any reasoning items are filtered out.
prompt(str): prompt to send to the agent backend.cwd(str | PathLike | None): working directory for the agent session.yolo(bool): pass--yolowhen true (defaults to true).flags(str | None): extra CLI flags to pass to the agent backend.include_thinking(bool): when true, return all agent messages joined.backend(str | None):codexorcursor(defaults toCODEXAPI_BACKENDorcodex).
Agent(cwd=None, yolo=True, thread_id=None, flags=None, welfare=False, include_thinking=False, backend=None)
Creates a stateful session wrapper. Calling the instance sends the prompt into the same conversation and returns only the agent's message.
__call__(prompt) -> str: send a prompt to the agent backend and return the message.thread_id -> str | None: expose the underlying session id once created.yolo(bool): pass--yolowhen true (defaults to true).flags(str | None): extra CLI flags to pass to the agent backend.welfare(bool): when true, append welfare stop instructions to each prompt and raiseWelfareStopif the agent outputsMAKE IT STOP.include_thinking(bool): when true, return all agent messages joined.backend(str | None):codexorcursor(defaults toCODEXAPI_BACKENDorcodex). For Cursor,thread_idcorresponds to thesession_idreturned by the agent.
Runs a long-lived agent session and periodically checks in with the current
local time and a reminder of prompt. Each check-in expects JSON with keys:
status (one line), continue (bool), and optional comments (string). If the
JSON is invalid, lead asks the agent once to retry. The loop stops when
continue is false and sends a Pushover notification (when configured).
Lead also injects the leadbook content into each prompt. By default it uses
LEADBOOK.md in the working directory. Pass leadbook=False to disable or a
path string to override the location.
Set backend="cursor" (or CODEXAPI_BACKEND=cursor) to use Cursor.
task(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None, backend=None) -> str
Runs a task with checker-driven retries and returns the success summary.
Raises TaskFailed when the maximum iterations are reached.
check(str | None | False): custom check prompt, default checker, orFalse/"None"to skip.max_iterations(int): maximum number of task iterations (0 means unlimited).progress(bool): show a tqdm progress bar with a one-line status after each round.set_up/tear_down/on_success/on_failure(str | None): optional hook prompts.backend(str | None):codexorcursor(defaults toCODEXAPI_BACKENDorcodex).
task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None, backend=None) -> TaskResult
Runs a task with checker-driven retries and returns a TaskResult without
raising TaskFailed.
Arguments mirror task() (including hooks).
Runs an agent task with checker-driven retries. Subclass it and implement
check() to return an error string when the task is incomplete, or return
None/"" when the task passes.
If you do not override check(), the default verifier wrapper runs with the
default check prompt and includes the agent output.
__call__(debug=False, progress=False) -> TaskResult: run the task.set_up(): optional setup hook.tear_down(): optional cleanup hook.check(output=None) -> str | None: return an error description orNone/"".outputis the last agent response.on_success(result): optional success hook.on_failure(result): optional failure hook.
Simple result object returned by Task.__call__.
success(bool): whether the task completed successfully.summary(str): agent summary of what happened.iterations(int): how many iterations were used.errors(str | None): last checker error, if any.thread_id(str | None): thread/session id for the session.
Exception raised by task() when iterations are exhausted.
summary(str): failure summary text.iterations(int | None): iterations made when the task failed.errors(str | None): last checker error, if any.
foreach(list_file, task_file, n=None, cwd=None, yolo=True, flags=None, backend=None) -> ForeachResult
Runs a task file over a list of items, updating the list file in place.
list_file(str | PathLike): path to the list file to process.task_file(str | PathLike): YAML task file (must includeprompt).n(int | None): limit parallelism to N (default: run all items in parallel).cwd(str | PathLike | None): working directory for the agent session.yolo(bool): pass--yolowhen true (defaults to true).flags(str | None): extra CLI flags to pass to the agent backend.backend(str | None):codexorcursor(defaults toCODEXAPI_BACKENDorcodex).
Simple result object returned by foreach().
succeeded(int): number of successful items.failed(int): number of failed items.skipped(int): number of items skipped (already marked in the list file).results(list[tuple]):(item, success, summary)entries for items that ran.
- Codex backend uses
codex exec --jsonand parses JSONLagent_messageitems. - Codex backend passes
--skip-git-repo-checkso it can run outside a git repo. - Cursor backend uses
cursor agent --print --output-format json --trustand parses the JSON result. include_thinking=Trueonly affects Codex; Cursor returns a single result string.- Passes
--yoloby default (Codex uses--full-autowhen disabled). - Raises
RuntimeErrorif the backend exits non-zero or returns no agent message.
Set the default backend:
export CODEXAPI_BACKEND=cursorSet CODEX_BIN to point at a non-default Codex binary:
export CODEX_BIN=/path/to/codexSet CURSOR_BIN to point at a non-default Cursor binary:
export CURSOR_BIN=/path/to/cursor