API Reference

X-GPT's web server exposes REST API endpoints for all major operations. The API is built with Elysia and returns HTML (for HTMX) or JSON based on the Accept header.

Base URL

http://localhost:3002/api

xgpt serve and bun dev default to port 3002. If you pass --port or call createServer() directly (default 3000), adjust the base URL.

Response Formats

Success (HTML)

Most endpoints return HTML for HTMX consumption:

<div class="result success">
  <strong>Success!</strong><br>
  Collected 150 tweets<br>
  Session ID: 42
</div>

Error (HTML)

<div class="result error">
  <div style="text-align: left;">
    <strong style="color: #ef4444;">Rate Limited</strong>
    <p>You've made too many requests...</p>
    <div>
      <strong>Suggestions:</strong>
      <ul>
        <li>Wait 15-30 minutes before trying again</li>
      </ul>
    </div>
  </div>
</div>

JSON Response

Request with Accept: application/json for JSON:

{
  "error": {
    "code": "RATE_LIMIT",
    "message": "Too many requests",
    "details": null,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "status": 429
}

Request Format

The web UI sends JSON payloads via HTMX json-enc. Use Content-Type: application/json as shown in the examples below.

Endpoints

Scrape Tweets

Scrape tweets from a Twitter user's profile.

POST /api/scrape
Content-Type: application/json

Parameters:

Field	Type	Required	Default	Description
`username`	string	Yes	-	Twitter username (without @)
`includeReplies`	boolean	No	false	Include replies
`includeRetweets`	boolean	No	false	Include retweets
`maxTweets`	number	No	100	Maximum tweets to fetch

Example:

curl -X POST http://localhost:3002/api/scrape \
  -H "Content-Type: application/json" \
  -d '{"username":"elonmusk","maxTweets":500,"includeReplies":true}'

Response Data:

tweetsCollected - Number of tweets scraped
sessionId - Session ID for tracking

Search Tweets

Search Twitter for tweets matching a query.

POST /api/search
Content-Type: application/json

Parameters:

Field	Type	Required	Default	Description
`query`	string	Yes	-	Search query (comma-separated terms)
`maxTweets`	number	No	100	Maximum tweets to fetch
`days`	number	No	-	Search tweets from last N days (no date filter if omitted)
`mode`	string	No	"latest"	Search mode: "latest" or "top"
`embed`	boolean	No	false	Generate embeddings after search

If days, since, and until are all omitted, the API applies no date filter. (The CLI defaults to 7 days when no date range is provided.)

Example:

curl -X POST http://localhost:3002/api/search \
  -H "Content-Type: application/json" \
  -d '{"query":"AI,machine learning","maxTweets":200,"days":30,"mode":"top","embed":true}'

Response Data:

tweetsCollected - Number of tweets found
usersCreated - Number of new users added
sessionId - Session ID
embeddingsGenerated - Boolean if embeddings were created

Discover Users

Find Twitter profiles by bio, name, or keywords.

POST /api/discover
Content-Type: application/json

Parameters:

Field	Type	Required	Default	Description
`query`	string	Yes	-	Search query for profiles
`maxResults`	number	No	20	Maximum profiles to find
`save`	boolean	No	true	Save profiles to database

Example:

curl -X POST http://localhost:3002/api/discover \
  -H "Content-Type: application/json" \
  -d '{"query":"AI researcher","maxResults":50,"save":true}'

Response Data:

profiles - Array of discovered profiles
savedCount - Number saved to database

Ask Question

Ask a question using semantic search over embedded tweets.

POST /api/ask
Content-Type: application/json

Parameters:

Field	Type	Required	Default	Description
`question`	string	Yes	-	Question to answer
`topK`	number	No	5	Number of relevant tweets
`model`	string	No	"gpt-4o-mini"	OpenAI model for answering

Example:

curl -X POST http://localhost:3002/api/ask \
  -H "Content-Type: application/json" \
  -d '{"question":"What does this person think about AI?","topK":10,"model":"gpt-4o"}'

Response Data:

answer - AI-generated answer
relevantTweets - Array of relevant tweets with similarity scores

Generate Embeddings

Generate vector embeddings for tweets without embeddings.

POST /api/embed
Content-Type: application/json

Parameters:

Field	Type	Required	Default	Description
`model`	string	No	"text-embedding-3-small"	OpenAI embedding model
`batchSize`	number	No	1000	Batch size for processing

Example:

curl -X POST http://localhost:3002/api/embed \
  -H "Content-Type: application/json" \
  -d '{"model":"text-embedding-3-small","batchSize":500}'

Response Data:

tweetsEmbedded - Number of tweets embedded
model - Model used

Initialize Database

Initialize or reset the database.

POST /api/db/init

Example:

curl -X POST http://localhost:3002/api/db/init

Set Configuration

Update a configuration value.

POST /api/config/set
Content-Type: application/json

Parameters:

Field	Type	Required	Description
`key`	string	Yes	Config key path (e.g., "scraping.maxTweets")
`value`	string	Yes	New value

Example:

curl -X POST http://localhost:3002/api/config/set \
  -H "Content-Type: application/json" \
  -d '{"key":"scraping.rateLimitProfile","value":"moderate"}'

Job Management

Get All Jobs

Get current job status for the taskbar.

GET /api/jobs

Returns HTML for the job taskbar.

Cancel Job

Cancel a running job.

POST /api/jobs/:id/cancel

Parameters:

Param	Type	Description
`id`	string	Job ID to cancel

Example:

curl -X POST http://localhost:3002/api/jobs/scrape-1704067200000/cancel

Responses:

200 - Job cancelled successfully
404 - Job not found or already completed

Job Updates Stream (SSE)

Server-Sent Events stream for real-time job updates.

GET /api/jobs/stream

Event Format:

event: jobs
data: <html content for taskbar>
data: <continued html>

: heartbeat

Client Usage:

const eventSource = new EventSource('/api/jobs/stream');

eventSource.addEventListener('jobs', (event) => {
  document.getElementById('taskbar').innerHTML = event.data;
});

eventSource.onerror = () => {
  // Reconnect on error
  setTimeout(() => location.reload(), 5000);
};

Error Codes

HTTP Status	Error Code	Description
400	`BAD_REQUEST`	Invalid request parameters
401	`UNAUTHORIZED`	Authentication required
403	`FORBIDDEN`	Access denied
404	`NOT_FOUND`	Resource not found
422	`VALIDATION_ERROR`	Input validation failed
429	`RATE_LIMIT`	Rate limit exceeded
500	`INTERNAL_ERROR`	Server error
503	`SERVICE_UNAVAILABLE`	Service temporarily unavailable

HTMX Integration

All endpoints are designed for HTMX integration:

<!-- Scrape form -->
<form hx-post="/api/scrape" hx-target="#results">
  <input name="username" required>
  <input name="maxTweets" type="number" value="100">
  <button type="submit">Scrape</button>
</form>
<div id="results"></div>

<!-- Job taskbar with SSE -->
<div id="taskbar"
     hx-ext="sse"
     sse-connect="/api/jobs/stream"
     sse-swap="jobs">
</div>

Rate Limiting

The API inherits Twitter's rate limits. Best practices:

Use conservative rate limit profile for safety
Wait 15-30 minutes after rate limit errors
Start with small maxTweets values
Avoid concurrent requests

Authentication

The API uses environment variables for authentication:

OPENAI_KEY - Required for embedding and ask endpoints
AUTH_TOKEN - Twitter auth token for scraping
CT0 - Twitter CSRF token for scraping

These are configured in .env or via the config system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Reference

Base URL

Response Formats

Success (HTML)

Error (HTML)

JSON Response

Request Format

Endpoints

Scrape Tweets

Search Tweets

Discover Users

Ask Question

Generate Embeddings

Initialize Database

Set Configuration

Job Management

Get All Jobs

Cancel Job

Job Updates Stream (SSE)

Error Codes

HTMX Integration

Rate Limiting

Authentication

Related Documentation

FilesExpand file tree

api-reference.md

Latest commit

History

api-reference.md

File metadata and controls

API Reference

Base URL

Response Formats

Success (HTML)

Error (HTML)

JSON Response

Request Format

Endpoints

Scrape Tweets

Search Tweets

Discover Users

Ask Question

Generate Embeddings

Initialize Database

Set Configuration

Job Management

Get All Jobs

Cancel Job

Job Updates Stream (SSE)

Error Codes

HTMX Integration

Rate Limiting

Authentication

Related Documentation