The Query Engine is a unified interface layer that provides declarative, SQL-like control over both the Browser Engine and Proxy Engine in BrowserX.
Instead of imperatively calling individual APIs, express your intent through queries that the engine translates into coordinated actions across browser and proxy subsystems.
-- Extract product price from a website
SELECT price, title FROM "https://example.com/product/123"
-- Navigate with proxy configuration
NAVIGATE TO "https://api.example.com"
WITH {
proxy: { cache: true, headers: {"Authorization": "Bearer token"} }
}
CAPTURE response.body, dom.titleThe Query Engine follows a multi-stage compilation and execution pipeline:
Query String → Lexer → Parser → Semantic Analyzer → Optimizer → Planner → Executor → Formatter → Result
- Lexer - Tokenizes query string into token stream
- Parser - Builds Abstract Syntax Tree (AST) from tokens
- Semantic Analyzer - Type checking and validation
- Optimizer - Query transformation and optimization
- Planner - Physical execution plan generation
- Executor - Step-by-step query execution
- Formatter - Result formatting (JSON, table, CSV, etc.)
import { QueryEngine } from "./query-engine/mod.ts";
// Create and initialize engine
const engine = new QueryEngine({
browser: {
headless: true,
defaultTimeout: 30000,
},
proxy: {
enabled: true,
defaultCache: true,
},
security: {
permissions: ["NAVIGATE_PUBLIC", "READ_COOKIES", "SCREENSHOT"],
},
});
await engine.initialize();// Execute a query
const result = await engine.execute(
`SELECT title, description FROM "https://example.com"`,
);
console.log(result.data);
// { title: "Example Domain", description: "..." }const result = await engine.execute(`
NAVIGATE TO "https://api.example.com/users"
WITH {
proxy: {
headers: {"Authorization": "Bearer token123"},
cache: false
},
browser: {
viewport: {width: 1920, height: 1080}
}
}
CAPTURE response.status, response.body
`);const result = await engine.execute(`
IF EXISTS("#login-form") THEN
INSERT "user@example.com" INTO "#email"
INSERT "password" INTO "#password"
CLICK "#submit"
ELSE
SELECT "Already logged in"
`);const urls = ["https://example1.com", "https://example2.com", "https://example3.com"];
const result = await engine.execute(`
FOR EACH url IN ${JSON.stringify(urls)}
SELECT title, description FROM url
`);// Start query asynchronously
const queryId = await engine.executeAsync(`
SELECT * FROM "https://slow-site.com"
`);
// Check status
const status = await engine.getQueryStatus(queryId);
console.log(`Progress: ${status.progress}%`);
// Cancel if needed
await engine.cancelQuery(queryId);- SELECT - Extract data from sources
- NAVIGATE - Navigate to URLs
- SET - Configure engine settings
- SHOW - Display state (cache, cookies, headers, etc.)
- FOR - Iteration
- IF - Conditional execution
- INSERT - Insert values into form fields
- UPDATE - Update element properties
- DELETE - Delete elements
- WITH - Common Table Expressions (CTEs)
- Comparison:
=,!=,>,>=,<,<= - Logical:
AND,OR,NOT - Arithmetic:
+,-,*,/,% - String:
||(concatenation),LIKE,MATCHES - Collection:
IN,CONTAINS
String Functions:
UPPER(text),LOWER(text),TRIM(text)SUBSTRING(text, start, length)REPLACE(text, pattern, replacement)SPLIT(text, delimiter)
DOM Functions:
TEXT(selector)- Extract text contentHTML(selector)- Extract HTMLATTR(selector, name)- Extract attributeCOUNT(selector)- Count matching elementsEXISTS(selector)- Check if element exists
Network Functions:
HEADER(request, name)- Get header valueSTATUS(response)- Get status codeBODY(response)- Get response bodyCACHED(url)- Check if URL is cached
Utility Functions:
PARSE_JSON(text)- Parse JSON stringPARSE_HTML(text)- Parse HTML stringWAIT(duration)- Wait for durationSCREENSHOT()- Capture screenshotPDF()- Generate PDF
interface QueryEngineConfig {
browser?: {
headless?: boolean;
defaultViewport?: { width: number; height: number };
defaultTimeout?: number;
};
proxy?: {
enabled?: boolean;
defaultCache?: boolean;
defaultTimeout?: number;
};
resources?: {
browsers?: { min?: number; max?: number; idleTimeout?: number };
pages?: { max?: number; idleTimeout?: number };
connections?: { max?: number; idleTimeout?: number };
};
security?: {
permissions?: Permission[];
sandbox?: { enabled?: boolean; timeout?: number };
rateLimit?: { perSecond?: number; perMinute?: number; perHour?: number };
};
metrics?: {
enabled?: boolean;
tracing?: boolean;
exportFormat?: "prometheus" | "json";
};
}The Query Engine enforces multiple security layers:
- Permissions - Require explicit permissions for sensitive operations
- Sandboxing - Execute queries in isolated V8 contexts
- Rate Limiting - Prevent abuse through rate limits
- URL Validation - Whitelist/blacklist domains and protocols
- Resource Limits - Enforce limits on memory, duration, navigations
- Constant Folding - Evaluate constant expressions at compile time
- Predicate Pushdown - Move filters closer to data source
- Cache Utilization - Use cached data when possible
- Parallel Execution - Execute independent queries in parallel
- Connection Pooling - Reuse browser instances and proxy connections
// Get metrics
const metrics = engine.getMetrics();
console.log(metrics);
/*
{
queries: { total: 100, successful: 95, failed: 5 },
performance: { averageExecutionTime: 523, p50: 450, p95: 1200, p99: 2000 },
resources: { browsers: 3, pages: 10, connections: 15, memoryUsage: 524288000 },
errors: { byType: { NetworkError: 3, TimeoutError: 2 }, total: 5 }
}
*/- Folder structure and type definitions
- Lexer with 80+ token types
- Parser with recursive descent parsing
- AST node definitions for all statement types
- Main QueryEngine class
- Semantic Analyzer: Type checking and validation
- Query Optimizer: Constant folding, predicate pushdown, cache utilization
- Execution Planner: Physical execution plan generation with dependency graph (GraphX-backed)
- Query Executor: Step-by-step query execution with async browser controller integration
- Browser Controller: DOM functions (CLICK, TYPE, TEXT, HTML, ATTR, COUNT, EXISTS) all async with
isAsync: true - Proxy Controller: Integration with proxy engine for network/cache queries
- Result Formatter: Multiple output formats (JSON, CSV, table)
- Security Validator: Permission enforcement and sandboxing
- Dependency Graph:
DependencyGraphBuilderbacked by GraphXDiGraphwith topological sort and cycle detection - Browser wiring:
setCurrentBrowserController()called at execute(),clearBrowserContext()in finally block - Error handling: Comprehensive error types and recovery
- State Manager: Cross-query state persistence
- Metrics Collector: Query performance tracking and reporting
- Error Recovery Manager: Advanced error recovery strategies
- Resource Manager: Browser instance and connection lifecycle
- Advanced SQL features: Subqueries, joins, aggregations, window functions
# Unit tests
deno test --allow-all tests/unit/
# Integration tests
deno test --allow-all tests/integration/
# All tests
deno test --allow-all tests/deno check mod.tsdeno lint- QueryEngine.md - High-level architecture and use cases
- QueryEngineAbstraction.md - Low-level implementation reference
- FOLDER_STRUCTURE.md - Complete folder structure outline
See the /examples directory for complete working examples:
- Web scraping with pagination
- API testing through proxy
- Visual regression testing
- Performance monitoring
- Login flow automation
- Data collection for AI/ML
Part of BrowserX project.