Complete API reference for the LessTokens Python SDK.
Main SDK class for interacting with the LessTokens API and LLM providers.
LessTokensSDK(config: LessTokensConfig)Parameters:
config(LessTokensConfig): SDK configurationapi_key(str, required): LessTokens API keyprovider(str, required): LLM provider name ('openai', 'anthropic', 'google', 'deepseek')base_url(str, optional): Base URL for LessTokens API (default: 'https://lesstokens.hive-hub.ai')timeout(int, optional): Request timeout in milliseconds (default: 30000)
Raises:
LessTokensError: If configuration is invalid
Example:
sdk = LessTokensSDK(
api_key="your-api-key",
provider="openai",
base_url="https://lesstokens.hive-hub.ai",
timeout=30000
)Process a prompt through LessTokens compression and send to LLM.
Parameters:
options(ProcessPromptOptions): Processing optionsprompt(str, required): The prompt to compress and sendllm_config(LLMConfig, required): LLM provider configurationcompression_options(CompressionOptions, optional): Compression settingsmessage_role(str, optional): Custom message role (default: 'user')message_content(str | Callable, optional): Custom message contentmessages(List[Dict[str, str]], optional): Additional messages for multi-turn conversations
Returns:
LLMResponse: LLM response with compression metrics
Raises:
LessTokensError: If compression or LLM request fails
Example:
response = await sdk.process_prompt({
"prompt": "Explain quantum computing",
"llm_config": {
"api_key": "sk-...",
"model": "gpt-4",
"temperature": 0.7,
}
})Process a prompt with streaming response.
Parameters:
- Same as
process_prompt
Returns:
AsyncIterator[StreamChunk]: Async iterable of stream chunks
Raises:
LessTokensError: If compression or LLM request fails
Example:
async for chunk in sdk.process_prompt_stream({
"prompt": "Tell a story",
"llm_config": {
"api_key": "sk-...",
"model": "gpt-4",
}
}):
if chunk.done:
print(f"Usage: {chunk.usage}")
else:
print(chunk.content, end="", flush=True)Compress a prompt without sending to LLM.
Parameters:
prompt(str): The prompt to compressoptions(CompressionOptions, optional): Compression settings
Returns:
CompressedPrompt: Compression results
Raises:
LessTokensError: If compression fails
Example:
compressed = await sdk.compress_prompt(
"Very long prompt...",
{
"target_ratio": 0.3,
"aggressive": True,
}
)Configuration for initializing the LessTokens SDK.
{
"api_key": str, # Required
"provider": str, # Required: 'openai', 'anthropic', 'google', 'deepseek'
"base_url": Optional[str], # Optional
"timeout": Optional[int], # Optional, milliseconds
}Options for processing a prompt.
{
"prompt": str, # Required
"llm_config": LLMConfig, # Required
"compression_options": Optional[CompressionOptions],
"message_role": Optional[str], # Default: 'user'
"message_content": Optional[str | Callable[[CompressedPrompt], str]],
"messages": Optional[List[Dict[str, str]]], # For multi-turn conversations
}LLM API configuration - supports all provider-specific options.
{
"api_key": str, # Required
"model": str, # Required
"temperature": Optional[float], # 0.0 to 2.0
"max_tokens": Optional[int],
"top_p": Optional[float],
"frequency_penalty": Optional[float],
"presence_penalty": Optional[float],
"stop": Optional[List[str]],
# ... all provider-specific options
}Compression options.
{
"target_ratio": Optional[float], # 0.0 to 1.0, default: 0.5
"preserve_context": Optional[bool], # Default: True
"aggressive": Optional[bool], # Default: False
}LLM response with usage metrics.
@dataclass
class LLMResponse:
content: str # Response content
usage: TokenUsage # Token usage information
metadata: Optional[ResponseMetadata] # Response metadataToken usage metrics.
@dataclass
class TokenUsage:
prompt_tokens: int # Original prompt tokens
completion_tokens: int # Completion tokens
total_tokens: int # Total tokens
compressed_tokens: Optional[int] # Compressed tokens (if compression was used)
savings: Optional[float] # Savings percentage (0-100)Response metadata.
@dataclass
class ResponseMetadata:
model: Optional[str] # Model used
provider: Optional[str] # Provider name
timestamp: Optional[str] # ISO timestamp
compression_ratio: Optional[float] # Compression ratio (if compression was used)Compressed prompt result.
@dataclass
class CompressedPrompt:
compressed: str # Compressed prompt text
original_tokens: int # Original token count
compressed_tokens: int # Compressed token count
savings: float # Savings percentage (0-100)
ratio: float # Compression ratio (compressed_tokens / original_tokens)Streaming chunk.
@dataclass
class StreamChunk:
content: str # Chunk content
done: bool # Whether this is the final chunk
usage: Optional[TokenUsage] # Usage information (available when done is True)Base error class for all LessTokens SDK errors.
Attributes:
message(str): Error messagecode(str): Error codestatus_code(Optional[int]): HTTP status code (if applicable)details(Optional[Any]): Additional error details
Example:
try:
response = await sdk.process_prompt(...)
except LessTokensError as e:
print(f"Error {e.code}: {e.message}")Error code constants.
INVALID_API_KEY: Invalid API keyINVALID_PROVIDER: Invalid providerCOMPRESSION_FAILED: Compression failedLLM_API_ERROR: LLM API errorTIMEOUT: Request timeoutNETWORK_ERROR: Network errorVALIDATION_ERROR: Validation error
Retry utilities with exponential backoff.
Retry a function with exponential backoff.
Parameters:
fn: Async function to retryconfig: Optional retry configurationmax_retries(int, default: 3)initial_delay(float, default: 1.0 seconds)max_delay(float, default: 10.0 seconds)retryable_errors(List[str], default: ["TIMEOUT", "NETWORK_ERROR", "RATE_LIMIT"])
Returns:
- Result of the function call
Input validation utilities.
Validate LessTokens configuration.
Validate prompt.
Validate process prompt options.
Validate LLM configuration.
Validate compression options.