edgeFlow.js

Browser ML inference framework with task scheduling and smart caching.

Documentation · Examples · API Reference · English | 中文

✨ Features

📋 Task Scheduler - Priority queue, concurrency control, task cancellation
🔄 Batch Processing - Efficient batch inference out of the box
💾 Memory Management - Automatic memory tracking and cleanup with scopes
📥 Smart Model Loading - Preloading, sharding, resume download support
💿 Offline Caching - IndexedDB-based model caching for offline use
⚡ Multi-Backend - WebGPU, WebNN, WASM with automatic fallback
🤗 HuggingFace Hub - Direct model download with one line
🔤 Real Tokenizers - BPE & WordPiece tokenizers, load tokenizer.json directly
👷 Web Worker Support - Run inference in background threads
📦 Batteries Included - ONNX Runtime bundled, zero configuration needed
🎯 TypeScript First - Full type support with intuitive APIs

📦 Installation

npm install edgeflowjs

yarn add edgeflowjs

pnpm add edgeflowjs

Note: ONNX Runtime is included as a dependency. No additional setup required.

🚀 Quick Start

Try the Demo

Run the interactive demo locally to test all features:

# Clone and install
git clone https://github.com/user/edgeflow.js.git
cd edgeflow.js
npm install

# Build and start demo server
npm run demo

Open http://localhost:3000 in your browser:

Load Model - Enter a Hugging Face ONNX model URL and click "Load Model"

https://huggingface.co/Xenova/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/onnx/model_quantized.onnx

Test Features:
- 🧮 Tensor Operations - Test tensor creation, math ops, softmax, relu
- 📝 Text Classification - Run sentiment analysis on text
- 🔍 Feature Extraction - Extract embeddings from text
- ⚡ Task Scheduling - Test priority-based scheduling
- 📋 Task Scheduler - Test priority-based task scheduling
- 💾 Memory Management - Test allocation and cleanup

Basic Usage

import { pipeline } from 'edgeflowjs';

// Create a sentiment analysis pipeline
const sentiment = await pipeline('sentiment-analysis');

// Run inference
const result = await sentiment.run('I love this product!');
console.log(result);
// { label: 'positive', score: 0.98, processingTime: 12.5 }

Batch Processing

// Native batch processing support
const results = await sentiment.run([
  'This is amazing!',
  'This is terrible.',
  'It\'s okay I guess.'
]);

console.log(results);
// [
//   { label: 'positive', score: 0.95 },
//   { label: 'negative', score: 0.92 },
//   { label: 'neutral', score: 0.68 }
// ]

Multiple Pipelines

import { pipeline } from 'edgeflowjs';

// Create multiple pipelines
const classifier = await pipeline('text-classification');
const extractor = await pipeline('feature-extraction');

// Run in parallel with Promise.all
const [classification, features] = await Promise.all([
  classifier.run('Sample text'),
  extractor.run('Sample text')
]);

Image Classification

import { pipeline } from 'edgeflowjs';

const classifier = await pipeline('image-classification');

// From URL
const result = await classifier.run('https://example.com/image.jpg');

// From HTMLImageElement
const img = document.getElementById('myImage');
const result = await classifier.run(img);

// Batch
const results = await classifier.run([img1, img2, img3]);

Text Generation (Streaming)

import { pipeline } from 'edgeflowjs';

const generator = await pipeline('text-generation');

// Simple generation
const result = await generator.run('Once upon a time', {
  maxNewTokens: 50,
  temperature: 0.8,
});
console.log(result.generatedText);

// Streaming output
for await (const event of generator.stream('Hello, ')) {
  process.stdout.write(event.token);
  if (event.done) break;
}

Zero-shot Classification

import { pipeline } from 'edgeflowjs';

const classifier = await pipeline('zero-shot-classification');

const result = await classifier.classify(
  'I love playing soccer on weekends',
  ['sports', 'politics', 'technology', 'entertainment']
);

console.log(result.labels[0], result.scores[0]);
// 'sports', 0.92

Question Answering

import { pipeline } from 'edgeflowjs';

const qa = await pipeline('question-answering');

const result = await qa.run({
  question: 'What is the capital of France?',
  context: 'Paris is the capital and largest city of France.'
});

console.log(result.answer); // 'Paris'

Load from HuggingFace Hub

import { fromHub, fromTask } from 'edgeflowjs';

// Load by model ID (auto-downloads model, tokenizer, config)
const bundle = await fromHub('Xenova/distilbert-base-uncased-finetuned-sst-2-english');
console.log(bundle.tokenizer); // Tokenizer instance
console.log(bundle.config);    // Model config

// Load by task name (uses recommended model)
const sentimentBundle = await fromTask('sentiment-analysis');

Web Workers (Background Inference)

import { runInWorker, WorkerPool, isWorkerSupported } from 'edgeflowjs';

// Simple: run inference in background thread
if (isWorkerSupported()) {
  const outputs = await runInWorker(modelUrl, inputs);
}

// Advanced: use worker pool for parallel processing
const pool = new WorkerPool({ numWorkers: 4 });
await pool.init();

const modelId = await pool.loadModel(modelUrl);
const results = await pool.runBatch(modelId, batchInputs);

pool.terminate();

🎯 Supported Tasks

Task	Pipeline	Status
Text Classification	`text-classification`	✅
Sentiment Analysis	`sentiment-analysis`	✅
Feature Extraction	`feature-extraction`	✅
Image Classification	`image-classification`	✅
Text Generation	`text-generation`	✅
Object Detection	`object-detection`	✅
Speech Recognition	`automatic-speech-recognition`	✅
Zero-shot Classification	`zero-shot-classification`	✅
Question Answering	`question-answering`	✅

⚡ Key Differentiators

Comparison with transformers.js

Feature	transformers.js	edgeFlow.js
Task Scheduler	❌ None	✅ Priority queue with limits
Task Cancellation	❌ None	✅ Cancel pending tasks
Batch Processing	⚠️ Manual	✅ Built-in batching
Memory Scopes	❌ None	✅ Auto cleanup with scopes
Model Preloading	❌ None	✅ Background loading
Resume Download	❌ None	✅ Chunked with resume
Model Caching	⚠️ Basic	✅ IndexedDB with stats
TypeScript	✅ Full	✅ Full

🔧 Configuration

Runtime Selection

import { pipeline } from 'edgeflowjs';

// Automatic (recommended)
const model = await pipeline('text-classification');

// Specify runtime
const model = await pipeline('text-classification', {
  runtime: 'webgpu' // or 'webnn', 'wasm', 'auto'
});

Memory Management

import { pipeline, getMemoryStats, gc } from 'edgeflowjs';

const model = await pipeline('text-classification');

// Use the model
await model.run('text');

// Check memory usage
console.log(getMemoryStats());
// { allocated: 50MB, used: 45MB, peak: 52MB, tensorCount: 12 }

// Explicit cleanup
model.dispose();

// Force garbage collection
gc();

Scheduler Configuration

import { configureScheduler } from 'edgeflowjs';

configureScheduler({
  maxConcurrentTasks: 4,
  maxConcurrentPerModel: 1,
  defaultTimeout: 30000,
  enableBatching: true,
  maxBatchSize: 32,
});

Caching

import { pipeline, Cache } from 'edgeflowjs';

// Create a cache
const cache = new Cache({
  strategy: 'lru',
  maxSize: 100 * 1024 * 1024, // 100MB
  persistent: true, // Use IndexedDB
});

const model = await pipeline('text-classification', {
  cache: true
});

🛠️ Advanced Usage

Custom Model Loading

import { loadModel, runInference } from 'edgeflowjs';

// Load from URL with caching, sharding, and resume support
const model = await loadModel('https://example.com/model.bin', {
  runtime: 'webgpu',
  quantization: 'int8',
  cache: true,           // Enable IndexedDB caching (default: true)
  resumable: true,       // Enable resume download (default: true)
  chunkSize: 5 * 1024 * 1024, // 5MB chunks for large models
  onProgress: (progress) => console.log(`Loading: ${progress * 100}%`)
});

// Run inference
const outputs = await runInference(model, inputs);

// Cleanup
model.dispose();

Preloading Models

import { preloadModel, preloadModels, getPreloadStatus } from 'edgeflowjs';

// Preload a single model in background (with priority)
preloadModel('https://example.com/model1.onnx', { priority: 10 });

// Preload multiple models
preloadModels([
  { url: 'https://example.com/model1.onnx', priority: 10 },
  { url: 'https://example.com/model2.onnx', priority: 5 },
]);

// Check preload status
const status = getPreloadStatus('https://example.com/model1.onnx');
// 'pending' | 'loading' | 'complete' | 'error' | 'not_found'

Model Caching

import { 
  isModelCached, 
  getCachedModel, 
  deleteCachedModel, 
  clearModelCache,
  getModelCacheStats 
} from 'edgeflowjs';

// Check if model is cached
if (await isModelCached('https://example.com/model.onnx')) {
  console.log('Model is cached!');
}

// Get cached model data directly
const modelData = await getCachedModel('https://example.com/model.onnx');

// Delete a specific cached model
await deleteCachedModel('https://example.com/model.onnx');

// Clear all cached models
await clearModelCache();

// Get cache statistics
const stats = await getModelCacheStats();
console.log(`${stats.models} models cached, ${stats.totalSize} bytes total`);

Resume Downloads

Large model downloads automatically support resuming from where they left off:

import { loadModelData } from 'edgeflowjs';

// Download with progress and resume support
const modelData = await loadModelData('https://example.com/large-model.onnx', {
  resumable: true,
  chunkSize: 10 * 1024 * 1024, // 10MB chunks
  parallelConnections: 4,      // Download 4 chunks in parallel
  onProgress: (progress) => {
    console.log(`${progress.percent.toFixed(1)}% downloaded`);
    console.log(`Speed: ${(progress.speed / 1024 / 1024).toFixed(2)} MB/s`);
    console.log(`ETA: ${(progress.eta / 1000).toFixed(0)}s`);
    console.log(`Chunk ${progress.currentChunk}/${progress.totalChunks}`);
  }
});

Model Quantization

import { quantize } from 'edgeflowjs/tools';

const quantized = await quantize(model, {
  method: 'int8',
  calibrationData: samples,
});

console.log(`Compression: ${quantized.compressionRatio}x`);
// Compression: 3.8x

Benchmarking

import { benchmark } from 'edgeflowjs/tools';

const result = await benchmark(
  () => model.run('sample text'),
  { warmupRuns: 5, runs: 100 }
);

console.log(result);
// {
//   avgTime: 12.5,
//   minTime: 10.2,
//   maxTime: 18.3,
//   throughput: 80 // inferences/sec
// }

Memory Scope

import { withMemoryScope, tensor } from 'edgeflowjs';

const result = await withMemoryScope(async (scope) => {
  // Tensors tracked in scope
  const a = scope.track(tensor([1, 2, 3]));
  const b = scope.track(tensor([4, 5, 6]));
  
  // Process...
  const output = process(a, b);
  
  // Keep result, dispose others
  return scope.keep(output);
});
// a and b automatically disposed

🔌 Tensor Operations

import { tensor, zeros, ones, matmul, softmax, relu } from 'edgeflowjs';

// Create tensors
const a = tensor([[1, 2], [3, 4]]);
const b = zeros([2, 2]);
const c = ones([2, 2]);

// Operations
const d = matmul(a, c);
const probs = softmax(d);
const activated = relu(d);

// Cleanup
a.dispose();
b.dispose();
c.dispose();

🌐 Browser Support

Browser	WebGPU	WebNN	WASM
Chrome 113+	✅	✅	✅
Edge 113+	✅	✅	✅
Firefox 118+	⚠️ Flag	❌	✅
Safari 17+	⚠️ Preview	❌	✅

📖 API Reference

Core

pipeline(task, options?) - Create a pipeline for a task
loadModel(url, options?) - Load a model from URL
runInference(model, inputs) - Run model inference
getScheduler() - Get the global scheduler
getMemoryManager() - Get the memory manager
runInWorker(url, inputs) - Run inference in a Web Worker
WorkerPool - Manage multiple workers for parallel inference

Pipelines

TextClassificationPipeline - Text/sentiment classification
SentimentAnalysisPipeline - Sentiment analysis
FeatureExtractionPipeline - Text embeddings
ImageClassificationPipeline - Image classification
TextGenerationPipeline - Text generation with streaming
ObjectDetectionPipeline - Object detection with bounding boxes
AutomaticSpeechRecognitionPipeline - Speech to text
ZeroShotClassificationPipeline - Classify without training
QuestionAnsweringPipeline - Extractive QA

HuggingFace Hub

fromHub(modelId, options?) - Load model bundle from HuggingFace
fromTask(task, options?) - Load recommended model for task
downloadTokenizer(modelId) - Download tokenizer only
downloadConfig(modelId) - Download config only
POPULAR_MODELS - Registry of popular models by task

Utilities

Tokenizer - BPE/WordPiece tokenization with HuggingFace support
ImagePreprocessor - Image preprocessing with HuggingFace config support
AudioPreprocessor - Audio preprocessing for Whisper/wav2vec
Cache - LRU caching utilities

Tools

quantize(model, options) - Quantize a model
prune(model, options) - Prune model weights
benchmark(fn, options) - Benchmark inference
analyzeModel(model) - Analyze model structure

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

Get Started · API Docs · Examples

Made with ❤️ for the edge AI community

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
benchmarks		benchmarks
demo		demo
dist		dist
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vercel.json		vercel.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

edgeFlow.js

✨ Features

📦 Installation

🚀 Quick Start

Try the Demo

Basic Usage

Batch Processing

Multiple Pipelines

Image Classification

Text Generation (Streaming)

Zero-shot Classification

Question Answering

Load from HuggingFace Hub

Web Workers (Background Inference)

🎯 Supported Tasks

⚡ Key Differentiators

Comparison with transformers.js

🔧 Configuration

Runtime Selection

Memory Management

Scheduler Configuration

Caching

🛠️ Advanced Usage

Custom Model Loading

Preloading Models

Model Caching

Resume Downloads

Model Quantization

Benchmarking

Memory Scope

🔌 Tensor Operations

🌐 Browser Support

📖 API Reference

Core

Pipelines

HuggingFace Hub

Utilities

Tools

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages