Sub-millisecond HNSW search, background ONNX embedding, and OPFS-cached knowledge cartridges. Zero server dependencies. 172 KB of WASM.
This runs entirely in your browser — WASM vector search + ONNX embeddings. No API calls, no server.
Type a query above to search HuggingFace documentation with sub-millisecond vector search — running 100% in your browser.
From zero to a searchable knowledge base in seconds.
Load the 172 KB WASM engine and spawn the embedding worker
Slot in a pre-built knowledge cartridge — cached in OPFS for instant reload
Embed your query in the background worker, then HNSW search in <1ms
Format results for LLM injection with token budgeting and source citations
The entire SDK is one import and five methods.
Mount a vROM knowledge cartridge, search with natural language, and inject context into your LLM — all client-side, all under 50 lines of code.
import { AgentMemory } from 'vrom.js'; // Initialize — loads WASM + spawns worker const memory = new AgentMemory(); await memory.init(); // Mount a knowledge base (auto-cached) await memory.mount('hf-transformers-docs'); // Search with natural language const results = await memory.search( 'how to fine-tune with LoRA', { topK: 5, expandContext: true } ); // Format for your LLM const context = memory.formatContext( results, { maxTokens: 2000 } ); // → Ready for system prompt injection
Everything you need for client-side RAG, nothing you don't.
Rust-compiled WASM engine runs HNSW approximate nearest neighbor search on 1000+ vectors in under 1ms.
ONNX models run in a Web Worker via transformers.js. The UI never freezes, even during inference.
Pre-computed HNSW indexes you slot in like game cartridges. One-click load, instant search, offline-first.
Indexes and models are cached in the Origin Private File System. Reload the page — everything is still there.
Switch between vROMs without reloading the embedding model. Same-model swaps complete in under 500ms.
The full package including WASM binary, TypeScript declarations, and embed worker fits in 178 KB.
Measured in Chrome on M1 MacBook Air.
| Metric | Value | Notes |
|---|---|---|
| HNSW Search | < 1 ms | 1,356 vectors, top-5 |
| Embedding | ~50 ms | Per sentence, q8 model |
| vROM Mount (cached) | < 500 ms | OPFS → WASM load |
| Hot-Swap | < 500 ms | Same model, different vROM |
| WASM Binary | 172 KB | Gzipped: ~80 KB |
| npm Package | 178 KB | ESM + CJS + types + worker |
Rust → WASM for speed. Web Workers for non-blocking UI. OPFS for persistence.
Install vrom.js, mount a knowledge base, and start searching — in five lines of code.