Disaggregated AI Search: Architectural Blueprint & Experimental Roadmap

Status: Finalized for Experimental Phase

Framework Name: Seek.js (@seekjs - @seekjs/core)

SaaS Platform: Vaan | Vantage | Koor (Placeholder : vaan.ai) or some .ai domain that is cheaper

Objective: Deliver an "AI-search-as-a-service" toolchain that completely eliminates the "Vector Database Tax" by shifting index generation to build-time and search execution to the client's browser. Serve as the definitive engineering roadmap for building the 4 core SDK modules of the framework, outlining API contracts, data flows, and critical research paths.

Executive Summary

The Problem: The "Vector Database Tax"

In 2026, adding generative AI search (RAG) to a website is fundamentally broken for the modern frontend developer. To give users the ability to "Ask AI" about documentation or product catalogs, developers are forced to regress to legacy backend architectures. They must provision expensive managed vector databases (Pinecone, Qdrant), write fragile web-scraping ingestion scripts, and pay heavy LLM inference costs for every single user keystroke. We call this the "Vector Database Tax."

The Mission: Frontend-Native AI Search

This project is a framework designed to completely eliminate that tax. We are redefining AI search not as a backend database challenge, but as a static asset delivery and edge compute challenge. Our mission is to give developers enterprise-grade AI search with the exact same developer experience (DX) as deploying a static website: zero provisioning, zero configuration, and sub-15ms latency.

The Innovation: Disaggregated RAG

We achieve this by fundamentally disaggregating the RAG (Retrieval-Augmented Generation) pipeline:

Shift Indexing to Build-Time: Instead of a live database, we hook directly into the developer's framework (Next.js, Astro, Vite). We extract text using a WASM-based parser, vectorize it, and compile it into a highly-compressed binary file (.msp).
Shift Search to the Browser: That binary file is deployed to a global CDN. The user's browser downloads it, caches it in IndexedDB, and executes Hybrid Search (BM25 + Vector) entirely in local memory.
Shift Reasoning to the Edge: Server-side compute is only invoked when a user asks for an AI summary. The browser sends local context to our Edge LLMs (Cloudflare Workers AI), which stream back cited, hallucination-free answers.

1. The Business Model: Open-Source Foundation & SaaS Monetization

The Open-Source Framework (Seek.js): A free, modular SDK. Developers can install bundler plugins to extract content, generate indexes using local models, and serve search from their own static hosting. Zero vendor lock-in.
The SaaS Abstraction (Vaan.ai): A managed, zero-config cloud platform.
- Automated Pipeline: We intercept the build, vectorize chunks on our Edge GPUs, and host sharded .msp files on our global CDN.
- Managed Reasoning: We securely manage the Edge LLM endpoints required for the generative RAG summaries.
- Revenue: Scalable, usage-based subscription for Edge AI compute and managed infrastructure.

2. The Competitive Advantage

By disaggregating the database, we drastically alter the performance and cost metrics for the end-user.

Competitive Matrix (2026)

Category	Competitors	Search Model	Architecture	Pricing (Avg)
Static Search	Pagefind, Stork	Lexical	Local-First	$0 (OS)
Vector DBs	Pinecone, Upstash	Vector-only	Centralized DB	$500/mo (Prod)
AI Chat SaaS	Mendable, Kapa.ai	RAG Chat	Centralized API	$200+/mo
Search Engines	Algolia AI, Orama	Neural/Hybrid	Centralized SaaS	$100 - $1,500/mo
Seek.js	N/A	Hybrid	Disaggregated	$0 (OS) / $19 (SaaS)

Why Seek.js Wins

Against Pagefind: Pagefind is "AI-blind." Seek.js brings semantic intent to the browser.
Against Mendable/Kapa.ai: These are "Black Boxes" that charge for data storage and message credits. Seek.js keeps the context in the browser—you pay $0 for storage and only pennies for Edge reasoning.
Against Pinecone: No 24/7 database instance required. Your DB is a static file on a CDN.
The MCP Advantage: Seek.js natively supports the Model Context Protocol (MCP), allowing your documentation to be instantly "read" by AI agents like Claude or ChatGPT.

3. The Core SDK Modules (The Pipeline)

Module 1: Parsing & Extraction (`@seekjs/parser`)

The Goal: Safely extract semantic text from HTML files and bind them to source URLs for citation.

Proposed API Contract

import { extractHtml } from '@seekjs/parser';

const chunkStream = extractHtml({
  inputDir: './dist',
  urlBase: 'https://mysite.com', 
  selectors: ['article', 'main .content'],
  ignorePaths: ['/404'],
  chunkSize: 50 
});

for await (const batch of chunkStream) {
  // batch: [{ text: "...", url: "/docs/auth", hash: "#setup" }]
  console.log(`Extracted ${batch.length} chunks...`);
}

Data Flow (Build-Time)

sequenceDiagram
    participant CI as CI/CD Runner (Node/Bun/Deno)
    participant FS as File System
    participant WASM as @seekjs/parser (WASM)
    
    CI->>FS: Scan /dist directory
    FS-->>CI: List of .html files
    
    loop Every 50 files
        CI->>FS: Open ReadStreams
        FS->>WASM: Pipe raw HTML bytes
        WASM->>WASM: Parse tags & bind URL
        WASM-->>CI: Yield JSON batch
    end

Module 2: Vectorization & Compilation (`@seekjs/compiler`)

The Goal: Vectorize chunks and compile them into a binary MessagePack (.msp) database.

Proposed API Contract

import { compileIndex } from '@seekjs/compiler';
import { cloudflareEmbedder } from '@seekjs/embedders/cloudflare';

const mspBuffer = await compileIndex({
  chunks: chunkBatch,
  embedder: cloudflareEmbedder({ 
    apiKey: process.env.CF_API_TOKEN,
    model: '@cf/baai/bge-small-en-v1.5'
  }),
  schema: {
    text: 'string',
    url: 'string',
    hash: 'string'
  }
});

Compilation Flow

flowchart TD
    A[Raw Chunk Batch] --> B{Schema Validation}
    B -->|Valid| C[Orama Instance]
    
    C --> D[Lexical Engine]
    D -->|BM25| F[In-Memory DB]
    
    C --> E[Embedder Function]
    E -->|Float32Array Vectors| F
    
    F --> G[MessagePack Serializer]
    G --> H[search_index.msp]

Module 3: Website Hydration & Search (`@seekjs/client`)

The Goal: Deliver the index to the browser and execute <15ms hybrid queries locally.

Proposed API Contract (React)

import { useAiSearch } from '@seekjs/react';

function SearchWidget() {
  const { search, results, status } = useAiSearch({
    indexUrl: '/search_index.msp',
    storageStrategy: 'indexedDB'
  });

  return (
    <input 
      onMouseEnter={() => search.preload()} 
      onChange={(e) => search.execute(e.target.value)} 
    />
  );
}

Background Sync & Cache Flow

sequenceDiagram
    participant UI as Search Widget
    participant IDB as IndexedDB (Local)
    participant CDN as Edge CDN (R2)
    
    UI->>UI: User hovers input (Intent)
    UI->>IDB: Check cached .msp & ETag
    UI->>CDN: HEAD /search_index.msp
    CDN-->>UI: Returns remote ETag
    
    alt Local ETag == Remote ETag
        UI->>IDB: Load from cache
    else Local ETag != Remote ETag
        UI->>CDN: GET /search_index.msp (Brotli)
        CDN-->>UI: 600KB Payload
        UI->>IDB: Overwrite Cache & ETag
    end
    
    UI->>UI: Hydrate Orama WASM
    UI-->>UI: Ready for 15ms Searches!

Module 4: The AI Generative Flow (`@seekjs/ai-edge`)

The Goal: Stream synthesized answers with clickable citations from the Edge.

Proposed API Contract (Cloudflare Worker)

import { streamAiResponse } from '@seekjs/ai-edge';

export async function POST(req) {
  const { query, chunks } = await req.json();
  
  const stream = await streamAiResponse({
    query,
    context: chunks,
    provider: 'cloudflare',
    model: '@cf/meta/llama-3-8b-instruct'
  });

  return new Response(stream, { headers: { 'Content-Type': 'text/event-stream' } });
}

Generative RAG Loop

sequenceDiagram
    participant Browser as User Browser
    participant API as Edge API (Cloudflare)
    participant LLM as Workers AI
    
    Browser->>Browser: Local Orama Search
    Browser->>API: POST Query + Top 5 Chunks
    API->>API: Assemble Context + Prompt
    API->>LLM: Dispatch Inference Request
    LLM-->>API: Stream tokens (SSE)
    API-->>Browser: Relay SSE Stream
    Browser->>Browser: Render Markdown Citations

4. Technical Complexities & Mitigation Strategies

The "Index Bloat" Problem: 5,000 pages can be 15MB+. We use int8 quantization and Brotli compression to squash this under 1.5MB.
Abuse Prevention: Use Cloudflare Turnstile and aggressive Edge semantic caching to prevent LLM endpoint spam.
Post-Build Stability: We act as a Vite/Rollup plugin to hook into the build lifecycle before obfuscation.

5. Experimental Roadmap

Experiment 1: Vector Sharding: Test sharding a 50MB .msp file into 1MB fragments for incremental hydration on mobile devices.
Experiment 2: LLM Citation Drift: Measure Llama 3 8B hallucination rates on links. If >5%, fallback to manual JSON mapping of citations.
Experiment 3: Cache Versioning: Ensure schema updates gracefully wipe old IndexedDB versions.
Experiment 4: Runtime Limits: Stress test WASM parser against 10,000 files in Bun/Node to find the EMFILE break point.

6. Pricing & Infrastructure Guesstimate (Cloudflare Stack)

Component	Cost (Seek.js / Vaan)	Cost (Pinecone + OpenAI)
Storage (R2)	$0.00 (within 10GB free tier)	$160.00
Search (Local)	$0.00	$150.00
AI (10k Summaries)	$2.50 (Workers AI)	$5.00
Total Monthly	~$2.54	~$315.00+

Monetization Strategy: Offer a $19/mo Pro Tier. At a COGS of ~$2.54, we maintain an 86% margin while saving the customer $400+/mo in "Vector Tax."

7. Final Strategy: The "Indie" Advantage

Because our architecture is disaggregated, we have zero server idle costs. We scale exactly with the user's traffic via Cloudflare's serverless edge.

Our Motto: "Search that pays for itself."

AchuAshwath/AIChatWidget-research.md

Select an option

No results found

Select an option

No results found

Disaggregated AI Search: Architectural Blueprint & Experimental Roadmap

Executive Summary

The Problem: The "Vector Database Tax"

The Mission: Frontend-Native AI Search

The Innovation: Disaggregated RAG

1. The Business Model: Open-Source Foundation & SaaS Monetization

2. The Competitive Advantage

Competitive Matrix (2026)

Why Seek.js Wins

3. The Core SDK Modules (The Pipeline)

Module 1: Parsing & Extraction (`@seekjs/parser`)

Proposed API Contract

Data Flow (Build-Time)

Module 2: Vectorization & Compilation (`@seekjs/compiler`)

Proposed API Contract

Compilation Flow

Module 3: Website Hydration & Search (`@seekjs/client`)

Proposed API Contract (React)

Background Sync & Cache Flow

Module 4: The AI Generative Flow (`@seekjs/ai-edge`)

Proposed API Contract (Cloudflare Worker)

Generative RAG Loop

4. Technical Complexities & Mitigation Strategies

5. Experimental Roadmap

6. Pricing & Infrastructure Guesstimate (Cloudflare Stack)

7. Final Strategy: The "Indie" Advantage

aanushh commented Apr 10, 2026

Uh oh!

AchuAshwath/AIChatWidget-research.md

Disaggregated AI Search: Architectural Blueprint & Experimental Roadmap

Executive Summary

The Problem: The "Vector Database Tax"

The Mission: Frontend-Native AI Search

The Innovation: Disaggregated RAG

1. The Business Model: Open-Source Foundation & SaaS Monetization

2. The Competitive Advantage

Competitive Matrix (2026)

Why Seek.js Wins

3. The Core SDK Modules (The Pipeline)

Module 1: Parsing & Extraction (@seekjs/parser)

Proposed API Contract

Data Flow (Build-Time)

Module 2: Vectorization & Compilation (@seekjs/compiler)

Proposed API Contract

Compilation Flow

Module 3: Website Hydration & Search (@seekjs/client)

Proposed API Contract (React)

Background Sync & Cache Flow

Module 4: The AI Generative Flow (@seekjs/ai-edge)

Proposed API Contract (Cloudflare Worker)

Generative RAG Loop

4. Technical Complexities & Mitigation Strategies

5. Experimental Roadmap

6. Pricing & Infrastructure Guesstimate (Cloudflare Stack)

7. Final Strategy: The "Indie" Advantage

aanushh commented Apr 10, 2026

Challenges in build time

1. Sending build files to SaaS

2. i18n

Optimizing performance

web-worker.js

client.js

Uh oh!

Module 1: Parsing & Extraction (`@seekjs/parser`)

Module 2: Vectorization & Compilation (`@seekjs/compiler`)

Module 3: Website Hydration & Search (`@seekjs/client`)

Module 4: The AI Generative Flow (`@seekjs/ai-edge`)