AI-NATIVE SEARCH ai-infra Scan 2026-05-20 to 2026-05-20 Run 20260521000114

Control plane that routes, caches, and governs web search for production AI agents before retrieval spend and failures explode.

Once AI products move from pilot to production, web retrieval becomes a hidden failure domain: query volume spikes, search bills become unpredictable, and one stale or low-quality source can poison downstream agent actions. Platform teams can buy a search API, but they still lack a control layer that decides when to use fast head search, deep long-tail search, cached evidence, or a browser fallback.

By Bizidea Research Thu May 21 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Overall rating 3.9 / 5.0

4
Market
$0.9B TAM, $180.0M SAM, 43.57% CAGR, and four mapped vendors point to a large, fast-growing market that is already competitive.
4
Differentiation
Provider-agnostic routing, policy controls, and replay logs address gaps across Exa, Tavily, Google, and Browserbase, though the wedge is still copyable.
3
Execution
Plan is clear and unit economics are healthy at 72% gross margin, 6.9x LTV/CAC, and 11.6-month payback, but three model flags keep risk real.
5
Timeliness
Four fresh signals in a one-day scan window—major funding, strong adoption, and 1,000x query growth—make retrieval governance feel urgent now.

Section

Why now

Agent traffic is about to overwhelm ad hoc retrieval logic because search volume is expected to grow by orders of magnitude as agents replace humans in research workflows.
Legacy search wrappers are no longer enough, creating room for a policy and routing layer that decides how agent queries should be executed.
Adoption has already crossed from experimentation into deployment, so buyers now need reliability and governance rather than a demo-quality API.
The size of Exa's round and valuation shows retrieval is being funded like core infrastructure, which gives startups permission to buy tooling around it.

Catalyst. Exa's rise to default search infrastructure, its 400,000+ developer adoption, and the explicit expectation that agents will search 1,000x more than humans have shifted the problem from "can agents search?" to "how do we govern search at production scale?"

Section

The idea

The product sits between agent frameworks and retrieval providers as an API proxy plus dashboard. It predicts query intent and business value, then chooses the cheapest path that still meets freshness and confidence targets: cached evidence packet, fast head search, deep long-tail search, or browser execution. Every result is normalized into a provenance graph with source metadata, recency checks, and replay logs so teams can debug bad answers and satisfy enterprise buyers asking for citations. Customers also get policy controls for approved domains, geo restrictions, budget caps, and workflow SLAs, plus observability on cost per agent action and failure mode.

What's different. Exa proves agent-native search demand exists, but this company wins by being provider-agnostic and policy-first rather than another monolithic index. Its defensibility comes from the query-routing data, source-policy configs, and replay logs it accumulates across customer workflows — especially the decision of when not to search and when to reuse prior evidence. That makes it sticky with buyers who care about margin, provenance, and enterprise controls more than raw search endpoints.

Startup thesis
Beachhead	Series B-D AI-native SaaS vendors with 20-150 engineers whose production agents perform 500,000+ external web searches per day for sales-intel, market-research, or compliance workflows
Wedge	A retrieval control plane that classifies each query, routes it across fast search, deep search, cache, or browser fallback, and returns replayable citations with cost and freshness policies enforced.
Non-obvious insight	The next valuable layer is not another search engine; it is the control plane above search engines that decides when an agent should search, which retrieval depth to buy, which sources are admissible, and when cached evidence is good enough. As agent traffic explodes, those policy and routing decisions matter more to customers than raw index ownership alone.
Venture-scale path	Starting as routing and policy middleware for external-web retrieval, the company can expand into retrieval evals, source licensing, benchmark data, enterprise governance, and multi-provider knowledge access across every agent workflow that touches external information.

Target user
Primary user	ML infrastructure and agent-platform engineers at Series B-D AI-native SaaS companies shipping research, sales-intel, or compliance agents that depend on live web retrieval
Secondary user	Product leaders responsible for enterprise-grade answer quality and gross margin in AI copilots that cite external web sources
Economic buyer	VP Engineering or Head of AI Platform

Go-to-market seed
First customer	Head of platform engineering at a 50-200 person AI-native SaaS company that already uses Exa or SerpAPI inside a research or sales-intel product and is spending more than $25,000 per month on external search
Buying trigger	A product GA launch or major enterprise rollout causes search spend to jump or surfaces a customer escalation about stale, uncited, or inconsistent answers
Current alternative	Direct calls to Exa or SerpAPI plus ad hoc caching, prompt heuristics, and internal dashboards
Switching reason	The wedge reduces retrieval spend and answer failures without forcing a model swap or full agent rewrite, while adding auditability enterprise customers already ask for.
Pricing hypothesis	Platform subscription plus usage-based pricing tied to managed query volume, routed spend, and retained evidence traces

Jobs to be done

Job	Current alternative	Success metric
When our agent product starts generating thousands of external queries per minute, help our platform team control retrieval cost and reliability so they can scale usage without margin surprises or bad citations.	Hand-built routing logic and dashboarding around one search API	30%+ reduction in search cost per successful agent task
When an enterprise customer questions an answer sourced from the web, help our team replay and audit the retrieval chain so they can defend the output and fix failures quickly.	Manual log inspection across prompts, API calls, and app telemetry	Under 10 minutes to trace a bad answer to query, source, and routing decision

Agent search control loop

flowchart LR
  Buyer[AI platform team] --> Pain[Uncontrolled search cost and weak provenance]
  Pain --> Product[Retrieval control plane]
  Product --> Outcome[Lower cost, cited answers, governed agent search]

Idea scorecard — average4.6 / 5 · 5axes

Signal · 5/5The cluster combines major funding, rapid valuation expansion, named customer adoption, and explicit claims that agents will search 1,000x more than humans.
Pain · 4/5The pain is not a one-time crisis, but once an agent product scales, cost, latency, and provenance failures directly hit gross margin and enterprise trust.
Wedge · 5/5A retrieval routing and governance layer for search-heavy agent companies is a crisp first product with obvious insertion point and measurable ROI.
Defense · 4/5Workflow-specific routing data, provenance graphs, and embedded source policies create stickiness, though providers could try to move upward over time.
Scale · 5/5If every production agent needs governed external knowledge access, this can grow from middleware into the standard control plane for retrieval across the agent economy.

Business model canvas

Key partners

Search API providers
Browser automation and crawling vendors
LLM observability platforms

Key activities

Routing and caching optimization
Provenance graph generation
Retrieval evaluation and policy tooling

Key resources

Query routing models
Retrieval trace corpus
Integrations with search and browser providers

Value propositions

Reduce search spend without degrading answer quality
Enforce provenance, freshness, and source-policy controls for agents
Give platform teams replayable retrieval traces and SLA observability

Customer relationships

High-touch implementation for first design partners
Usage reviews tied to spend savings and answer-quality metrics
Shared eval dashboards with platform teams

Channels

Founder-led sales into AI product and platform teams
Partnerships with agent-framework and observability vendors
Technical content targeted at retrieval-heavy AI builders

Customer segments

AI-native SaaS companies running production agents on live web data
Enterprise internal AI platform teams standardizing retrieval across apps

Cost structure

Cloud inference and storage
Provider usage costs
Engineering for routing, evals, and integrations

Revenue streams

Annual platform subscription
Usage-based fee on managed search requests
Enterprise add-ons for retention, compliance, and private deployment

Section

Market

Market sizing

Market sizing overview
TAM	$0.9B Bottom-up estimate: assume ~15,000 global retrieval-heavy teams ultimately need multi-provider search governance (modeled from Exa reporting 5,000+ companies already using its stack, multiplied by ~3x to account for broader multi-vendor adoption), at roughly $60k annual control-plane spend per team; cross-check = ~15% of the 2026 enterprise-search market.
SAM	$180.0M Apply the beachhead constraint to ~3,000 Series B-D AI-native SaaS teams with retrieval-heavy production agents (about 20% of the modeled TAM universe) at the same ~$60k annual control-plane spend.
SOM	$5.4M Reachable year-3 share modeled as ~60 customers at roughly $90k blended annual contract value once pilots prove spend savings and citation reliability.

Executive takeaways

Agent-native retrieval has clearly become infrastructure: Exa says agents will search 1,000x more than humans and Tavily has raised capital around the same thesis, which validates a real workload shift rather than a speculative feature add-on. [1][3]
The wedge is not another index alone; it is orchestration across search depth, extraction, browser fallback, caching, and traceability. Public docs from Exa, Tavily, Cloudflare, and browser-runtime vendors show buyers already assemble these primitives by hand. [48][63][24][22][23]
Budget exists because providers already charge on a visible per-query or per-run basis: Exa, Tavily, Google Agent Search, Browserbase, and Browserless all publish usage-linked pricing that turns search reliability into an operating-cost line item. [45][61][37][22][23]
The strategic gap is provider-agnostic governance. Search APIs optimize their own surfaces, clouds optimize their own platforms, and browser runtimes optimize execution; none of them naturally optimize when not to search, when to reuse evidence, and how to keep multi-provider policy consistent. [47][39][28][77]
Adoption friction will center on enterprise controls, not raw API access. NIST, the EU AI Act, Anthropic computer-use guidance, and browser-runtime security docs all point to logging, oversight, retention, and prompt-injection controls as table stakes. [11][12][20][78]
The category should grow with agent adoption faster than classic enterprise search alone: Precedence pegs AI agents at 43.57% CAGR through 2035 versus 9.05% for enterprise search, implying outsized growth for the subset of spend tied to live external retrieval. [10][9]

Market definition

Provider-agnostic control software for production AI-agent retrieval. The category sits above search APIs, grounding services, caches, and browser runtimes to decide whether a query should hit fast search, deep search, extraction, cached evidence, or browser fallback while preserving citations, freshness policies, and cost telemetry.

Customer and buyer

Primary users are ML infrastructure and agent-platform engineers running retrieval-heavy research, sales-intel, or compliance agents. The economic buyer is typically a VP Engineering, Head of AI Platform, or platform GM who owns margin, answer quality, and enterprise trust for agent products.

Buying triggers

A product GA launch or enterprise rollout causes external-search spend to jump and exposes that direct provider calls lack cost guardrails. [45][61][37]
Bad or stale citations trigger customer escalations, making replayability and source-policy controls more urgent than raw retrieval breadth. [15][16][51]
Teams add browser execution or multi-provider fallbacks to recover failed searches, which raises operational complexity and creates a clear insertion point for a routing control plane. [28][22][23]

Willingness to pay

Public pricing already trains buyers to pay for retrieval infrastructure as metered production spend: Exa charges $7 per 1,000 search requests, Tavily charges per credit, Google Agent Search charges per 1,000 queries, and browser runtimes bill for search, fetch, browser hours, or units. A control plane can therefore anchor ROI around reduced routed spend and fewer failed tasks rather than inventing a new budget category. [45][61][37][22][23]

Category dynamics

Growth signal 43.57% CAGR

Tailwinds

The AI-agent market is scaling much faster than classic enterprise search, creating a fast-growing demand layer for live external retrieval.
Providers now expose search, extraction, research, and browser primitives as composable APIs, which lowers integration friction for a control-plane layer.
Funding into Exa and Tavily signals that investors and customers view agent retrieval as strategic infrastructure.

Headwinds

Provider bundling could compress the value of an independent orchestration layer if search vendors or clouds add enough governance features themselves.
Browser fallback remains operationally messy and can create prompt-injection, retention, and security review burdens that slow sales cycles.

Validation signals

Exa’s funding and adoption narrative shows that large AI builders already treat web search as core infrastructure.
Tavily’s funding and LangChain integration indicate that agent developers actively choose specialized search APIs rather than defaulting to generic web search.
Public per-query pricing from Exa, Google, and browser runtimes makes retrieval cost concrete enough for a cost-optimization wedge.
Independent comparisons increasingly treat provider choice, latency, and response structure as first-order engineering decisions for agent teams.

Regulatory & technical constraints

The EU AI Act raises expectations for documentation, logging, traceability, and human oversight in higher-risk deployments.
NIST frames trustworthy AI deployment around structured risk-management practices, which strengthens the case for auditable retrieval policies.
Anthropic warns that browser-capable agents face prompt injection and recommends domain restrictions, minimal privileges, and user confirmation for high-impact actions.
Google documents a one-million-queries-per-day limit for grounding with Google Search, underscoring provider limits and the need for routing/fallback controls.
Enterprise buyers will ask for retention and security controls over logs and recordings when browser fallback is part of the workflow.

Agent retrieval stack market map

Section

Competition

Competition comes from four directions: agent-native search APIs such as Exa and Tavily; cloud grounding and enterprise-search stacks led by Google; browser runtimes such as Browserbase and Browserless that own execution fallback; and in-house glue code built on framework tools. The startup only wins if it becomes the cross-provider policy and spend layer rather than another point solution within one provider stack.

Competitor	Stage	Wedge	Pricing	Strength	Weakness vs. us
Exa	scale-up	Agent-native search engine with configurable latency, deep search, contents extraction, and async agent workflows.	$7/1k search requests; $12 deep search; $15 deep-reasoning; $0.025-$2.00 per agent run.	Strong semantic retrieval, broad search modes, and public evidence that leading AI companies already use it.	Still optimizes the Exa stack itself rather than neutral routing, spend policy, or evidence reuse across several providers.
Tavily	scale-up	LLM-oriented search, extract, crawl, map, and research endpoints tuned for agent workflows with simple integration.	1,000 free API credits per month; pay-as-you-go at $0.008 per credit; enterprise custom.	Developer-friendly search surface with broad framework adoption and a clear product around agent retrieval.	Like Exa, Tavily is still a provider surface, not a control plane that arbitrates among providers and fallback modes.
Google Agent Search	incumbent	Enterprise search and grounded answers inside the Google Cloud stack, with grounding from Google Search and configurable search apps.	$1.50/1,000 standard queries; $4.00/1,000 enterprise queries; +$4.00/1,000 advanced generative-answer queries.	Distribution, enterprise packaging, and direct access to Google search grounding make it credible for large-company buyers.	Best aligned with Google-native deployments and internal-search use cases, not neutral routing across several external retrieval providers.
Browserbase	scale-up	Browser agent infrastructure with search, fetch, browser hours, captcha solving, and enterprise security controls for browser fallback.	Developer and startup plans meter browser hours plus $7/1k search calls and $0.5-$1/1k fetch calls, with enterprise custom.	Owns the authenticated browser-execution layer that search providers often hand off to when plain search fails.	It is a runtime for execution and capture, not a retrieval-routing layer that decides when to search, cache, or browse.

Why incumbents do not win by default

Search APIs. Exa and Tavily already optimize search quality, extraction, and research depth, but they are single-provider products that do not maximize multi-provider routing, evidence reuse, or neutral spend governance by default.
Cloud platforms. Google bundles grounding, enterprise search, and governance features, yet its economic and technical center of gravity is still the Google stack rather than a neutral control plane spanning several retrieval vendors and browser fallbacks.
Browser runtimes. Browserbase and Browserless are compelling for authenticated fallback and extraction, but they monetize execution primitives rather than query classification, evidence admissibility, or provider-agnostic search governance.
Gateway and observability vendors. Cloudflare proves that teams want analytics, fallbacks, custom cost accounting, and key management, but its scope is model-gateway infrastructure rather than a retrieval-specific orchestration layer across search, cache, and browser actions.
In-house. Tool-use frameworks and vendor SDKs make custom orchestration possible, but they push routing policy, failure handling, and auditability work back onto the platform team.

Section

Business plan

This company should start as a provider-agnostic retrieval control plane for AI-native SaaS teams that already run production agents on live web data and feel search cost, answer-quality, and provenance pain at the same time. The first customer is a Series B-D AI-native SaaS company with 20-150 engineers, at least one retrieval-heavy research, sales-intel, or compliance workflow, and more than $25,000 per month of external search spend spread across Exa, Tavily, Google, or browser fallback. The initial product should sit as an API proxy and policy layer above those providers, deciding when to use fast search, deep search, cached evidence, or browser fallback while returning replayable citations and cost telemetry. The wedge is attractive because buyers already see retrieval as metered infrastructure spend, so a pilot can be sold around measurable savings per successful agent task and faster debugging of stale or uncited answers. Research supports a meaningful market with an estimated $0.9B TAM, $180.0M beachhead SAM, and $5.4M reachable year-three SOM if the company can land roughly 60 customers at about $90k blended ACV. The deliberate tradeoff is not to build another search engine, broad enterprise search suite, or autonomous browser stack; the company should first win one control point where multi-provider routing and governance are more valuable than raw index ownership. The biggest disconfirming risks are that providers bundle enough native governance to collapse the wedge, or that too few teams have enough routed-query spend to justify a standalone line item. The first 12 months should therefore focus on 3-5 paid design partners, proof of at least 25% retrieval-spend reduction on covered workflows, and evidence that browser fallback remains the exception rather than the margin-destroying default.

Problem

Retrieval-heavy agent products accumulate unpredictable search spend, stale or weak citations, and inconsistent failure handling once usage moves from pilot to production.
Platform teams can buy search APIs and browser runtimes, but they still lack a neutral control layer for routing, caching, provenance, budget caps, and source-policy enforcement across providers.

Solution

Provide an API proxy and dashboard that classifies each query and routes it to cached evidence, fast search, deep search, or constrained browser fallback based on freshness, confidence, and cost policy.
Normalize every retrieval step into replayable evidence traces with source metadata, policy logs, and per-task cost analytics so teams can debug bad answers and satisfy enterprise citation requirements.

Why we win

The product is narrower than building another search engine and more neutral than any single provider stack, which makes the ROI case legible for buyers already juggling multiple retrieval vendors.
Defensibility compounds from routing-performance data, evidence-reuse patterns, and embedded source-policy configurations that improve with each covered workflow.
The first sale can be tied to a single painful workflow and a visible provider bill, which is faster to prove than a broader enterprise-search replacement motion.

Strategic choices
Beachhead	Series B-D AI-native SaaS companies with 20-150 engineers, one or more production research, sales-intel, or compliance agents, and at least 500,000 external web searches per day across Exa, Tavily, Google, or similar providers.
Wedge rationale	The narrow entry point is a retrieval-governance pilot for one live workflow where budget spikes and citation failures are already visible to the platform team; this creates faster proof than selling horizontal search governance to every enterprise app team at once.
Sequencing	Start with a thin proxy, provider analytics, replayable citations, and policy controls for the existing search stack so deployment is low-friction and savings are measurable. Add deeper caching, evals, browser fallback orchestration, and partner-led distribution only after pilots show that customers trust the routing layer and will pay for it as infrastructure rather than custom services.
Not yet	Building a proprietary web index or crawler network · Selling to low-volume teams that have not yet reached budget pain · Broad internal enterprise search across HR, support, and knowledge management · Full autonomous browser execution for sensitive workflows without approval gates

Go-to-market
Wedge	Sell a paid retrieval-governance pilot for one production workflow, then convert the pilot into an always-on control plane once the customer trusts the routing rules and sees lower cost per successful agent task.
Channels	Founder-led outbound into teams already using Exa, Tavily, Google Agent Search, or browser-based fallbacks · Co-selling and integration partnerships with search providers, browser-runtime vendors, and AI observability platforms · Technical content and benchmark reports aimed at platform engineers building retrieval-heavy agents
Funnel targets	lead→qualified pilot 15-25%, qualified pilot→paid pilot 40-50%, paid pilot→production 50%+, production→second workflow expansion 30%+ in year one
Pricing	Charge an annual platform subscription plus usage-based fees tied to managed query volume and retained evidence traces, because buyers already understand retrieval as metered infrastructure spend. Price the first paid pilot as a scoped savings and governance engagement, then convert to roughly $60k-$120k annual ACV for the first production workflow if the control plane reduces spend and improves citation replay speed.

Product roadmap
MVP	MVP is a provider-agnostic proxy for Exa, Tavily, Google grounding, and one browser-fallback partner with query classification, cost accounting, cache controls, approved-domain policies, and replayable evidence traces. It must show which path was taken for each task, why it was chosen, and whether freshness and citation policy were met.
6 months	Ship production pilots with routing policies by query class, shared cache and evidence reuse, provider-level observability, and constrained browser fallback for the small subset of queries that cannot be resolved through search alone.
12 months	Add customer-specific eval harnesses, policy templates for regulated and customer-facing workflows, approval gates for risky fallback actions, and support for more provider mixes beyond the initial Exa-Tavily-Google stack.
24 months	Expand from one workflow wedge into a broader external-knowledge control plane with multi-team governance, benchmark data, enterprise retention controls, and adjacent products for retrieval evals and licensed-source orchestration.
Key bets	Most target buyers can realize meaningful savings through better routing and evidence reuse before demanding proprietary search quality improvements. · Browser fallback is necessary often enough to matter strategically, but not so often that it destroys gross margin in the first wedge. · Replayable evidence traces and domain policies are strong enough enterprise differentiators to beat hand-built orchestration. · A thin proxy integration is easier for early customers to adopt than a full agent-stack rewrite.

Business model
Revenue streams	Annual subscription for the retrieval control plane · Usage-based overage on managed query volume, evidence retention, and advanced policy controls · Enterprise add-ons for private deployment, retention controls, audit packaging, and advanced evaluation modules
Unit of value	Covered production workflow with included routed-query volume and evidence retention
Target gross margin	72%
Expansion levers	Add more retrieval-heavy workflows within the same customer after the first pilot converts · Expand provider coverage and policy depth across search, cache, and browser fallback paths · Sell higher-assurance governance, eval, and retention modules to larger enterprise deployments

Strategy map
North-star metric	Percent of covered agent tasks that meet freshness and citation policy at or below target retrieval cost
Input metrics	Paid pilot to production conversion rate · Retrieval spend reduction per successful agent task on covered workflows · Median time to replay the retrieval chain behind a bad answer · Cache-hit rate on repeated high-value queries · Share of routed queries requiring browser fallback
Moats to build	Historical routing graph showing which provider, depth, and fallback path works best by query class · Replayable evidence-trace corpus tied to customer outcomes and failure modes · Embedded domain, geography, and retention policies that become sticky inside customer workflows
Kill criteria	Fewer than 3 paid design partners after 30 qualified platform-team conversations · Covered-workflow spend savings below 15% in the first 3 pilots · Paid pilot to production conversion below 40% after the first 5 pilots · Browser fallback required on more than 30% of covered queries without clear gross-margin offset

Milestones

0–12 months

Sign 3-5 paid design partners in the AI-native SaaS beachhead
Show at least 25% retrieval-spend reduction on one covered workflow in the first 3 pilots
Launch a production-ready proxy with replayable evidence traces and approved-domain policies
Convert at least 2 paid pilots into annual production contracts

12–24 months

Expand from the first workflow into at least 2 additional retrieval-heavy workflows per customer cohort
Add broader provider support, eval tooling, and higher-assurance governance modules
Establish 3 active ecosystem partners that source or accelerate pilots
Reach repeatable annual ACV in the modeled $60k-$120k range without services-heavy deployment

24–36 months

Standardize as the external-knowledge control plane for multi-team agent deployments
Launch enterprise retention and regional-policy packaging for broader international sales
Build a routing and evidence dataset large enough to support benchmark products and expansion modules
Reach the modeled year-three SOM path of roughly $5.4M in annualized revenue

Strategy map

flowchart LR
  Wedge[Retrieval-governance pilot] --> MVP[Provider-agnostic routing proxy]
  MVP --> Proof[Lower spend and replayable citations]
  Proof --> Expansion[More workflows and enterprise governance modules]

Founding team

Role	Start timing	Rationale
Founder CEO	Month 0	Founder-led sales is required because the buyer is senior, the category needs education, and early pricing depends on quantified ROI.
Founding eng	Month 0	The company needs a strong technical lead to build the proxy, routing logic, telemetry, and early integrations.
Applied ML and product engineer	Month 2	Query classification, cache strategy, and retrieval evaluation are core product risks and need dedicated ownership early.
Solutions engineer	Month 6	Early enterprise pilots will hinge on fast integration, workflow mapping, and proving value from customer traces.
Security and platform engineer	Month 9	Retention controls, policy logging, and deployment hardening become critical once pilots move into procurement.
GTM lead	Month 12	Add repeatable pipeline ownership only after pilot packaging and conversion evidence are clear.

Experiment roadmap

Horizon	Experiment	Hypothesis	Success metric	Owner
0–90 days	ICP and budget-threshold discovery	Retrieval-heavy AI-native SaaS teams with visible provider bills and customer-facing citations will describe a funded platform problem rather than a nice-to-have optimization.	15 discovery interviews completed, 8 matching the beachhead profile, and 5 sharing credible spend or failure baselines.	Founder CEO
0–90 days	Concierge routing benchmark	Manually tuned routing across cache, fast search, deep search, and browser fallback can reduce cost per successful task by at least 25% on one workflow.	2 design partners benchmark at least 100 tasks each and both show 25%+ spend reduction without lower answer acceptance.	Founding eng
90–180 days	Thin-proxy pilot deployment	Customers will deploy a provider-agnostic proxy faster than they would replace their retrieval stack outright.	3 paid pilots launch with time to first routed production task under 30 days.	Founding eng
90–180 days	Pricing and packaging test	Platform subscription plus managed-query overage pricing converts better than pure consumption pricing.	Preferred package wins in at least 4 of 6 budget-owner conversations and appears in 2 signed pilot scopes.	Founder CEO
6–12 months	Evidence-trace procurement validation	Audit logs, approved-domain policies, and retention controls are sufficient to unblock security review for customer-facing AI use cases.	At least 2 pilots complete security review without requiring customer-hosted deployment or custom one-off controls.	Security platform lead
12–18 months	Partner-sourced expansion motion	Search-provider, browser-runtime, and observability partners can source qualified pilots with conversion comparable to founder-led outbound.	25% of qualified pipeline comes from 3 active partners and pilot conversion is no worse than direct outbound.	GTM lead

Risk assessment

Business plan risks — 4 mapped

Impact →

High

R2 R3

Medium

Low

Medium

High

Likelihood →

R1Providers bundle routing, caching, or governance features fast enough to compress the standalone wedge. · Highlikelihood / Highimpact — Differentiate on multi-provider neutrality, workflow-specific policy controls, and evidence reuse across several backends rather than features tied to one provider.
R2Too few target teams have enough routed-query spend to support a new infrastructure vendor. · Mediumlikelihood / Highimpact — Focus on accounts already spending materially on retrieval, require a measurable spend baseline before pilots, and avoid low-volume segments.
R3Browser fallback becomes more common than expected and damages gross margin and deployment simplicity. · Mediumlikelihood / Highimpact — Partner for browser execution, meter fallback separately, and prioritize better routing and caching before expanding browser-heavy workflows.
R4Security and compliance reviews slow deals because logs, artifacts, and fallback paths feel risky. · Mediumlikelihood / Mediumimpact — Ship retention controls, domain restrictions, approval gates, and auditable routing logs in the MVP rather than as later enterprise features.

Risk	Likelihood	Impact	Mitigation
Providers bundle routing, caching, or governance features fast enough to compress the standalone wedge.	High	High	Differentiate on multi-provider neutrality, workflow-specific policy controls, and evidence reuse across several backends rather than features tied to one provider.
Too few target teams have enough routed-query spend to support a new infrastructure vendor.	Medium	High	Focus on accounts already spending materially on retrieval, require a measurable spend baseline before pilots, and avoid low-volume segments.
Browser fallback becomes more common than expected and damages gross margin and deployment simplicity.	Medium	High	Partner for browser execution, meter fallback separately, and prioritize better routing and caching before expanding browser-heavy workflows.
Security and compliance reviews slow deals because logs, artifacts, and fallback paths feel risky.	Medium	Medium	Ship retention controls, domain restrictions, approval gates, and auditable routing logs in the MVP rather than as later enterprise features.

First customer
Title	Head of AI Platform at an AI-native SaaS company
Profile	A Series B-D software company with 50-200 employees, 20-150 engineers, and a production research, sales-intel, or compliance agent that already uses one or more external search providers.
Trigger	Product GA, enterprise rollout, or a customer escalation exposes rising retrieval bills and stale or weakly cited answers.
Buyer	VP Engineering or Head of AI Platform
Initial contract	$20k-$40k paid pilot for one workflow and baseline measurement, converting to roughly $60k-$120k annual ACV once routing policies and evidence traces are trusted in production.

What must be true

At least a meaningful subset of target buyers already spends enough on external retrieval to justify a standalone control-plane line item.
A thin proxy can cut covered-workflow retrieval spend by at least 25% without reducing answer acceptance or freshness.
Buyers care enough about replayable citations and policy controls to choose a neutral layer over deeper commitment to one provider.
Browser fallback remains a minority of routed queries in the first wedge so gross margin stays above software-infrastructure norms.
Provider and framework churn does not force so much integration work that the business becomes services-heavy.

Open diligence questions

What monthly routed-query spend or failure rate actually unlocks budget for a dedicated control plane?
In the first ten real prospects, which KPI matters most: spend saved, citation replay speed, or lower bad-answer incidence?
How often do target teams need browser fallback after query routing and caching are improved?
Which provider combinations dominate the beachhead accounts and therefore determine MVP integration scope?
What native governance features are Exa, Tavily, Google, and gateway vendors already shipping into the same workflow?

Investor verdict
Call	Meet / investigate further
Conviction	Strong wedge and market timing, but conviction depends on proving standalone budget thresholds and keeping browser-heavy workloads from eroding margins.
Why believe	The plan targets a real production bottleneck where buyers already see metered retrieval costs and lack a neutral cross-provider governance layer.
Why doubt	Search providers, clouds, and browser vendors can all bundle adjacent controls, so the startup must prove that neutrality and workflow data create enough value before incumbents close the gap.
Next diligence	Confirm 3 paid pilots with customers already spending meaningfully on external retrieval and show at least 25% spend reduction plus sub-10-minute citation replay on one workflow.

Section

Financial model

3-year totals
Year 1 revenue	$54K EBITDA $-928K · Cash EOP $2.37M
Year 2 revenue	$520K EBITDA $-1.28M · Cash EOP $1.10M
Year 3 revenue	$1.81M EBITDA $-733K · Cash EOP $363K

Unit economics
ARPU (annual)	$110K
Gross margin	72%
CAC	$76K Payback 11.6 months
LTV / CAC	6.9x LTV $528K

Funding ask
Round	pre-seed · $3.3M
Runway	36 months
Milestone	Reach 16 paying customers by Q2Y3, prove blended production ACV above $100K with 72% gross margin, and have 3 active ecosystem partners while still holding roughly 6 months of buffer into Q4Y3.

Model sanity

Revenue engine. Base-case revenue comes from growing from 4 paying accounts in Y1 to 24 by Q4Y3 while blended ACV rises from $36K pilot pricing to $110K production contracts.
Must go right. Founder-led pilots need to convert into production before the GTM team scales, because the sales-cycle sensitivity is the single biggest revenue and runway swing factor.
Model breaks if. The downside case turns cash negative if buyers will not support a $60K-$120K annual control-plane budget or if browser fallback keeps gross margin below about 68%.
Next-round proof. The strongest next financing story is 16 paying customers by Q2Y3, 72% gross margin, and 3 active ecosystem partners that show the motion is no longer purely founder-sourced.

Revenue, cash, and EBITDA — 12-month Y1 + 8-quarter Y2/Y3

Revenue (line, area)
Cash EOP (dashed)
EBITDA (bars, gray = loss)

Use of funds — $3.3M pre-seed

Headcount build by role — peak9 FTE

Founder / CEO
Founding engineer
Applied ML / product engineer
Solutions engineer
Security / platform engineer
GTM lead
Customer success / solutions architect
Account executive
Platform engineer

Year-3 scenarios — base / downside / upside

	Y3 revenue	Y3 EBITDA	Cash low point	Description
Downside	$1.25M	-$1.19M	-$156K	Slower pilot conversion and weaker standalone budget proof keep the company below the intended production-customer ramp.
Base	$1.81M	-$733K	$363K	Four paid pilots in Y1 convert into a measured but repeatable production ramp, and revenue grows through higher ACV plus modest customer count expansion.
Upside	$2.46M	-$220K	$879K	Design partners convert faster, partner channels contribute earlier, and production pricing expands with stronger usage and governance attach rates.

Sensitivity — Y3 cash and revenue impact, sorted by magnitude

Variable	Downside	Upside	Cash impact	Revenue impact
CAC	$95K CAC per new customer	$65K CAC per new customer	-$375K	$0K
sales cycle	Pilot-to-production close slips by one quarter	Partner-sourced deals close in under two months after pilot proof	-$290K	-$385K
hiring pace	Pull the platform engineer and a second field-facing hire forward by two quarters	Delay the Y3 platform hire by one quarter and cover overflow with partners	-$140K	-$60K
ARPU	$100K blended annual ACV in Y3	$120K blended annual ACV in Y3	-$118K	-$165K
churn	2.0% monthly churn after the first contract term	1.0% monthly churn	-$90K	-$120K
gross margin	68%	74%	-$73K	$0K

Scenarios

Scenario	Y3 revenue	Y3 EBITDA	Cash low point	Description	Key changes
Downside	$1.25M	$-1.19M	$-156K	Slower pilot conversion and weaker standalone budget proof keep the company below the intended production-customer ramp.	Y2 exits at 8 paying customers and Y3 exits at 18 instead of 24 Blended annual ACV reaches only $75K in Y2 and $100K in Y3 Gross margin tops out around 68% because browser fallback and provider costs stay elevated
Base	$1.81M	$-733K	$363K	Four paid pilots in Y1 convert into a measured but repeatable production ramp, and revenue grows through higher ACV plus modest customer count expansion.	Y2 exits at 10 paying customers and Y3 exits at 24 Blended annual ACV rises from $36K pilot pricing to $80K in Y2 and $110K in Y3 Gross margin reaches the 72% business-plan target once routing and caching mature
Upside	$2.46M	$-220K	$879K	Design partners convert faster, partner channels contribute earlier, and production pricing expands with stronger usage and governance attach rates.	Y2 exits at 12 paying customers and Y3 exits at 30 Blended annual ACV reaches $85K in Y2 and $120K in Y3 Gross margin improves to 74% as cache hit rates and query routing outperform plan

Sensitivity

Variable	Downside	Base	Upside
ARPU	$100K blended annual ACV in Y3	$110K blended annual ACV in Y3	$120K blended annual ACV in Y3
CAC	$95K CAC per new customer	$76.32K CAC per new customer	$65K CAC per new customer
churn	2.0% monthly churn after the first contract term	1.25% monthly churn	1.0% monthly churn
sales cycle	Pilot-to-production close slips by one quarter	Paid pilots convert inside two quarters	Partner-sourced deals close in under two months after pilot proof
gross margin	68%	72%	74%
hiring pace	Pull the platform engineer and a second field-facing hire forward by two quarters	Add only one platform engineer in Y3 after Y2 production proof	Delay the Y3 platform hire by one quarter and cover overflow with partners

Key assumptions (18)

ID	Name	Value	Unit	Source
A1	Model start month	2026-06	YYYY-MM	[BP date 2026-05-21] the model starts the month after the business-plan date so the round closes before the hiring ramp.
A2	Opening cash	3300.0	USDK	[BP fundingAsk targetFundingRangeUsd $2-4M] base case uses a $3.3M pre-seed, inside the stated range and large enough to reach the next milestone plus six months of buffer.
A3	Customer unit in the model	active paying customer account	definition	[BP market.som 60 customers at about $90k blended ACV; BP businessModel.unitOfValue covered production workflow] customersEop tracks paying accounts, with expansion reflected through rising ARPU.
A4	Starting customers (M1)	0	count	[BP milestones 0-12 months] the company starts pre-revenue before signing paid design partners.
A5	Y1 new paying customers by month	[0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0]	count	[BP milestones sign 3-5 paid design partners and convert at least 2 pilots] base case closes four paid pilots across the back half of Y1.
A6	Y2 new paying customers by quarter	[1, 1, 2, 2]	count	[BP milestones 12-24 months] assumes a modest founder-led ramp from pilot conversions plus the first GTM support hire.
A7	Y3 new paying customers by quarter	[3, 3, 4, 4]	count	[BP market.som 60 customers at about $90k ACV] base case ends Y3 at 24 customers, deliberately below the SOM path to stay conservative for a founder-led infrastructure sale.
A8	Annual price ladder	Y1 36.0; Y2 80.0; Y3 110.0	annual USDK per customer	[BP investorMemo.initialContract $20k-$40k paid pilot; BP gtm.pricing $60k-$120k annual ACV plus usage fees] pilots start near the midpoint, then production pricing moves into the target range with usage and governance add-ons.
A9	Revenue recognition policy	average active customers in period multiplied by annualized contract value	formula	Startup-finance heuristic: new enterprise customers usually activate mid-period on average, so recognized revenue uses ((BoP + EoP) / 2) x annualized price.
A10	Gross margin ramp	Y1 50-60 on paid months; Y2 62, 65, 68, 70 by quarter; Y3 71, 72, 72, 72 by quarter	percent	[BP businessModel.targetGrossMarginPct 72; BP risks on browser fallback; Research sensitivityCases browser fallback] margin climbs only as routing and caching reduce expensive browser and provider overuse.
A11	Loaded salary bands	Founder 180; founding eng 180; applied ML/product eng 170; solutions eng 145; security/platform eng 170; GTM lead 170; customer success 130; AE 155; platform eng 160	annual USDK per FTE	[BP team roles] plus startup-finance heuristic for seed-stage U.S.-centric infrastructure startups including payroll tax and benefits load.
A12	Hiring schedule	founder and founding eng in M1; applied ML/product eng in M2; solutions eng in M6; security/platform eng in M9; GTM lead in M12; customer success/solutions architect in M15; AE in M18; platform engineer in M27	timing	[BP team startTiming through Month 12] plus startup-finance heuristic that later hires wait for pilot proof and partner-sourced pipeline.
A13	Payroll allocation policy	founder 70% S&M and 30% G&A; applied ML/product eng 80% R&D and 20% S&M; solutions eng 50% S&M, 30% R&D, 20% G&A; security/platform eng 85% R&D and 15% G&A; customer success 65% S&M, 20% R&D, 15% G&A; engineering fully R&D; GTM and AE fully S&M	policy	[BP team rationales; BP gtm; BP operations] reflects founder-led selling, implementation-heavy pilots, and an initially product-weighted org.
A14	Non-payroll operating expense ramp	S&M 5-20 monthly; R&D 10-22 monthly; G&A 7-15 monthly	USDK per month	[BP operations; BP fundingAsk.useOfFundsSummary] plus startup-finance heuristic for cloud spend, eval runs, travel, legal, and compliance tooling in a lean pre-seed plan.
A15	Steady-state monthly churn	1.25	percent	Startup-finance heuristic: annual infrastructure contracts with observable ROI should retain well, but the early category and provider bundling risk keep modeled churn above best-in-class infra retention.
A16	Blended CAC per customer	76.32	USDK	Calculated from modeled Y2-Y3 sales and marketing spend of $1.526M divided by 20 new customers; consistent with founder-led, technical enterprise sales.
A17	Funding sizing rule	reach the next fundable milestone and hold six months of buffer	policy	Developer instruction plus [BP fundingAsk runwayMonths 18] base case sizes the round to reach mid-Y3 proof points with H2Y3 buffer, not just to launch the first pilots.
A18	Cash flow simplification	ending cash equals opening cash plus cumulative EBITDA	formula	Startup-finance heuristic: the model assumes an asset-light software business with negligible capex, debt, and working-capital distortion.

unit economics flow

flowchart LR
  QualifiedTeams[Qualified platform teams] --> PaidPilots[Paid pilots]
  PaidPilots --> ProductionAccounts[Production accounts]
  ProductionAccounts --> Revenue[Subscription and usage revenue]
  Revenue --> GrossProfit[Gross profit after provider and browser costs]
  GrossProfit --> Cash[Cash runway]

Flags: The model still exits Y3 EBITDA-negative, so it assumes the company raises again off proof of efficient growth rather than self-funding. · Revenue per FTE only reaches the low end of software benchmarks, which leaves little room for extra services work or a faster support ramp. · Gross margin assumes browser fallback remains a minority path; if it stays above the business-plan risk threshold, the funding ask would need to move higher.

Section

Top risks

Provider bundling. Search providers could add their own routing, caching, and observability features and squeeze the middleware layer. Mitigation: Focus on multi-provider governance, replayability, and source-policy controls that customers need across several backends, not just one API.
Weak early ROI. Teams below meaningful query volume may not feel enough pain to buy a dedicated retrieval control plane. Mitigation: Target customers already spending $25,000+ per month on search and sell a rapid pilot around hard dollar savings and citation reliability.
Fast-moving agent stacks. Framework churn could make deeply embedded integrations expensive to maintain. Mitigation: Integrate at the API proxy and telemetry layer so the product remains useful across changing models, frameworks, and search vendors.

Section

Evidence

Cited sources (40)

Andreessen Horowitz. Investing in Exa | Andreessen Horowitz · https://a16z.com/announcement/investing-in-exa
SiliconANGLE. Exa Labs raises $250M at $2.2B valuation for its AI search tools - SiliconANGLE · https://siliconangle.com/2026/05/20/exa-labs-raises-250m-2-2b-valuation-ai-search-tools
TechCrunch. Tavily raises $25M to connect AI agents to the web | TechCrunch · https://techcrunch.com/2025/08/06/tavily-raises-25m-to-connect-ai-agents-to-the-web
Data4AI. Exa.ai vs. Tavily - AI Semantic Search API for LLM - Data4AI · https://data4ai.com/blog/tool-comparisons/exa-ai-vs-tavily
Rhumb. Exa vs Tavily vs Serper vs Brave Search for AI Agents · https://rhumb.dev/blog/exa-vs-tavily-vs-serper-vs-brave-search
WebSearchAPI.ai. Compare Tavily, Perplexity API, Google Search Grounding, Exa with LLM-as-Judge in LangSmith · https://websearchapi.ai/blog/compare-tavily-google-search-exa-perplexity
Precedence Research. Enterprise Search Market Size to Hit USD 12.71 Billion by 2035 · https://www.precedenceresearch.com/enterprise-search-market
Precedence Research. AI Agents Market Size to Hit USD 294.66 Billion by 2035 · https://www.precedenceresearch.com/ai-agents-market
NIST. AI Risk Management Framework · https://www.nist.gov/itl/ai-risk-management-framework
EUR-Lex. Regulation - EU - 2024/1689 - EN - EUR-Lex · https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng
European Commission. AI Act · https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
IBM. What is RAG (Retrieval Augmented Generation)? | IBM · https://www.ibm.com/think/topics/retrieval-augmented-generation
AWS. What is RAG? - Retrieval-Augmented Generation AI Explained - AWS · https://aws.amazon.com/what-is/retrieval-augmented-generation
Anthropic. Tool use with Claude · https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview
Anthropic. Computer use tool · https://platform.claude.com/docs/en/agents-and-tools/tool-use/computer-use-tool
LangChain. Tavily search integration - Docs by LangChain · https://docs.langchain.com/oss/python/integrations/tools/tavily_search
Browserbase. Browserbase Pricing: Free, $20, $99, or Custom · https://www.browserbase.com/pricing
Browserless. Pricing · https://www.browserless.io/pricing
Cloudflare. Cloudflare AI Gateway · https://developers.cloudflare.com/ai-gateway
Cloudflare. Caching · https://developers.cloudflare.com/ai-gateway/features/caching
Cloudflare. Rate limiting · https://developers.cloudflare.com/ai-gateway/features/rate-limiting
Cloudflare. Analytics · https://developers.cloudflare.com/ai-gateway/observability/analytics
Cloudflare. Fallbacks · https://developers.cloudflare.com/ai-gateway/configuration/fallbacks
Cloudflare. Custom costs · https://developers.cloudflare.com/ai-gateway/configuration/custom-costs
Google Cloud. Grounding with Google Search | Generative AI on Vertex AI | Google Cloud Documentation · https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search
Google Cloud. Agent Search pricing · https://cloud.google.com/generative-ai-app-builder/pricing
Google Cloud. Introduction to custom search | Agent Search | Google Cloud Documentation · https://docs.cloud.google.com/generative-ai-app-builder/docs/about-generic-search
Exa. API Pricing | Exa · https://exa.ai/pricing
Exa. Exa vs Tavily: AI Search API Comparison 2026 · https://exa.ai/versus/tavily
Exa. Exa Search API - Exa · https://exa.ai/docs/reference/search-api-guide
Exa. Contents API - Exa · https://exa.ai/docs/reference/contents-api-guide
Exa. Crawling Subpages - Exa · https://exa.ai/docs/reference/crawling-subpages
Tavily. Tavily · https://www.tavily.com/pricing
Tavily. Tavily Search - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/search
Tavily. Tavily Extract - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/extract
Tavily. Tavily Crawl - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/crawl
Tavily. Create Research Task - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/research
Browserbase. Enterprise security - Browserbase Documentation · https://docs.browserbase.com/account/enterprise/security
Browserbase. Zero data retention (ZDR) - Browserbase Documentation · https://docs.browserbase.com/account/enterprise/zero-data-retention
Browserbase. This week we fixed the worst part of Browserbase · https://www.browserbase.com/blog/session-recordings

Why now

The idea

Jobs to be done

Market

Executive takeaways

Market definition

Customer and buyer

Buying triggers

Willingness to pay

Category dynamics

Tailwinds

Headwinds

Validation signals

Regulatory & technical constraints

Competition

Why incumbents do not win by default

Business plan

Problem

Solution

Why we win

Milestones

Founding team

Experiment roadmap

Risk assessment

What must be true

Open diligence questions

Financial model

Model sanity

Scenarios

Sensitivity

Top risks

Evidence

Cited sources (40)

Related dossiers

Policy-safe trace relay for AI vendors in customer VPCs, exporting redacted support evidence without raw-data exfiltration.

Knowledge expiry gate that quarantines stale docs before support and employee AI agents answer from them.

Control plane that shadow-tests email and CRM permissions before support agents can act on customer conversations.