BizIdea

AI-NATIVE SEARCH ai-infra Scan 2026-05-20 to 2026-05-20 Run 20260521000114

Control plane that routes, caches, and governs web search for production AI agents before retrieval spend and failures explode.

Once AI products move from pilot to production, web retrieval becomes a hidden failure domain: query volume spikes, search bills become unpredictable, and one stale or low-quality source can poison downstream agent actions. Platform teams can buy a search API, but they still lack a control layer that decides when to use fast head search, deep long-tail search, cached evidence, or a browser fallback.

Overall rating 3.9 / 5.0
  1. 4
    Market

    $0.9B TAM, $180.0M SAM, 43.57% CAGR, and four mapped vendors point to a large, fast-growing market that is already competitive.

  2. 4
    Differentiation

    Provider-agnostic routing, policy controls, and replay logs address gaps across Exa, Tavily, Google, and Browserbase, though the wedge is still copyable.

  3. 3
    Execution

    Plan is clear and unit economics are healthy at 72% gross margin, 6.9x LTV/CAC, and 11.6-month payback, but three model flags keep risk real.

  4. 5
    Timeliness

    Four fresh signals in a one-day scan window—major funding, strong adoption, and 1,000x query growth—make retrieval governance feel urgent now.

Section

Why now

  1. Agent traffic is about to overwhelm ad hoc retrieval logic because search volume is expected to grow by orders of magnitude as agents replace humans in research workflows.
  2. Legacy search wrappers are no longer enough, creating room for a policy and routing layer that decides how agent queries should be executed.
  3. Adoption has already crossed from experimentation into deployment, so buyers now need reliability and governance rather than a demo-quality API.
  4. The size of Exa's round and valuation shows retrieval is being funded like core infrastructure, which gives startups permission to buy tooling around it.

Catalyst. Exa's rise to default search infrastructure, its 400,000+ developer adoption, and the explicit expectation that agents will search 1,000x more than humans have shifted the problem from "can agents search?" to "how do we govern search at production scale?"

Section

The idea

The product sits between agent frameworks and retrieval providers as an API proxy plus dashboard. It predicts query intent and business value, then chooses the cheapest path that still meets freshness and confidence targets: cached evidence packet, fast head search, deep long-tail search, or browser execution. Every result is normalized into a provenance graph with source metadata, recency checks, and replay logs so teams can debug bad answers and satisfy enterprise buyers asking for citations. Customers also get policy controls for approved domains, geo restrictions, budget caps, and workflow SLAs, plus observability on cost per agent action and failure mode.

What's different. Exa proves agent-native search demand exists, but this company wins by being provider-agnostic and policy-first rather than another monolithic index. Its defensibility comes from the query-routing data, source-policy configs, and replay logs it accumulates across customer workflows — especially the decision of when not to search and when to reuse prior evidence. That makes it sticky with buyers who care about margin, provenance, and enterprise controls more than raw search endpoints.

Startup thesis
Beachhead Series B-D AI-native SaaS vendors with 20-150 engineers whose production agents perform 500,000+ external web searches per day for sales-intel, market-research, or compliance workflows
Wedge A retrieval control plane that classifies each query, routes it across fast search, deep search, cache, or browser fallback, and returns replayable citations with cost and freshness policies enforced.
Non-obvious insight The next valuable layer is not another search engine; it is the control plane above search engines that decides when an agent should search, which retrieval depth to buy, which sources are admissible, and when cached evidence is good enough. As agent traffic explodes, those policy and routing decisions matter more to customers than raw index ownership alone.
Venture-scale path Starting as routing and policy middleware for external-web retrieval, the company can expand into retrieval evals, source licensing, benchmark data, enterprise governance, and multi-provider knowledge access across every agent workflow that touches external information.
Target user
Primary user ML infrastructure and agent-platform engineers at Series B-D AI-native SaaS companies shipping research, sales-intel, or compliance agents that depend on live web retrieval
Secondary user Product leaders responsible for enterprise-grade answer quality and gross margin in AI copilots that cite external web sources
Economic buyer VP Engineering or Head of AI Platform
Go-to-market seed
First customer Head of platform engineering at a 50-200 person AI-native SaaS company that already uses Exa or SerpAPI inside a research or sales-intel product and is spending more than $25,000 per month on external search
Buying trigger A product GA launch or major enterprise rollout causes search spend to jump or surfaces a customer escalation about stale, uncited, or inconsistent answers
Current alternative Direct calls to Exa or SerpAPI plus ad hoc caching, prompt heuristics, and internal dashboards
Switching reason The wedge reduces retrieval spend and answer failures without forcing a model swap or full agent rewrite, while adding auditability enterprise customers already ask for.
Pricing hypothesis Platform subscription plus usage-based pricing tied to managed query volume, routed spend, and retained evidence traces

Jobs to be done

Job Current alternative Success metric
When our agent product starts generating thousands of external queries per minute, help our platform team control retrieval cost and reliability so they can scale usage without margin surprises or bad citations. Hand-built routing logic and dashboarding around one search API 30%+ reduction in search cost per successful agent task
When an enterprise customer questions an answer sourced from the web, help our team replay and audit the retrieval chain so they can defend the output and fix failures quickly. Manual log inspection across prompts, API calls, and app telemetry Under 10 minutes to trace a bad answer to query, source, and routing decision
Agent search control loop
flowchart LR
  Buyer[AI platform team] --> Pain[Uncontrolled search cost and weak provenance]
  Pain --> Product[Retrieval control plane]
  Product --> Outcome[Lower cost, cited answers, governed agent search]
Idea scorecard — average4.6 / 5 · 5axes
Signal5/5Pain4/5Wedge5/5Defense4/5Scale5/5
  • Signal · 5/5The cluster combines major funding, rapid valuation expansion, named customer adoption, and explicit claims that agents will search 1,000x more than humans.
  • Pain · 4/5The pain is not a one-time crisis, but once an agent product scales, cost, latency, and provenance failures directly hit gross margin and enterprise trust.
  • Wedge · 5/5A retrieval routing and governance layer for search-heavy agent companies is a crisp first product with obvious insertion point and measurable ROI.
  • Defense · 4/5Workflow-specific routing data, provenance graphs, and embedded source policies create stickiness, though providers could try to move upward over time.
  • Scale · 5/5If every production agent needs governed external knowledge access, this can grow from middleware into the standard control plane for retrieval across the agent economy.
Business model canvas
Key partners
  • Search API providers
  • Browser automation and crawling vendors
  • LLM observability platforms
Key activities
  • Routing and caching optimization
  • Provenance graph generation
  • Retrieval evaluation and policy tooling
Key resources
  • Query routing models
  • Retrieval trace corpus
  • Integrations with search and browser providers
Value propositions
  • Reduce search spend without degrading answer quality
  • Enforce provenance, freshness, and source-policy controls for agents
  • Give platform teams replayable retrieval traces and SLA observability
Customer relationships
  • High-touch implementation for first design partners
  • Usage reviews tied to spend savings and answer-quality metrics
  • Shared eval dashboards with platform teams
Channels
  • Founder-led sales into AI product and platform teams
  • Partnerships with agent-framework and observability vendors
  • Technical content targeted at retrieval-heavy AI builders
Customer segments
  • AI-native SaaS companies running production agents on live web data
  • Enterprise internal AI platform teams standardizing retrieval across apps
Cost structure
  • Cloud inference and storage
  • Provider usage costs
  • Engineering for routing, evals, and integrations
Revenue streams
  • Annual platform subscription
  • Usage-based fee on managed search requests
  • Enterprise add-ons for retention, compliance, and private deployment
Section

Market

Market sizing
TAMSAMSOM TAM · Total addressable $0.9B SAM · Serviceable available $180.0M SOM · Serviceable obtainable $5.4M
Market sizing overview
TAM $0.9B Bottom-up estimate: assume ~15,000 global retrieval-heavy teams ultimately need multi-provider search governance (modeled from Exa reporting 5,000+ companies already using its stack, multiplied by ~3x to account for broader multi-vendor adoption), at roughly $60k annual control-plane spend per team; cross-check = ~15% of the 2026 enterprise-search market.
SAM $180.0M Apply the beachhead constraint to ~3,000 Series B-D AI-native SaaS teams with retrieval-heavy production agents (about 20% of the modeled TAM universe) at the same ~$60k annual control-plane spend.
SOM $5.4M Reachable year-3 share modeled as ~60 customers at roughly $90k blended annual contract value once pilots prove spend savings and citation reliability.

Executive takeaways

  • Agent-native retrieval has clearly become infrastructure: Exa says agents will search 1,000x more than humans and Tavily has raised capital around the same thesis, which validates a real workload shift rather than a speculative feature add-on. [1][3]
  • The wedge is not another index alone; it is orchestration across search depth, extraction, browser fallback, caching, and traceability. Public docs from Exa, Tavily, Cloudflare, and browser-runtime vendors show buyers already assemble these primitives by hand. [48][63][24][22][23]
  • Budget exists because providers already charge on a visible per-query or per-run basis: Exa, Tavily, Google Agent Search, Browserbase, and Browserless all publish usage-linked pricing that turns search reliability into an operating-cost line item. [45][61][37][22][23]
  • The strategic gap is provider-agnostic governance. Search APIs optimize their own surfaces, clouds optimize their own platforms, and browser runtimes optimize execution; none of them naturally optimize when not to search, when to reuse evidence, and how to keep multi-provider policy consistent. [47][39][28][77]
  • Adoption friction will center on enterprise controls, not raw API access. NIST, the EU AI Act, Anthropic computer-use guidance, and browser-runtime security docs all point to logging, oversight, retention, and prompt-injection controls as table stakes. [11][12][20][78]
  • The category should grow with agent adoption faster than classic enterprise search alone: Precedence pegs AI agents at 43.57% CAGR through 2035 versus 9.05% for enterprise search, implying outsized growth for the subset of spend tied to live external retrieval. [10][9]

Market definition

Provider-agnostic control software for production AI-agent retrieval. The category sits above search APIs, grounding services, caches, and browser runtimes to decide whether a query should hit fast search, deep search, extraction, cached evidence, or browser fallback while preserving citations, freshness policies, and cost telemetry.

Customer and buyer

Primary users are ML infrastructure and agent-platform engineers running retrieval-heavy research, sales-intel, or compliance agents. The economic buyer is typically a VP Engineering, Head of AI Platform, or platform GM who owns margin, answer quality, and enterprise trust for agent products.

Buying triggers

  • A product GA launch or enterprise rollout causes external-search spend to jump and exposes that direct provider calls lack cost guardrails. [45][61][37]
  • Bad or stale citations trigger customer escalations, making replayability and source-policy controls more urgent than raw retrieval breadth. [15][16][51]
  • Teams add browser execution or multi-provider fallbacks to recover failed searches, which raises operational complexity and creates a clear insertion point for a routing control plane. [28][22][23]

Willingness to pay

Public pricing already trains buyers to pay for retrieval infrastructure as metered production spend: Exa charges $7 per 1,000 search requests, Tavily charges per credit, Google Agent Search charges per 1,000 queries, and browser runtimes bill for search, fetch, browser hours, or units. A control plane can therefore anchor ROI around reduced routed spend and fewer failed tasks rather than inventing a new budget category. [45][61][37][22][23]

Category dynamics

Growth signal 43.57% CAGR

Tailwinds

  • The AI-agent market is scaling much faster than classic enterprise search, creating a fast-growing demand layer for live external retrieval.
  • Providers now expose search, extraction, research, and browser primitives as composable APIs, which lowers integration friction for a control-plane layer.
  • Funding into Exa and Tavily signals that investors and customers view agent retrieval as strategic infrastructure.

Headwinds

  • Provider bundling could compress the value of an independent orchestration layer if search vendors or clouds add enough governance features themselves.
  • Browser fallback remains operationally messy and can create prompt-injection, retention, and security review burdens that slow sales cycles.

Validation signals

  • Exa’s funding and adoption narrative shows that large AI builders already treat web search as core infrastructure.
  • Tavily’s funding and LangChain integration indicate that agent developers actively choose specialized search APIs rather than defaulting to generic web search.
  • Public per-query pricing from Exa, Google, and browser runtimes makes retrieval cost concrete enough for a cost-optimization wedge.
  • Independent comparisons increasingly treat provider choice, latency, and response structure as first-order engineering decisions for agent teams.

Regulatory & technical constraints

  • The EU AI Act raises expectations for documentation, logging, traceability, and human oversight in higher-risk deployments.
  • NIST frames trustworthy AI deployment around structured risk-management practices, which strengthens the case for auditable retrieval policies.
  • Anthropic warns that browser-capable agents face prompt injection and recommends domain restrictions, minimal privileges, and user confirmation for high-impact actions.
  • Google documents a one-million-queries-per-day limit for grounding with Google Search, underscoring provider limits and the need for routing/fallback controls.
  • Enterprise buyers will ask for retention and security controls over logs and recordings when browser fallback is part of the workflow.
Agent retrieval stack market map
← Single-provider primitive Cross-provider orchestration → ← Lower workflow urgency Higher production urgency → Q2 Q1 · winning zone Q3 Q4 Proposed startup Exa Tavily Google Agent Search Browserbase
Section

Competition

Competition comes from four directions: agent-native search APIs such as Exa and Tavily; cloud grounding and enterprise-search stacks led by Google; browser runtimes such as Browserbase and Browserless that own execution fallback; and in-house glue code built on framework tools. The startup only wins if it becomes the cross-provider policy and spend layer rather than another point solution within one provider stack.

Competitor Stage Wedge Pricing Strength Weakness vs. us
Exa scale-up Agent-native search engine with configurable latency, deep search, contents extraction, and async agent workflows. $7/1k search requests; $12 deep search; $15 deep-reasoning; $0.025-$2.00 per agent run. Strong semantic retrieval, broad search modes, and public evidence that leading AI companies already use it. Still optimizes the Exa stack itself rather than neutral routing, spend policy, or evidence reuse across several providers.
Tavily scale-up LLM-oriented search, extract, crawl, map, and research endpoints tuned for agent workflows with simple integration. 1,000 free API credits per month; pay-as-you-go at $0.008 per credit; enterprise custom. Developer-friendly search surface with broad framework adoption and a clear product around agent retrieval. Like Exa, Tavily is still a provider surface, not a control plane that arbitrates among providers and fallback modes.
Google Agent Search incumbent Enterprise search and grounded answers inside the Google Cloud stack, with grounding from Google Search and configurable search apps. $1.50/1,000 standard queries; $4.00/1,000 enterprise queries; +$4.00/1,000 advanced generative-answer queries. Distribution, enterprise packaging, and direct access to Google search grounding make it credible for large-company buyers. Best aligned with Google-native deployments and internal-search use cases, not neutral routing across several external retrieval providers.
Browserbase scale-up Browser agent infrastructure with search, fetch, browser hours, captcha solving, and enterprise security controls for browser fallback. Developer and startup plans meter browser hours plus $7/1k search calls and $0.5-$1/1k fetch calls, with enterprise custom. Owns the authenticated browser-execution layer that search providers often hand off to when plain search fails. It is a runtime for execution and capture, not a retrieval-routing layer that decides when to search, cache, or browse.

Why incumbents do not win by default

  • Search APIs. Exa and Tavily already optimize search quality, extraction, and research depth, but they are single-provider products that do not maximize multi-provider routing, evidence reuse, or neutral spend governance by default.
  • Cloud platforms. Google bundles grounding, enterprise search, and governance features, yet its economic and technical center of gravity is still the Google stack rather than a neutral control plane spanning several retrieval vendors and browser fallbacks.
  • Browser runtimes. Browserbase and Browserless are compelling for authenticated fallback and extraction, but they monetize execution primitives rather than query classification, evidence admissibility, or provider-agnostic search governance.
  • Gateway and observability vendors. Cloudflare proves that teams want analytics, fallbacks, custom cost accounting, and key management, but its scope is model-gateway infrastructure rather than a retrieval-specific orchestration layer across search, cache, and browser actions.
  • In-house. Tool-use frameworks and vendor SDKs make custom orchestration possible, but they push routing policy, failure handling, and auditability work back onto the platform team.
Section

Business plan

This company should start as a provider-agnostic retrieval control plane for AI-native SaaS teams that already run production agents on live web data and feel search cost, answer-quality, and provenance pain at the same time. The first customer is a Series B-D AI-native SaaS company with 20-150 engineers, at least one retrieval-heavy research, sales-intel, or compliance workflow, and more than $25,000 per month of external search spend spread across Exa, Tavily, Google, or browser fallback. The initial product should sit as an API proxy and policy layer above those providers, deciding when to use fast search, deep search, cached evidence, or browser fallback while returning replayable citations and cost telemetry. The wedge is attractive because buyers already see retrieval as metered infrastructure spend, so a pilot can be sold around measurable savings per successful agent task and faster debugging of stale or uncited answers. Research supports a meaningful market with an estimated $0.9B TAM, $180.0M beachhead SAM, and $5.4M reachable year-three SOM if the company can land roughly 60 customers at about $90k blended ACV. The deliberate tradeoff is not to build another search engine, broad enterprise search suite, or autonomous browser stack; the company should first win one control point where multi-provider routing and governance are more valuable than raw index ownership. The biggest disconfirming risks are that providers bundle enough native governance to collapse the wedge, or that too few teams have enough routed-query spend to justify a standalone line item. The first 12 months should therefore focus on 3-5 paid design partners, proof of at least 25% retrieval-spend reduction on covered workflows, and evidence that browser fallback remains the exception rather than the margin-destroying default.

Problem

  • Retrieval-heavy agent products accumulate unpredictable search spend, stale or weak citations, and inconsistent failure handling once usage moves from pilot to production.
  • Platform teams can buy search APIs and browser runtimes, but they still lack a neutral control layer for routing, caching, provenance, budget caps, and source-policy enforcement across providers.

Solution

  • Provide an API proxy and dashboard that classifies each query and routes it to cached evidence, fast search, deep search, or constrained browser fallback based on freshness, confidence, and cost policy.
  • Normalize every retrieval step into replayable evidence traces with source metadata, policy logs, and per-task cost analytics so teams can debug bad answers and satisfy enterprise citation requirements.

Why we win

  • The product is narrower than building another search engine and more neutral than any single provider stack, which makes the ROI case legible for buyers already juggling multiple retrieval vendors.
  • Defensibility compounds from routing-performance data, evidence-reuse patterns, and embedded source-policy configurations that improve with each covered workflow.
  • The first sale can be tied to a single painful workflow and a visible provider bill, which is faster to prove than a broader enterprise-search replacement motion.
Strategic choices
Beachhead Series B-D AI-native SaaS companies with 20-150 engineers, one or more production research, sales-intel, or compliance agents, and at least 500,000 external web searches per day across Exa, Tavily, Google, or similar providers.
Wedge rationale The narrow entry point is a retrieval-governance pilot for one live workflow where budget spikes and citation failures are already visible to the platform team; this creates faster proof than selling horizontal search governance to every enterprise app team at once.
Sequencing Start with a thin proxy, provider analytics, replayable citations, and policy controls for the existing search stack so deployment is low-friction and savings are measurable. Add deeper caching, evals, browser fallback orchestration, and partner-led distribution only after pilots show that customers trust the routing layer and will pay for it as infrastructure rather than custom services.
Not yet Building a proprietary web index or crawler network · Selling to low-volume teams that have not yet reached budget pain · Broad internal enterprise search across HR, support, and knowledge management · Full autonomous browser execution for sensitive workflows without approval gates
Go-to-market
Wedge Sell a paid retrieval-governance pilot for one production workflow, then convert the pilot into an always-on control plane once the customer trusts the routing rules and sees lower cost per successful agent task.
Channels Founder-led outbound into teams already using Exa, Tavily, Google Agent Search, or browser-based fallbacks · Co-selling and integration partnerships with search providers, browser-runtime vendors, and AI observability platforms · Technical content and benchmark reports aimed at platform engineers building retrieval-heavy agents
Funnel targets lead→qualified pilot 15-25%, qualified pilot→paid pilot 40-50%, paid pilot→production 50%+, production→second workflow expansion 30%+ in year one
Pricing Charge an annual platform subscription plus usage-based fees tied to managed query volume and retained evidence traces, because buyers already understand retrieval as metered infrastructure spend. Price the first paid pilot as a scoped savings and governance engagement, then convert to roughly $60k-$120k annual ACV for the first production workflow if the control plane reduces spend and improves citation replay speed.
Product roadmap
MVP MVP is a provider-agnostic proxy for Exa, Tavily, Google grounding, and one browser-fallback partner with query classification, cost accounting, cache controls, approved-domain policies, and replayable evidence traces. It must show which path was taken for each task, why it was chosen, and whether freshness and citation policy were met.
6 months Ship production pilots with routing policies by query class, shared cache and evidence reuse, provider-level observability, and constrained browser fallback for the small subset of queries that cannot be resolved through search alone.
12 months Add customer-specific eval harnesses, policy templates for regulated and customer-facing workflows, approval gates for risky fallback actions, and support for more provider mixes beyond the initial Exa-Tavily-Google stack.
24 months Expand from one workflow wedge into a broader external-knowledge control plane with multi-team governance, benchmark data, enterprise retention controls, and adjacent products for retrieval evals and licensed-source orchestration.
Key bets Most target buyers can realize meaningful savings through better routing and evidence reuse before demanding proprietary search quality improvements. · Browser fallback is necessary often enough to matter strategically, but not so often that it destroys gross margin in the first wedge. · Replayable evidence traces and domain policies are strong enough enterprise differentiators to beat hand-built orchestration. · A thin proxy integration is easier for early customers to adopt than a full agent-stack rewrite.
Business model
Revenue streams Annual subscription for the retrieval control plane · Usage-based overage on managed query volume, evidence retention, and advanced policy controls · Enterprise add-ons for private deployment, retention controls, audit packaging, and advanced evaluation modules
Unit of value Covered production workflow with included routed-query volume and evidence retention
Target gross margin 72%
Expansion levers Add more retrieval-heavy workflows within the same customer after the first pilot converts · Expand provider coverage and policy depth across search, cache, and browser fallback paths · Sell higher-assurance governance, eval, and retention modules to larger enterprise deployments
Strategy map
North-star metric Percent of covered agent tasks that meet freshness and citation policy at or below target retrieval cost
Input metrics Paid pilot to production conversion rate · Retrieval spend reduction per successful agent task on covered workflows · Median time to replay the retrieval chain behind a bad answer · Cache-hit rate on repeated high-value queries · Share of routed queries requiring browser fallback
Moats to build Historical routing graph showing which provider, depth, and fallback path works best by query class · Replayable evidence-trace corpus tied to customer outcomes and failure modes · Embedded domain, geography, and retention policies that become sticky inside customer workflows
Kill criteria Fewer than 3 paid design partners after 30 qualified platform-team conversations · Covered-workflow spend savings below 15% in the first 3 pilots · Paid pilot to production conversion below 40% after the first 5 pilots · Browser fallback required on more than 30% of covered queries without clear gross-margin offset

Milestones

0–12 months
  • Sign 3-5 paid design partners in the AI-native SaaS beachhead
  • Show at least 25% retrieval-spend reduction on one covered workflow in the first 3 pilots
  • Launch a production-ready proxy with replayable evidence traces and approved-domain policies
  • Convert at least 2 paid pilots into annual production contracts
12–24 months
  • Expand from the first workflow into at least 2 additional retrieval-heavy workflows per customer cohort
  • Add broader provider support, eval tooling, and higher-assurance governance modules
  • Establish 3 active ecosystem partners that source or accelerate pilots
  • Reach repeatable annual ACV in the modeled $60k-$120k range without services-heavy deployment
24–36 months
  • Standardize as the external-knowledge control plane for multi-team agent deployments
  • Launch enterprise retention and regional-policy packaging for broader international sales
  • Build a routing and evidence dataset large enough to support benchmark products and expansion modules
  • Reach the modeled year-three SOM path of roughly $5.4M in annualized revenue
Strategy map
flowchart LR
  Wedge[Retrieval-governance pilot] --> MVP[Provider-agnostic routing proxy]
  MVP --> Proof[Lower spend and replayable citations]
  Proof --> Expansion[More workflows and enterprise governance modules]

Founding team

Role Start timing Rationale
Founder CEO Month 0 Founder-led sales is required because the buyer is senior, the category needs education, and early pricing depends on quantified ROI.
Founding eng Month 0 The company needs a strong technical lead to build the proxy, routing logic, telemetry, and early integrations.
Applied ML and product engineer Month 2 Query classification, cache strategy, and retrieval evaluation are core product risks and need dedicated ownership early.
Solutions engineer Month 6 Early enterprise pilots will hinge on fast integration, workflow mapping, and proving value from customer traces.
Security and platform engineer Month 9 Retention controls, policy logging, and deployment hardening become critical once pilots move into procurement.
GTM lead Month 12 Add repeatable pipeline ownership only after pilot packaging and conversion evidence are clear.

Experiment roadmap

Horizon Experiment Hypothesis Success metric Owner
0–90 days ICP and budget-threshold discovery Retrieval-heavy AI-native SaaS teams with visible provider bills and customer-facing citations will describe a funded platform problem rather than a nice-to-have optimization. 15 discovery interviews completed, 8 matching the beachhead profile, and 5 sharing credible spend or failure baselines. Founder CEO
0–90 days Concierge routing benchmark Manually tuned routing across cache, fast search, deep search, and browser fallback can reduce cost per successful task by at least 25% on one workflow. 2 design partners benchmark at least 100 tasks each and both show 25%+ spend reduction without lower answer acceptance. Founding eng
90–180 days Thin-proxy pilot deployment Customers will deploy a provider-agnostic proxy faster than they would replace their retrieval stack outright. 3 paid pilots launch with time to first routed production task under 30 days. Founding eng
90–180 days Pricing and packaging test Platform subscription plus managed-query overage pricing converts better than pure consumption pricing. Preferred package wins in at least 4 of 6 budget-owner conversations and appears in 2 signed pilot scopes. Founder CEO
6–12 months Evidence-trace procurement validation Audit logs, approved-domain policies, and retention controls are sufficient to unblock security review for customer-facing AI use cases. At least 2 pilots complete security review without requiring customer-hosted deployment or custom one-off controls. Security platform lead
12–18 months Partner-sourced expansion motion Search-provider, browser-runtime, and observability partners can source qualified pilots with conversion comparable to founder-led outbound. 25% of qualified pipeline comes from 3 active partners and pilot conversion is no worse than direct outbound. GTM lead

Risk assessment

Business plan risks — 4 mapped
Impact →
High
R2 R3
R1
Medium
R4
Low
Low
Medium
High
Likelihood →
  1. R1Providers bundle routing, caching, or governance features fast enough to compress the standalone wedge. · Highlikelihood / Highimpact — Differentiate on multi-provider neutrality, workflow-specific policy controls, and evidence reuse across several backends rather than features tied to one provider.
  2. R2Too few target teams have enough routed-query spend to support a new infrastructure vendor. · Mediumlikelihood / Highimpact — Focus on accounts already spending materially on retrieval, require a measurable spend baseline before pilots, and avoid low-volume segments.
  3. R3Browser fallback becomes more common than expected and damages gross margin and deployment simplicity. · Mediumlikelihood / Highimpact — Partner for browser execution, meter fallback separately, and prioritize better routing and caching before expanding browser-heavy workflows.
  4. R4Security and compliance reviews slow deals because logs, artifacts, and fallback paths feel risky. · Mediumlikelihood / Mediumimpact — Ship retention controls, domain restrictions, approval gates, and auditable routing logs in the MVP rather than as later enterprise features.
Risk Likelihood Impact Mitigation
Providers bundle routing, caching, or governance features fast enough to compress the standalone wedge. High High Differentiate on multi-provider neutrality, workflow-specific policy controls, and evidence reuse across several backends rather than features tied to one provider.
Too few target teams have enough routed-query spend to support a new infrastructure vendor. Medium High Focus on accounts already spending materially on retrieval, require a measurable spend baseline before pilots, and avoid low-volume segments.
Browser fallback becomes more common than expected and damages gross margin and deployment simplicity. Medium High Partner for browser execution, meter fallback separately, and prioritize better routing and caching before expanding browser-heavy workflows.
Security and compliance reviews slow deals because logs, artifacts, and fallback paths feel risky. Medium Medium Ship retention controls, domain restrictions, approval gates, and auditable routing logs in the MVP rather than as later enterprise features.
First customer
Title Head of AI Platform at an AI-native SaaS company
Profile A Series B-D software company with 50-200 employees, 20-150 engineers, and a production research, sales-intel, or compliance agent that already uses one or more external search providers.
Trigger Product GA, enterprise rollout, or a customer escalation exposes rising retrieval bills and stale or weakly cited answers.
Buyer VP Engineering or Head of AI Platform
Initial contract $20k-$40k paid pilot for one workflow and baseline measurement, converting to roughly $60k-$120k annual ACV once routing policies and evidence traces are trusted in production.

What must be true

  • At least a meaningful subset of target buyers already spends enough on external retrieval to justify a standalone control-plane line item.
  • A thin proxy can cut covered-workflow retrieval spend by at least 25% without reducing answer acceptance or freshness.
  • Buyers care enough about replayable citations and policy controls to choose a neutral layer over deeper commitment to one provider.
  • Browser fallback remains a minority of routed queries in the first wedge so gross margin stays above software-infrastructure norms.
  • Provider and framework churn does not force so much integration work that the business becomes services-heavy.

Open diligence questions

  • What monthly routed-query spend or failure rate actually unlocks budget for a dedicated control plane?
  • In the first ten real prospects, which KPI matters most: spend saved, citation replay speed, or lower bad-answer incidence?
  • How often do target teams need browser fallback after query routing and caching are improved?
  • Which provider combinations dominate the beachhead accounts and therefore determine MVP integration scope?
  • What native governance features are Exa, Tavily, Google, and gateway vendors already shipping into the same workflow?
Investor verdict
Call Meet / investigate further
Conviction Strong wedge and market timing, but conviction depends on proving standalone budget thresholds and keeping browser-heavy workloads from eroding margins.
Why believe The plan targets a real production bottleneck where buyers already see metered retrieval costs and lack a neutral cross-provider governance layer.
Why doubt Search providers, clouds, and browser vendors can all bundle adjacent controls, so the startup must prove that neutrality and workflow data create enough value before incumbents close the gap.
Next diligence Confirm 3 paid pilots with customers already spending meaningfully on external retrieval and show at least 25% spend reduction plus sub-10-minute citation replay on one workflow.
Section

Financial model

3-year totals
Year 1 revenue $54K EBITDA $-928K · Cash EOP $2.37M
Year 2 revenue $520K EBITDA $-1.28M · Cash EOP $1.10M
Year 3 revenue $1.81M EBITDA $-733K · Cash EOP $363K
Unit economics
ARPU (annual) $110K
Gross margin 72%
CAC $76K Payback 11.6 months
LTV / CAC 6.9x LTV $528K
Funding ask
Round pre-seed · $3.3M
Runway 36 months
Milestone Reach 16 paying customers by Q2Y3, prove blended production ACV above $100K with 72% gross margin, and have 3 active ecosystem partners while still holding roughly 6 months of buffer into Q4Y3.

Model sanity

  • Revenue engine. Base-case revenue comes from growing from 4 paying accounts in Y1 to 24 by Q4Y3 while blended ACV rises from $36K pilot pricing to $110K production contracts.
  • Must go right. Founder-led pilots need to convert into production before the GTM team scales, because the sales-cycle sensitivity is the single biggest revenue and runway swing factor.
  • Model breaks if. The downside case turns cash negative if buyers will not support a $60K-$120K annual control-plane budget or if browser fallback keeps gross margin below about 68%.
  • Next-round proof. The strongest next financing story is 16 paying customers by Q2Y3, 72% gross margin, and 3 active ecosystem partners that show the motion is no longer purely founder-sourced.
Revenue, cash, and EBITDA — 12-month Y1 + 8-quarter Y2/Y3
$0K$1.00M$2.00M$3.00M$4.00MM1M4M7M10Q1Y2Q4Y2Q3Y3Q4Y3
  • Revenue (line, area)
  • Cash EOP (dashed)
  • EBITDA (bars, gray = loss)
Use of funds — $3.3M pre-seed
Engineering · 45% GTM · 28% G&A · 9% Buffer (6 mo) · 18%
Headcount build by role — peak9 FTE
Q1Y13Q2Y14Q3Y15Q4Y16Q1Y26Q2Y26Q3Y26Q4Y28Q1Y38Q2Y38Q3Y38Q4Y39
  • Founder / CEO
  • Founding engineer
  • Applied ML / product engineer
  • Solutions engineer
  • Security / platform engineer
  • GTM lead
  • Customer success / solutions architect
  • Account executive
  • Platform engineer
Year-3 scenarios — base / downside / upside
Y3 revenueY3 EBITDACash low pointDescription
Downside$1.25M-$1.19M-$156KSlower pilot conversion and weaker standalone budget proof keep the company below the intended production-customer ramp.
Base$1.81M-$733K$363KFour paid pilots in Y1 convert into a measured but repeatable production ramp, and revenue grows through higher ACV plus modest customer count expansion.
Upside$2.46M-$220K$879KDesign partners convert faster, partner channels contribute earlier, and production pricing expands with stronger usage and governance attach rates.
Sensitivity — Y3 cash and revenue impact, sorted by magnitude
VariableDownsideUpsideCash impactRevenue impact
CAC$95K CAC per new customer$65K CAC per new customer-$375K$0K
sales cyclePilot-to-production close slips by one quarterPartner-sourced deals close in under two months after pilot proof-$290K-$385K
hiring pacePull the platform engineer and a second field-facing hire forward by two quartersDelay the Y3 platform hire by one quarter and cover overflow with partners-$140K-$60K
ARPU$100K blended annual ACV in Y3$120K blended annual ACV in Y3-$118K-$165K
churn2.0% monthly churn after the first contract term1.0% monthly churn-$90K-$120K
gross margin68%74%-$73K$0K

Scenarios

Scenario Y3 revenue Y3 EBITDA Cash low point Description Key changes
Downside $1.25M $-1.19M $-156K Slower pilot conversion and weaker standalone budget proof keep the company below the intended production-customer ramp.
  • Y2 exits at 8 paying customers and Y3 exits at 18 instead of 24
  • Blended annual ACV reaches only $75K in Y2 and $100K in Y3
  • Gross margin tops out around 68% because browser fallback and provider costs stay elevated
Base $1.81M $-733K $363K Four paid pilots in Y1 convert into a measured but repeatable production ramp, and revenue grows through higher ACV plus modest customer count expansion.
  • Y2 exits at 10 paying customers and Y3 exits at 24
  • Blended annual ACV rises from $36K pilot pricing to $80K in Y2 and $110K in Y3
  • Gross margin reaches the 72% business-plan target once routing and caching mature
Upside $2.46M $-220K $879K Design partners convert faster, partner channels contribute earlier, and production pricing expands with stronger usage and governance attach rates.
  • Y2 exits at 12 paying customers and Y3 exits at 30
  • Blended annual ACV reaches $85K in Y2 and $120K in Y3
  • Gross margin improves to 74% as cache hit rates and query routing outperform plan

Sensitivity

Variable Downside Base Upside
ARPU $100K blended annual ACV in Y3 $110K blended annual ACV in Y3 $120K blended annual ACV in Y3
CAC $95K CAC per new customer $76.32K CAC per new customer $65K CAC per new customer
churn 2.0% monthly churn after the first contract term 1.25% monthly churn 1.0% monthly churn
sales cycle Pilot-to-production close slips by one quarter Paid pilots convert inside two quarters Partner-sourced deals close in under two months after pilot proof
gross margin 68% 72% 74%
hiring pace Pull the platform engineer and a second field-facing hire forward by two quarters Add only one platform engineer in Y3 after Y2 production proof Delay the Y3 platform hire by one quarter and cover overflow with partners
Key assumptions (18)
ID Name Value Unit Source
A1 Model start month 2026-06 YYYY-MM [BP date 2026-05-21] the model starts the month after the business-plan date so the round closes before the hiring ramp.
A2 Opening cash 3300.0 USDK [BP fundingAsk targetFundingRangeUsd $2-4M] base case uses a $3.3M pre-seed, inside the stated range and large enough to reach the next milestone plus six months of buffer.
A3 Customer unit in the model active paying customer account definition [BP market.som 60 customers at about $90k blended ACV; BP businessModel.unitOfValue covered production workflow] customersEop tracks paying accounts, with expansion reflected through rising ARPU.
A4 Starting customers (M1) 0 count [BP milestones 0-12 months] the company starts pre-revenue before signing paid design partners.
A5 Y1 new paying customers by month [0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0] count [BP milestones sign 3-5 paid design partners and convert at least 2 pilots] base case closes four paid pilots across the back half of Y1.
A6 Y2 new paying customers by quarter [1, 1, 2, 2] count [BP milestones 12-24 months] assumes a modest founder-led ramp from pilot conversions plus the first GTM support hire.
A7 Y3 new paying customers by quarter [3, 3, 4, 4] count [BP market.som 60 customers at about $90k ACV] base case ends Y3 at 24 customers, deliberately below the SOM path to stay conservative for a founder-led infrastructure sale.
A8 Annual price ladder Y1 36.0; Y2 80.0; Y3 110.0 annual USDK per customer [BP investorMemo.initialContract $20k-$40k paid pilot; BP gtm.pricing $60k-$120k annual ACV plus usage fees] pilots start near the midpoint, then production pricing moves into the target range with usage and governance add-ons.
A9 Revenue recognition policy average active customers in period multiplied by annualized contract value formula Startup-finance heuristic: new enterprise customers usually activate mid-period on average, so recognized revenue uses ((BoP + EoP) / 2) x annualized price.
A10 Gross margin ramp Y1 50-60 on paid months; Y2 62, 65, 68, 70 by quarter; Y3 71, 72, 72, 72 by quarter percent [BP businessModel.targetGrossMarginPct 72; BP risks on browser fallback; Research sensitivityCases browser fallback] margin climbs only as routing and caching reduce expensive browser and provider overuse.
A11 Loaded salary bands Founder 180; founding eng 180; applied ML/product eng 170; solutions eng 145; security/platform eng 170; GTM lead 170; customer success 130; AE 155; platform eng 160 annual USDK per FTE [BP team roles] plus startup-finance heuristic for seed-stage U.S.-centric infrastructure startups including payroll tax and benefits load.
A12 Hiring schedule founder and founding eng in M1; applied ML/product eng in M2; solutions eng in M6; security/platform eng in M9; GTM lead in M12; customer success/solutions architect in M15; AE in M18; platform engineer in M27 timing [BP team startTiming through Month 12] plus startup-finance heuristic that later hires wait for pilot proof and partner-sourced pipeline.
A13 Payroll allocation policy founder 70% S&M and 30% G&A; applied ML/product eng 80% R&D and 20% S&M; solutions eng 50% S&M, 30% R&D, 20% G&A; security/platform eng 85% R&D and 15% G&A; customer success 65% S&M, 20% R&D, 15% G&A; engineering fully R&D; GTM and AE fully S&M policy [BP team rationales; BP gtm; BP operations] reflects founder-led selling, implementation-heavy pilots, and an initially product-weighted org.
A14 Non-payroll operating expense ramp S&M 5-20 monthly; R&D 10-22 monthly; G&A 7-15 monthly USDK per month [BP operations; BP fundingAsk.useOfFundsSummary] plus startup-finance heuristic for cloud spend, eval runs, travel, legal, and compliance tooling in a lean pre-seed plan.
A15 Steady-state monthly churn 1.25 percent Startup-finance heuristic: annual infrastructure contracts with observable ROI should retain well, but the early category and provider bundling risk keep modeled churn above best-in-class infra retention.
A16 Blended CAC per customer 76.32 USDK Calculated from modeled Y2-Y3 sales and marketing spend of $1.526M divided by 20 new customers; consistent with founder-led, technical enterprise sales.
A17 Funding sizing rule reach the next fundable milestone and hold six months of buffer policy Developer instruction plus [BP fundingAsk runwayMonths 18] base case sizes the round to reach mid-Y3 proof points with H2Y3 buffer, not just to launch the first pilots.
A18 Cash flow simplification ending cash equals opening cash plus cumulative EBITDA formula Startup-finance heuristic: the model assumes an asset-light software business with negligible capex, debt, and working-capital distortion.
unit economics flow
flowchart LR
  QualifiedTeams[Qualified platform teams] --> PaidPilots[Paid pilots]
  PaidPilots --> ProductionAccounts[Production accounts]
  ProductionAccounts --> Revenue[Subscription and usage revenue]
  Revenue --> GrossProfit[Gross profit after provider and browser costs]
  GrossProfit --> Cash[Cash runway]

Flags: The model still exits Y3 EBITDA-negative, so it assumes the company raises again off proof of efficient growth rather than self-funding. · Revenue per FTE only reaches the low end of software benchmarks, which leaves little room for extra services work or a faster support ramp. · Gross margin assumes browser fallback remains a minority path; if it stays above the business-plan risk threshold, the funding ask would need to move higher.

Section

Top risks

  • Provider bundling. Search providers could add their own routing, caching, and observability features and squeeze the middleware layer. Mitigation: Focus on multi-provider governance, replayability, and source-policy controls that customers need across several backends, not just one API.
  • Weak early ROI. Teams below meaningful query volume may not feel enough pain to buy a dedicated retrieval control plane. Mitigation: Target customers already spending $25,000+ per month on search and sell a rapid pilot around hard dollar savings and citation reliability.
  • Fast-moving agent stacks. Framework churn could make deeply embedded integrations expensive to maintain. Mitigation: Integrate at the API proxy and telemetry layer so the product remains useful across changing models, frameworks, and search vendors.
Section

Evidence

Cited sources (40)

  1. Andreessen Horowitz. Investing in Exa | Andreessen Horowitz · https://a16z.com/announcement/investing-in-exa
  2. SiliconANGLE. Exa Labs raises $250M at $2.2B valuation for its AI search tools - SiliconANGLE · https://siliconangle.com/2026/05/20/exa-labs-raises-250m-2-2b-valuation-ai-search-tools
  3. TechCrunch. Tavily raises $25M to connect AI agents to the web | TechCrunch · https://techcrunch.com/2025/08/06/tavily-raises-25m-to-connect-ai-agents-to-the-web
  4. Data4AI. Exa.ai vs. Tavily - AI Semantic Search API for LLM - Data4AI · https://data4ai.com/blog/tool-comparisons/exa-ai-vs-tavily
  5. Rhumb. Exa vs Tavily vs Serper vs Brave Search for AI Agents · https://rhumb.dev/blog/exa-vs-tavily-vs-serper-vs-brave-search
  6. WebSearchAPI.ai. Compare Tavily, Perplexity API, Google Search Grounding, Exa with LLM-as-Judge in LangSmith · https://websearchapi.ai/blog/compare-tavily-google-search-exa-perplexity
  7. Precedence Research. Enterprise Search Market Size to Hit USD 12.71 Billion by 2035 · https://www.precedenceresearch.com/enterprise-search-market
  8. Precedence Research. AI Agents Market Size to Hit USD 294.66 Billion by 2035 · https://www.precedenceresearch.com/ai-agents-market
  9. NIST. AI Risk Management Framework · https://www.nist.gov/itl/ai-risk-management-framework
  10. EUR-Lex. Regulation - EU - 2024/1689 - EN - EUR-Lex · https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng
  11. European Commission. AI Act · https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
  12. IBM. What is RAG (Retrieval Augmented Generation)? | IBM · https://www.ibm.com/think/topics/retrieval-augmented-generation
  13. AWS. What is RAG? - Retrieval-Augmented Generation AI Explained - AWS · https://aws.amazon.com/what-is/retrieval-augmented-generation
  14. Anthropic. Tool use with Claude · https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview
  15. Anthropic. Computer use tool · https://platform.claude.com/docs/en/agents-and-tools/tool-use/computer-use-tool
  16. LangChain. Tavily search integration - Docs by LangChain · https://docs.langchain.com/oss/python/integrations/tools/tavily_search
  17. Browserbase. Browserbase Pricing: Free, $20, $99, or Custom · https://www.browserbase.com/pricing
  18. Browserless. Pricing · https://www.browserless.io/pricing
  19. Cloudflare. Cloudflare AI Gateway · https://developers.cloudflare.com/ai-gateway
  20. Cloudflare. Caching · https://developers.cloudflare.com/ai-gateway/features/caching
  21. Cloudflare. Rate limiting · https://developers.cloudflare.com/ai-gateway/features/rate-limiting
  22. Cloudflare. Analytics · https://developers.cloudflare.com/ai-gateway/observability/analytics
  23. Cloudflare. Fallbacks · https://developers.cloudflare.com/ai-gateway/configuration/fallbacks
  24. Cloudflare. Custom costs · https://developers.cloudflare.com/ai-gateway/configuration/custom-costs
  25. Google Cloud. Grounding with Google Search  |  Generative AI on Vertex AI  |  Google Cloud Documentation · https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search
  26. Google Cloud. Agent Search pricing · https://cloud.google.com/generative-ai-app-builder/pricing
  27. Google Cloud. Introduction to custom search  |  Agent Search  |  Google Cloud Documentation · https://docs.cloud.google.com/generative-ai-app-builder/docs/about-generic-search
  28. Exa. API Pricing | Exa · https://exa.ai/pricing
  29. Exa. Exa vs Tavily: AI Search API Comparison 2026 · https://exa.ai/versus/tavily
  30. Exa. Exa Search API - Exa · https://exa.ai/docs/reference/search-api-guide
  31. Exa. Contents API - Exa · https://exa.ai/docs/reference/contents-api-guide
  32. Exa. Crawling Subpages - Exa · https://exa.ai/docs/reference/crawling-subpages
  33. Tavily. Tavily · https://www.tavily.com/pricing
  34. Tavily. Tavily Search - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/search
  35. Tavily. Tavily Extract - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/extract
  36. Tavily. Tavily Crawl - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/crawl
  37. Tavily. Create Research Task - Tavily Docs · https://docs.tavily.com/documentation/api-reference/endpoint/research
  38. Browserbase. Enterprise security - Browserbase Documentation · https://docs.browserbase.com/account/enterprise/security
  39. Browserbase. Zero data retention (ZDR) - Browserbase Documentation · https://docs.browserbase.com/account/enterprise/zero-data-retention
  40. Browserbase. This week we fixed the worst part of Browserbase · https://www.browserbase.com/blog/session-recordings