BizIdea

AGENT-TO-AGENT ai-infra Scan 2026-04-25 to 2026-04-25 Run 20260426183436

MCP-native policy and benchmarking rail that lets enterprise buying agents negotiate tail-spend purchases safely.

Enterprises want AI agents to handle low-value purchasing, but today's buying workflows still rely on email, spreadsheets, and manual approvals because nobody trusts an agent to negotiate and commit spend. Anthropic's Project Deal shows agents can already clear real transactions, yet it also shows stronger models get better prices and weaker-agent users may not notice the loss.

Overall rating 3.6 / 5.0
  1. 3
    Market

    $262.7M TAM and $131.3M SAM in a 10.0% CAGR category, but five credible incumbents make procurement controls a competitive market.

  2. 4
    Differentiation

    A neutral, MCP-native control plane is a sharper wedge than suite-bound AI, with a benchmark dataset that could deepen over time.

  3. 3
    Execution

    Planned hiring and milestones are clear, with 76% gross margin, 5.6x LTV/CAC, and 9.8-month payback, but four model flags remain.

  4. 5
    Timeliness

    Anthropic's Project Deal and seven recent signals make the why-now unusually strong around agent commerce, oversight, and MCP adoption.

Section

Why now

  1. Project Deal moved agent commerce from thought experiment to real-world proof by showing autonomous agents can complete real low-stakes transactions.
  2. The hidden pricing disadvantage from weaker agents creates an immediate need for benchmarking, disclosure, and policy controls before finance teams trust autonomous buying.
  3. MCP and portable agent skills mean a control layer can now sit above many agent and tool vendors instead of being built as a one-off integration.
  4. Anthropic's own agent work keeps showing that permissions, checkpoints, sandboxing, and human review are the gating factors for consequential autonomous actions.
  5. Buyers are already shifting from chatbots to delegated work products, so procurement is next if someone can make the transactions governable.

Catalyst. Anthropic's Project Deal proved agent commerce can clear real transactions, while MCP standardization and Anthropic's own trust-and-permission work make a dedicated governance layer newly feasible and newly urgent.

Section

The idea

The product gives every enterprise buying agent a governed transaction envelope with spend limits, approved vendor lists, negotiation instructions, and escalation rules. It plugs into procurement systems, ERPs, and vendor channels through MCP where available, and falls back to email/browser automation where standards are missing. During each negotiation, it benchmarks quotes against prior deals and market baselines, flags likely weak-agent outcomes, and requires approval when variance or risk crosses a threshold. Every transaction produces an auditable transcript, policy decision log, and supplier-ready order package for finance. Over time, the company builds a proprietary dataset of agent-vs-agent deal outcomes that becomes the default benchmarking and trust layer for autonomous procurement.

What's different. This is not another AI procurement copilot; it is the transaction control plane for any procurement copilot or agent. The wedge is where trust actually breaks: permissions, benchmarking, model-quality disclosure, and exception routing at the moment an agent tries to negotiate or spend money. Because it is MCP-native and model-agnostic, it can become the neutral trust layer across many agent vendors rather than a single-agent application. Its long-term moat is the outcome dataset linking agent configuration, negotiation behavior, supplier type, and realized price quality.

Startup thesis
Beachhead Mid-market companies that want employees to use AI assistants to source office equipment, developer peripherals, lab consumables, and swag purchases under $5,000 from approved vendors
Wedge An MCP-native control plane that sits between employee buying agents and vendor endpoints to enforce spend policy, benchmark negotiated prices, disclose agent/model quality, and route only exception cases to humans
Non-obvious insight The first real market for agent-to-agent commerce is not an open consumer bazaar; it is enterprise tail-spend, where low-stakes purchases happen constantly, savings are measurable, and finance teams will pay for a control layer once they realize weaker agents can quietly lose money.
Venture-scale path Start with tail-spend procurement guardrails, then expand into supplier onboarding, contract and services buying, autonomous accounts payable, and the broader trust layer for cross-vendor agent transactions.
Target user
Primary user Procurement operations managers at 500-5,000 employee tech, biotech, and R&D-heavy companies with large tail-spend volume
Secondary user Finance systems leaders rolling out internal AI assistants for employee purchasing workflows
Economic buyer VP Finance or Head of Procurement
Go-to-market seed
First customer A 1,000-employee AI-native software company whose procurement team already manages high-volume employee purchases for laptops, monitors, dev tools, and office equipment, and wants to pilot AI-assisted buying under strict spend caps
Buying trigger Leadership approves internal use of AI assistants for delegated work and finance is asked to support autonomous purchasing without increasing control risk
Current alternative Manual procurement workflow plus incumbent procurement software and ad hoc human negotiation over email
Switching reason The first customer switches because this lets them automate high-volume low-dollar purchases while keeping policy enforcement, price protection, and human oversight that current suites do not provide for agent-led transactions
Pricing hypothesis SaaS platform fee based on annual autonomous spend under management, with a minimum platform subscription plus usage-based pricing per completed transaction

Jobs to be done

Job Current alternative Success metric
When my company wants to let employees use AI to buy routine items, help me enforce policy and catch bad deals, so they can automate purchasing without creating finance risk. Manual approvals inside procurement software plus email negotiation Percent of tail-spend transactions completed autonomously within policy at equal or better unit pricing
When an internal buying agent negotiates with suppliers, help me see whether the agent got a market-competitive outcome, so they can trust autonomous purchasing instead of second-guessing every order. Spot-checking a few quotes manually or relying on supplier list prices Savings or avoided overpayment per transaction relative to baseline workflow
Governed autonomous procurement
flowchart LR
  Buyer[Head of Procurement] --> Pain[Unsafe invisible agent overpayment]
  Pain --> Product[Agent procurement control plane]
  Product --> Outcome[Autonomous tail-spend with policy and audit]
Idea scorecard — average4.6 / 5 · 5axes
Signal5/5Pain4/5Wedge5/5Defense4/5Scale5/5
  • Signal · 5/5The core wedge maps directly to multiple strong signals, especially real-world agent commerce, invisible quality gaps, interoperability, and oversight bottlenecks.
  • Pain · 4/5Procurement teams already feel the pain of manual tail-spend and will feel sharper risk once AI assistants are asked to transact autonomously.
  • Wedge · 5/5The entry product is specific: policy, benchmarking, and approval controls for autonomous tail-spend procurement.
  • Defense · 4/5Defensibility comes from benchmark data, deep workflow integrations, and the trust position at the transaction boundary.
  • Scale · 5/5Procurement is a large spend category and the same control plane can expand into broader agent commerce, supplier onboarding, and financial workflows.
Business model canvas
Key partners
  • Procurement suites
  • ERP and finance system integrators
  • Supplier network and marketplace platforms
Key activities
  • Integrating buyer and vendor systems
  • Benchmarking transaction outcomes
  • Running trust, eval, and policy models
Key resources
  • MCP and procurement integrations
  • Deal outcome benchmark dataset
  • Policy engine and approval workflow infrastructure
Value propositions
  • Safe autonomous purchasing with policy enforcement
  • Price benchmarking that catches weak-agent negotiations
  • Audit-ready transaction logs for finance and compliance
Customer relationships
  • High-touch implementation
  • Shared policy tuning and rollout governance
  • Ongoing benchmark reviews tied to savings
Channels
  • Direct sales to finance and procurement leaders
  • Partnerships with procurement software and ERP integrators
  • Design-partner pilots with AI-native companies
Customer segments
  • Mid-market procurement teams adopting AI assistants
  • Finance leaders responsible for spend controls
  • AI-first enterprises with large tail-spend purchasing volume
Cost structure
  • Model and inference costs
  • Integration engineering
  • Enterprise implementation and support
Revenue streams
  • Platform subscription
  • Usage fees per completed transaction
  • Premium analytics and benchmarking modules
Section

Market

Market sizing
TAMSAMSOM TAM · Total addressable $262.7M SAM · Serviceable available $131.3M SOM · Serviceable obtainable $2.4M
Market sizing overview
TAM $262.7M 43,779 US firms with 500+ employees x 15% target-sector share x $40k ACV.
SAM $131.3M TAM x 50% AI-ready filter.
SOM $2.4M 60 customers x $40k ACV.

Executive takeaways

  • Project Deal proves agent commerce works but also surfaces hidden overpayment risk from weaker agents [1].
  • The credible beachhead is governed tail-spend, not open consumer commerce [14][20][22].
  • Incumbents are shipping AI fast, but mostly inside their own suites rather than as a neutral control plane [9][15][16][21].
  • Security, prompt injection, privacy, and human-oversight demands are category-defining constraints [4][25][26].
  • A conservative US beachhead still supports a few-hundred-million-dollar TAM, with larger top-down markets behind it [11][27][28][29].

Market definition

US-first control software for AI-assisted or autonomous enterprise purchasing in tail-spend workflows; excludes full S2P suites, ERP, cards, and consumer shopping agents [14][20][22][28].

Customer and buyer

Primary user is procurement ops or finance systems; buyer is VP Finance or Head of Procurement [14][20].

Buying triggers

  • AI assistants get approved for delegated work and finance is asked to keep controls intact. [1][4][15]
  • Manual tail-spend workflows create rogue spend, AP mismatches, and slow approvals. [14][20]

Willingness to pay

Anthropic participants said they would pay for delegated buying help, and adjacent vendors already monetize procurement savings and control [1][11][13]. [1][11][13]

Category dynamics

Growth signal 10.0% CAGR

Tailwinds

  • Procurement software remains a growing market.
  • MCP and AI feature launches expand category awareness.

Headwinds

  • Hidden model-quality gaps and security risk slow adoption.

Validation signals

  • Project Deal completed 186 real transactions and found willingness to pay.
  • Zip claims 10M insights and 60+ integrations.
  • Tropic reached 150 customers and shows public pricing.
  • Tonkean acquired Cinch to deepen spend intelligence.
  • Fairmarkit and Sievo/Pactum show overlay adoption.

Regulatory & technical constraints

  • Prompt injection, privacy, and unintended actions are core product risks.
  • Human oversight and explicit risk-management controls are required for buyer trust.
  • Interoperability remains uneven across vendor endpoints.
Agentic procurement market map
← Low specialization High specialization → ← Human-led workflows Agent-led execution → Q2 Q1 · winning zone Q3 Q4 Proposed startup Zip Tropic Order.co Tonkean Fairmarkit
Section

Competition

Zip, Tropic, Order.co, Tonkean, and Fairmarkit are the closest priority competitors or substitutes [9][11][15][16][21].

Competitor Stage Wedge Pricing Strength Weakness vs. us
Zip scale-up AI-forward intake-to-pay suite. Custom quote Workflow footprint and AI positioning Not a neutral cross-agent control plane
Tropic scale-up Software-procurement savings platform Starts at $3,167/month Public pricing and data Focused on SaaS spend, not autonomous tail-spend governance
Order.co scale-up Embedded operational procurement and AP Custom quote Strong operational pain alignment Best inside its own buying rails
Tonkean scale-up Horizontal process orchestration Custom quote Broad orchestration flexibility Less procurement-specific benchmarking
Fairmarkit scale-up Tail-spend sourcing automation Custom quote Closest tail-spend adjacency More sourcing execution than neutral governance

Why incumbents do not win by default

  • Cloud platforms. Model vendors enable agents but do not solve buyer-specific spend governance and benchmarking.
  • Procurement suites. Suites extend their own workflow footprint; the wedge is cross-agent, cross-suite neutrality.
  • Workflow tools. Horizontal orchestration is broad but less procurement-specific on savings and model-quality benchmarking.
  • Tail-spend AI. Fairmarkit is close, but still centered on sourcing execution rather than neutral transaction governance.
Section

Business plan

This company proposes an MCP-native control plane for AI-assisted and autonomous tail-spend procurement in mid-market enterprises. The immediate pain is not quote discovery; it is that finance teams do not trust an agent to negotiate and commit spend without hard policy controls, price benchmarking, and an audit trail. Research supports a narrow beachhead in sub-$5,000 tail-spend categories at 500-5,000 employee tech, biotech, and R&D-heavy companies, where transaction volume is high, savings are measurable, and buying complexity is still manageable. The product should start as a governed transaction layer that enforces approved vendors, spend limits, exception routing, and deal-quality checks across existing procurement and ERP systems rather than trying to replace those systems. The strongest evidence is that agent commerce now works in real transactions, while hidden model-quality gaps create a new governance problem that incumbents do not yet solve as a neutral cross-agent layer. The main strategic risk is timing: many finance leaders may permit AI assistance before they permit autonomous spend, which means the company must prove ROI first through approval-plus-benchmarking workflows. If early pilots show that benchmark alerts prevent overpayment and reduce manual review on routine purchases, the company can expand from control point to system of record for autonomous procurement decisions. If buyers refuse any delegated spend even under hard caps, or if procurement suites rapidly make a neutral layer unnecessary, the thesis weakens materially.

Problem

  • Procurement teams still handle tail-spend through email, spreadsheets, and manual approvals because incumbent systems were designed for human buyers, not autonomous agents.
  • Stronger models can negotiate better prices than weaker ones, creating hidden overpayment risk that finance teams cannot observe or govern in current workflows.
  • Enterprises want delegated AI work but need scoped permissions, checkpoints, and auditability before allowing an agent to commit company spend.

Solution

  • Insert a control plane between internal buying agents and supplier endpoints to enforce spend policy, approved vendors, negotiation instructions, and exception routing.
  • Benchmark each negotiated outcome against prior deals and market baselines so weak-agent outcomes trigger review before a purchase is finalized.
  • Produce an auditable transcript, policy log, and supplier-ready order package that fits existing procurement, ERP, and compliance processes.

Why we win

  • The wedge is the trust boundary at transaction time, where suites and model vendors are weakest and buyer urgency is highest.
  • A model-agnostic, MCP-native layer can sit across multiple assistants, suites, and vendor channels instead of being trapped inside one workflow stack.
  • Outcome data linking supplier, category, model configuration, and realized price quality can become a defensible benchmark moat if collected from day one.
Strategic choices
Beachhead Mid-market North American tech, biotech, and R&D-heavy companies rolling out internal AI assistants for employee purchases under $5,000 in categories such as laptops, monitors, developer peripherals, lab consumables, and office supplies.
Wedge rationale This slice has frequent transactions, measurable savings, relatively standard policy rules, and low enough dollar risk to win approval faster than services procurement, strategic sourcing, or open marketplace commerce.
Sequencing Start with approval-plus-benchmarking on tail-spend so customers get immediate audit and savings value before full autonomy; then add governed auto-approval for low-risk categories; then expand into supplier onboarding, broader indirect spend, and downstream AP workflows once transaction trust data exists.
Not yet Consumer or SMB shopping agents · Full source-to-pay suite replacement · High-stakes services, contract, or strategic sourcing categories · International compliance-heavy rollouts before a North America reference base exists
Go-to-market
Wedge Sell a high-touch design-partner deployment for one or two tail-spend categories where finance wants AI-assisted buying but needs hard controls before granting autonomy.
Channels Founder-led outbound to VP Finance, Head of Procurement, and finance transformation leaders · Design-partner sales into AI-native companies already piloting internal assistants · Integration and referral partnerships with procurement consultants, ERP implementers, and MCP ecosystem players
Funnel targets Lead to qualified pilot 20-30%, pilot to paid production 50%+, first production deployment expanded to 3+ categories within 9 months in 50% of retained accounts.
Pricing Annual platform subscription with a minimum contract in the $40k range, plus usage priced by governed autonomous spend or completed transactions; this matches the modeled ACV, funds high-touch implementation, and ties upside to customer adoption.
Product roadmap
MVP Focus the MVP on governed tail-spend transactions for a small set of categories and approved vendors, with policy rules, benchmark alerts, human exception routing, audit logs, and ERP or procurement-system handoff. Support MCP where available and a limited fallback connector set where it is not.
6 months Ship category templates, role-based approval policies, supplier transcript logging, benchmark scoring, and integrations for one ERP or P2P system plus email or browser fallback to support paid pilots.
12 months Add governed auto-approval for low-risk scenarios, model-quality disclosure, savings reporting, supplier onboarding workflows, and two to three additional system integrations to convert pilots into repeatable deployments.
24 months Expand into a broader autonomous procurement control layer with cross-customer benchmark models, policy simulation, multi-agent governance, and adjacent workflows in supplier onboarding and AP exception handling.
Key bets Benchmarking weak-agent outcomes is a pain buyers will pay for before they permit broad autonomous spend. · A neutral layer can coexist with procurement suites because customers will run multiple assistants and partial tool stacks. · Tail-spend categories provide enough transaction volume to build a proprietary benchmark dataset quickly.
Business model
Revenue streams Annual platform subscription for policy, audit, and benchmark controls · Usage fees per completed governed transaction or spend-under-management band · Premium analytics and benchmark modules for savings and model-quality reporting
Unit of value Governed transaction volume and autonomous spend under management
Target gross margin 75%
Expansion levers More spend categories per customer · Additional ERP, P2P, and supplier-channel integrations · Higher auto-approval thresholds as trust increases · Benchmark and compliance modules sold to finance leadership
Strategy map
North-star metric Annual autonomous spend processed within policy at equal or better benchmarked pricing
Input metrics Number of live governed transactions per customer · Percent of transactions auto-approved within policy · Benchmark alert precision on bad or above-market outcomes · Pilot to production conversion rate · Category expansion rate per production customer
Moats to build Cross-customer dataset of agent configuration, supplier context, and realized deal quality · Workflow embedment in approvals, audit logs, and ERP handoffs · Default policy templates for low-risk autonomous purchasing
Kill criteria Fewer than 3 of the first 10 design partners allow any sub-$5k delegated spend after policy controls are demonstrated · Benchmarking fails to show at least 5% avoided overpayment or equivalent review-time reduction in two pilot categories · More than 30% of target transactions require unsupported custom integrations after the first product year

Milestones

0-12 months
  • Sign 3-5 paid design partners in target verticals
  • Ship MVP with policy engine, benchmark alerts, audit logs, and first ERP or P2P integration
  • Complete 200+ governed transactions across pilot accounts
  • Convert at least 2 pilots to annual production contracts
12-24 months
  • Reach 10-15 production customers and expand the average account to 3 or more spend categories
  • Launch governed auto-approval for low-risk categories with model-quality disclosure
  • Build the first cross-customer benchmark dataset and savings reporting module
  • Establish 2 channel or integration partnerships that generate qualified pipeline
24-36 months
  • Become the default control layer for autonomous tail-spend in the initial segment
  • Expand into supplier onboarding and AP exception workflows
  • Introduce policy simulation and multi-agent governance capabilities
  • Demonstrate repeatable expansion beyond the initial vertical mix without custom implementation economics
Strategy map
flowchart LR
  Wedge[Governed tail-spend wedge] --> MVP[Policy plus benchmark MVP]
  MVP --> Proof[Paid pilots and avoided overpayment proof]
  Proof --> Expansion[More categories, auto-approval, supplier onboarding]

Founding team

Role Start timing Rationale
Founding eng Month 0 Build the policy engine, transaction logging, and first integration path without outsourcing core product architecture.
Founder-GTM Month 0 Early sales depend on deep buyer discovery, design-partner selling, and hands-on implementation credibility.
Solutions engineer Month 3 Customer success hinges on deployment speed, workflow mapping, and integration reliability in the first pilots.
Product and trust lead Month 6 The company needs explicit ownership of policy templates, evaluation harnesses, benchmark quality, and rollout safety.
Account executive Month 9 Add quota-carrying capacity only after the pilot motion and implementation scope are repeatable.

Experiment roadmap

Horizon Experiment Hypothesis Success metric Owner
0-90 days Founder interviews with 15 procurement and finance leaders in target segments Buyer urgency is high enough to fund an approval-plus-benchmarking pilot before full autonomy is allowed. 8 or more interviews confirm a current delegated-spend initiative and at least 5 agree to pilot scoping. CEO
0-90 days Concierge benchmark review on historical tail-spend transactions Weak-agent or weak-process outcomes can be detected and framed as avoided overpayment with clear economic value. 2 customers identify at least one category where benchmark analysis would have changed approval or supplier choice. CEO plus founding eng
90-180 days MVP pilot in one category with policy rules, approval routing, and audit transcripting Customers will run live governed transactions if the workflow fits their existing ERP or P2P controls. 50 or more governed transactions completed with zero policy escapes and less than 20% manual rework. Founding eng
90-180 days Pricing test across pilot proposals A $25k-$50k pilot and $40k+ annual expansion are acceptable when tied to one integration and defined savings or control outcomes. Close 3 paid pilots with no more than 20% discount from target pricing. CEO
180-360 days Auto-approval rollout for low-risk SKUs and vendors Customers will raise automation thresholds after benchmark performance and audit controls are proven. 30% or more eligible transactions in one pilot account move to auto-approval with no material incident. Product lead
180-360 days Partner channel test with one ERP implementer or procurement consultant Trusted implementation partners can shorten sales cycles and reduce integration friction. 2 qualified opportunities sourced by one partner and one converted to a paid pilot. CEO

Risk assessment

Business plan risks — 4 mapped
Impact →
High
R3 R4
R1 R2
Medium
Low
Low
Medium
High
Likelihood →
  1. R1Slow customer permissioning for autonomous spend · Highlikelihood / Highimpact — Lead with approval-plus-benchmarking and use hard spend caps, category limits, and human checkpoints to earn broader autonomy.
  2. R2Integration fragmentation across procurement and supplier systems · Highlikelihood / Highimpact — Narrow the first workflows, prioritize one standard integration path, and support fallback automation only where it is repeatable.
  3. R3Incumbent suites add enough native governance to compress the standalone wedge · Mediumlikelihood / Highimpact — Differentiate on neutrality, cross-agent benchmarking, and speed of deployment across heterogeneous tool stacks.
  4. R4Security, prompt injection, or bad purchases damage trust · Mediumlikelihood / Highimpact — Build conservative permissions, release gates, auditability, and incident response into the operating model from day one.
Risk Likelihood Impact Mitigation
Slow customer permissioning for autonomous spend High High Lead with approval-plus-benchmarking and use hard spend caps, category limits, and human checkpoints to earn broader autonomy.
Integration fragmentation across procurement and supplier systems High High Narrow the first workflows, prioritize one standard integration path, and support fallback automation only where it is repeatable.
Incumbent suites add enough native governance to compress the standalone wedge Medium High Differentiate on neutrality, cross-agent benchmarking, and speed of deployment across heterogeneous tool stacks.
Security, prompt injection, or bad purchases damage trust Medium High Build conservative permissions, release gates, auditability, and incident response into the operating model from day one.
First customer
Title Procurement operations manager at an AI-native 1,000-employee software company
Profile Company already uses internal AI assistants for delegated work and processes high-volume low-dollar purchases across IT equipment, developer gear, and office spend.
Trigger Finance is asked to support agent-led purchasing without increasing rogue spend, overpayment, or audit risk.
Buyer VP Finance or Head of Procurement
Initial contract $25k-$50k paid pilot for 1-2 categories and one system integration, converting to roughly $40k-$80k annual subscription plus usage as auto-approved volume expands.

What must be true

  • At least half of qualified design partners will allow autonomous or semi-autonomous purchasing under hard spend caps within the next 12 months.
  • Benchmark alerts can identify materially bad agent-negotiated outcomes with enough precision to change approval behavior.
  • Buyers will pay a standalone budget line for a neutral control layer rather than waiting for their suite vendor.
  • One or two initial integrations are sufficient to make pilots operational without custom work overwhelming deployment.
  • Tail-spend transaction volume is high enough to build a differentiated outcome dataset before incumbents normalize similar features.

Open diligence questions

  • Which budget owns this purchase in the first sale: procurement, finance systems, or security?
  • How many categories can customers realistically put under hard-capped delegated spend in year one?
  • What evidence will make a VP Finance trust benchmark scores as more than another analytics dashboard?
  • How often do suites block or discourage external policy layers in live procurement workflows?
  • What is the implementation burden per customer for the first ERP or P2P integration?
Investor verdict
Call Meet / investigate further
Conviction Strong wedge and timing signal, but conviction depends on near-term buyer willingness to permit capped autonomous spend.
Why believe Real agent-commerce proof, clear procurement pain, and a neutral governance layer create a credible path to early design-partner revenue and a defensible data asset.
Why doubt Enterprise rollout timing may lag the technology cycle, and incumbent suites may close the governance gap before neutrality matters.
Next diligence Verify with 10-15 finance and procurement leaders that approval-plus-benchmarking is budgetable now and can convert into capped autonomous spend within 12 months.
Section

Financial model

3-year totals
Year 1 revenue $100K EBITDA $-737K · Cash EOP $1.86M
Year 2 revenue $606K EBITDA $-991K · Cash EOP $873K
Year 3 revenue $1.48M EBITDA $-727K · Cash EOP $146K
Unit economics
ARPU (annual) $80K
Gross margin 76%
CAC $50K Payback 9.8 months
LTV / CAC 5.6x LTV $283K
Funding ask
Round pre-seed · $2.6M
Runway 30 months
Milestone Reach 15 production customers, prove category expansion and benchmark ROI, and preserve roughly 6 months of operating buffer before the seed raise.

Model sanity

  • Revenue engine. Base-case revenue is driven by 30 paying accounts by Y3, with land ACV near $48K expanding toward roughly $80K once categories and usage ramp.
  • Must go right. Pilot-to-production conversion has to stay near the plan's 50%+ target so the second AE sells repeatable deployments instead of bespoke pilots.
  • Model breaks if. A one-quarter sales-cycle slip plus weaker expansion drops Y3 revenue to about $1.0M and takes cash roughly $372K below zero.
  • Next-round proof. The seed case is credible if the company exits Y2 around 15 customers and exits Y3 near $2.0M ARR with burn multiple below 1x.
Revenue, cash, and EBITDA — 12-month Y1 + 8-quarter Y2/Y3
$0K$500K$1.00M$1.50M$2.00M$2.50M$3.00MM1M4M7M10Q1Y2Q4Y2Q3Y3Q4Y3
  • Revenue (line, area)
  • Cash EOP (dashed)
  • EBITDA (bars, gray = loss)
Use of funds — $2.6M pre-seed
Engineering · 44% GTM · 32% G&A · 11% Buffer (6 mo) · 13%
Headcount build by role — peak9 FTE
Q1Y13Q2Y14Q3Y15Q4Y15Q1Y26Q2Y26Q3Y27Q4Y27Q1Y37Q2Y38Q3Y39Q4Y39
  • Founder-GTM
  • Engineering
  • Product/Trust
  • Solutions/Implementation
  • Sales
  • Customer Success
Year-3 scenarios — base / downside / upside
Y3 revenueY3 EBITDACash low pointDescription
Downside$1.05M-$1.08M-$372KPilot conversion slips and expanded accounts land closer to high-$60Ks ACV, not the full mature case.
Base$1.48M-$727K$146KFive paid design partners in Year 1 compound into 30 paying customers by Year 3 with category expansion on roughly half of retained accounts.
Upside$1.79M-$473K$476KPartner help and faster trust-building pull deals forward, and more accounts expand into higher-usage production deployments.
Sensitivity — Y3 cash and revenue impact, sorted by magnitude
VariableDownsideUpsideCash impactRevenue impact
sales cycleSeveral pilot decisions slip by one quarter.References and partner intros pull multiple decisions forward.-$284K-$242K
hiring pacePlanned hires start about 2 months earlier than revenue maturity.Key hires are delayed until usage proves out.-$207K$0K
churnOlder cohorts begin churning at ~1% monthly before broad expansion.Retention stays perfect through the modeled period.-$171K-$177K
CACKeeping pipeline full requires about $4K more S&M spend per month.Referrals and founder credibility lower paid acquisition needs.-$144K$0K
ARPUExpanded accounts settle near a $70K mature ACV.Expanded accounts settle near an $89K mature ACV.-$134K-$137K
gross marginGross margin exits Y3 at 74% because implementation remains too bespoke.Gross margin exits Y3 at 77%+ as workflows standardize faster.-$44K$0K

Scenarios

Scenario Y3 revenue Y3 EBITDA Cash low point Description Key changes
Downside $1.05M $-1.08M $-372K Pilot conversion slips and expanded accounts land closer to high-$60Ks ACV, not the full mature case.
  • Y2-Y3 customer adds slow to 19 total new logos after Year 1 instead of 25.
  • Mature ACV falls from $80.4K to roughly $69.6K.
  • Gross margin stays 2 points below base in each year.
Base $1.48M $-727K $146K Five paid design partners in Year 1 compound into 30 paying customers by Year 3 with category expansion on roughly half of retained accounts.
  • New-logo plan closes 5 customers in Y1, 10 in Y2, and 15 in Y3.
  • Land ACV starts near $48K and expands toward an $80.4K blended mature ACV after 9 months.
  • Gross margin improves from 72% in Y1 to 76% in Y3 while hiring stays lean.
Upside $1.79M $-473K $476K Partner help and faster trust-building pull deals forward, and more accounts expand into higher-usage production deployments.
  • Y2-Y3 customer adds increase to 30 total new logos after Year 1 instead of 25.
  • Mature ACV rises to roughly $86K as more accounts add categories and usage.
  • Gross margin improves 1 point above base in each year.

Sensitivity

Variable Downside Base Upside
ARPU Expanded accounts settle near a $70K mature ACV. Expanded accounts settle near an $80K mature ACV. Expanded accounts settle near an $89K mature ACV.
sales cycle Several pilot decisions slip by one quarter. Founder-led sales plus references convert on the planned cadence. References and partner intros pull multiple decisions forward.
churn Older cohorts begin churning at ~1% monthly before broad expansion. No realized churn appears in the 36-month model horizon. Retention stays perfect through the modeled period.
gross margin Gross margin exits Y3 at 74% because implementation remains too bespoke. Gross margin exits Y3 at 76%. Gross margin exits Y3 at 77%+ as workflows standardize faster.
CAC Keeping pipeline full requires about $4K more S&M spend per month. Y3 blended CAC stays around $50K per customer. Referrals and founder credibility lower paid acquisition needs.
hiring pace Planned hires start about 2 months earlier than revenue maturity. Hiring stays on the lean plan shown in headcount. Key hires are delayed until usage proves out.
Key assumptions (17)
ID Name Value Unit Source
A1 Model start month 2026-05 month [BP date] Model starts the month after the 2026-04-26 business plan.
A2 Opening cash from pre-seed round 2.6 USD M [BP fundingAsk] $2-4M target range; model uses $2.6M to fund the roadmap through the next seed-proof milestone plus buffer.
A3 Initial landed ACV per new customer 48.0 USD K annual [BP gtm][BP investorMemo] $40k minimum annual contract plus modest usage fees inside the first year.
A4 Mature ACV after category expansion 80.4 USD K annual [BP gtm][BP investorMemo] 50% of retained accounts expand to 3+ categories within 9 months, lifting blended ACV toward the $40k-$80k range plus usage.
A5 Expansion timing after 9 months timing [BP gtm] First production deployment expanded to 3+ categories within 9 months in 50% of retained accounts.
A6 Year 1 new paying customers by month [0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1] count [BP milestones][BP experimentRoadmap] Base case closes 5 paid design partners in the first 12 months.
A7 Year 2 new paying customers by month [1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1] count [BP milestones] Base case reaches 15 total customers by end of Year 2, consistent with the 10-15 production-customer milestone plus a small pilot overhang.
A8 Year 3 new paying customers by month [1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 2] count [research market SOM][startup finance heuristic] Two-AE ramp plus founder-led sales reaches 30 total customers by end of Year 3, still below the 60-customer SOM.
A9 Logo churn in 36-month P&L 0.0 pct monthly [BP investorMemo] Annual-contract early cohorts are modeled as retained during the first 36 months; separate churn heuristic is used in unit economics and flagged in sanityChecks.
A10 Steady-state churn for LTV 1.8 pct monthly [startup finance heuristic: early-stage vertical SaaS] Conservative steady-state churn used for LTV and payback math.
A11 Gross margin ramp Y1 72%, Y2 74%, Y3 76% gross margin pct [BP businessModel] 75% target gross margin, with lower Year 1 margin due to high-touch implementation and early infra overhead.
A12 Loaded annual cash compensation benchmarks Founder-GTM 129.8; Engineer 188.8; Product/Trust 177.0; Solutions 153.4; Sales AE 200.6; Customer Success 129.8 USD K annual [BP team][startup finance heuristic] US seed-stage cash comp with 18% payroll tax/benefit load.
A13 Hiring start months Founder-GTM M1; Eng1 M1; Solutions M3; Product/Trust M6; AE1 M9; Eng2 M13; AE2 M19; Eng3 M28; Customer Success M31 timing [BP team] First five roles from plan; later hires added conservatively to support repeatable deployment and sales capacity.
A14 Non-payroll S&M spend ladder M1-M6 3; M7-M12 5; M13-M18 7; M19-M24 9; M25-M36 12 USD K per month [startup finance heuristic] Travel, partner development, and light demand generation for founder-led enterprise sales.
A15 Non-payroll R&D spend ladder M1-M6 5; M7-M12 6; M13-M18 8; M19-M24 9; M25-M36 11 USD K per month [BP operations][startup finance heuristic] Cloud, eval infrastructure, security testing, and developer tools rise with transaction volume.
A16 Non-payroll G&A spend ladder M1-M6 6; M7-M12 7; M13-M18 8; M19-M24 9; M25-M36 11 USD K per month [BP operations][startup finance heuristic] Legal, finance, insurance, compliance, and audit readiness for enterprise pilots.
A17 Funding milestone 15 production customers, benchmark dataset, repeatable expansion motion, and seed-readiness with 6 months of buffer milestone [BP milestones][developer requirement] Funding ask is sized to the next financing proof point with explicit 6-month buffer.
unit economics flow
flowchart LR
  Leads --> PaidPilots
  PaidPilots --> ProductionCustomers
  ProductionCustomers --> SubscriptionRevenue
  ProductionCustomers --> UsageRevenue
  SubscriptionRevenue --> GrossProfit
  UsageRevenue --> GrossProfit
  GrossProfit --> Opex
  Opex --> Cash

Flags: Y3 revenue per ending FTE is still below mature SaaS benchmarks, so the next round depends on continued ARR growth rather than current efficiency. · The P&L assumes no realized logo churn inside 36 months; LTV uses a separate 1.8% monthly churn heuristic and should be read as directional. · Ending cash is only $145.9K in the base case, so even modest sales-cycle slippage or margin pressure would force an earlier raise. · The model assumes category expansion happens within 9 months for roughly half of retained accounts; if expansion is slower, CAC payback stretches materially.

Section

Top risks

  • Slow enterprise rollout. Many companies may allow AI assistance before they allow AI agents to commit spend, stretching sales cycles. Mitigation: Start as a benchmark-and-approval layer for human-led purchases so customers get savings and audit value before full autonomy.
  • Integration fragmentation. Suppliers and procurement stacks will adopt MCP and agent standards unevenly, making end-to-end automation messy. Mitigation: Support MCP first but ship practical email, browser, and ERP connectors so the product works before standards are universal.
  • Liability from bad purchases. A few visible agent mistakes or overpayments could damage trust and stall adoption. Mitigation: Enforce spend thresholds, exception routing, full audit logs, and conservative default policies while building insurer and compliance partnerships over time.
Section

Evidence

Cited sources (29)

  1. Anthropic. Project Deal: our Claude-run marketplace experiment | Anthropic | Anthropic · https://www.anthropic.com/features/project-deal
  2. Anthropic. Project Vend: Can Claude run a small shop? (And why does that matter?) | Anthropic · https://www.anthropic.com/research/project-vend-1
  3. Anthropic. Project Vend: Phase two | Anthropic · https://www.anthropic.com/research/project-vend-2
  4. Anthropic. Trustworthy agents in practice | Anthropic · https://www.anthropic.com/research/trustworthy-agents
  5. Anthropic. Introducing the Model Context Protocol | Anthropic · https://www.anthropic.com/news/model-context-protocol
  6. Anthropic. Donating the Model Context Protocol and establishing the Agentic AI Foundation | Anthropic · https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation
  7. Anthropic. Building Effective AI Agents | Anthropic · https://www.anthropic.com/engineering/building-effective-agents
  8. Anthropic. Scaling Managed Agents: Decoupling the brain from the hands | Anthropic · https://www.anthropic.com/engineering/managed-agents
  9. Zip. AI Agents for Procurement | Zip AI Automation · https://ziphq.com/ai
  10. TechCrunch. Procurement platform Zip raises $100M at a $1.5 billion valuation | TechCrunch · https://techcrunch.com/2023/05/16/procurement-platform-zip-raises-100m-at-a-1-5-billion-valuation/
  11. Tropic. Spend Management Plans | Tropic · https://www.tropicapp.io/pricing
  12. Tropic. SaaS and AI Buying Trends Report | Tropic · https://www.tropicapp.io/reports/software-spending-trends-2025
  13. TechCrunch. Tropic takes in more capital as demand for software procurement savings continues | TechCrunch · https://techcrunch.com/2022/02/15/tropic-takes-in-more-capital-as-demand-for-software-procurement-savings-continues/
  14. Order.co. How Operational Procurement Improves Control and Speed | Order.co · https://www.order.co/blog/procurement/operational-procurement/
  15. Order.co. Order.co AI | AI-Powered Procurement & Spend Control | Order.co · https://www.order.co/ai/
  16. Tonkean. Tonkean - Agentic Orchestration Platform for the Enterprise · https://www.tonkean.com/agentic-orchestration
  17. Tonkean. Pricing | AI-Powered Enterprise Intake & Process Orchestration · https://www.tonkean.com/pricing
  18. Tonkean. Tonkean Acquires AI Spend Intelligence Startup Cinch, Doubling Down on Procurement, Finance, and EMEA | Tonkean blog · https://www.tonkean.com/blog/tonkean-acquires-ai-spend-intelligence-startup-cinch-doubling-down-on-procurement-finance-and-emea
  19. TechCrunch. Tonkean raises $50M Series B to accelerate is no-code business automation service | TechCrunch · https://techcrunch.com/2021/06/24/tonkean-raises-50m-series-b/
  20. Fairmarkit. What is tail spend and how can we manage it? | Fairmarkit Blog · https://www.fairmarkit.com/blog/what-is-tail-spend-and-how-can-we-manage-it
  21. Fairmarkit. Fairmarkit | RFx Agent · https://www.fairmarkit.com/platform/execution-agent
  22. TechCrunch. Fairmarkit's AI-fueled platform delivers autonomous procurement sourcing | TechCrunch · https://techcrunch.com/2022/09/01/fairmarkits-ai-fueled-platform-delivers-autonomous-procurement-sourcing/
  23. Sievo. Agentic AI in Procurement: Transforming Decision-Making at Scale · https://sievo.com/blog/agentic-ai-in-procurement-transforming-decision-making-at-scale
  24. Sievo. Sievo partners with Pactum: Procurement Analytics meets Autonomous Negotiations · https://sievo.com/news/press-release-sievo-pactum-partnership
  25. NIST. AI Risk Management Framework | NIST · https://www.nist.gov/itl/ai-risk-management-framework
  26. OWASP. LLMRisks Archive - OWASP Gen AI Security Project · https://genai.owasp.org/llm-top-10/
  27. NAICS Association. US Business Firmographics - Company Size · https://www.naics.com/business-lists/counts-by-company-size/
  28. Grand View Research. Procurement Software Market Size | Industry Report, 2033 · https://www.grandviewresearch.com/industry-analysis/procurement-software-market-report
  29. Grand View Research. Spend Management Platform Market Size Report, 2022-2030 · https://www.grandviewresearch.com/industry-analysis/spend-management-platform-market-report