BizIdea

PANTHALASSA'S ai-infra Scan 2026-05-05 to 2026-05-05 Run 20260506092635

Power-aware scheduler for AI clusters on variable energy, turning stranded megawatts into sellable GPU capacity.

AI infrastructure builders increasingly have megawatts that are intermittent, remote, or operationally unusual, but still sell compute as if power were flat and guaranteed. Generic schedulers optimize GPUs, not the changing power envelope behind them, so operators either overprovision, idle expensive hardware, or miss customer SLAs.

Overall rating 3.3 / 5.0
  1. 2
    Market

    $74.6M TAM and $30.0M SAM are still small, but visible energy-first AI projects are scaling fast and only five competitors are mapped.

  2. 4
    Differentiation

    The wedge is a neutral layer that turns variable power into sellable GPU products, with site data that can deepen forecasting and SLA advantages.

  3. 3
    Execution

    Milestones are concrete and unit economics are strong at 8.5x LTV/CAC, 7.8-month payback, and 70% gross margin, though four model flags remain.

  4. 5
    Timeliness

    A fresh $140M Panthalassa round and four recent signals point to AI power constraints becoming an immediate software bottleneck.

Section

Why now

  1. Power availability is now explicitly identified as the bottleneck for larger AI models and new data centers.
  2. Offshore data-center infrastructure is moving from thought experiment to funded deployment, creating a new class of operators with novel scheduling needs.
  3. Renewable and ocean-energy companies are now designing directly for AI compute demand instead of only selling electricity.
  4. Lead investors are financing alternative AI-power architectures, which increases the number of sites that need software to commercialize variable energy.

Catalyst. Panthalassa's $140 million funding round shows that AI operators are actively moving compute toward nontraditional power sources because power availability has become the limiting factor.

Section

The idea

The product sits above Slurm, Kubernetes, or proprietary cluster managers and turns real-time power availability into sellable compute tiers. It forecasts how many GPU-hours a site can safely promise, gates premium workloads from flex workloads, and reprices queue slots as energy conditions change. For operators, it converts stranded or intermittent power into a new revenue line without risking core SLAs. For customers, it exposes cheaper flex capacity with transparent completion windows and energy provenance. Over time, the company can build the benchmark dataset for how variable-power sites actually translate megawatts into AI throughput.

What's different. Existing cluster schedulers assume power is a static constraint, while utility software assumes compute demand is someone else's problem. This company is built around the translation layer between volatile megawatts and contractual GPU products. The moat comes from site-level data on how power variability, queue mix, and hardware behavior interact, which improves forecasting, pricing, and SLA design over time. That dataset becomes valuable not just to operators but to financiers, insurers, and customers evaluating nontraditional AI capacity.

Startup thesis
Beachhead Independent GPU cloud operators commissioning first clusters on offshore, behind-the-meter renewable, or battery-backed power and initially selling batch training and offline inference jobs with flexible completion windows
Wedge A power-aware scheduling and commercialization layer that ingests live power envelopes, forecasts available GPU-hours, and automatically admits, prices, and routes only the jobs that fit that energy profile
Non-obvious insight The scarce resource in AI is no longer just GPUs but bankable power predictability. As offshore, battery-backed, and behind-the-meter sites become viable, the winning control layer will convert variable megawatts into contractual GPU products rather than assume grid-like stability.
Venture-scale path Start as the control plane for flexible AI workloads on variable-power clusters, then expand into standard scheduling, energy provenance, capacity marketplaces, and financing-grade performance data for all power-constrained AI infrastructure.
Target user
Primary user VP Infrastructure or Head of Capacity Engineering at an independent GPU cloud operator bringing 10 to 100 MW of nontraditional power-backed compute online
Secondary user Capacity planning lead at a foundation model lab willing to buy discounted flexible training capacity
Economic buyer COO, VP Infrastructure, or GM of cloud capacity at an AI infrastructure provider
Go-to-market seed
First customer A Series A or B GPU cloud startup launching its first 20 to 50 MW cluster on behind-the-meter renewable, battery-backed, or offshore-adjacent power and trying to sell enterprise training contracts before utilization is stable
Buying trigger A new site comes online with non-flat power delivery, and the sales team needs a credible way to contract capacity without overcommitting uptime
Current alternative Generic cluster schedulers plus spreadsheet-based capacity planning, manual throttling, and conservative overprovisioning
Switching reason The wedge lets operators monetize power variability as discounted flex compute while protecting premium queues, so they can sell more capacity faster than with manual gating
Pricing hypothesis Platform fee per MW under management plus a usage fee on GPU-hours scheduled into flex capacity tiers

Jobs to be done

Job Current alternative Success metric
When a new variable-power AI site goes live, help the infrastructure operator promise the right jobs to the right customers, so they can sell more capacity without missing SLAs. Manual throttling and conservative overbooking inside generic schedulers Increase in sellable utilization and flex-capacity revenue with no rise in missed job commitments
Power-shaped AI capacity
flowchart LR
  Buyer[GPU cloud operator] --> Pain[Variable power makes capacity hard to sell]
  Pain --> Product[Power-aware GPU scheduler]
  Product --> Outcome[More sellable utilization without SLA failures]
Idea scorecard — average4.6 / 5 · 5axes
Signal5/5Pain4/5Wedge5/5Defense4/5Scale5/5
  • Signal · 5/5The cluster is anchored in a large financing event and repeated source language that power is now the AI bottleneck.
  • Pain · 4/5Operators can lose millions through idle GPUs or broken contracts when power variability is unmanaged.
  • Wedge · 5/5Power-aware admission control for flexible AI workloads is a narrow first product tied to a concrete workflow.
  • Defense · 4/5Proprietary operating data linking power envelopes to throughput and SLA outcomes can compound into a strong moat.
  • Scale · 5/5If nontraditional power becomes a major source of AI capacity, the scheduling and commercialization layer can become core infrastructure.
Business model canvas
Key partners
  • GPU cloud startups
  • Energy asset developers
  • Data-center operators
  • Cluster software vendors
Key activities
  • Integrating with cluster control planes
  • Forecasting power envelopes
  • Packaging compute tiers
Key resources
  • Scheduling software
  • Power-to-throughput forecasting models
  • Integrations with cluster managers
Value propositions
  • Turn intermittent power into sellable compute
  • Protect premium SLAs while monetizing flex workloads
  • Forecast and prove energy-backed capacity
Customer relationships
  • High-touch deployment
  • Technical account management
  • Capacity planning reviews
Channels
  • Direct sales
  • Data-center and energy project partners
  • GPU cloud ecosystem
Customer segments
  • Independent GPU cloud operators
  • AI infrastructure developers
  • Foundation model labs buying flex capacity
Cost structure
  • Engineering
  • Deployment and support
  • Cloud infrastructure
  • Sales to AI infrastructure operators
Revenue streams
  • Platform fee per MW
  • Usage fee per GPU-hour scheduled
  • Premium analytics modules
Section

Market

Market sizing
TAMSAMSOM TAM · Total addressable $74.6M SAM · Serviceable available $30.0M SOM · Serviceable obtainable $4.8M
Market sizing overview
TAM $74.6M Bottom-up, conservative visible-MW method: 900 MW announced at Crusoe's new Abilene campus plus 166 MW for Soluna's Project Kati = 1,066 MW of publicly visible energy-first AI capacity; applying an estimated $70k per MW-year software spend yields about $74.6M. This intentionally undercounts other sites and assumes pricing that is a small fraction of implied GPU compute revenue. [22][25]
SAM $30.0M Constraint the TAM to the first-wave beachhead: roughly 500 MW of North American independent GPU-cloud and renewable-backed sites likely to sell third-party capacity rather than internalize the capability; at $60k per MW-year, SAM is about $30.0M. [1][22][24][25]
SOM $4.8M Reachable year-3 SOM assumes 80 MW under management at roughly $60k per MW-year, equivalent to 8 sites averaging 10 MW or 4 sites averaging 20 MW. This is modest relative to visible public site sizes and fits a beachhead-first rollout. [16][18][22][25]

Executive takeaways

  • The beachhead is real but narrow: public energy-first AI buildouts now span fresh Panthalassa funding and site announcements that jumped from tens of MW to 166 MW and 900 MW-class projects, but the first buyer set is still a small club of neoclouds and power-backed operators.
  • The strongest wedge is commercialization, not scheduling alone: hyperscaler spot/preemptible products prove demand for discounted flexible compute, while generic schedulers still treat power as a static constraint rather than a sellable product attribute.
  • Incumbent risk is feature absorption by cloud and scheduler vendors, so the startup must own live power forecasting, job admission, and SLA packaging specific to variable-energy sites rather than compete as yet another cluster dashboard.
  • Public GPU pricing shows that even a thin software take-rate can matter economically because each MW of AI capacity can produce high revenue density before any flex-capacity uplift.
  • Adoption friction is operational rather than conceptual: infra teams already buy managed Slurm, Kubernetes, and cluster tooling, but they are cautious about inserting a new control layer into the critical path.
  • Why now is stronger than the original offshore narrative alone: batteries, onsite generation, and renewable-curtailment strategies are becoming part of AI infrastructure design, expanding the number of sites that need a power-aware control plane.

Market definition

The market is software that turns constrained or variable-power AI clusters into sellable compute products for independent GPU clouds, renewable-backed data-center developers, and other operators who cannot assume flat grid-like capacity. The first buyer geography is North America, where public spot/preemptible compute markets are mature and many visible energy-first AI projects are being announced. Included are live power forecasting, admission control, flex-tier packaging, and queue orchestration above existing cluster managers. Excluded are generic schedulers, general DCIM/energy-management tools, retail GPU brokers, and vertically integrated clouds where the operator internalizes the problem instead of selling neutral control software. [3][4][6][8][10][12][22][24][27]

Customer and buyer

Primary users are capacity-engineering and cluster-operations teams who must decide which jobs can be promised under a non-flat power envelope. The economic buyer is typically the COO, VP Infrastructure, or GM of capacity at a GPU cloud or power-backed compute operator. Their urgent jobs are to protect premium SLAs, monetize flexible workloads, and stop sales teams from overcommitting a site whose available power, cooling, or energization schedule is still changing. Current alternatives are generic Slurm/Kubernetes-based queues, manual throttling, and spreadsheet capacity planning. [3][4][6][8][9][10][11][12][14][17][19][21]

Buying triggers

  • A new site or expansion phase comes online with non-flat power delivery or staged energization, and the operator needs a credible way to contract capacity before utilization is stable. [1][22][25]
  • The sales team wants to offer discounted flex capacity analogous to spot/preemptible compute rather than leave intermittent capacity idle. [3][6][8][20]
  • Existing Slurm or Kubernetes queues stop being enough because the operator now needs policy around who gets scarce capacity, when preemption is acceptable, and how to expose those tradeoffs commercially. [10][11][12][14][17][19]

Willingness to pay

Willingness to pay is supported indirectly by existing spend on flexible compute and cluster operations. AWS and Google market interruptible capacity as 90%+ discounts to on-demand pricing, while Lambda and Crusoe publish H100/H200 pricing in the roughly $3.9 to $6.2 per GPU-hour range. That means even modest improvements in sellable utilization or safer flex-capacity packaging can justify meaningful software spend without requiring a new budget category. [3][6][16][20][32] [3][6][16][20][32]

Category dynamics

Growth signal Public energy-first AI projects stepped from 83-166 MW phases to 900 MW campus announcements in 2025-26.

Tailwinds

  • Spot and preemptible compute markets already normalize the idea that some AI workloads can trade certainty for lower cost.
  • Energy-first operators are pairing AI compute with onsite generation, storage, or curtailed renewables, increasing the number of non-flat-power sites.
  • Scheduler primitives are commoditizing, making a commercialization layer more plausible than building a net-new cluster manager.

Headwinds

  • The initial buyer set is still small, and some of the most sophisticated operators may prefer vertically integrated solutions or in-house tooling.
  • Any software that touches admission control faces trust, integration, and rollback concerns from operators running expensive GPU fleets.

Validation signals

  • Panthalassa's $140M round shows investors will finance nontraditional AI-power architectures rather than assume all future capacity sits on conventional grid-backed campuses.
  • Crusoe announced a new 900 MW AI campus for Microsoft and says the broader Abilene site is projected to reach 2.1 GW, signaling campus-scale demand for power-shaped AI infrastructure.
  • Crusoe and Form Energy announced 12 GWh of iron-air batteries for AI data centers, indicating power-shaping is moving into the operating design of new sites.
  • Soluna explicitly pitches AI loads as flexible demand and says Project Kati is a 166 MW wind-powered data center with an 83 MW first phase.
  • Lambda and Runpod both expose managed Slurm / cluster offerings, which confirms operators already buy higher-level control and operational tooling on top of raw GPUs.
  • AWS, Google, and Azure all market interruptible capacity products, proving demand for lower-cost compute with weaker delivery guarantees.

Regulatory & technical constraints

  • Data-center energy intensity and large-load planning remain real constraints, so any commercialization layer must map promised compute to actual site energy and cooling limits.
  • Flexible-capacity products only work for workloads tolerant of interruption, preemption, or completion windows.
  • Integration risk is high because the product must ingest or influence Slurm, Kubernetes, or managed-cluster control planes without creating instability.
  • Forecast quality depends on access to live power, cooling, and queue telemetry that often lives in separate operational systems.
  • Security and operator trust matter because the software touches scheduling policy, capacity allocation, and potentially customer-facing commitments.
Power-aware AI capacity control landscape
← Low power specialization High power specialization → ← Low operator urgency High operator urgency → Q2 Q1 · winning zone Q3 Q4 Proposed startup AWS Spot/Capacity Blocks Slurm Lambda Runpod Crusoe
Section

Competition

Competition comes from four directions: cloud platforms that already sell interruptible or reserved capacity, open-source schedulers that own job placement, neoclouds that bundle operations with capacity sales, and vertically integrated energy-first operators that solve the problem inside their own stack. The startup should not try to replace Slurm or become another GPU broker; it should own the translation layer between volatile megawatts and contractual GPU products. [3][4][8][10][11][16][18][20][21][29][35]

Competitor Stage Wedge Pricing Strength Weakness vs. us
SchedMD / Slurm incumbent Default open-source workload manager for HPC and AI clusters with built-in QoS, reservations, and preemption. Open-source software; enterprise support and services sold separately. Deep operational credibility and broad deployment in GPU/HPC environments. Optimizes scheduling primitives but does not package variable-power capacity into customer-facing products or revenue-aware admission rules.
AWS EC2 Spot + Capacity Blocks incumbent Commercializes interruptible and reserved GPU capacity directly inside a hyperscaler cloud. Spot priced at up to 90% below on-demand; Capacity Blocks reserve ML GPU capacity. Buyer familiarity, mature APIs, and immediate proof that flexible compute is a real buying behavior. Only solves the problem inside AWS and does not help third-party operators monetize their own variable-power sites.
Lambda scale-up Neocloud with public GPU pricing and managed Slurm/Kubernetes offerings. Public H100 and B200 instance / cluster pricing; H100 cluster pricing published from roughly $5.54-$6.16 per GPU-hour for larger reserved clusters. Transparent pricing, operational maturity, and managed cluster distribution into exactly the kind of buyer the startup wants. Lambda sells capacity and managed infrastructure, not a neutral power-aware commercialization layer for third-party sites.
Runpod scale-up Developer-friendly GPU cloud with instant clusters and Slurm-based cluster docs. Public pricing for pods and serverless plus instant-cluster documentation. Fast time-to-value and strong fit for operators or buyers who prioritize speed and flexibility. Not centered on power-envelope forecasting or flex-capacity packaging tied to site energy constraints.
Crusoe scale-up Vertically integrated energy-first AI cloud combining data centers, power strategy, cloud pricing, and operations tooling. Public cloud pricing including H100/H200 rates and spot / on-demand options. Most complete substitute because it can solve the problem via vertical integration rather than neutral software. A neutral startup can serve operators that want the capability without buying Crusoe's full infrastructure stack or competing clouds.

Why incumbents do not win by default

  • Cloud platforms. AWS, Google Cloud, and Azure already train buyers to accept spot, preemptible, and reserved GPU capacity, but their products are tied to their own infrastructure and do not optimize around site-specific power envelopes, phased energization, or neutral multi-site commercialization.
  • Open-source schedulers. Slurm, Kueue, and Kubernetes handle quotas, preemption, queue fairness, and job placement, but they do not natively convert changing power availability into product tiers, delivery windows, or revenue-aware admission rules.
  • Neocloud operators. Lambda and Runpod sell managed clusters and capacity quickly, yet their economic center is selling standard compute inventory; a neutral control plane can still win at sites that need commercialization software before they look like a conventional cloud region.
  • Vertically integrated power-first AI clouds. Crusoe is the strongest strategic substitute because it combines energy, data centers, cloud pricing, and an operations platform, but that vertical model does not win by default when operators want the software capability without outsourcing the entire infrastructure stack.
  • In-house operations. Many operators can stitch together scheduler controls, spreadsheets, and manual throttling, but the operational burden of forecasting, packaging, and proving flexible capacity is precisely where specialist software should add value.
Section

Business plan

This company should sell a power-aware commercialization layer to North American GPU cloud operators bringing 10-50 MW variable-power clusters online and needing to contract capacity before site performance is stable. The initial product is not a new scheduler; it is a read-only forecasting, flex-queue, and SLA-packaging layer above Slurm, Kubernetes, or managed cluster control planes. The first customer should be a Series A or B neocloud or power-backed operator launching a behind-the-meter, battery-backed, curtailed-renewable, or offshore-adjacent site and trying to protect premium queues while monetizing cheaper flexible training capacity. The buying trigger is a site launch or expansion with staged energization or non-flat power delivery, because that is when spreadsheet planning and generic schedulers stop being commercially sufficient. Research supports a real wedge and clear willingness to pay, but the near-term market is still narrow at roughly $30.0M SAM, so expansion into additional power-constrained AI infrastructure categories is required for venture scale. Product and GTM therefore need to sequence from read-only proof, to flex-capacity packaging, to admission control only after the company has site-level forecasting data and operator trust. The biggest open market risk is volume: the inputs do not establish how many non-hyperscaler variable-power AI sites will be commercially live in the next 24 months, so early success should be judged by paid pilots and managed MW rather than broad logo count.

Problem

  • Independent GPU clouds and energy-backed AI sites increasingly have usable megawatts that are intermittent, staged, or operationally unusual, but they still need to sell compute with credible delivery commitments.
  • Generic schedulers and manual capacity planning allocate jobs, but they do not convert changing power envelopes into sellable product tiers, pricing, and SLA rules, so operators either idle GPUs, overconstrain sales, or risk missed commitments.

Solution

  • Provide a control layer above existing cluster managers that forecasts available GPU-hours from live power and site telemetry, packages premium and flex capacity tiers, and recommends which workloads fit the current envelope.
  • Start with read-only forecasting, flex-queue creation, and commercial policy tooling, then graduate to automated admission control once operators trust the forecasts and exception handling.

Why we win

  • The beachhead pain is tied to a specific trigger, buyer, and workflow: new variable-power sites need a way to contract capacity safely before utilization is stable.
  • Incumbent schedulers own queue mechanics, and clouds own their own spot products, but neither gives third-party operators a neutral layer that translates site-specific power volatility into contractual GPU products.
  • Early deployments can compound a proprietary dataset linking power variability, queue mix, and delivered GPU-hours, which improves forecasting, pricing, and SLA design over time.
Strategic choices
Beachhead North American independent GPU cloud operators launching their first 10-50 MW behind-the-meter renewable, battery-backed, curtailed-power, or offshore-adjacent clusters and initially selling batch training or offline inference with flexible completion windows.
Wedge rationale This slice has the clearest buying trigger, a small number of reachable technical buyers, and proof metrics such as managed MW, sellable utilization lift, forecast error, flex-queue fill rate, and SLA miss avoidance; it is faster to validate than selling a generic AI infrastructure optimization platform.
Sequencing Start with forecasting and flex-capacity packaging because operators will tolerate a read-only overlay sooner than a new control-plane dependency, then add admission control, broader site integrations, and expansion analytics only after the company proves commercial lift and earns operational trust.
Not yet Full replacement of Slurm, Kubernetes, or managed cluster control planes. · Retail marketplace aggregation for end customers across many providers. · Always-on latency-sensitive inference or premium enterprise workloads that require flat-power assumptions. · Financing, insurance, or energy-provenance analytics sold as standalone products before core control software is deployed.
Go-to-market
Wedge Sell a paid pilot to the infrastructure or capacity leader at a GPU cloud bringing a variable-power site online, package the software as the fastest way to create a discounted flex-compute product without risking premium SLAs, and convert to annual production pricing once the pilot demonstrates higher sellable utilization and reliable completion-window performance.
Channels Founder-led direct sales to neocloud, AI infrastructure, and power-backed site operators. · Energy and data-center developers that market AI-specific sites and can introduce new deployments before energization. · Managed Slurm, Kubernetes, and NVIDIA ecosystem partners that already sit in cluster operations.
Funnel targets 12-15 target accounts per year -> 30-40% qualified pilot discussions -> 25-35% paid pilot rate -> 50%+ pilot-to-production conversion -> 60%+ of production customers expanding managed MW or adding a second site within 12 months.
Pricing Paid pilot followed by annual software pricing anchored to MW under active management, with a secondary usage fee on flex GPU-hours scheduled through the platform; the rationale is that research indicates roughly $60k-$70k per MW-year is supportable if the software materially increases sellable utilization and reduces SLA risk.
Product roadmap
MVP The MVP should ingest live or periodic power-envelope inputs plus queue telemetry, forecast safe sellable GPU-hours, create premium versus flex capacity rules, and surface a read-only planning console with audit logs for one cluster running on Slurm, Kubernetes, or an equivalent managed control plane. It should not own full job placement at first; the proof point is that operators can contract and route flex workloads more confidently before inserting the product into the critical path.
6 months Land 1-2 paid pilots, support one production-like cluster integration, ship flex-queue policy templates for batch training and offline inference, and prove forecast accuracy and utilization lift on live workloads.
12 months Add guarded admission-control automation, site-level SLA policy management, customer-facing flex-capacity quoting workflows, and reusable integrations for the most common cluster telemetry and scheduler stacks seen in pilots.
24 months Standardize a multi-site control layer across several operators, add benchmarking for interruption tolerance and energy-linked throughput, and expand from single-site flex packaging into portfolio-level capacity planning and expansion inside existing customers.
Key bets A read-only overlay can show enough ROI to win deployment before operators trust live admission control. · Early sites will have enough interruption-tolerant batch training and offline inference demand to fill flex tiers. · Buyers will pay on managed MW and commercial uplift rather than view this as a low-value scheduler feature. · Site-level forecasting accuracy will improve materially with cross-customer operating data, creating a defensible advantage over in-house tooling.
Business model
Revenue streams Annual platform subscription priced by managed MW of variable-power AI capacity. · Usage-based fees on flex GPU-hours scheduled or contracted through the platform. · Implementation and integration fees for first-site deployment. · Premium analytics modules for forecasting, SLA reporting, and portfolio benchmarking.
Unit of value Managed MW of variable-power AI capacity.
Target gross margin 70%
Expansion levers Grow from one pilot site to more MW and additional sites inside the same operator. · Add admission-control and commercial-policy modules after forecasting is trusted. · Expand from batch training into additional interruption-tolerant workload classes as benchmark data improves. · Sell benchmarking and planning analytics to existing operator customers once multi-site data exists.
Strategy map
North-star metric Contracted flex GPU-hours delivered within promised completion windows from managed variable-power sites.
Input metrics Managed MW under active forecasting. · Forecast error between promised and delivered GPU-hours. · Sellable utilization lift versus pre-deployment baseline. · Premium queue SLA miss rate. · Flex-queue fill rate and repeat purchase rate. · Paid pilot to production conversion rate.
Moats to build Proprietary dataset linking site power envelopes, queue mix, and delivered GPU throughput. · Commercial-policy templates for packaging premium versus flex AI capacity by workload class. · Reusable integrations and rollback-safe deployment playbooks above common cluster control planes.
Kill criteria If the first 3 pilots cannot show at least 10% sellable utilization lift or materially lower overcommit risk without increasing premium SLA misses, the wedge is too weak. · If no operator converts to production pricing above a meaningful managed-MW contract within 12 months of pilot start, the category is likely too narrow or too easy to internalize. · If buyers consistently require full control-plane replacement before paying, the product will be too integration-heavy for efficient pre-seed execution.

Milestones

0–12 months
  • Secure 2 design partners and convert at least 1 into a paid overlay pilot.
  • Prove a read-only deployment that improves sellable utilization or commitment confidence without increasing premium SLA misses.
  • Ship one repeatable integration path above a common cluster control stack and complete a production-ready security package.
  • Convert the first pilot into an annual production contract with a defined managed-MW expansion path.
12–24 months
  • Reach 3-4 production customers and roughly 30-40 MW under active management.
  • Launch guarded admission control for flex workloads at production customers.
  • Show at least one customer expansion from the first site into additional MW or a second site.
  • Build internal benchmarking on forecast accuracy and interruption tolerance across multiple deployments.
24–36 months
  • Reach the researched year-3 target of roughly 80 MW under management.
  • Establish the company as the default commercialization layer for third-party variable-power AI sites in its beachhead segment.
  • Expand product scope into portfolio-level planning and benchmarking while preserving neutral-control-plane positioning.
Strategy map
flowchart LR
  Wedge[Variable-power site launch wedge] --> MVP[Read-only forecasting and flex-queue MVP]
  MVP --> Proof[Forecast accuracy utilization lift first paid pilot]
  Proof --> Expansion[Admission control multi-site expansion benchmarking]

Founding team

Role Start timing Rationale
Founding eng Month 0 Owns forecasting engine, telemetry ingestion, policy logic, and first operator integrations.
Founder CEO Month 0 Required for founder-led sales into a concentrated technical buyer set and for design-partner recruitment.
Founding product/infrastructure Month 0 Bridges operator workflows, pilot metrics, and scheduler integration details so the roadmap stays tied to commercial proof.
Integration / solutions engineer Month 4-6 Needed once pilots start to handle scheduler integrations, deployment safety, and customer-specific telemetry mapping without slowing core product work.
Customer success / implementation lead Month 9-12 Supports pilot-to-production conversion and creates a repeatable rollout process as customers add more MW or sites.

Experiment roadmap

Horizon Experiment Hypothesis Success metric Owner
0–90 days Interview 12-15 target operators and site developers to build a named beachhead account list with launch timing, MW size, and buy-versus-build preferences. The market contains enough imminent variable-power sites to support at least 3 paid pilot opportunities in the next 12 months. At least 10 qualified target accounts and 3 active pilot-scope discussions. Founder CEO
0–90 days Build a simulation or shadow-mode prototype that converts sample power-envelope changes into revised sellable GPU-hour forecasts and flex-tier recommendations. Operators will see clear commercial value from forecasting and packaging before the product touches live admission control. At least 2 design partners agree the prototype is valuable enough to scope a paid overlay pilot. Founding eng
3–6 months Run the first paid overlay pilot on one live cluster with forecast dashboards, flex-queue policy rules, and post hoc comparison against manual planning. The product can increase safe sellable utilization without increasing premium SLA misses. At least 10% utilization lift or equivalent reduction in overcommit risk with no increase in premium queue SLA misses during the pilot window. Founding product/infrastructure
6–12 months Add guarded admission-control recommendations or automation for flex workloads at the first production candidate. Once overlay value is proven, operators will trust limited control-path automation for non-premium jobs. First customer authorizes production use for at least one flex workload class and signs an annual contract. Integration / solutions engineer
12–18 months Benchmark interruption tolerance, price sensitivity, and completion-window acceptance across early workload types and sites. Cross-customer data will reveal repeatable flex-capacity product templates that raise win rate and forecast quality. Publish internal benchmarks used in at least 2 expansion deals and show better forecast accuracy than site-specific manual rules alone. Founder CEO

Risk assessment

Business plan risks — 4 mapped
Impact →
High
R3
R1 R2
Medium
R4
Low
Low
Medium
High
Likelihood →
  1. R1The number of commercially relevant variable-power AI sites may be too small for fast ARR growth. · Highlikelihood / Highimpact — Sell to any power-constrained or staged-energization AI site with the same commercialization problem, not just offshore deployments.
  2. R2Operators may refuse to trust a startup in the scheduling control path. · Highlikelihood / Highimpact — Begin with read-only forecasting and flex packaging, prove value on non-premium workloads, and add automation gradually.
  3. R3Vertically integrated providers or in-house teams may absorb the functionality. · Mediumlikelihood / Highimpact — Focus on neutral third-party operators and compound a forecasting plus commercialization dataset that is hard to recreate quickly.
  4. R4Flex-capacity demand may not be deep enough to justify the pricing model. · Mediumlikelihood / Mediumimpact — Validate interruption-tolerant workload classes early and tune product packaging around concrete completion-window preferences.
Risk Likelihood Impact Mitigation
The number of commercially relevant variable-power AI sites may be too small for fast ARR growth. High High Sell to any power-constrained or staged-energization AI site with the same commercialization problem, not just offshore deployments.
Operators may refuse to trust a startup in the scheduling control path. High High Begin with read-only forecasting and flex packaging, prove value on non-premium workloads, and add automation gradually.
Vertically integrated providers or in-house teams may absorb the functionality. Medium High Focus on neutral third-party operators and compound a forecasting plus commercialization dataset that is hard to recreate quickly.
Flex-capacity demand may not be deep enough to justify the pricing model. Medium Medium Validate interruption-tolerant workload classes early and tune product packaging around concrete completion-window preferences.
First customer
Title Capacity leader at a variable-power neocloud
Profile A North American GPU cloud startup commissioning its first 20-50 MW site with staged energization or non-flat behind-the-meter power and selling batch training capacity to external customers.
Trigger The site is nearing launch, the sales team wants to contract capacity, and operations cannot yet promise flat always-on delivery with confidence.
Buyer VP Infrastructure
Initial contract Assumed $75k-$150k paid pilot on one site, converting to annual production pricing based on managed MW and flex GPU-hour volume once the operator proves safe utilization lift; credible early production range is roughly $300k-$800k annualized depending on live MW and usage.

What must be true

  • At least 5-10 non-hyperscaler variable-power AI sites in the target geography will be commercially relevant buyers within the next 24 months.
  • Operators will pay for a read-only forecasting and flex-capacity layer before demanding full scheduler replacement or full in-house builds.
  • One or more early workload classes such as batch training or offline inference will accept completion windows large enough to create repeatable flex demand.
  • The platform can improve sellable utilization or commitment confidence enough to justify roughly $60k-$70k per MW-year pricing.
  • Early deployments will generate forecasting and commercialization data that meaningfully outperform manual planning and generic scheduler rules over time.

Open diligence questions

  • How many live or funded sites in the next 24 months actually match the beachhead profile and third-party software buying model?
  • What quantitative proof would make an infra leader trust this product in overlay mode, and later in admission control?
  • Why will Crusoe-like vertical providers or in-house scheduler extensions not win the first deal by default?
  • Which workload types create the highest early flex fill rate without unacceptable customer support burden?
  • Does pricing by managed MW map to how buyers budget, or will they insist on pure usage pricing or bundled services?
Investor verdict
Call Watch
Conviction Medium-low conviction because the product wedge is crisp and timely, but the first-wave buyer pool and standalone market size may be too small unless expansion happens quickly.
Why believe Power-constrained AI infrastructure is becoming real, and a neutral layer that converts volatile megawatts into contractual compute products addresses a concrete operational pain that generic schedulers do not solve.
Why doubt The inputs still leave open whether enough third-party variable-power sites will launch soon, and whether operators will buy neutral software instead of building internally or defaulting to vertically integrated providers.
Next diligence Confirm that at least two target operators with live or imminent variable-power clusters will fund a paid overlay pilot before the company asks for deep control-plane authority.
Section

Financial model

3-year totals
Year 1 revenue $495K EBITDA $-658K · Cash EOP $1.34M
Year 2 revenue $1.93M EBITDA $-498K · Cash EOP $844K
Year 3 revenue $4.18M EBITDA $353K · Cash EOP $1.20M
Unit economics
ARPU (annual) $660K
Gross margin 70%
CAC $301K Payback 7.8 months
LTV / CAC 8.5x LTV $2.57M
Funding ask
Round pre-seed · $2.0M
Runway 30 months
Milestone Reach 4 production customers and roughly 40 MW under management, ship guarded admission control for flex workloads, and show the first multi-site expansion before a seed round.

Model sanity

  • Revenue engine. Base-case growth comes from scaling from 2 to 8 active operators at about 10 managed MW each and roughly $660K blended annual revenue per operator.
  • Must go right. The first paid pilots have to convert within about two quarters so the company reaches 4 production customers and ~40 MW under management by the end of Y2.
  • Model breaks if. If the beachhead pipeline only supports 6 active operators or gross margin stays in the high-60s, downside cash falls close to the floor before the seed milestone is proven.
  • Next-round proof. A seed round is justified by 4 production customers, a guarded admission-control module in market, and at least one multi-site expansion reference.
Revenue, cash, and EBITDA — 12-month Y1 + 8-quarter Y2/Y3
$0K$500K$1.00M$1.50M$2.00MM1M4M7M10Q1Y2Q4Y2Q3Y3Q4Y3
  • Revenue (line, area)
  • Cash EOP (dashed)
  • EBITDA (bars, gray = loss)
Use of funds — $2.0M pre-seed
Engineering · 41% GTM · 23% G&A · 9% Buffer (6 mo) · 27%
Headcount build by role — peak12 FTE
Q1Y13Q2Y14Q3Y15Q4Y16Q1Y26Q2Y28Q3Y28Q4Y210Q1Y310Q2Y311Q3Y312Q4Y312
  • Founder CEO
  • Engineering
  • Product/Infrastructure
  • Solutions Engineer
  • Customer Success/Implementation
  • Sales
  • G&A/Finance
Year-3 scenarios — base / downside / upside
Y3 revenueY3 EBITDACash low pointDescription
Downside$3.13M-$186K$214KPilot conversion stretches by a quarter, one planned site expansion does not close, and the revenue mix lands closer to core managed-MW pricing than the blended package.
Base$4.18M$353K$827KFounder-led year 1 lands two revenue-generating operators, year 2 reaches 4 production customers / ~40 MW, and year 3 expands to 8 active operators / ~80 MW.
Upside$4.95M$920K$910KThe first production customer expands faster, referenceability compresses the sales cycle, and one extra operator lands in Y3 without materially increasing fixed cost.
Sensitivity — Y3 cash and revenue impact, sorted by magnitude
VariableDownsideUpsideCash impactRevenue impact
sales cyclePilot-to-production conversion slips by ~1 quarter because security and ops reviews take longerReference customers compress new-logo sales by ~1 quarter-$430K-$550K
ARPU$600K blended annual revenue per customer$720K blended annual revenue per customer-$330K-$380K
CACCAC rises toward ~$340K as the first rep needs more field time and referencesReference-led selling keeps CAC near ~$260K-$235K$0K
hiring paceA second AE and extra engineering hire are pulled forward before repeatability is provenOne non-core hire is delayed until after Q4Y2 conversion targets are met-$210K$0K
churnMonthly churn trends toward 2.5% and Y3 exits with one fewer retained operatorMonthly churn holds near 1.0% with strong renewal and expansion-$180K-$220K
gross marginGross margin stays at 67% because implementation remains service-heavyGross margin improves to 72% with cleaner integrations and less support load-$125K$0K

Scenarios

Scenario Y3 revenue Y3 EBITDA Cash low point Description Key changes
Downside $3.13M $-186K $214K Pilot conversion stretches by a quarter, one planned site expansion does not close, and the revenue mix lands closer to core managed-MW pricing than the blended package.
  • Blended annual revenue per active customer falls from $660K to $600K.
  • Customer ramp ends Y3 at 6 active operators instead of 8 because one pilot slips and one expansion never lands.
  • Gross margin stays at 67% because implementation remains more site-specific for longer.
Base $4.18M $353K $827K Founder-led year 1 lands two revenue-generating operators, year 2 reaches 4 production customers / ~40 MW, and year 3 expands to 8 active operators / ~80 MW.
  • Blended annual revenue per active customer stays at $660K.
  • Customer ramp reaches 2 active operators by Y1, 4 by Y2, and 8 by Y3.
  • Gross margin reaches the 70% business-plan target as integrations become more repeatable.
Upside $4.95M $920K $910K The first production customer expands faster, referenceability compresses the sales cycle, and one extra operator lands in Y3 without materially increasing fixed cost.
  • Blended annual revenue per active customer rises from $660K to $700K on stronger usage-fee capture.
  • Customer ramp reaches 9 active operators by Q4Y3 through one additional site expansion.
  • Gross margin improves from 70% to 72% as one integration path becomes standard.

Sensitivity

Variable Downside Base Upside
ARPU $600K blended annual revenue per customer $660K blended annual revenue per customer $720K blended annual revenue per customer
CAC CAC rises toward ~$340K as the first rep needs more field time and references Forecast CAC of ~$301K per net new operator Reference-led selling keeps CAC near ~$260K
churn Monthly churn trends toward 2.5% and Y3 exits with one fewer retained operator Monthly churn stays at 1.5% with expansions offsetting early logo risk Monthly churn holds near 1.0% with strong renewal and expansion
sales cycle Pilot-to-production conversion slips by ~1 quarter because security and ops reviews take longer Pilot-to-production conversion stays inside roughly 2 quarters after proof of value Reference customers compress new-logo sales by ~1 quarter
gross margin Gross margin stays at 67% because implementation remains service-heavy Gross margin reaches the 70% business-plan target Gross margin improves to 72% with cleaner integrations and less support load
hiring pace A second AE and extra engineering hire are pulled forward before repeatability is proven Hires stay milestone-based around pilots, conversions, and multi-site proof One non-core hire is delayed until after Q4Y2 conversion targets are met
Key assumptions (30)
ID Name Value Unit Source
A1 Model start month 2026-06 month [business-plan.yaml date] startup-finance heuristic: first full month after plan date
A2 Opening cash at M1 2000 USDK [business-plan.yaml fundingAsk.targetFundingRangeUsd] selects the low end of the stated $2–4M pre-seed range because the base case reaches the next milestone before cash drops below ~$0.8M
A3 Mature managed MW per active operator 10 MW/customer [research.yaml market.som] 80 MW year-3 SOM mapped to 8 active operators in the base case
A4 Blended annual revenue per mature customer 660 USDK/year [business-plan.yaml gtm.pricing; businessModel.revenueStreams; research.yaml bottomUpSizingDrivers] 10 MW x ~$60K/MW-year plus ~10% usage and analytics uplift
A5 Revenue recognized in landing month 50 percent of monthly ARPU startup-finance heuristic: enterprise pilots and first production ramps start mid-month on average
A6 Active paying customer ramp 2 by Y1 / 4 by Y2 / 8 by Y3 customers [business-plan.yaml milestones; research.yaml validationPlan] conservative path inside the stated 4–8 early commercial deployments and 80 MW year-3 target
A7 Gross margin target 70 percent [business-plan.yaml businessModel.targetGrossMarginPct]
A8 Founder CEO loaded annual cash cost 144 USDK/year startup-finance heuristic: $120K cash salary plus 20% payroll tax and benefits
A9 Engineer loaded annual cash cost 180 USDK/year startup-finance heuristic: $150K salary plus 20% payroll tax and benefits for GPU infrastructure talent
A10 Product/infrastructure loaded annual cash cost 168 USDK/year startup-finance heuristic: $140K salary plus 20% payroll tax and benefits
A11 Solutions engineer loaded annual cash cost 156 USDK/year startup-finance heuristic: $130K salary plus 20% payroll tax and benefits
A12 Customer success / implementation loaded annual cash cost 132 USDK/year startup-finance heuristic: $110K salary plus 20% payroll tax and benefits
A13 Sales AE loaded annual base cost 168 USDK/year startup-finance heuristic: $140K base plus 20% payroll tax and benefits; commissions are modeled separately
A14 G&A / finance loaded annual cash cost 120 USDK/year startup-finance heuristic: $100K salary plus 20% payroll tax and benefits
A15 R&D non-payroll spend 10 in Y1, 12 in Y2, 14 in Y3 USDK/month startup-finance heuristic: cloud telemetry, observability, and security tooling for infrastructure software
A16 Sales and marketing non-payroll spend 6 in Y1, 8 in Y2, 10 in Y3 + 1.5 per AE + 8% of revenue USDK/month startup-finance heuristic: founder travel, references, conferences, and commissions for high-touch enterprise selling
A17 G&A non-payroll spend 7 in Y1, 9 in Y2 pre-finance hire, 10 after finance hire, and 11 in Y3 USDK/month startup-finance heuristic: legal, insurance, audit, and back-office overhead for critical infrastructure customers
A18 First solutions engineer hire Month 5 month [business-plan.yaml team] integration / solutions engineer needed once pilots start
A19 Second engineer hire Month 9 month startup-finance heuristic tied to [business-plan.yaml milestones 0–12 months] to finish repeatable integration and production security packaging before first conversion
A20 First customer success hire Month 10 month [business-plan.yaml team] customer success / implementation lead starts in the month 9–12 window
A21 First AE hire Month 14 month [business-plan.yaml gtm.funnelTargets; team] startup-finance heuristic: add dedicated sales only after the first production contract proves the wedge
A22 Third engineer hire Month 16 month [business-plan.yaml milestones 12–24 months] supports guarded admission control and multi-site benchmark ingestion
A23 Finance / compliance hire Month 22 month startup-finance heuristic tied to [business-plan.yaml milestones 12–24 months] when 3–4 production customers require more vendor management and finance ops
A24 Fourth engineer hire Month 22 month [business-plan.yaml milestones 24–36 months] supports portfolio analytics and second-site expansion readiness
A25 Second customer success hire Month 28 month startup-finance heuristic tied to [business-plan.yaml milestones 24–36 months] when customers add more MW or second sites
A26 Second AE hire Month 31 month startup-finance heuristic: add a second seller only after referenceable production deployments exist
A27 Steady-state monthly churn for unit economics 1.5 percent startup-finance heuristic: conservative early-stage enterprise infrastructure-software renewal risk
A28 Blended CAC 300.8 USDK/customer calculated from forecast Y2–Y3 sales and marketing spend of $1.805M divided by 6 net new active operators
A29 Cash conversion timing In-period collection policy startup-finance heuristic; flagged because infrastructure operators may still pay on 45–60 day terms
A30 Funding ask 2.0 USDM [business-plan.yaml fundingAsk] aligns with the low end of the stated range while preserving a >$0.8M cash floor in the base case
unit economics flow
flowchart LR
  TargetAccounts --> PaidPilots
  PaidPilots --> ManagedMW
  ManagedMW --> ProductionCustomers
  ProductionCustomers --> Revenue
  Revenue --> GrossProfit
  GrossProfit --> Cash

Flags: The initial SAM is still narrow, so the base case depends on winning 8 of a small number of North American variable-power operator deployments by Q4Y3. · Revenue is modeled as a blended per-operator ARPU and does not separately show subscription, usage, and implementation mix by contract. · Cash collection is assumed in-period even though infrastructure operators may still pay on 45–60 day terms, which would reduce the base-case cash cushion. · Holding 70% gross margin from Y2 onward assumes site integrations become meaningfully repeatable instead of remaining service-heavy.

Section

Top risks

  • Market still early. There may be too few variable-power AI clusters in production this year to support fast initial ARR growth. Mitigation: Start with any behind-the-meter or curtailed-power GPU sites, not only offshore deployments, while keeping the same product architecture.
  • Integration friction. Infrastructure teams may resist inserting a new control layer into mission-critical cluster operations. Mitigation: Launch as a read-only forecasting and flex-queue product first, then earn trust before taking over admission control.
  • Customers may prefer fixed-power contracts. Enterprise buyers could avoid flex capacity if the operational tradeoff feels too complex. Mitigation: Focus early sales on batch training, backfills, and offline inference workloads where price savings clearly outweigh timing flexibility.
Section

Evidence

Cited sources (35)

  1. Tech Startups. Peter Thiel-backed Panthalassa raises $140M to build wave-powered floating AI data centers - Tech Startups · https://techstartups.com/2026/05/05/peter-thiel-backed-panthalassa-raises-140m-to-build-wave-powered-floating-ai-data-centers
  2. OfficeChai. Peter Thiel Leads $140 Million Investment In Panthalassa To Build AI Datacenters In The Sea · https://officechai.com/ai/peter-thiel-leads-140-million-investment-in-panthalassa-to-build-ai-datacenters-in-the-sea
  3. AWS. Save up-to 90% on On-Demand Prices – Amazon EC2 Spot Instances – Amazon Web Services · https://aws.amazon.com/ec2/spot
  4. AWS. Reserve GPU instances for ML workloads – Amazon EC2 Capacity Blocks for ML – AWS · https://aws.amazon.com/ec2/capacityblocks
  5. AWS. Best practices for Amazon EC2 Spot - Amazon Elastic Compute Cloud · https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-best-practices.html
  6. Google Cloud. Spot VMs  |  Compute Engine  |  Google Cloud Documentation · https://docs.cloud.google.com/compute/docs/instances/spot
  7. Google Cloud. Batch: Simplicity for Batch Computing | Google Cloud · https://cloud.google.com/batch
  8. Microsoft Learn. About Azure Spot Virtual Machines - Azure Virtual Machines | Microsoft Learn · https://learn.microsoft.com/en-us/azure/virtual-machines/spot-vms
  9. Microsoft Learn. Overview - Azure CycleCloud | Microsoft Learn · https://learn.microsoft.com/en-us/azure/cyclecloud/overview?view=cyclecloud-8
  10. SchedMD. Slurm Workload Manager - Overview · https://slurm.schedmd.com/overview.html
  11. SchedMD. Slurm Workload Manager - Slurm Power Saving Guide · https://slurm.schedmd.com/power_save.html
  12. Kueue. Overview | Kueue · https://kueue.sigs.k8s.io/docs/overview
  13. Kueue. Fair Sharing | Kueue · https://kueue.sigs.k8s.io/docs/concepts/fair_sharing
  14. Kubernetes. Pod Priority and Preemption | Kubernetes · https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption
  15. NVIDIA. NVIDIA Base Command Manager | AI & HPC Cluster Management Software · https://www.nvidia.com/en-us/data-center/base-command-manager
  16. Lambda. AI Cloud Pricing | GPU Compute & AI Infrastructure | Lambda · https://lambda.ai/pricing
  17. Lambda Docs. Using Lambda's Managed Slurm - Lambda Docs · https://docs.lambda.ai/public-cloud/1-click-clusters/managed-slurm
  18. Runpod. Pricing | Runpod · https://www.runpod.io/pricing
  19. Runpod Docs. Slurm Clusters - Runpod Documentation · https://docs.runpod.io/instant-clusters/slurm-clusters
  20. Crusoe. Crusoe Cloud Pricing for AI Compute & Inference | NVIDIA & AMD GPUs · https://www.crusoe.ai/cloud/pricing
  21. Crusoe. Command Center: GPU observability + orchestration | Crusoe Cloud · https://www.crusoe.ai/cloud/command-center
  22. Crusoe. Crusoe Announces New 900 MW AI Factory Campus in Abilene, Texas to Support Microsoft AI Infrastructure · https://www.crusoe.ai/resources/newsroom/crusoe-announces-new-900-mw-ai-factory-campus-in-abilene-texas-to-support-microsoft-ai-infrastructure
  23. Crusoe. Form Energy and Crusoe Announce Agreement for 12 Gigawatt-Hours of Iron-Air Batteries for AI Data Centers · https://www.crusoe.ai/resources/newsroom/form-energy-crusoe-announce-agreement-for-12-gigawatt-hours-of-iron-air-batteries-for-ai-data-centers
  24. Soluna. For AI - Soluna · https://www.solunacomputing.com/for-ai
  25. Soluna. Project Kati FAQs - Soluna · https://www.solunacomputing.com/blog/kati-faqs
  26. Soluna. What 1 Gigawatt Powers: A New Era of Renewable Computing - Soluna · https://www.solunacomputing.com/blog/1gw
  27. U.S. Department of Energy. Data Centers and Servers | Department of Energy · https://www.energy.gov/cmei/buildings/data-centers-and-servers
  28. ENERGY STAR. Data Center Equipment | ENERGY STAR · https://www.energystar.gov/products/data_center_equipment
  29. Crusoe. Crusoe Cloud | AI Platform & Services · https://www.crusoe.ai/cloud
  30. Google Cloud. Accelerator-optimized machine family  |  Compute Engine  |  Google Cloud Documentation · https://docs.cloud.google.com/compute/docs/accelerator-optimized-machines
  31. AWS. Efficient Batch Processing - AWS Batch - AWS · https://aws.amazon.com/batch
  32. Google Cloud. GPU pricing | Google Cloud · https://cloud.google.com/compute/gpus-pricing
  33. Microsoft Azure. Spot Virtual Machines – Spot Pricing and Features | Microsoft Azure · https://azure.microsoft.com/en-us/products/virtual-machines/spot
  34. Lambda. AI Cloud Platform | Lambda · https://lambda.ai/cloud
  35. Crusoe. Crusoe Energy | Energy-First Innovation for AI Cloud · https://www.crusoe.ai/energy