Token Budgets for Investment Firms: What's the ROI? And Where's the Limit?

👋 Hi, I’m Andre and welcome to my newsletter Data Driven VC which is all about becoming a better investor with data and AI.

Join our free virtual roundtable “The Compounding Data Layer: Building an Edge When Everyone Uses the Same AI Model” with Eight Roads, Offline Ventures, and Kruncher here

ICYMI, check out some of our most read episodes:

Brought to you by Vessel - Agentic fund operations for VC and PE firms

100+ done deals a year. A lean team. Zero operational drag.

FJ Labs runs their entire LP operations on Vessel: automated updates, on-demand reporting, and agents that handle the work so their partners don't have to.

Read the FJ Labs story

For a decade, the cost of a knowledge worker came in two parts: payroll and software. A third line is now forming on the budget. Tokens.

Ramp's AI Index, which tracks observed corporate-card and bill-pay spend, puts the median US firm at $11.38 per employee per month on AI. The top 10% spend about $611. The top 1%, the cohort Ramp calls "AI-pilled," spend roughly $7,500 per employee per month.

That is a 680x gap between the frontier and the middle. No other line item on the org chart is distributed that unevenly, and among the top 1% the spend grew 14.1% in a single month.

The unit economics are counterintuitive. The cost to run a fixed level of AI capability has fallen by roughly 280x over two years on Stanford HAI's inference-cost index, yet total bills keep climbing.

The driver is consumption.

Reasoning models think several times longer per task, agentic tools chain dozens of calls, and Goldman Sachs projects total token usage growing about 24x by 2030. Cheaper per token has not meant cheaper overall.

So the budgeting question has changed. The old question was "what does a seat cost." The working question now is "what is a person's token budget, and how does it sit against what that person costs to employ."

For investment firms, where the people are expensive and the data is sensitive, that second question is the one worth getting right.

What firms actually spend

The token data in our DDVC community is still fresh and evolving. A first snapshot will be published soon in the DDVC Landscape report.

Public benchmarks for AI spend inside investment firms is almost non-existing.

For the purpose of this episode, I’ll triangulate from broader knowledge-work and financial-services data, then adjust for the economics of an investment firm.

Three benchmark families are worth holding side by side, because they disagree, and the disagreement matters.

Cohort	AI spend per employee per month	What it buys
Median firm	$11.38	Basically one subscription
Top 10%	$611	A few enterprise seats plus some API
Top 1% ("AI-pilled")	$7,500	Heavy agentic and automation load

Source: Ramp AI Index, 2026.

The first family is Ramp's observed distribution above. It is the cleanest read on what companies actually pay per head, and it shows extreme dispersion.

The second is a blended average: enterprise AI spend of roughly $1,240 per employee per year across firms with 500 or more employees, per IDC-derived compilations. That is about $103 per month, a different methodology that lands between Ramp's median and top decile.

The third is the absolute level. CloudZero put the average organization at $85,521 per month on AI-native applications in 2025, up 36% year over year, with 45% of organizations planning to spend more than $100,000 per month.

The frontier is louder still. Mercor's CEO has said the company spends more on tokens for internal agents than on headcount, and an Nvidia executive has said compute now exceeds the salaries of his team.

Two cautions on the data. The per-employee figures blend seats, API usage, and bundled SaaS, so they undercount "shadow AI" bought on personal cards, which Menlo Ventures estimates at close to 40% of application AI spend.

And self-reported and forecast figures should be read as direction, not precision.

The ratio that matters

Put the two budgets in the same frame. The fully loaded cost of an employee runs about 1.25x to 1.4x base salary once benefits, payroll taxes, and overhead are included. The BLS put benefits alone at 29.8% of total compensation in June 2025. For senior investment staff, bonus and carried interest push the multiple well past 3x base.

In monthly terms, a US software engineer costs on the order of $16,000 fully loaded. An investment professional in a major market sits in a similar band, and a partner is a multiple of it.

Per-employee AI spend	Share of a $16k/month loaded employee
Median, $11	~0.07%
Top decile, $611	~3.8%
Uber's per-tool cap, $1,500	~9%
Top 1%, $7,500	~47%

For almost every firm the conclusion is the same: Tokens are a rounding error against payroll, and they stay that way even at aggressive adoption levels.

A top-decile budget of $611 per person per month is under 4% of a loaded engineer's cost. To break even, it needs to return roughly 1.5 hours of that person's time per week. The measured productivity evidence clears that bar with room to spare.

So for most investment firms, affordability is rarely the binding constraint. Data control and runaway agentic spend are.

What the tokens actually buy

The case for spending rests on time returned, and the measured numbers are large enough that the math is not close.

The St. Louis Fed found generative AI users saved 5.4% of work hours, about 2.2 hours per week, with daily users saving four or more. The LSE Inclusion Initiative and Protiviti put the average at 7.5 hours per week per user, worth roughly £14,000 per employee per year, rising to 11 hours for trained users.

In controlled settings the effects are sharper. GitHub Copilot lifted weekly pull requests by about 26%, a professional-writing study cut completion time by roughly 40%, and Microsoft 365 Copilot users spent 3.6 fewer hours per week on email.

For an investment firm the analogous wins are sourcing pipelines, outbound emails, data-room synthesis, competitive landscape analysis, and investment memos. A few hours per professional per week, against a loaded cost measured in thousands per day, makes a $600 monthly budget trivially ROI-positive.

The risk on this side is over-claiming. Self-reported gains around 40% run far above measured gains around 5%. The value is real, and it is easy to overstate.

Join 1785+ investors in our free Slack group as we automate our VC job end-to-end with AI. Live experiment. Full transparency.

Join the experiment

Seat versus usage: two cautionary tales from 2026

The harder question than "how much" is "in what form." Two real cases frame it.

Uber rolled Claude Code to roughly 5,000 engineers and exhausted its entire 2026 AI coding-tools budget within four months. By March 2026, about 84% of its engineers were agentic-coding users. The response was a hard cap of $1,500 per employee per month, per tool. Power users had been running $500 to $2,000 each, against an average of $150 to $250.

Microsoft went the other way. It began canceling Claude Code licenses in its Experiences and Devices division and redirecting engineers to GitHub Copilot CLI, in a move reporting framed as cost-certainty driven. Copilot bills a flat per-seat rate, while Claude Code charges a base seat fee plus variable token usage.

The two cases are the whole decision in miniature:

Seats give predictability and a known ceiling
Usage scales with value and carries spike risk

Dimension	Per-seat	Usage / API (tokens)
Cost behavior	Fixed, predictable	Variable, scales with work
Best for	Individual assistant use	Agentic workflows, automations
Main risk	Idle seats	Runaway spend, spikes
Governance need	Low	High: caps, attribution, anomaly detection
Real signal	Microsoft chose flat seats	Uber capped usage at $1,500/mo

The operator consensus lands in a sensible place.

Keep individual assistant use on seat plans, where it caps the downside and keeps a clean view of the per-hour-of-staff versus per-token tradeoff.

Move agentic workflows and automations to metered API behind a gateway, where the value is highest and the controls need to be tightest.

One more warning worth budgeting around. Today's token prices are partly subsidized by frontier labs competing for share. A workflow whose economics only clear at current prices is fragile if prices rise after the model providers go public.

Where the spend concentrates

Clean cross-department per-head data is not well published, so this is a synthesis of the case evidence. The pattern is consistent. Engineering and R&D dominate, because agentic coding is the single most token-hungry workload.

Function	Spend intensity	Dominant instrument
Engineering / R&D	Highest	Usage plus seats
Investment / deal team	Highest value	Seats plus gated API
Data / automation	Variable	API behind a gateway
Ops / marketing / knowledge	Low to medium	Seats

For an investment firm the practical reading is that the deal and research teams are the highest-value adopters, while engineering and data functions carry the highest spend and the highest runaway risk. The budget should follow that shape.

How to set a standard

The FinOps Foundation, whose practitioners now manage AI spend at 98% of surveyed organizations (up from 63% a year earlier), has converged on a phased approach. Here it is, sized for a firm rather than a hyperscaler.

Phase one, months 1 to 3. Baseline before you cap. Run a controlled free-for-all, route everything through one gateway, govern API keys, tag by user and use case, and stand up a dashboard with per-account alerts. A month of observation beats a guessed budget.

Phase two, months 3 to 9. Attribute and right-size. Show each team its own spend, break it down by model, turn on prompt caching and batching, and switch on anomaly detection.

Phase three, beyond. Optimize and route. Default to the cheapest model that clears the quality bar and escalate only on evidence. Intelligent routing cuts cost per request by 60% to 80%, and the RouteLLM research showed an 85% cut while preserving about 95% of flagship quality.

Join our paid community to access 450+ deep dives and step-by-step guides, 100+ masterclasses and recordings, prompt libraries, automation templates, and lots more.

Learn more

5 rules above all

Anchor the budget to loaded cost, not a flat dollar. A single org-wide number under-serves a partner and over-serves an analyst. Set the guardrail as a share of each person's fully loaded monthly cost, then express it as a hard per-tool cap. A cap under roughly 10% of loaded cost is almost always ROI-positive.
Put hard caps at the tool and feature level, not just the team level. A single misbehaving agent can drain a team's monthly budget in hours, so the guardrail has to sit one level deeper, with alerts that fire before the spike.
Use a gateway. A proxy such as LiteLLM, Helicone, or Portkey gives one endpoint, a virtual key per user, per-key budgets, and call-level attribution across providers. This is the highest-leverage piece of plumbing for control.
Tier your policies by data sensitivity. This matters most for an investment firm. Run a strict default policy for anything touching deal data, LP information, or material non-public information, and a looser policy for low-sensitivity R&D. The gateway is where you enforce model allow-lists, zero-data-retention requirements, and data-classification rules. FINRA-regulated contexts make this non-negotiable.
Treat each agent like a new hire. Onboard an AI agent the way you would onboard a junior analyst: scoped permissions, a playbook, a spend limit, and recommend-then-approve rather than act-autonomously for anything expensive.

A worked standard

Here is one concrete way to express the standard, sized for a lean firm. Treat the numbers as a starting frame to calibrate against your own baseline, and anchor the caps to your real loaded-cost figures once phase one gives you data.

Role	Soft alert / mo	Hard cap / mo	Instrument	Policy tier
Investment professional	$300	$800	Seats + gated API	Strict (deal, LP, MNPI)
Partner / GP	$600	$1,500	Seats + gated API	Strict
Engineering / R&D	$750	$1,500 per tool	Usage behind gateway	R&D
Data / automation	per job	workflow-level	API behind gateway	Mixed by data class
Ops / marketing	$100	$300	Seats	Default

The engineering hard cap mirrors Uber's $1,500 per tool. The policy tier, not the dollar figure, is the load-bearing control for an investment firm.

Bottom line

The instinct to set a per-employee token budget on day one is the wrong first move. Baseline first, govern the plumbing, then cap.

For an investment firm the arithmetic is freeing. Even top-decile AI spend is a low-single-digit percentage of what your people cost, and the productivity evidence clears the ROI bar with room to spare.

The question that deserves partner-level attention is the governance one: who can use which model, on which data, with what cap and what audit trail.

Set the standard there.

Stay driven,
Andre

PS: Check out Vessel to automate your fund operations

PPS: Join our free virtual roundtable “The Compounding Data Layer: Building an Edge When Everyone Uses the Same AI Model”