This website uses cookies

Read our Privacy policy and Terms of use for more information.

👋 Hi, I’m Andre and welcome to my newsletter Data Driven VC which is all about becoming a better investor with data and AI.

Join our upcoming in-person events:

ICYMI, check out some of our most read episodes:

Brought to you by Kruncher - The AI-First Private Capital CRM

Understand every company before the market does.

Most firms are sitting on more data than they can act on. Kruncher turns external market signals, portfolio activity, and internal firm knowledge into a unified intelligence layer tailored to your thesis. 

Track companies continuously. Rank them against your strategy. Build conviction faster.

One platform purpose-built for private market investors: AI analyst + live data layer + signal engine + agents and automations.

  • Automate 80% of your firm's workflows from screening to LP reporting

  • 450+ configurable signals and time-series intelligence on company evolution, with source-linked evidence

  • Query your fund knowledge directly inside Claude and ChatGPT via Kruncher’s secure MCP server

  • Fully customizable to your thesis, scoring model, KPIs, and workflows

  • 90-day adoption support from the Kruncher team to ensure the platform is fully embedded in your workflow


Welcome to another Data Driven VC “Insights” episode where we cover the most interesting research and reports about startups, VCs, LPs, AI & automation.

Have Mega-Funds Taken Over Seed?

Pavel Prata from Murph Capital used Harmonic data to analyse early-stage deal activity across 10 mega-funds ($10B+ AUM) across three eras: SaaS (2015 to 2019), ZIRP (2020 to 2022), and AI (2023 to 2026).

  • a16z and General Catalyst at 4x SaaS-Era Levels: a16z went from 16.6 early-stage deals/year in the SaaS era to 75.3/year in the AI era. General Catalyst went from 15.2 to 61.5/year. Both funds' AI-era activity exceeds their ZIRP peaks, confirming this is not a cheap-capital hangover.

  • No Fund Has Returned to Pre-ZIRP Behaviour: "Stabiliser" funds like Sequoia (19.6 to 50.6/year) and Lightspeed (11.6 to 32.1/year) operate at 2 to 3x their SaaS-era baselines. Even the most disciplined funds (Bessemer, Index, Lux) show durable upward shifts.

  • Collective Seed Volume Nearly Tripled: These 10 funds made ~140 to 150 early-stage deals/year in the SaaS era. In the AI era that figure is ~370 to 400/year.

✈️ KEY TAKEAWAYS

The early-stage market has permanently repriced around mega-fund participation. Emerging managers need a specific, defensible sourcing edge: geographies or sectors where these 10 funds lack density, or pricing discipline that wins deals they will not chase. Generalist seed funds without a structural moat are entering the hardest competitive environment on record.

IRR Is a Trap Before Year 4

Peter Walker, Head of Insights at Carta, published net IRR percentiles by vintage across 2,276 US venture funds ($10M to $1B+), measured March 31, 2026, showing that early fund performance is an unreliable predictor of eventual outcomes.

  • 2021 Vintage at -4.3% Median IRR: Funds that printed explosive early IRR through rapid markup cycles are now dragging under stale marks. Top-quartile (75th percentile) sits at just 1.3% as of Q1 2026.

  • 2025 Vintage Spread of 81 Points: The youngest cohort shows a 90th percentile of 54.1% versus a 25th percentile of -26.9%. AI markups in a few months can print outsized IRR before any real value is proven.

  • Mature Vintages Show Compression: 2016 to 2019 cohorts show tight, stable spreads at maturity, confirming that IRR only becomes meaningful as a comparative metric at year 4 to 6.

✈️ KEY TAKEAWAYS

Early IRR in AI-era vintages mirrors the 2021 setup: rapid markups inflating numbers before fundamentals catch up. For LPs evaluating newer funds, IRR before year 4 is a marketing figure. Portfolio construction quality and entry price discipline are the only early signals worth tracking.

Join our next meetup in Berlin to discuss the future of VC

Stacking Management Fees

Dan Gray at Odin wrote a structural critique of how venture capital's fee economics have decoupled from LP returns, showing that maximising AUM to collect management fees has become the dominant business model at scale.

  • $10M vs $250M in Fees, Same Company: A pre-ZIRP era company generated roughly $10M in total management fees across its lifecycle. The same company held private for 14 years through multiple rounds absorbing ~$1B in VC capital generates $250M in fees, with no improved exit outcome.

  • Top Firm Fee Income Up 10x Since 2005: The five largest VC fundraisers in 2005 generated ~$150M in subsequent fee income. For 2025 vintages that figure is ~$1.6B.

  • $3B+ Raised in Private = Negative Return Signal: Of the 10 companies that raised more than $3B in private markets, only Robinhood shows a slight positive return vs. the S&P 500. A portfolio built on the thesis that large private raises signal quality would be at -120.5% today.

✈️ KEY TAKEAWAYS

At scale, fee income has become the primary business of large VC platforms. Fund size and brand recognition now correlate negatively with LP-aligned incentives, and the data on $3B+ private-raised companies makes that concrete. Smaller, carry-dependent managers with disciplined fund sizes hold a structural alignment advantage.

Emerging Managers Are Back

Peter Walker at Carta tracked new US venture fund formations ($10M to $100M) joining Carta by quarter, showing the first material recovery since the post-ZIRP contraction.

  • 78 New Funds in Q1 2026, Up from 58 in Q1 2025: Still well below the Q1 2022 peak of 147, but the first meaningful acceleration in two years.

  • 2023 to 2024 Were the Floor: Q1 2023 (59 funds) and Q1 2024 (51 funds) mark the trough. The correction lasted roughly eight quarters.

  • Formation Recovery Meets Peak Mega-Fund Activity: More new managers are entering the market precisely as the 10 largest funds post their highest seed deal volumes on record.

✈️ KEY TAKEAWAYS

The rebound in emerging manager formation is happening at exactly the wrong structural moment. More dry powder entering the emerging manager tier while mega-fund seed density hits a historical peak means more competition for the same early deals. LPs evaluating new managers should weight differentiation more heavily than at any prior point in the cycle.

Join 1615+ investors in our free Slack group as we automate our VC job end-to-end with AI. Live experiment. Full transparency.

Skill Distillation: The Cheapest Way to Run AI at Scale

Tomasz Tunguz at Theory Ventures documented a personal agent architecture called Pi, where frontier models author procedural skill files that smaller local models execute, a technique he calls skill distillation.

  • How It Works: A three-layer system uses a local markdown knowledge base (~80 workflow files), atomic SKILL.md playbooks written by frontier models (Opus 4.7, GPT-5.1, Gemini 3 Pro), and an agent loop that runs those skills using cheaper local models (Qwen 35B, Gemma 26B).

  • Different from Classical Distillation: Standard model distillation compresses a large model's probability outputs into a smaller model's weights. Skill distillation externalises procedures into inspectable markdown files the smaller model reads and follows.

  • The Model Becomes Interchangeable: Because the skill library is decoupled from the executing model, the model can be swapped for whatever is cheapest each quarter with no retraining. The institutional knowledge lives in the files, not the weights.

✈️ KEY TAKEAWAYS

For VC firms building internal AI tooling, skill distillation is a practical architecture: encode proprietary processes once using a frontier model, then run them cheaply on local models indefinitely. The competitive moat is the quality of the captured procedures, not which model you use.

The Complete Hermes Agent Setup Guide

CyrilXBT published a complete masterclass on building a fully autonomous Hermes Agent operation, covering installation through multi-agent deployment in a single guide.

  • Four Properties That Separate Hermes from Other Frameworks: Persistent memory across sessions via SQLite, a reusable skill system built on plain Markdown files, a configurable scheduler that runs workflows without manual triggers, and MCP server integration that connects the agent to real tools including file systems, web search, and external APIs.

  • The CLAUDE.md Is the Core Leverage Point: Every skill reads a single configuration file before executing. A precisely written CLAUDE.md shapes every output across every automated workflow. Vague configuration produces generic outputs regardless of skill quality.

  • The 90-Day Compounding Curve: A Hermes agent at day 90 has processed hundreds of sources, tracked dozens of decisions, and built a detailed picture of what works in a specific operation. That accumulated memory is not replicable by starting later. Day 1 is the only day the compounding starts.

✈️ KEY TAKEAWAYS

For data-driven investors building internal research and monitoring operations, Hermes offers a practical open-source infrastructure layer. The skill system maps directly to repeatable investment workflows: morning intelligence briefs, source monitoring, deal pipeline tracking, and content generation. The memory layer is the compounding asset.


That’s it for today!

Stay driven,
Andre

PS: In Berlin for SuperReturn? Join our DDVC Breakfast 9th June here

Reply

Avatar

or to participate

Keep Reading