This website uses cookies

Read our Privacy policy and Terms of use for more information.

👋 Hi, I’m Andre and welcome to my newsletter Data Driven VC which is all about becoming a better investor with data and AI.

Last chance to join 350+ funds and participate in the 2026 DDVC Landscape survey to win 3x €100 Amazon voucher + 3x €1500 The Lab Premium membership

ICYMI, check out some of our most read episodes:

Brought to you by VESTBERRY - Portfolio Intelligence Platform for Data Driven VCs

Everyone's talking about agentic AI. But how are VCs actually using it? Most of us are still peeking over the fence, watching the early adopters do their thing. If that sounds like you, this one is worth the hour. On May 19th, Marek sits down with two practitioners to discuss:

  • what agentic AI really means for VCs

  • the infrastructure you need underneath it

  • concrete examples of building an AI agent in practice

Save a spot.


Welcome to another Data Driven VC “Insights” episode where we cover the most interesting research and reports about startups, GPs, LPs, AI & automation.

LPs Fight for Foundational AI Co-Investments as Capital Outpaces Demand

PitchBook’s Kaidi Gao in LPs fight tooth and nail for foundational AI co-investment share shows that AI valuation expansion has turned co-investment access into a do-or-die LP competition. However, capital supply now outpaces founder demand, suggesting late entrants are buying the top.

  • AI Series D+ Pre-Money Hit $4.7B Median in Q1 2026: Median pre-money valuation for US AI/ML startups raising Series D or later was $4.7B in Q1 2026, ~4x non-AI peers and a 447.8% jump from 2024. Concentrated in foundational players including xAI ($20B Series E, Jan), Anthropic ($30B Series G at $380B, Feb) and OpenAI ($122B round at $852B post-money, Mar).

  • Round Cadence Compressed 19%: Median time between AI rounds fell to 1.3 years in Q1 from 1.6 years during 2022-2024. Non-AI startups still take 1.9 years, squeezing diligence windows and rewarding LPs with dedicated co-invest staff over under-resourced peers.

  • Capital Supply Exceeds Demand Since Q2 2025: For every $0.90 AI venture-growth startups looked to raise in Q1, investors had $1 to deploy. Non-AI peers received just $1 for every $1.70 sought. A Coller Capital survey of 108 LPs found that 44% now rate co-investments as more important, while roughly 20% report no access to attractive deals.

✈️ KEY TAKEAWAYS

The AI co-investment market is inverted. LP capital supply now exceeds founder demand at the foundation-model layer, concentrating winners among allocators with deep GP relationships. Over the decade ending in 2024, UC Investments cut its manager list from 280 to 28. Smaller LPs entering today are statistically late and structurally disadvantaged on diligence speed. Gao’s conclusion is that the better risk-adjusted exposure sits in “picks and shovels” infrastructure rather than the frontier labs themselves.

Round Dilutions Are Declining

Carta's Head of Insights Peter Walker argues that primary-round dilution is falling for software startups, driven by a tougher fundraising environment that rewards the few startups clearing the bar.

  • Falling dilution despite bigger rounds: Median round sizes are rising over time, but dilution is dropping. Valuations in competitive deals are climbing faster than the cash invested. The data covers primary rounds only; bridge rounds, extensions, and convertible notes between rounds are explicitly excluded.

  • Selectivity is the engine: Walker's read is that despite X being full of "$40M raised here, $100M raised there" headlines, the bar to raise has materially risen and most VCs are more selective. The startups that do close rounds carry unusual leverage, pushing dilution below historical norms.

  • Bar-belling at Series A: Founders are increasingly raising either much more or much less than the current Series A median of roughly $12M, according to Walker. He also frames falling dilution as a natural consequence of staying private longer. Series D, E, F, and G rounds all require equity headroom, so the model adjusts upstream.

✈️ KEY TAKEAWAYS

The falling-dilution story is largely an AI software story and reflects how concentrated capital is becoming around a smaller set of competitive deals. For VCs, standard ownership targets of 20-25% at seed are increasingly unachievable in the deals they most want. Firms either recalibrate expectations or lose the round. Walker’s closing line is the one founders should internalise: “don’t let the perfect be the enemy of the good. Dilution is important, but staying alive is more urgent.”

Join 1590+ investors in our free Slack group as we automate our VC job end-to-end with AI. Live experiment. Full transparency.

Should You Increase Focus on Your Inbound Deal Flow?

Dan Gray at Odin makes the case that cold inbound dealflow is structurally underpriced and pairs the argument with a six-question intake framework designed to strip out the biases that warm intros introduce.

  • Cold deals outperform on size, despite being de-prioritised: A 2020 HBS survey found cold inbound accounts for roughly 10% of VC deals, versus 20% from co-investor referrals and 30% from professional networks. VentuRank analysis found cold inbound makes up 64% of opportunities seen but only 6% of investments made. Yet the same study found successful cold-deal companies delivered 16.2% higher ROI and required roughly 18% less capital than warm-deal counterparts.

  • Warm intros embed two specific biases: First, there is a “Keynesian beauty contest” effect: referrers prioritise what they think will land well with the investor, rather than what is actually best, distorting the quality signal. Second, affinity bias degrades performance over time. Research by Du and Hellmann found that higher levels of past co-investment activity lead to fewer new co-investments AND lower exit performance, after controlling for endogeneity. One study cited in the piece found "mirrored matching" (preferring founders with similar facial features) corresponds to a 7% drop in successful exits.

  • The atomic question and YC-style intake: Gray proposes a six-field intake form (problem, why it matters, contrarian belief, founder's relationship to the problem, traction, next milestone) with character limits, modelled on Y Combinator's single-slide batch format that has been shown to reduce credentialism. The goal: reduce the inbox to a single binary the partner can hold in their head: "Has this team managed to design a problem worth solving?"

✈️ KEY TAKEAWAYS

The question isn't whether to accept cold inbound. It's whether your intake process strips out enough bias to find signal in it. Most firms screen for what's easy to evaluate (school, prior company, intro source); few screen for what's predictive (problem design, founder-problem fit, velocity). Firms that build YC-style structured intake will systematically harvest the 16% ROI premium hiding in the 6% of investments most others ignore.

Half the Agent Workload Runs Locally Now

Tomasz Tunguz (Theory Ventures) in Localmaxxing shares a five-week, 1,478-task self-experiment showing that roughly half of an investor’s daily AI workload can be handled on a laptop. His conclusion is that latency, not cost or privacy, is the primary reason to push inference local.

  • 50% of Tasks Run on a Local 35B Model: Across 1,478 tasks classified over five weeks, Email & Inbound (11.5%), Scheduling (17.2%), Summarisation (12.4%), and Admin (0.7%) accounted for 618 tasks, or 41.8%, all of which succeeded fully on a local model. Roughly half of Market Research (13.0%) and Engineering (9.9%) tasks also worked locally, bringing the total close to 50%.

  • 2.1x Faster Than Frontier Cloud: In an eight-task head-to-head test, Qwen 3.6 35B-A3B-4bit running on a MacBook Pro M5 averaged 2.8 seconds per task, versus 5.8 seconds for Claude Opus 4.5 via API. The prompts were identical, both models were warmed, and outputs were correct in both cases.

  • ~20% Reasoning Gap, 3-4 Month Lag: Opus 4.5 scores roughly 20% higher on reasoning benchmarks, and local models trail the frontier by around 3-4 months. That gap matters for complex synthesis, but it is largely irrelevant for the routine agent tasks that dominate volume.

✈️ KEY TAKEAWAYS

Local inference’s killer feature is not cost or privacy. It is latency. If inference is 2x faster, users get 2x more iteration cycles per agent session. As open models close the frontier gap, hybrid stacks, with local models handling routine volume and cloud models reserved for complex reasoning, become the default. That dynamic weakens the token-growth narrative underpinning hyperscaler API revenue forecasts and strengthens the case for on-device tooling, consumer GPU silicon, and inference-orchestration startups.

Upgrade your subscription to access our premium content & join the Data Driven VC community

Can You Trust Your Agent?

A new arXiv paper from Princeton and UW researchers (Wu, Liu, Li, Tsvetkov & Griffiths, April 2026) provides the first rigorous evaluation of how 23 frontier and legacy LLMs navigate conflicts of interest when given sponsored-recommendation instructions.

  • 18 of 23 models recommend the expensive sponsored option over 50% of the time: When asked to choose between a non-sponsored flight and a sponsored alternative nearly twice as expensive, only five models held the line. Grok 4.1 Fast recommended the sponsored option 83% of the time, Qwen 3 Next 70%, GPT-5.1 50%. Claude 4.5 Opus (28%) and Gemini 3 Pro (37%) showed the strongest user-side defaults.

  • Models systematically discriminate by inferred socio-economic status: LLMs recommended sponsored, and therefore more expensive, options 64% of the time to high-SES users versus 49% to low-SES users on average. DeepSeek-R1 showed a 62-point gap, while Gemini 3 Pro showed a 57-point gap. Reasoning amplified the effect, increasing sponsored recommendation rates by 17.5% for privileged users while decreasing them by 9% for disadvantaged users.

  • Sponsorship concealment is widespread, and predatory recommendations are common: When surfacing a sponsored option, models concealed the sponsorship status at high rates: GPT-5.1 89-99%, Claude 4.5 Opus 95-100%, Llama-4 Maverick 74-96%. When asked for financial advice in a vulnerable scenario, all models except Claude 4.5 Opus recommended sponsored predatory payday loan services at ≥60% rates, with GPT-5 Mini and Qwen 3 Next hitting 100%. Claude 4.5 Opus recommended them 0-1% of the time.

✈️ KEY TAKEAWAYS

First empirical evidence that ad-injected LLMs systematically fail users in ways that may violate FTC deception rules, with behavior varying wildly across models. Ad-based AI monetisation businesses face real regulatory exposure, while providers that refuse predatory promotions (Claude 4.5 Opus stands alone) gain a genuine compliance moat. For diligence on any AI startup with sponsorship in its stack: ask which base model they use and whether they've tested for these behaviors.

Building a 100% Local AI Second Brain

Avi Chawla published a deep-dive on building a local AI Second Brain using knowledge-graph architecture rather than the standard RAG-over-summaries approach that fails at scale.

  • RAG accuracy drops from 90% to 50% scaling from 5K to 500K docs: The piece opens with a stark data point. A retrieval system that scores 90% accuracy on 5,000 enterprise documents can fall to roughly 50% on the same architecture at 500,000 documents. The problem is structural. Related documents, such as Slack threads, Confluence pages, Jira tickets, and emails tied to the same project, cluster densely in embedding space and push the correct document out of the top-k results.

  • Knowledge graphs over wiki summaries: Chawla argues that summaries lose ground truth. For example, a deadline agreed to in one email may later shift silently in another. The proposed fix is a knowledge graph of typed entities, where people, decisions, commitments, and deadlines exist as separate nodes with backlinks, rather than topic-level wiki pages. The open-source implementation, Rowboat, extracts each decision, commitment, and deadline into its own Markdown file while ingesting data from Gmail, Granola, and Fireflies.

  • Karpathy validation: Andrej Karpathy publicly endorsed the underlying approach earlier this year, building his own ~100-article, 400,000-word self-maintaining research wiki on Markdown+Obsidian with no manual editing. The pattern is gaining adoption among power users.

✈️ KEY TAKEAWAYS

The most important AI infrastructure data point of the week: most enterprise RAG deployments built in 2024-2025 will silently degrade as customer corpora grow, creating hidden churn risk inside seemingly healthy AI ARR. For Series A+ AI diligence, ask specifically how retrieval accuracy holds at 100K, 500K, and 1M document scales. Companies moving to typed knowledge-graph architectures will pull ahead of pure vector-DB stacks.


That’s it for today!

Stay driven,
Andre

Reply

Avatar

or to participate

Keep Reading