👋 Hi, I’m Andre and welcome to my newsletter Data Driven VC which is all about becoming a better investor with data and AI.

Join our upcoming in-person events:

DDVC x SuperReturn Breakfast 9th June in Berlin
DDVC x Bits & Pretzels Breakfast 30th Sept in Munich

Brought to you by Affinity - The AI-First Private Capital CRM

The fastest way to use AI on your deals is to put your data where the AI already lives.

Affinity's new hosted MCP server connects your live CRM directly to Claude, ChatGPT, Copilot, and Gemini. Your AI assistant can now read from and write back to your pipeline in natural language.

Prep for a partner meeting in seconds. Pull deal comps mid-conversation. Update deal stages, add tags, and set reminders without leaving your AI tool.

Put Affinity, the AI-first private capital CRM, at the center of every AI workflow your team runs.

Learn more

Anthropic Shipped Claude Opus 4.8 Today..

.. and it beats its predecessor across most major benchmarks while topping OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro in several key categories.

Anthropic frames it as "a modest but tangible improvement" over Opus 4.7, not a leap. For investors, the interesting part is not the headline score. It is which improvements actually change how a fund operates.

The cadence is the first signal. Opus 4.6 landed in February, Opus 4.7 on April 16, and Opus 4.8 today, roughly six weeks later.

The frontier is now repricing itself every six weeks, and any thesis built on a static capability assumption is already stale.

The Benchmarks, Briefly

The agentic coding jump is the cleanest number. Opus 4.8 leads on SWE-bench Pro with 69.2%, versus 64.3% for Opus 4.7, 58.6% for GPT-5.5, and 54.2% for Gemini 3.1 Pro.

On multidisciplinary reasoning it scores 49.8% without tools and 57.9% with tools, ahead of all three rivals. On agentic computer use it reaches 83.4%, and one browser-agent partner clocked it at 84% on a separate web-navigation test.

It is not a clean sweep. GPT-5.5 still wins on terminal and command-line workflows, and the two are roughly tied on graduate-level science.

The takeaway for diligence: there is no longer one model that wins everything, so "which model" is becoming a workflow-by-workflow decision, not a vendor decision.

Join our next meetup in Berlin to discuss the future of VC

Sign up

The Honesty Upgrade

This is the change you will not see on a benchmark chart but will feel inside an hour of real work.

Anthropic reports Opus 4.8 is roughly 4x less likely than Opus 4.7 to let a flaw in its own output pass unremarked, and more likely to flag when it is uncertain. Its alignment assessment hit new highs on prosocial traits, with misalignment rates closer to Anthropic's best-aligned internal model.

For an investor, that is the difference between a tool you babysit and a tool you delegate to. A model that confidently hands you a wrong comp or a fabricated citation is a liability in an IC memo.

One of Anthropic's launch testimonials came from an investment associate, who singled out the model proactively surfacing problems in the inputs and outputs of an analysis that other models left for the human to catch.

A model that tells you when it is unsure is worth more in diligence than one that is occasionally brilliant and occasionally wrong with the same confident tone.

Cheaper, Faster, Same Price

Standard pricing is unchanged at $5 per million input tokens and $25 per million output tokens, identical to Opus 4.7.

The economics moved underneath.

Fast mode now runs at 2.5x the speed and is 3x cheaper than before, at $10 / $50 per million tokens, which finally puts a frontier model inside reach of latency-sensitive, high-volume pipelines.

Two partner data points make this concrete for data-driven workflows. One enterprise platform reported reasoning over PDFs and diagrams at 61% cheaper token cost than Opus 4.7, and a financial-document orchestrator cited better citation precision on dense filings.

There is also a new effort control on Claude AI and Cowork, letting you dial thinking from low to max. Run low effort on routine triage and max on the hard memo, and you cut spend without touching quality on what matters.

Same model quality at the same headline price, with the per-task cost of running it on your data falling sharply, is the upgrade that compounds across a fund's monthly bill.

Dynamic Workflows: Portfolio-Scale Work in One Instruction

The feature with the highest ceiling is Dynamic Workflows in Claude Code, now in research preview on Enterprise, Team, and Max plans.

For large tasks, Claude plans the work, runs hundreds of parallel subagents in a single session, then verifies its own output before reporting back. Anthropic's headline example is a codebase migration across hundreds of thousands of lines, from kickoff to merge.

Translate that off the engineering org and into the investment team. A market map across several hundred companies, a portfolio-wide data refresh, or a thematic teardown of an entire vertical starts to look like one instruction rather than a one-week analyst sprint.

The unit of delegation just jumped from a single task to an entire workstream, and that is where the operating leverage for a lean fund actually lives.

What It Actually Means

Zoom out, because the release lands inside a bigger picture investors should price in.

Opus 4.8 still sits below Anthropic's most capable internal model, Mythos, which is restricted to a handful of organizations for cybersecurity work today. Anthropic says it expects to bring Mythos-class models to all customers in the coming weeks.

The release also arrives as Anthropic and OpenAI race toward public-market debuts, with Anthropic having raised a $65B round at a $965B post-money valuation. The capability curve and the capital curve are steepening together.

For a VC portfolio, the implication is the one we keep returning to: Each release commoditizes another layer of the stack, so the durable value is not the model and not the thin wrapper around it, but the proprietary context and workflow that a product accumulates and that a new model release cannot replicate.

Buy the capability that is now available to everyone, and build the edge that a frontier release cannot hand your competitors for free.

Join 1611+ investors in our free Slack group as we automate our VC job end-to-end with Claude, OpenClaw, n8n & more.

Join the experiment

How VC Investors Should Actually Use Opus 4.8

The use cases where Opus 4.8 is better than any other model available right now and how to configure it to get started today 👇

The Four Use Cases Where It Wins Today

1) Diligence memos you can actually trust. The honesty upgrade matters most here, because the model now flags weak evidence instead of papering over it, which is exactly the failure mode that makes an IC memo dangerous.

Feed it a data room, a founder deck, and your call notes, and ask for the bear case first. The 4x lower rate of letting its own flaws slide is the difference between a draft you edit and a draft you have to fact-check line by line.

2) Financial analysis over messy filings. Opus 4.8 reasons directly over PDFs, cap tables, and diagrams, and partners reported it doing so at 61% cheaper token cost than Opus 4.7 with better citation precision.

This is the workflow to point it at for cohort math, burn analysis, and comp pulls where a fabricated number is worse than no number.

3) Market maps at portfolio scale. Dynamic Workflows lets Claude run hundreds of parallel subagents in one session, so a full-vertical teardown across several hundred companies becomes one instruction rather than an analyst week.

This is Claude Code on Enterprise, Team, or Max plans, and it is the single biggest leverage unlock for a lean fund this release.

4) Long-context synthesis. With the 1M-token context window, you can drop an entire founder corpus, a sector's worth of memos, or a quarter of portfolio updates into one prompt and ask for the throughline.

The pattern across all four: Opus 4.8 wins where the cost of a confident wrong answer is highest, which is most of investing.

How to Configure It

Pick your effort level deliberately. Opus 4.8 defaults to high effort, but you can now dial it on Claude AI and Cowork from low to max.

Run low on routine triage like inbox sorting and first-pass screening, and max (or "extra" / xhigh in Claude Code) on the hard memo or the contentious comp. That single habit cuts spend without touching quality on the work that matters.

Turn on fast mode for pipelines. Use /fast in Claude Code for high-volume, latency-sensitive jobs like enriching a deal list, now 3x cheaper at 2.5x the speed.

Use the API for anything recurring. The model string is claude-opus-4-8 at the unchanged $5 / $25 per million tokens, and the Messages API now lets you update instructions mid-run without breaking the prompt cache, which matters for long agentic jobs.

The Bottom Line

Opus 4.8 is a steady step, not a revolution. Better judgment, sharper honesty, cheaper inference, same price.

For investors, the signal matters more than the model. The frontier is repricing every few weeks, the gap between vendors is narrowing into workflow-specific tradeoffs, and the question is no longer whether to adopt but how fast your edge can stay ahead of the next release.

Stay driven,
Andre

PS: Don’t forget to sign up for our next DDVC get-together on June 9th

Anthropic Released Claude Opus 4.8 Today