Data-Driven VC

Data-Driven VC

Share this post

Data-Driven VC
Data-Driven VC
Data-driven VC #10: Processes and frameworks for augmented VCs
Copy link
Facebook
Email
Notes
More
Essays

Data-driven VC #10: Processes and frameworks for augmented VCs

Where venture capital and data intersect. Every week.

Andre Retterath's avatar
Andre Retterath
Nov 17, 2022
∙ Paid
10

Share this post

Data-Driven VC
Data-Driven VC
Data-driven VC #10: Processes and frameworks for augmented VCs
Copy link
Facebook
Email
Notes
More
Share

👋 Hi, I’m Andre and welcome to my weekly newsletter, Data-driven VC. Every Thursday I cover hands-on insights into data-driven innovation in venture capital and connect the dots between the latest research, reviews of novel tools and datasets, deep dives into various VC tech stacks, interviews with experts and the implications for all stakeholders. Follow along to understand how data-driven approaches change the game, why it matters, and what it means for you.

Current subscribers: 2,682, +143 since last week


Let machines do what machines can do best, and humans do what humans can do best. Collecting and processing data, and interacting with other humans, respectively. What does that mean in practice? Well, I believe in a future VC setup where web crawlers/scrapers collect identification and enrichment data to achieve comprehensive startup coverage and holistic, well-balanced information about the respective companies top of the funnel but where humans eventually sit together and decide to partner up at the bottom of the funnel. But what happens in-between?

VC use cases: Exploration versus Research

Assuming comprehensive company and data coverage at the top, the next question is how to leverage this information most effectively to narrow down the funnel. In line with the previous identification and enrichment logic, there exist two major use cases that need to be considered for data presentation:

  • Exploration. Identification of new, promising investment opportunities. Investors need effective and inclusive filters to cut through the noise. I covered this use case in detail in the previous episodes “How to automate startup screening” and “Patterns of successful startups”

  • Research. Track and follow previously identified opportunities over time or search for specific companies/markets/trends to conduct a deep dive. Investors need a well-organized overview of all existing data as well as the ability to conduct analyses at the point. This has been solved by CRM providers and mainly requires some great UI/UX work, so I won’t go into detail here.

Ultimately, we need to unite both use cases in one interface to create a single source of truth and remove the need for context switching. Exploration is a screening challenge, Research is a UI/UX challenge. To solve the screening challenge, I strongly believe in a hybrid screening setup that combines static, deterministic filters (what the investors look at) with dynamic, ML-based filters (what the data tells us); see my previous episode here.

Benefits of a hybrid screening approach

Static, deterministic screening:

  • Incorporate your own “style” to differentiate. Assuming every VC uses exactly the same data to identify patterns of successful companies, everyone would end up with the same set of screening filters. In order to differentiate, VCs can easily balance filters with their specific preferences in terms of geography, ticket size, industry, technology or even founder profiles.

  • Be more inclusive. Purely data-driven approaches would suggest mirroring the past into the future as filters are based on historic data. Considering that minorities received little funding in the past, this would never change with a purely data-driven approach. Tackling this problem, VCs can empower minorities by prioritizing them accordingly in their filters.

  • Adapt to changing requirements. In line with the previous bullet, purely data-driven approaches would not be able to adapt to changing market dynamics. For example, ML-based approaches might tell us what a successful B2C FinTech Neobank looks like but it would struggle to identify patterns of a successful Core Fusion company as the data does not yet exist. Investors can rebalance by including their own perspectives about the future in their filters.

Dynamic, ML-based screening

  • Remove human bias. For example, human investors have a limited set of experience and an incomplete sample of successful companies. As a result, they would over-index these patterns and miss out on for them so far unknown success patterns. Moreover, traditional screening is impacted by similarity bias, recency bias and others. Incorporating patterns from comprehensive datasets reduces biases significantly compared to human processes.

  • Scalability. ML-based approaches can easily process and screen hundreds of thousands, even millions of startups in due course whereas humans would take forever to do so.

By combining the best of both worlds, we receive a single “likelihood of success score” that helps investors to cut through the noise. We can then either present all companies and leave it to the investor to sort companies based on their score and manually define a cut-off in line with the overarching filters and available resources at the time or we define a fixed score cut-off and only present the companies that qualify above. So far, so good. But how can we integrate these novel sourcing and screening approaches into existing processes to close the remaining gap between machines at the top of the funnel and human investors at the bottom of the funnel?

Traditional VC processes and frameworks

First off, let’s understand the status quo of the VC funnel. Processes and frameworks of how VCs split responsibilities among their team members are highly diverse and depend on a variety of factors. Clearly, there exist interdependencies as the focus of the fund is the sum of the focus of all individual team members. A bigger fund delivers more management fee which in turn allows to grow a bigger team which then allows every individual to specialise more. I see 5 major dimensions of focus.

  1. Industry: DeepTech, IndustrialTech, FinTech, Logistics, Mobility etc.

  2. Technology: AI, Blockchain, IoT, Robotics, VR/AR, web3 etc.

  3. Business model: SaaS, marketplace, open-source, transactional etc.

  4. Value chain: Sourcing, screening, deal assessment, deal winning, closing, portfolio work, etc.

  5. Geography: DACH, France, UK, Nordics, Southern Europe etc.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Andre Retterath
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More