šĀ Hi, Iām Andre and welcome to my weekly newsletter, Data-driven VC. Every Thursday I cover hands-on insights into data-driven innovation in venture capital and connect the dots between the latest research, reviews of novel tools and datasets, deep dives into various VC tech stacks, interviews with experts and the implications for all stakeholders. Follow along to understand how data-driven approaches change the game, why it matters, and what it means for you.
Current subscribers:Ā 6,360+, +250 since last week
Following your great feedback on last weekās guest post covering Hustle Fundās data-driven journey, Iām happy to take a completely different angle and have Dries Faems contribute todayās episode. Dries is a Professor for Entrepreneurship, Innovation and Technological Transformation at the WHU Otto Beisheim School of Management, one of the leading entrepreneurial universities in Europe that is lucky to count the founders of Zalando, Rocket Internet, Forto, Flixbus, HelloFresh and many more unicorns to its alumni.

Iām particularly excited about this episode as it perfectly exemplifies how data-driven approaches can be leveraged outside of VC, for example in academic research, M&A or corporate innovation scouting. Thank you, Dries, for sharing your innovative work with us and providing a blueprint in your guest post below šš»
At the Chair of Entrepreneurship, Innovation and Technological Transformation of WHU, we have started building the WHU Founder Database, a data infrastructure which allows us to address exactly these kind of research questions. In this guest contribution, I want to provide a blueprint that will allow any data enthusiast to build a similar data infrastructure for his or her own organization. In this contribution, I will describe the following steps:
(i)Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Step 1: Identifying founders
(ii)Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Step 2: Collecting company data
(iii)Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Step 3: Collecting investor data
(iv)Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Step 4: Merging founder, company and investor data
(v)Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Step 5: Developing use cases for your data infrastructure
Step 1: Identifying founders
A valuable data source for collecting Founder Data is LinkedIn. Doing a search in LinkedIn Sales Navigator or LinkedIn Recruiter on the terms āFounderā and āCo-Founderā in the category Job Title and your organization in the category āCompanyā or āSchoolā will give you a good overview of all the founders in your ecosystem.
Some people are quite proactive in claiming a founder role. As an organization, for instance, you might not be really interested in people, who have been the founder of the local synchronized swimming club in their village (yes this is a real exampleā¦). Another issue is that employees in corporates might claim āfounderā roles for specific activities within the company (i.e., I am the founder of the feminist book club at Googleā¦). This requires careful cleaning to make sure that only relevant founders are identified.
Whereas LinkedIn is a valuable tool for identifying founders, it cannot be used for unauthorized scraping of founder profiles. LinkedIn defines unauthorized scraping as āthe use of code and automated collection methods to make (up to) thousands of queries per second and evade technical blocks in order to take data without permission.ā Andre has provided more info on the doās and donāts of web scraping in this newsletter post.

