Data-Driven VC

Data-Driven VC

Share this post

Data-Driven VC
Data-Driven VC
First steps as a data-driven VC without coding skills: AI-powered Google Sheet to track LinkedIn profiles
Essays

First steps as a data-driven VC without coding skills: AI-powered Google Sheet to track LinkedIn profiles

DDVC #32: Where venture capital and data intersect. Every week.

Andre Retterath's avatar
Andre Retterath
Apr 20, 2023
∙ Paid
17

Share this post

Data-Driven VC
Data-Driven VC
First steps as a data-driven VC without coding skills: AI-powered Google Sheet to track LinkedIn profiles
1
1
Share

👋 Hi, I’m Andre and welcome to my weekly newsletter, Data-driven VC. Every Thursday I cover hands-on insights into data-driven innovation in venture capital and connect the dots between the latest research, reviews of novel tools and datasets, deep dives into various VC tech stacks, interviews with experts and the implications for all stakeholders. Follow along to understand how data-driven approaches change the game, why it matters, and what it means for you.

Current subscribers: 7,747+, +162 since last week


Brought to you by VESTBERRY - the future of portfolio management.

Harness real-time data to leapfrog in the investment game and uncover hidden opportunities. Make data-driven decisions with VESTBERRY's intuitive platform.

Book free demo


Incentives to become more data-driven are obvious, yet many firms are stuck in a buy versus build trade-off and end up doing nothing.

Loading...

“What are the first steps on the journey from productivity VC to data-driven VC?” Today, I’m incredibly excited to have Vlastimil Vodička, CEO and Founder of Leadspicker, share his step-by-step guide to start leveraging AI as a VC - without coding skills - in his guest post below.


If you're a fan of Andre‘s newsletter, chances are you're already intrigued by using LinkedIn for startup sourcing and taking advantage of GPT's powerful classification and categorization capabilities, even if you lack coding skills.

As part of a little no-code experiment, we've put together a comprehensive guide on how to connect OpenAI's GPT to your Google Spreadsheet. This guide will show you how to evaluate companies scraped from LinkedIn and determine if they're a startup or not, and how to categorize them into predefined categories directly in your spreadsheet using GPT.

By reading this blog post, you will learn:

  • How to Add OpenAI's GPT to Your Google Spreadsheet

  • Challenges We Faced When Scraping Data from LinkedIn

  • How to Classify and Categorize Startups with GPT via Your Google Spreadsheet

  • Outcomes of Our Little No-Code Experiment

Extracting Data from Linkedin Sales Navigator

For this experiment, we extracted data from LinkedIn Sales Navigator, focusing on new founders in the Central and Eastern (CEE) region within the last two years. To achieve this, we utilized advanced search filters in Sales Navigator, targeting specific job titles. We recommend using a boolean query such as "founder" OR "co-founder" OR "CEO" OR "CTO" instead of the filter options that LinkedIn offers, as it can provide more accurate and comprehensive results.

We chose to focus on the CEE region, but you can select any region according to your specific interests.

To extract data from Sales Navigator, we recommend using no-code tools such as PhantomBuster, Duck-soup or Apify.

Challenges we faced: 

  • Splitting your search region into smaller data samples of a maximum of 2,000 contacts per batch can help ensure that you're able to extract all relevant data. LinkedIn doesn't display the exact number of people who match your search criteria in their database, so this approach can be useful in making sure that you retrieve all the necessary data.

  • False positive: LinkedIn can also display completely irrelevant profiles that don't match your search criteria, and it's unclear why this happens. This can make it difficult to accurately categorize and analyze extracted data. As a result, it's important to carefully clean and filter the data before using it for further analysis. While it can be time-consuming, this step is crucial to ensure the accuracy and reliability of your data.

  • Data cleaning: it's important to deduplicate and clean the data by removing obviously irrelevant profiles. This step is necessary to ensure that the final data set is accurate.

  • Data enrichment may be necessary depending on the tool you used for extracting data. Make sure that you have scraped the information from LinkedIn company profiles, such as the company description, headquarters location, and the number of employees. PhantomBuster and Duck-soup can do the trick

Keep in mind the Linkedin profile visit limits to avoid your LinkedIn account from getting blocked, as already described in his article. 

Outcome: We were able to export a total of 29,763 profiles. After some basic deduplication and data cleaning, we ended up with 21,311 unique firms in the dataset. 

Now, let's find out if they really are new startups and in which industries they can be classified to.

Join 7,700+ thought leaders from VCs like a16z, Accel, Index, Sequoia and more.

How to add GPT-3.5 to Google Spreadsheet

  1. Go to Google Sheet, where you want to add GPT-3 -> go to Extensions -> Apps Script

  1. Copy and paste the attached code into your Google Apps Script Project

/**

* Generates text using OpenAI's GPT-3 model

* @param {string} prompt The prompt to feed to the GPT-3 model

* @param {string} cell The cell to append to the prompt

* @param {number} [maxWords=10] The maximum number of words to generate

* @return {string} The generated text

* @customfunction

*/

function runOpenAI(prompt, cell, maxWords) {

const API_KEY = "YourAPIkey";

maxTokens = 100

if (maxWords){maxTokens = maxWords * 0.75}

model = "gpt-3.5-turbo"

prompt = prompt+cell+":"

temperature= 0

 // Set up the request body with the given parameters

 const requestBody = {

   "model": model,

   "messages": [

       {"role": "system", "content": "You are a helpful assistant that answers questions."},

       {"role": "user", "content": prompt},

   ],

   "temperature": temperature,

   "max_tokens": maxTokens

 };

 console.log(requestBody)

 // Set up the request options with the required headers

 const requestOptions = {

   "method": "POST",

   "headers": {

     "Content-Type": "application/json",

     "Authorization": "Bearer "+API_KEY

   },

   "payload": JSON.stringify(requestBody)

 };

 // Send the request to the GPT-3 API endpoint for completions

 const response = UrlFetchApp.fetch("https://api.openai.com/v1/chat/completions", requestOptions);

 console.log(response.getContentText())

 // Get the response body as a JSON object

 const responseBody = JSON.parse(response.getContentText());

 //let answer= responseBody.choices[0]["text"].text

 let answer= responseBody.choices[0]["message"]["content"]

 // Return the generated text from the response

 return answer

}

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Andre Retterath
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share