Financial Statement Analysis with Large Language Models
Predicting Company Performance Better Than Human Analysts
👋 Hi, I’m Andre and welcome to my weekly newsletter, Data-driven VC. Every Tuesday, I publish “Insights” to digest the most relevant startup research & reports, and every Thursday, I publish “Essays” that cover hands-on insights about data-driven innovation & AI in VC. Follow along to understand how startup investing becomes more data-driven, why it matters, and what it means for you.
Current subscribers: 23,150, +200 since last week
Brought to you by VESTBERRY - Portfolio Intelligence Platform for data-driven VCs
Watch this short video to learn how data-driven VCs automate their monthly portfolio performance reviews using ChatGPT, Vestberry, Slack, and Gmail via make.com! Discover a new way to stay updated on your portfolio's performance and offer support to your portfolio companies when needed, all powered by AI.
In 2020, I published the study “Human Versus Computer: Benchmarking Venture Capitalists and Machine Learning Algorithms for Investment Screening”. It was an early attempt to convince investors with hard facts to start augmenting their analysts with ML models.
While I dedicated my paper on the screening stage of the investment process, I’ve been on the lookout for studies focused on other parts of the value chain such as due diligence or portfolio value creation.
Without success.
Until today.
Today, I’m really excited to share a paper that I came across last night, providing tangible benchmarks on how LLMs perform financial statement analysis compared to professional analysts. It’s written by Alex G. Kim, Maximilian Muhn, and Valeri V. Nikolaev from The University of Chicago, Booth School of Business.
Their findings are specifically interesting as LLMs are adept at processing vast amounts of textual data and extracting patterns, which may suggest that they could potentially replace financial analysts. However, LLMs typically lack deep numerical reasoning and the intuitive judgment that human analysts apply when evaluating financial health and sustainability based on complex financial statements.
TL;DR
In short, this paper examines whether an LLM, specifically GPT-4, can analyze financial statements to predict future company performance (earnings in this case) as effectively as professional human analysts. Though it’s focused on more mature companies, the study provides unique direction on how LLMs perform in financial due diligence more broadly.
Remarkably, the LLM demonstrated superior performance in predicting earnings changes compared to seasoned financial analysts, especially in scenarios typically challenging for humans such as instances where analyst forecasts are likely to be biased or inefficient ex ante.
The study also reveals that the LLM's forecasting accuracy matches that of state-of-the-art, narrowly focused machine learning models. Notably, these predictions are not merely recollections from its training data; rather, GPT-4 provides insightful narrative analyses that shed light on potential future performance of companies.
Moreover, trading strategies developed from GPT-4’s forecasts achieved higher Sharpe ratios and alphas than those based on other models. These findings underscore the potential for LLMs to play a pivotal role in financial decision-making processes. Let’s dive into a bit more detail.
Computstat Data from 1968 to 2021
The authors used the entire universe of Compustat annual financial data from the 1968 to 2021 fiscal years. They also set aside data for 2022 to predict 2023 earnings to test the robustness of the model’s performance outside GPT’s training window. In particular, the GPT-4-Turbo preview’s training window ends in April 2023, and the model cannot have seen the earnings data of 2023, which was released in late March 2024.
The Methodology
The researchers' methodology involves anonymizing and standardizing financial statements to prevent the LLM from recalling any specific company information. They then asked the LLM to predict whether a company's earnings will increase or decrease, focusing solely on the figures provided in the balance sheet and income statement.