We investigate how state-of-the-art large language models (LLMs) like GPT-4 process and utilize information from corporate disclosures to predict future earnings. By analyzing 2,344 earnings press releases, we uncover key factors influencing GPT’s information selection process, such as sentence position, clarity, numerical content, and sentiment. Comparing GPT’s earnings forecasts to analyst consensus reveals lower accuracy. However, GPT’s performance significantly improves for firms with better information environments and high-quality disclosures. The most consistent determinants of GPT’s forecast performance are firms’ information environment, disclosure specificity, and readability. While LLMs demonstrate potential in identifying essential information and providing initial analyses, they are not yet capable of replacing human analysts, particularly for firms with poor information environments or subpar disclosures. Our results offer guidance on disclosure practices in the era of LLMs and underscore their current limitations in financial forecasting.
<aside> <img src="/icons/groups_gray.svg" alt="/icons/groups_gray.svg" width="40px" /> Coauthored with Edward Li Zhiyuan Tu
</aside>
<aside> <img src="/icons/save_gray.svg" alt="/icons/save_gray.svg" width="40px" /> Download: WP
</aside>
<aside> <img src="/icons/boombox_gray.svg" alt="/icons/boombox_gray.svg" width="40px" />
Presented at: Hawaii Accounting Research Conference, Wolfe Research
</aside>