GitHub - dmamakas2000/ipo: This GitHub repository implements a novel approach for detecting Initial Public Offering (IPO) underpricing using pre-trained Transformers. The models, extended to handle large S-1 filings, leverage both textual information and financial indicators, outperforming traditional machine learning methods.

Abstract

Through the past decades, Initial Public Offerings (IPOs) evolved into an irreplaceable tool for companies to raise capital. Generally, IPOs describe the procedure of offering private corporative shares to the primary market, attracting professional investors or venture capitalists to purchase them. Afterward, the securities become available in the secondary market, where they become easily traded by individuals. Typically, when U.S. firms go public, they follow an explicit procedure. Specifically, the U.S. Securities and Exchange Commission (SEC) requires the submission of the S-1 filing document (also referred to as IPO prospectus) on the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system. This clause ensures investors have prior knowledge of the issuing company’s valuation, potential risks, or future business plans. Hence, IPO underpricing has received considerable attention through the years by triggering economists and financial experts. Overall, underpricing remarks the listing of an IPO at a price lower than its entered value on the stock market after the first trading day. The opposite scenario indicates IPO overpricing. For investigating these phenomena, previous work applied non-state-of-the-art Machine Learning (ML) techniques that use features retrieved from the S-1 fillings to classify IPOs or specific financial indications. However, traditional ML methods impose processing limitations due to the prospectus’s large document size as they contain a considerably high number of words, making them hard to process and analyze. Therefore, to address this issue, in this study, we go beyond the bounds and dive into the predictive power of IPOs by utilizing pre-trained Transformers. To detect underpricing, we use textual information retrieved from S-1 fillings, along with specific economic knowledge coming from certain financial indicators. We introduce a collection of models that extend the vanilla architectures for up to 20,480 tokens, thus making them a reliable option for facing the needs of this classification task. Finally, the findings indicate that our methods outperform the baselines in most experiments and unveil the way for further investigation on this topic.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
experiments		experiments
functions		functions
models		models
scripts		scripts
trainer		trainer
tuning		tuning
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiments

experiments

functions

functions

models

models

scripts

scripts

trainer

trainer

tuning

tuning

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Abstract

About

Releases

Packages

Languages

License

dmamakas2000/ipo

Folders and files

Latest commit

History

Repository files navigation

Abstract

About

Topics

Resources

License

Stars

Watchers

Forks

Languages