Automatic Extraction of Startup Metrics
KFund, a leading venture capital fund in Spain, invests in innovative startups in their early stages, such as Factorial and Graphext. Given their investment strategy in companies with great disruptive and growth potential, they face the challenge of managing a large number of investment proposals in presentation format (.pdf). These presentations vary significantly and contain crucial data, such as monthly recurring income (MRR), number of founders and five-year financial projections and which need to be analyzed in a structured manner to make informed investment decisions.
In response to this challenge, WhiteBox has developed an automated processing system. We began by transforming PDF files to text using computer vision technologies, including Vision Transformers, then we implemented an application with LLMs (OpenAI's next-generation language models), using the Langchain framework to extract relevant data from the pitch texts. Finally, we stored the KPIs of each startup in a relational database, thus facilitating the quick and efficient consultation and comparison of the metrics of different companies.
This approach transformed the KFund evaluation process from manual and laborious, requiring a full-time employee's dedication, to an automated and effective method. Now, KFund can focus on evaluating only those startups whose key indicators meet pre-established criteria for investment. This system not only saves time but also allows KFund to access consolidated metrics and valuable insights about startups that request funding, improving the selection and analysis of potential investments.