top of page

Understanding the costs of using LLMs

Are you an investor or entrepreneur considering getting into the Generative AI space? There is a big difference between showing a POC and going into production, so don't even start without understanding the costs.

Graph: Bigger AI models and costs require less training data.

Recently, Andreessen Horowitz published a great article about the Generative AI field. Still, one major element needed to be added - how to analyze the cost structure of Gen-AI startups and the implications for their potential success.


Zero-shot prompting has been one of the recent major advancements of #OpenAI. This means that there is no need for you to have examples. #ChatGPT was created using GPT-3.5. This new model provides decent results by explaining what is needed, thus opening up the door to many creative uses that can be explained with simple words, leading to perhaps 100 new startup ideas per second.

It isn't the case with competing Large Language Models, which often require a lot of "shots," examples of the desired outcome.


Due to its quality and simple onboarding and usage, basing a startup on this model may seem like a great decision, but it may be a honeytrap. If you look at the actual cost, two cents per 1000 tokens (~1000 words), you need to ask yourself - how much can I charge my clients to cover these costs? ChatGPT's premium version costs $42 for a reason. A large amount of computing power is required to run this model.


Given OpenAI's Davinci's high cost, Hugging Face's open-source GPT-like model may appeal to you. BLOOM has 176B parameters, and you think you've figured it out. Hosting it with AWS costs $40-50k per month before scaling up. Try bootstrap with that...

If you require prompting with an LLM, you might have to spend a lot of money to cover your costs, and you might never become profitable.


A breakthrough in computing costs may change this space entirely; however, there is another way to do it in the meantime.


You could be in a much better position cost-wise if you are a data-driven startup. When you have access to a large amount of well-structured data, you can train smaller models, which reduces inference costs. Next, you can decide whether to host your medium-sized model with Hugging Face or hold a trained AI21 Labs or an OpenAI model.


The cost structure of #generativeai startups can significantly impact their success, so entrepreneurs and investors should understand it. IP is also a key element. IP enables data to be leveraged, while data can become a risk without IP. My next post will address it. Keep an eye out!

bottom of page