We have been covering a series this year based on how a RAG works, and how to build one. Now it’s time to look at what it will cost to deploy it!
BLUF: If you have constraints that require you to host your own LLM instead of using services such as OpenAI, Anthropic, or Google – this is going to get expensive.
The baseline that we will be using to derive requirements and prices will be the a subset documents that we have been using along the way from NASA. To start, I have pulled 500 of these documents which total 109,516 paragraphs and 4,254,032 words.
We will start with a basic use case with using OpenAI GPT4o and a Weaviate hosted vector store. We will move from there to a self hosted solution for all of the components and add up the month cost associated. |