This event has passed.

Private RAG Deployment & Cost

July 24, 2024 @ 6:00 pm – 7:00 pm

We have been covering a series this year based on how a RAG works, and how to build one. Now it’s time to look at what it will cost to deploy it!

BLUF: If you have constraints that require you to host your own LLM instead of using services such as OpenAI, Anthropic, or Google – this is going to get expensive.

The baseline that we will be using to derive requirements and prices will be the a subset documents that we have been using along the way from NASA. To start, I have pulled 500 of these documents which total 109,516 paragraphs and 4,254,032 words.

We will start with a basic use case with using OpenAI GPT4o and a Weaviate hosted vector store. We will move from there to a self hosted solution for all of the components and add up the month cost associated.

Series:

Over the next several sessions, we will be diving deeper into separate components needed for RAG – hopefully resulting in a chat-based Q&A service for the NASA Technical Report Server. We were introduced to this data during our submission for the 2022 NASA SpaceApps Challenge – where we placed 2nd. Our submission was a semantic search based on the abstracts for the NSTR dataset of 10,000 papers.

Links:

Huntsville AI 2022 SpaceApps Submission – https://github.com/HSV-AI/spaceapps2022

Details:

Date – 07/24/2024
Time – 6-7pm
Location – HudsonAlpha
Address – 601 Genome Way Northwest, Huntsville, AL 35806
Zoom –https://us02web.zoom.us/j/84452278503?pwd=rJwxSbD1EAUdIHuzGoMscHYxpfULhR.1

Details

Date:: July 24, 2024
Time:: 6:00 pm – 7:00 pm

Venue

: HudsonAlpha
: 601 Genome Way Northwest
Huntsville, AL 35806 + Google Map
: View Venue Website