This event has passed.

Document Chunks with LLM Sherpa

Name: Document Chunks with LLM Sherpa
Start: 2024-04-03T18:00:00-05:00
End: 2024-04-03T19:00:00-05:00
Location: HudsonAlpha

April 3, 2024 @ 6:00 pm – 7:00 pm

Continuing our discussion about Retrieval Augmented Generation (RAG), this week we will incorporate LLM Sherpa to provide chunks of text from PDF documents that have been retrieved from the NASA archive.

Our initial attempt used PyPDF2 to read text from the PDF documents. It was very slow and provided limited strings of text that did not match the paragraphs in the documents. We’ll take a look back at what was available at the time, and then look through the LLM Sherpa API and see what it looks like with that piece incorporated.

As we get further into this project update, it has become apparent for the need to split the monolithic application into components that can be hosted and updated separately. We will go through what has been done so far to containerize both the ChromaDB vector database and the LLM Sherpa for chunking.

Details

Date:: April 3, 2024
Time:: 6:00 pm – 7:00 pm

Venue

: HudsonAlpha
: 601 Genome Way Northwest
Huntsville, AL 35806 + Google Map
: View Venue Website