4 major challenges with RAG

Amit Singh Rathore
2 min readJan 5, 2024

Experience-based learning on what’s difficult with RAG in real life

Real data [Documents] are Messy

Most of the RAG examples out there deal with one type of doc. Either a PDF or Word doc. But in real life information is spread across formats like PDFs, PPTs, GitHub readme, RST files hosted on Sphinx server, Wiki pages on confluence, and so on. On top of this, each document does have images, tables, and code blocks in that. So parsing the right element in them becomes extremely challenging.

Unstructured is the…

--

--

Amit Singh Rathore

Staff Data Engineer @ Visa — Writes about Cloud | Big Data | ML