Semi-structured RAG with LangChain and OpenAI GPT-4 RAG on tabular data , semi structured documents

Опубликовано: 05 Январь 2025
на канале: Rithesh Sreenivasan
4,237
94

If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: https://www.buymeacoffee.com/rithesh

Many documents contain a mixture of content types, including text and tables.
Semi-structured data can be challenging for conventional RAG for at least two reasons:
• Text splitting may break up tables, corrupting the data in retrieval
• Embedding tables may pose challenges for semantic similarity search
This video shows how to perform RAG on documents with semi-structured data:
• We will use Unstructured to parse both text and tables from documents (PDFs).
• We will use the multi-vector retriever to store raw tables, text along with table summaries better suited for retrieval.
• We will use LCEL to implement the chains used.
Colab notebook: https://colab.research.google.com/dri...
https://github.com/langchain-ai/langc...


If you like such content please subscribe to the channel here:
https://www.youtube.com/c/RitheshSree...