Tudor Golubenco - Semantic vs keyword search as context for GPT

Опубликовано: 02 Ноябрь 2024
на канале: Plain Schwarz
322
6

The OpenAI ChatGPT has taken the world by storm and people want to be able to offer the same type of chat bot experience on their own data. Such a bot can answer questions based on your documentation or knowledge base.

This can be done with the OpenAI API by providing the right context, extracted from your data, to the model. You can do this in two steps:

the search step: perform a search to select the documentation pages that are likely to contain the answer.
the GPT step: provide these pages as context with a prompt like "With this context: .... answer this questions: ...".

For the search step, semantic search is often used, because it makes use of the LLM capabilities. However, we have found that in practice keyword search (e.g. BM25 based) has some advantages when it comes to tuning the search step, and it tends to be more "explainable".

Speaker: Tudor Golubenco

More: https://2023.berlinbuzzwords.de/sessi...

Web: https://2023.berlinbuzzwords.de/
Fediverse: https://floss.social/@berlinbuzzwords
Linkedin:   / 13978964  
Twitter:   / berlinbuzzwords