Alessandro Benedetti - Introducing Multi-valued Vector Fields in Apache Lucene

Опубликовано: 17 Январь 2025
на канале: Plain Schwarz
811
12

Since the introduction of native vector-based search in Apache Lucene happened, many features have been developed, but the support for multiple vectors in a dedicated KNN vector field remained to explore.
Having the possibility of indexing (and searching) multiple values per field unlocks the possibility of working with long textual documents, splitting them in paragraphs and encoding each paragraph as a separate vector: scenario that is often encountered by many businesses.
This talk explores the challenges, the technical design and the implementation activities happened during the work for this contribution to the Apache Lucene project.
The audience is expected to get an understanding of how multi-valued fields can work in a vector-based search use-case and how this feature has been implemented.

Speaker: Alessandro Benedetti

More: https://2023.berlinbuzzwords.de/sessi...

Web: https://2023.berlinbuzzwords.de/
Fediverse: https://floss.social/@berlinbuzzwords
Linkedin:   / 13978964  
Twitter:   / berlinbuzzwords