Page de couverture de [RAG-GOOGLE] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings

[RAG-GOOGLE] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings

[RAG-GOOGLE] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings

Écouter gratuitement

Voir les détails du balado

À propos de cet audio

Welcome to our podcast! Today, we're diving into MUVERA (Multi-Vector Retrieval Algorithm), a groundbreaking development from researchers at Google Research, UMD, and Google DeepMind. While neural embedding models are fundamental to modern information retrieval (IR), multi-vector models, though superior, are computationally expensive. MUVERA addresses this by ingeniously reducing complex multi-vector similarity search to efficient single-vector search, allowing the use of highly-optimised MIPS (Maximum Inner Product Search) solvers.

The core innovation is Fixed Dimensional Encodings (FDEs), single-vector proxies for multi-vector similarity that offer the first theoretical guarantees (ε-approximations). Empirically, MUVERA significantly outperforms prior state-of-the-art implementations like PLAID, achieving an average of 10% higher recall with 90% lower latency across diverse BEIR retrieval datasets. It also incorporates product quantization for 32x memory compression of FDEs with minimal quality loss.

A current limitation is that MUVERA did not outperform PLAID on the MS MARCO dataset, possibly due to PLAID's extensive tuning for that specific benchmark. Additionally, the effect of the average number of embeddings per document on FDE retrieval quality remains an area for future study. MUVERA's applications primarily lie in enhancing modern IR pipelines, potentially improving the efficiency of components within LLMs.

Learn more: https://arxiv.org/pdf/2405.19504

Ce que les auditeurs disent de [RAG-GOOGLE] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings

Moyenne des évaluations de clients

Évaluations – Cliquez sur les onglets pour changer la source des évaluations.