Don’t forget private retrieval: distributed private similarity search for large language models

URL:

https://openreview.net/pdf?id=JTcaziw7G1

Abstract:

Currently, we’ve seen large language models (LLMs) which are capable of performing general-purpose tasks, but are lacking in most up-to-date events and knowledge about private data. To supplement this, LLMs now are starting to use Retrieval Augmented Generation (RAG) methods to fetch relevant data about a given query, and to use both the query and fetched data as input to the model. However, as data could be hosted in private data stores, and queries could contain valuable intellectual property (IP), it’s important to be able to keep both opaque to the servers making these requests.

The paper provides a model name Private Retrieval Augmented Generation (PRAG), that is able to accomplish this goal. To protect the confidentiality of the query and the databases, they use multi-party computation (MPC) techniques that is compatible with standard neural IR embeddings and top-k retrieval algorithms, and is scalable to medium sized databases with minimal impact on accuracy. The communication costs is also relatively minimal, which is a big an improvement on previous MPC approaches. The performance model still could be improved to be more efficient with very large databases, and thus remains an avenue for further research.

Author:

Guy Zyskind, Tobin South, Alex ‘Sandy’ Pentland

Year:

2024

Domain: Cybersecurity & Sustainability