VeRA

Short for Vector database Retrieval Augmentation, this is the repository for our project Optimizing Knowledge Retrieval: A Benchmark-Driven Study of VecDB-RAG Integration.

VeRA is aimed at addressing the limitations and challenges faced by Large Language Models (LLMs) by integrating Vector Databases (VecDBs) with Retrieval-Augmented Generation (RAG) models. LLMs often suffer from issues such as hallucinations, lack of domain expertise, and bias in their responses, which hinder their reliability and real-time knowledge updates. Moreover, the computational resources required for training and operating LLMs requires heavy computational resources.

The project proposes a solution by leveraging VecDBs to provide efficient storage and retrieval of domain-specific knowledge for RAG models. By integrating VecDBs, RAG models can access up-to-date and contextually relevant data, leading to more accurate and relevant responses.

We also aim to develop a standardized evaluation framework to assess the performance of various VecDB-RAG combinations and conduct comparative analyses with different types of open-source vector databases. By optimizing the retrieval procedure and exploring alternative algorithms, we seek to enhance the efficiency and effectiveness of RAG models in dynamically evolving environments. Time and resources permitting, we also aim to explore the performance of various algorithms in optimally matching and retrieving relevant records from the VecDB.

The benchmark is proposed to evaluate the performance of the Vector DB - RAG model based on the following metrics:

Retrieval Latency
Throughput
Computational Complexity
Scalability

Some open-source vector databases that can be considered for this project include:

FAISS
Milvus
Weaviate

Key Features

Integration of VecDBs with RAG models
Efficient storage and retrieval of domain-specific knowledge
Real-time updates and contextually relevant responses
Development of standardized evaluation frameworks

Report

Find the detailed report on the project here.

Online Proceedings

Check out our SPOTLIGHT Paper in the proceedings here

Contributors

Vihang Pancholi
Shubh Mehta
Aditi Ganapathi

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
chromadb		chromadb
dataset/aws-case-studies-blogs-dataset		dataset/aws-case-studies-blogs-dataset
faiss		faiss
weaviate		weaviate
README.md		README.md
VeRA_Report.pdf		VeRA_Report.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VeRA

Key Features

Report

Online Proceedings

Contributors

About

Releases

Packages

Contributors 2

Languages

g-aditi/vera

Folders and files

Latest commit

History

Repository files navigation

VeRA

Key Features

Report

Online Proceedings

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages