Improving RAG-based GPT Search Accuracy from 65% to 90%

Retrieval Augmented Generation (RAG) represents a groundbreaking approach in information retrieval, where the accuracy of search results directly influences the quality of generated answers. In essence, RAG combines traditional search mechanisms with the Large Language Model's ability to understand and generate answers. Search accuracy becomes particularly significant when considering that the answers generated by RAG are only as accurate as the documents it retrieves.

In this article, we will explore how we improve search accuracy for RAG applications from 65% using basic text search to over 90%.

‍

The Initial Framework: Setting the Stage for Advanced Search

Our initial setup for testing and improving accuracy involved several key components:

Azure Cognitive Search: A sophisticated search service provided by Microsoft Azure. It's designed to offer scalable and reliable search capabilities, which are crucial for handling large volumes of data.
Document and Question Database: We indexed 121 document chunks and linked 186 questions to specific documents containing the answers.
Success Measurement: The system's effectiveness was gauged by its ability to retrieve relevant documents within a 1000-token context window.

This foundational structure was essential for our subsequent enhancements.

‍

📊 Basic Text Search: The Starting Point

The initial approach was basic text search, a straightforward method:

How Basic Text Search Operates

Keyword Matching: Searches for exact match keywords within documents.
Limitations: Tends to miss out on the context and deeper semantic meanings.

Initial Results

Baseline Accuracy: This method achieved a starting accuracy of 65.41%.

Though basic text search was a good starting point, it was clear that more sophisticated methods were needed.

🔍 Implementing Search Term Expansion: Enhancing Queries

To improve upon basic text search, we introduced a Search Term Expansion with the following approach:

Questions are processed through a GPT-3.5-powered Search Term Expansion step, deriving additional relevant search terms and keywords.
GPT-3.5 analyzes the query to generate contextually enriched search terms, which are then used in Azure Cognitive Search.
Accuracy Uplift: This approach raised our search accuracy to 70.81%.

Integrating Search Term Expansion was a key move in bridging the gap between simple queries and the complex content within our documents.

🌐 Semantic Reranking: The Leap to Contextual Understanding

While text search is great for finding an initial set of documents, it often lacks a contextual understanding of the questions. As a result, the relevancy score - typically ranked using BM25 or RRF methods for text-based searches - of the resulting documents is often not accurate.

To solve this, we have enabled the Semantic Ranking feature in Cognitive Search, which uses natural language understanding to analyze the initial set of documents returned from the search and then re-ranks them based on its own natural language understanding capability.

Impact on Accuracy

Increased Precision: This method boosted our search accuracy to 82.70%. The ability of semantic search to interpret the nuances of queries was a major factor in improving search accuracy!

📈 Final Refinement: Incorporating Sample Questions

The last refinement involved adding sample questions to documents:

Enhancing Document Relevance

Creating Targeted Queries: Questions generated through human input and the Search Term Expansion step were added to documents.
Semantic Ranking Adjustment: We reconfigured semantic search to give more weight to these new questions.

Accuracy Improvement

Notable Accuracy Increase: This strategy further improved our accuracy to 90.27%. Incorporating sample questions ensured that our search system was accurate and highly relevant.

📝 What didn't make the cut

In addition to the features that enhanced our search accuracy, we evaluated other functionalities but ultimately chose not to implement them. This includes

Hybrid Search, which combines vector, text, and semantic approaches. Although this model boosted accuracy by a modest 0.5%, it came with considerable resource and latency drawbacks.
Search Term Expansion using GPT-4, which offered an accuracy improvement of 93%. However, its higher latency and costs led us to continue with GPT-3.5.

Final Thoughts: Striking the Right Balance

Our efforts to boost RAG's search result accuracy from 65% to 90% were marked by innovation and learning.

A bar chart shows improvement stages in search accuracy — Improvement Stages

Through exploring various methods and understanding their trade-offs, we achieved a balance that significantly enhances the precision and efficiency of information retrieval. This journey highlights a significant step forward in search technology and its application in the business world.

‍

Consult with our experts at Amity Solutions for additional information on Improving RAG Search Accuracy here