Understanding Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) technology enhances language models by integrating retrieved documents with user queries, injecting additional contextual information. This combination not only enriches the model’s knowledge base but also improves the accuracy and relevance of its responses.
For instance, when users pose questions involving specific personal information or business data, RAG can extract key information from predefined data sources, helping the model provide more precise answers.
In simple terms, RAG technology equips language models with an intelligent background knowledge base, enabling them to respond more wisely to complex questions.
GraphRAG: Enhancing Context Through Graph Structures
Building on traditional RAG, GraphRAG takes a further step by constructing graphical networks of information, allowing for deeper understanding and application of data. This approach goes beyond merely retrieving relevant documents; it enables insight into the complex relationships between entities within those documents.
For example, if a dataset contains information about cities like New York, Los Angeles, and Chicago, GraphRAG can not only retrieve relevant data about these cities but also identify and utilize various relationships among them, such as trade partnerships between New York and Los Angeles or cultural exchange projects between Los Angeles and Chicago.
This technology allows GraphRAG to exhibit richer insights and deeper analytical capabilities when processing information.
Setting Up the Environment
Source Code Location:
You can find the source code at: GitHub Repository
Creating a Virtual Environment
To get started, create a virtual environment:
python -m venv why_graphRAG_such_game_changer
Activating the Environment
Activate the environment using the following commands:
- Windows:
why_graphRAG_such_game_changerScriptsactivate
- Unix/macOS:
source why_graphRAG_such_game_changer/bin/activate
Installing Dependencies
Navigate to your project directory and install the necessary dependencies:
cd path/to/your/project
pip install -r requirements.txt
Pulling Necessary Models with Ollama
To pull the required models, use the following commands:
ollama pull mistral
ollama pull nomic-embed-text
Installing Ollama
Visit the Ollama official website to download the appropriate version for your operating system (Windows, macOS, or Linux) and follow the installation instructions provided.
Once installed, open your terminal or command prompt and verify the installation by running:
ollama --version
Initializing and Using GraphRAG
To initialize GraphRAG, run the following command:
python -m graphrag.index --init --root ragtest
Downloading Data
You can download any data you prefer. A small, scenario-based, relationship-oriented storyline has already been checked and placed in the “input” folder.
Initialization Process
When you run the command python -m graphrag.index --init --root ragtest
, the following operations will be performed automatically:
Creating Configuration Files
- .env File:
- Purpose: Stores environment variables, including API keys for OpenAI or Azure OpenAI endpoints.
- Usage: This file is referenced by the GraphRAG system to access necessary credentials and configuration settings.
- settings.yaml File:
- Purpose: Contains detailed configuration settings for the GraphRAG pipeline.
- Usage: This file allows users to customize the behavior of the GraphRAG pipeline, specifying which language model (LLM) to use and how to interact with it.
- prompts/ Directory:
- Purpose: Stores default prompts used by GraphRAG to guide the LLM in generating responses.
- Usage: Users can modify these prompts to improve the accuracy and relevance of generated responses. The directory also supports running automatic prompt adjustment commands to create new, data-specific prompts.
Modifying settings.yaml
Since we plan to run GraphRAG locally, we need to make some changes to the default settings.yaml
file:
encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat
model: mistral
model_supports_json: true
api_base: http://localhost:11434/v1
parallelization:
stagger: 0.3
async_mode: threaded
embeddings:
async_mode: threaded
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding
model: nomic_embed_text
api_base: http://localhost:11434/api
Modifying to Use Ollama Embeddings
GraphRAG does not support Ollama embedding models natively, but you can add this support by modifying specific files in the GraphRAG library. By editing the openai_embeddings_llm.py
file, we can enable the use of Ollama’s embedding model. This involves importing necessary modules and updating the code to call Ollama’s embedding function, allowing us to leverage advanced embeddings within the GraphRAG pipeline.
Here’s how to update the openai_embeddings_llm.py
file:
from typing_extensions import Unpack
from graphrag.llm.base import BaseLLM
from graphrag.llm.types import EmbeddingInput, EmbeddingOutput, LLMInput
from .openai_configuration import OpenAIConfiguration
from .types import OpenAIClientTypes
import ollama
class OpenAIEmbeddingsLLM(BaseLLM[EmbeddingInput, EmbeddingOutput]):
"""
Custom LLM implementation using Ollama for embeddings.
"""
def __init__(self, client: OpenAIClientTypes, configuration: OpenAIConfiguration):
"""
Initialize OpenAIEmbeddingsLLM instance.
Parameters:
client (OpenAIClientTypes): Client to interact with AI services.
configuration (OpenAIConfiguration): Configuration settings for AI services.
"""
self._client = client
self._configuration = configuration
async def _execute_llm(self, input: EmbeddingInput, **kwargs: Unpack[LLMInput]) -> EmbeddingOutput | None:
"""
Execute LLM using Ollama embeddings.
Parameters:
input (EmbeddingInput): Input data for the embedding model.
**kwargs (Unpack[LLMInput]): Additional keyword parameters for model settings.
Returns:
EmbeddingOutput | None: Output embedding from the model or None if no output.
"""
args = {
"model": self._configuration.model,
**(kwargs.get("model_parameters") or {}),
}
embedding_list = []
for inp in input:
embedding = ollama.embeddings(model="nomic-embed-text", prompt=inp)
embedding_list.append(embedding["embedding"])
return embedding_list
Running Queries with GraphRAG
To index data, use the following command:
python -m graphrag.index --root .ragtest
Output
You will receive output confirming the indexing process.
To test your setup, run a query with:
python -m graphrag.query --root .ragtest --method global "How was the impact of Lily's action on villagers"
Output
This will display the results of your query.
Conclusion
In the future, the influence of artificial intelligence in the business sector will continue to grow. Companies that can effectively integrate AI technologies into their operations will experience faster innovation, enhanced customer service, and more efficient process optimization. The emergence of GraphRAG technology is just the tip of the iceberg regarding AI potential.
As time progresses, AI will become increasingly sophisticated. It will not only understand questions but also generate responses that closely mirror human thought processes. In the near future, businesses will witness the birth of a new type of AI system—one that can not only answer questions but also provide strategic insights and recommendations based on extensive data analysis. Therefore, for any company looking to maintain a competitive edge, actively embracing AI and staying informed about its latest developments is not just a choice but an imperative trend.