NashTech Blog

Meet AgenticRAG: Indexing Knowledge using Akka Workflows

Table of Contents
Hi! How may I help you?

Welcome back to our journey to AgenticRAG! The last article, Meet AgenticRAG: A Smart Interface to Akka Agentic AI – Part-1, provides a step-by-step guide to build a Retrieval-Augmented Generation (RAG) application using a very simple agent that streams responses from a LLM.

However, for a RAG application to work efficiently, indexing of documents is important. Since, the indexed documents will become the basis for AI Agent(s) to respond to User queries. Akka Workflows simplifies the process of indexing documents in a responsive and a scalable manner. This article will guide us through indexing documents in MongoDB Atlas DB using Akka Workflows.

Knowledge Indexing is a 3-step process:

  1. Adding Indexing Workflow: RagIndexingWorkflow
  2. Injecting a MongoDB Client
  3. Exposing the Workflow via Endpoint(s)

Step 1: Adding RagIndexingWorkflow

Core logic of RagIndexingWorkflow encapsulates 3 major components:

1. State

The workflow needs to maintain a state, i.e., a list of files the needs to be processed (toProcess) and a list of files which are processed (processed).

2. Processing Files

Since, we are treating the list of files as queue. Hence, a processingFileStep (StepEffect) is required which reads documents one by one, indexes them, and adds them in MongoDB Atlas as a segment.

3. Termination

At last, the workflow needs to be paused and resume later for new documents. An interesting aspect of this workflow is that it never ends. If it runs out of files to process, then it simply pauses itself.

Step 2: Injecting a MongoDB Client

Next step is to inject an embeddingStore field into the workflow. This field is of type MongoDbEmbeddingStore, and to create an instance of that we need to inject a MongoClient (MongoDB) to the workflow’s constructor.

To make the MongoClient instance available, we can use a bootstrap class that uses Akka’s @Setup annotation.

Step 3: Exposing the Workflow via Endpoint(s)

Akka offers HTTP endpoint(s) to control indexing, i.e., start/abort indexing.

Since, both the endpoint(s) return HTTP 202 (Accepted) status code, hence, external orchestration becomes easy.

Time to Index Documents!

1. Start the service locally

2. Trigger Indexing

Indexing process of all *.md documents can be observed in the application logs. Once indexed, MongoDB will contain the embeddings ready for semantic search.

Next Steps

Now we know how to index documents in a vector DB (MongoDB Atlas). With the indexing in place, the next step is to build an actual Akka AI Agent that will wire the embeddings into a RAG search query. And at last create Endpoint(s)/UI for interactive User querying. To know more about it, stay tuned 🙂

Further Reading

Picture of Himanshu Gupta

Himanshu Gupta

Himanshu Gupta is a Principal Architect passionate about building scalable systems, AI‑driven solutions, and high‑impact digital platforms. He enjoys exploring emerging technologies, writing technical articles, and creating accelerators that help teams move faster. Outside of work, he focuses on continuous learning and sharing knowledge with the tech community.

Leave a Comment

Your email address will not be published. Required fields are marked *

Suggested Article

Scroll to Top