A production-ready template for building RAG (Retrieval-Augmented Generation) applications using Swarms and LlamaIndex. This template provides everything you need to get started with document-based AI interactions.
- 📚 Built-in RAG implementation with LlamaIndex
- 🤖 Advanced AI agent for healthcare data summarization
- 🔄 Interactive chat interface
- 📊 Sample data generator for testing
- 🛠️ Easy-to-customize architecture
- Clone the repository
git clone https://github.com/The-Swarm-Corporation/Swarms-LlamaIndex-RAG-Template.git
cd Swarms-LlamaIndex-RAG-Template
- Install dependencies
pip install -r requirements.txt
- Set up your environment
# Create a .env file and add your API keys and configurations
touch .env
# export OPENAI_API_KEY=<your-openai-api-key>
# export GROQ_API_KEY=<your-groq-api-key>
# export WORKSPACE_DIR=<your-workspace-directory>
- Generate sample data or add your own
# To generate sample data:
python -m llamaindex_rag.fake_data_generator
# OR add your own documents to the 'docs' folder
- Run the application
python main.py
- The
healthcare_summarizer
agent is a pre-built agent that summarizes medical data. - You can customize the prompt to fit your needs.
- The
all_cores=True
argument is optional, it allows the agent to use all available cores on your machine for faster processing. - The agent will first query the LlamaIndexDB for relevant documents, then use the LLM to summarize the data.
from llamaindex_rag.agent import healthcare_summarizer
# Example usage
print(
healthcare_summarizer.run(
"""
What is the medical history of patient 1? Create a report with the following format:
- Chief Complaint
- Vitals
- Assessment
- Medications
- Plan
""",
all_cores=True,
)
)
- You can create your own agent with RAG by using the
Agent
class. - The
system_prompt
argument is the prompt that the agent will use to summarize the data. - The
llm
argument is the LLM model that the agent will use to summarize the data. - The
max_loops
argument is the number of loops the agent will run. - The
long_term_memory
argument is the LlamaIndexDB instance that the agent will use to query the vector database. It can use any RAG system you want it just needs to be a class with aquery
method that intakes aquery
string and returns aresponse
string.
# Initialize the healthcare summarization agent
healthcare_summarizer = Agent(
agent_name="Healthcare-Data-Summarizer",
system_prompt=HEALTHCARE_SUMMARY_PROMPT,
llm=model,
max_loops=2,
autosave=True,
dashboard=False,
verbose=True,
dynamic_temperature_enabled=False,
saved_state_path="healthcare_summarizer.json",
user_name="Human:",
retry_attempts=1,
context_length=250000,
return_step_meta=True,
output_type="json",
streaming_on=False,
long_term_memory=database,
auto_generate_prompt=False,
# output_file="healthcare_report.md",
# state_save_file_type="json",
rag_every_loop=True,
interactive=True,
)
- LlamaIndexDB is a class that allows you to query a vector database of documents.
- Add your documents to the
docs
folder or use the fake data generator to populate the database. - The
data_dir
argument is the directory containing your documents. similarity_top_k
is the number of similar documents to retrieve, it can make the response more relevant and longer.
from llamaindex_rag.memory import LlamaIndexDB
# Example usage
llama_index_db = LlamaIndexDB(
data_dir="docs",
filename_as_id=True,
recursive=True,
required_exts=[".txt", ".pdf", ".docx"],
similarity_top_k=10
)
llama_index_db.query(
"What is the medical history of patient 1?",
streaming=True,
response_mode="compact"
)
.
├── llamaindex_rag/
│ ├── agent.py # Main RAG agent implementation
│ ├── memory.py # LlamaIndex database integration
│ └── fake_data_generator.py # Sample data generator
├── docs/ # Document storage directory
├── requirements.txt # Project dependencies
└── main.py # Application entry point
- Swarms Framework - AI agent framework
- LlamaIndex - Data framework for LLM applications
- Python 3.10+
- GROQ/OpenAI API
-
Add Your Documents
- Place your documents in the
docs
folder, or - Use the fake data generator for testing
- Place your documents in the
-
Interact with the Agent
- Run
main.py
to start the interactive session - Ask questions about your documents
- Get AI-powered insights and summaries
- Run
Contributions are welcome! Please feel free to submit a Pull Request.
- Swarms Framework
- Python 3.10+
- GROQ API Key or you can change it to use any model from Swarm Models
Questions? Reach out:
- Twitter: @kyegomez
- Email: kye@swarms.world
Book a call with here for real-time assistance:
⭐ Star us on GitHub if this project helped you!
Built with ♥ using Swarms Framework