Skip to content

ragutils Cog

Advanced RAG Features

  • Sentence-aware chunking (12-120 words) keeps context tight for long knowledge docs; toggle with [p]ragutils enable chunking
  • Cross-encoder reranking (default cross-encoder/ms-marco-MiniLM-L-6-v2, threshold 0.33) upgrades cosine similarity; enable via [p]ragutils enable reranking
  • MMR deduplication (lambda 0.5) diversifies results and reduces repeats; toggle with [p]ragutils enable mmr
  • Keyword/numeric boosts lift answers containing numbers, dates, or workflow terms; works automatically when RAG is enabled
  • Backends: ChromaDB by default or optional Qdrant ([p]ragutils backend qdrant) for larger deployments

Quickstart - [p]ragutils settings to view status - [p]ragutils test "<query>" to preview retrieval with the current pipeline - [p]ragutils threshold 0.33 to tune rerank cutoff

Qdrant Migration 1. Install qdrant-client 2. Ensure Qdrant is reachable at http://localhost:6333 or set your URL with [p]ragutils backend qdrant 3. Run [p]ragutils backend qdrant to sync existing embeddings 4. Switch back anytime with [p]ragutils backend chromadb

Performance Snapshot | Mode | Strength | Notes | |------|----------|-------| | ChromaDB (default) | Fast local similarity | Lowest overhead | | Qdrant | Scales to larger corpora | Network hop; optional | | RAG enhanced (rerank+MMR) | Higher relevance + diversity | Uses cross-encoder + extra compute |