How Reranking Works
- Vector Search - Retrieve top 20 results
- LLM Evaluation - LLM scores each for relevance
- Reorder - Results reordered by LLM score
- Return - Top results more accurate
When to Use Reranking
Use Reranking
- Complex queries
- Need high accuracy
- Willing to pay for quality
Skip Reranking
- Simple queries
- Speed critical
- Cost-conscious
Enabling Reranking
- Knowledge > Select KB
- Settings > Search Configuration
- Toggle Enable Reranking
- Click Save
Reranking Models
Available models:| Model | Quality |
|---|---|
| gpt-4o-mini | Good |
| gpt-4o | Better |
| gpt-5-nano | Best |
Citation Tracking
With reranking, track which documents contributed:Performance
Reranking adds latency:- Vector search: 100-200ms
- Reranking: 500-1000ms
- Total: ~1 second
Disabling Reranking
If not needed or too expensive:- Settings > Search Configuration
- Toggle Disable Reranking
- Back to vector search only
Best Practices
Use When:
- Quality matters more than speed
- Complex domain knowledge
- Legal/compliance content
- Customer-facing answers
Skip When:
- Speed critical
- Simple lookups
- Internal use
- Budget constrained
Evaluation
Improve reranking quality:- Test different models
- Monitor citation accuracy
- Adjust threshold if needed
- A/B test with/without