Skip to main content
Reranking uses an LLM to refine search results, improving accuracy.

How Reranking Works

  1. Vector Search - Retrieve top 20 results
  2. LLM Evaluation - LLM scores each for relevance
  3. Reorder - Results reordered by LLM score
  4. Return - Top results more accurate

When to Use Reranking

Use Reranking

  • Complex queries
  • Need high accuracy
  • Willing to pay for quality

Skip Reranking

  • Simple queries
  • Speed critical
  • Cost-conscious

Enabling Reranking

  1. Knowledge > Select KB
  2. Settings > Search Configuration
  3. Toggle Enable Reranking
  4. Click Save

Reranking Models

Available models:
ModelQuality
gpt-4o-miniGood
gpt-4oBetter
gpt-5-nanoBest

Citation Tracking

With reranking, track which documents contributed:
Answer: "Our refund policy is..."
Source: Page 3, Customer Policy.pdf (89% confidence)
Agent can cite sources in response.

Performance

Reranking adds latency:
  • Vector search: 100-200ms
  • Reranking: 500-1000ms
  • Total: ~1 second
Trade-off: Accuracy vs speed

Disabling Reranking

If not needed or too expensive:
  1. Settings > Search Configuration
  2. Toggle Disable Reranking
  3. Back to vector search only

Best Practices

Use When:

  • Quality matters more than speed
  • Complex domain knowledge
  • Legal/compliance content
  • Customer-facing answers

Skip When:

  • Speed critical
  • Simple lookups
  • Internal use
  • Budget constrained

Evaluation

Improve reranking quality:
  1. Test different models
  2. Monitor citation accuracy
  3. Adjust threshold if needed
  4. A/B test with/without

Next Steps