Skip to content

Reranking & Filtering

Reranking cải thiện thứ tự kết quả từ retrieval bằng cách đánh giá lại mức độ liên quan với query.

🎯 Mục đích

  • Improve ranking: Nâng cao chất lượng thứ tự kết quả
  • Query relevance: Tăng độ phù hợp với ý định query
  • Diversity: Giảm redundancy, tăng đa dạng
  • Precision: Đưa kết quả tốt nhất lên top

🔄 Process

🛠️ Techniques

Cross-Encoder Reranking

  • Pairwise comparison: So sánh query với từng document
  • High accuracy: Better than bi-encoder
  • Computational cost: Expensive cho large sets

Learning to Rank

  • Feature engineering: Query-doc features
  • ML models: Gradient boosting, neural networks
  • Training data: Human-labeled relevance

Diversity-based Reranking

  • Maximal marginal relevance: Balance relevance và diversity
  • Query aspect coverage: Cover different query aspects
  • Result clustering: Group similar results

🤖 Models

Cross-Encoders

  • MS MARCO models: Pre-trained cho passage ranking
  • Domain-specific: Fine-tuned cho legal documents
  • Multilingual: Support Vietnamese

Neural Rankers

  • BERT-based: Deep semantic understanding
  • RoBERTa variants: Improved performance
  • Ensemble methods: Combine multiple models

📊 Implementation

Batch Processing

  • Top-K reranking: Rerank top candidates only
  • Parallel execution: Process multiple documents together
  • Caching: Cache reranking results

Integration

  • Post-retrieval: After initial retrieval
  • Hybrid search: Rerank fusion results
  • Multi-stage: Multiple reranking passes

🚀 Optimization

Performance

  • Model compression: Smaller, faster models
  • Approximation: Approximate reranking
  • Early stopping: Stop if confidence high

Quality

  • A/B testing: Compare reranking strategies
  • User feedback: Learn from interactions
  • Continuous learning: Adapt to user preferences

Reranking nâng cao chất lượng kết quả cuối cùng, cải thiện trải nghiệm người dùng.