Différences

Ci-dessous, les différences entre deux révisions de la page.

--- informatique:ai_lm:ai_nlp_rag [21/04/2026 09:06] – [AI NLP and RAG] cyrille
+++ informatique:ai_lm:ai_nlp_rag [04/06/2026 08:21] (Version actuelle) – [AI NLP and RAG] cyrille
@@ Ligne 9: / Ligne 9: @@
   * https://github.com/run-llama/llama_index
   * [[https://github.com/AnswerDotAI/RAGatouille|RAGatouille]] : bridging the gap between state-of-the-art research and alchemical RAG pipeline practices
+    * Langchain integration as a [[https://docs.langchain.com/oss/python/integrations/providers/ragatouille|provider]] or [[https://docs.langchain.com/oss/python/integrations/retrievers/ragatouille|retriever]]
+  * [[https://github.com/stanford-futuredata/ColBERT|ColBERT v2]] is a fast and accurate retrieval model, enabling scalable BERT-based search over large text collections in tens of milliseconds.
+    * One of the easiest ways to use ColBERT in applications nowadays is the semi-official, fast-growing RAGatouille library (2024)
 La reconnaissance d’entités (NER), également appelée segmentation d’entités ou extraction d’entités, est un composant du traitement automatique du langage naturel (NLP) qui identifie des catégories prédéfinies d’objets dans un corps de texte.
@@ Ligne 23: / Ligne 25: @@
   * Détection d'entités (NER)
     * vise à reconnaître et à classer des entités nommées telles que des personnes, des lieux, des organisations et d'autres informations spécifiques
+ReRanking
+  * Modèles de ReRanking : Utilisation de modèles spécialisés (comme Cross-Encoders) qui comparent directement la question et chaque chunk pour calculer un score de pertinence plus précis.
+  * Fusion de scores : Combinaison de plusieurs critères (pertinence vectorielle, popularité, fraîcheur des données, etc.) pour obtenir un classement final optimisé.
+  * Filtrage des redondances : Suppression des chunks qui se recoupent trop, afin d’éviter de répéter la même information.
+SEQUOIA (Semantic-Evolved QUery-Optimized Iterative Abstraction) is a novel RAG architecture that combines four techniques into a unified retrieval pipeline:
+  - Liste numérotéeSemantic Chunking -- splits documents by embedding similarity boundaries instead of fixed-size windows
+  - RAPTOR Tree -- recursively clusters chunks and summarizes via LLM, building a hierarchy
+  - Step-Back Prompting -- LLM generates a more abstract query; both queries used for retrieval across all tree levels
+  - Confidence-Gated Adaptive Depth -- retrieval starts at leaf level, ascends tree only if confidence is below threshold
+<code>
+query
+  → multi-query expansion (2 rewrites + 1 step-back, via LLM)
+  → hybrid retrieval per variant (BM25 + dense + RRF, top-20 each)
+  → RRF merge across all variants
+  → cross-encoder rerank (top-50 → top-5)
+  → context compression (sentence-level filtering by cosine sim to query,
+                         keep top 12 sentences, collapse into one chunk)
+  → LLM with short-answer prompt
+</code>
+Articles
+  * [[https://aimultiple.com/rag-frameworks|Benchmark 5 RAG frameworks: LangChain, LangGraph, LlamaIndex, Haystack, and DSPy]] (2026)
+  * [[https://github.com/Diyago/rag-benchmark/tree/main|RAG Benchmark: comparing 8 retrieval-augmented generation architectures including SEQUOIA]]
 ==== Glossaire ====