Skip to content

MiuLab/GenIR-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 

Repository files navigation

Generative IR Survey Paper Structure

Introduction

  • Briefly introduce the history and trend in IR
  • Traditional DPR (Retrieve then Rank)
image
  • propose an alternative IR architecture called differentiable search index (DSI), direct seq2seq map query to document ID
image

image

Definition of GenIR (DSI)

image

Intro to DSI

Two Major Components

  • Indexing
  • Retrieval

Pros and Cons of DSI

  • Pros
    • End-to-end training
  • Cons
    • Not easy to scale DSI systems to handle large data volumes

Identifier Strategies in Generative IR

Document Identifiers

  • Transformer Memory as a Differentiable Search Index
  • DSI++: Updating Transformer Memory with New Documents
  • Learning to Tokenize for Generative Retrieval

String Identifiers

  • Autoregressive Search Engines: Generating Substrings as Document Identifiers
  • Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies
  • Learning to Rank in Generative Retrieval
  • TOME: A Two-stage Approach for Model-based Retrieval
  • Multiview Identifiers Enhanced Generative Retrieval

Enhancing Query Generation and Expansion

  • generating more training data
  • document representation as queries

Document Representation in DSI

  • DSI
    • Direct Indexing
    • Set Indexing
    • Inverted Index
  • Bridging the Gap
    • queries as representation
    • Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation
    • Multiview Identifiers Enhanced Generative Retrieval

Add New Documents

  • DSI++: Updating Transformer Memory with New Documents
  • Continual Learning for Generative Retrieval over Dynamic Corpora

Evaluation of Generative IR

Compare the performance of traditional IR systems with DSI-based IR systems

  • Datasets
    • IR tasks
    • knowledge intensive tasks (eg. QA)
  • Metrics

Limitations and Future Directions

  • Challenges with DSI: Address potential issues
    • generating non-existent document IDs (FM Index)
    • scaling DSI systems to handle large data volumes.

This structure aims to provide a comprehensive overview of Generative IR and the pivotal role of Differentiable Search Indexes, making it accessible to newcomers while detailing the progress and challenges in the field.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published