[Obs AI Assistant] Make content from Search connectors fully searchable #175434

miltonhultgren · 2024-01-24T13:44:49Z

Today, if we ingest a large piece of text into a Knowledge base entry, only the first 512 word pieces are used for creating the embeddings that ELSER uses to match on during semantic search.

This means that if the relevant parts for the query is not that the "start" of this big text, it won't match even though there may be critical information at the end of this text.

We should attempt to apply chunking to all documents ingested into the Knowledge base so that the recall search has a better chance of finding relevant hits, regardless of their size.

As a stretch, it would also be valuable if it was possible to extract only the relevant chunk (512 word pieces?) from the matched document in order to send less (and only relevant) text to the LLM.

AC

Large texts imported into the Knowledge base get embeddings that cover the full text
The Ingest pipeline used to apply the chunking is shared in docs so users can apply it to their search-* indices as well
Recall is able to search across small Knowledge base documents ("single" embedding) and large documents ("multiple" embeddings) in a seamless manner
(Stretch) Only the relevant part of a "multiple embeddings" document is passed to the LLM

More resources on chunking https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/document-chunking

The text was updated successfully, but these errors were encountered:

elasticmachine · 2024-01-24T13:44:51Z

Pinging @elastic/obs-knowledge-team (Team:obs-knowledge)

miltonhultgren · 2024-01-24T15:36:25Z

If we want to retrieve multiple passages from the same text document, we need to split them before ingesting them and store 1 document per passage.
The recommended chunk size for ELSER is 512 but to make the search more coherent it's also recommended to overlap the chunks by 256 tokens.

dgieselaar · 2024-01-24T15:44:13Z

If we want to retrieve multiple passages from the same text document, we need to split them before ingesting them and store 1 document per passage.

Do you mean that we can only select a subset of passages if we split them up into separate documents?

miltonhultgren · 2024-01-24T17:31:43Z

Yes, at least that is my understanding after talking to the AI Search folks.

Assuming you have a large document, and you create nested fields of each passage and create embeddings for each passage.
You'll be able to use knn with inner_hits to search across all passages but it will still give back the whole document (and perhaps some information about which passage caused the match), but you can't pull out more than one passage this way (even setting the k value of the knn to higher, that will just give you more whole document hits with a single passage).

So to get multiple passage hits we need to store multiple documents in ES, which would then let us turn up the k value in our search to find possibly multiple hits from the same original large document text. Not sure if semantic_text would change this.

miltonhultgren · 2024-02-02T09:25:22Z

Do you mean that we can only select a subset of passages if we split them up into separate documents?

@dgieselaar The thing I said above is true for using knn (I've asked if this will change at some point), but if you're using ELSER you cannot use knn (dense vector vs sparse vector), so you need to stick to text_expansion queries which also support inner_hits but in this case can give back more than 1 hit.

So as long as we use ELSER (or rather some model that produces sparse_vector) for the chunking, we can search across a large document and return X number of passages in that document that matched.

Example query:

GET wiki-dual_semantic*/_search
{
  "query": {
    "nested": {
      "path": "passages",
      "query": {
        "text_expansion": {
          "passages.sparse": {
            "model_id": ".elser_model_2_linux-x86_64",
            "model_text": "Where is the Eiffel Tower?"
           }
        }
      },
      "inner_hits": {
        "_source": false,
        "size": 5,
        "fields": [
        "passages.text"
      ]
     }
    }
  },
  "_source": false,
  "fields": [
    "title"
  ]
}

Pseudo query for multi model hybrid search:

GET my-index/_search
{
query: {
  bool: {
    should: [
      { text_expansion }, // on nested field1, with inner_hits
      { text_expansion }, // on nested field2, with inner_hits
      { match_phrase }, // on nested field3
    ]
  }
}.
knn: [
  {
    "field": "image-vector",
    "query_vector": [-5, 9, -12],
    "k": 10,
    "num_candidates": 100
   // with inner_hits
  },
  {
    "field": "image-vector",
    "query_vector": [-5, 9, -12],
    "k": 10,
    "num_candidates": 100
    // with inner_hits
  }
],
 "rank": {
        "rrf": {
            "window_size": 50,
            "rank_constant": 20
        }
    }
}

dgieselaar · 2024-02-02T09:30:10Z

@miltonhultgren that sounds good AFAICT, do you see any concerns?

miltonhultgren · 2024-02-02T09:43:42Z

KNN supports multiple inner hits in 8.13 🚀

I haven't gotten to really trying these things out yet. It seems the path is being paved for us here (and semantic_text will only make it easier).
A lot of the things I've looked at are out of scope for this issue and will be things we can plan for future iterations.

For this issue I will stick to using ELSER, chunking into a nested object, using a nested query with text_expansion and inner_hits to grab multiple relevant passages.

I have two small concerns for this ticket:

Should we aim to support keyword/hybrid search (using a normal text match BM25 query with/without RRF)?
I'm not sure I fully understand how to apply the chunking yet, in particular the "512 size, 256 overlap"]

Number 1 would be in case, for example, there isn't any embeddings in a search-* index or there are only dense_vector embeddings, we could still fallback on keyword search and maybe find good matches that way.
That could also allow users to use our Knowledge base without ELSER installed.
I'm leaning towards deferring that until later though (together with multi model support), do you agree @dgieselaar ?

I'm going to research number 2 next.

miltonhultgren · 2024-02-05T09:07:25Z

Sample query combining nested query match and inner_hits with knn and inner_hits sorted with RRF:

GET wikipedia_*/_search
{
  "size": 5,
  "_source": false,
  "fields": [
    "title",
    "passages.text"
  ], 
  "query": {
    "nested": {
      "path": "passages",
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "passages.text": "who is batman"
              }
            }
          ]
        }
      },
      "inner_hits": {
        "name": "query",
        "_source": false,
        "fields": [
          "passages.text"
        ]
      }
    }
  },
  "knn": {
    "inner_hits": {
      "name": "knn",
      "_source": false,
      "fields": [
        "passages.text"
      ]
    },
    "field": "passages.embeddings",
    "k": 5,
    "num_candidates": 100,
    "query_vector_builder": {
      "text_embedding": {
        "model_id": "sentence-transformers__all-distilroberta-v1",
        "model_text": "who is batman"
      }
    }
  },
  "rank": {
    "rrf": {}
  }
}

miltonhultgren · 2024-02-06T12:46:19Z

Would it be desired/ideal to perform a single ranked search across text, dense and sparse vectors but also across all indices at once? Rather than per source (knowledge base, search connectors in different indices)? What are the trade offs for that?

How would one combine that with "API search", meaning searches that hit an API rather than Elasticsearch? Just thinking out loud here for the future.

dgieselaar · 2024-02-06T12:48:41Z

@miltonhultgren yes it would be preferable (a single search), but we have different privilege models for the knowledge base versus search-* - the former uses the internal user, and the latter uses the current user, so we cannot (at least to my understanding) execute it as a single search request.

miltonhultgren · 2024-04-04T09:09:03Z

We're waiting for semantic_text to be available since it will handle chunking for us, at that point this ticket can be re-written to reflect the work needed to migrate the Knowledge base to use semantic_text instead.

sorenlouv · 2024-05-22T10:43:05Z

Update: This is still blocked by semantic_text

miltonhultgren added the Team:obs-knowledge Observability Experience Knowledge team label Jan 24, 2024

miltonhultgren self-assigned this Jan 24, 2024

miltonhultgren unassigned miltonhultgren Mar 28, 2024

emma-raffenne added this to the 8.15 milestone Apr 17, 2024

sorenlouv added the blocked label May 22, 2024

emma-raffenne changed the title ~~Make Knowledge base articles fully searchable~~ Make content from Search connectors fully searchable Jun 27, 2024

emma-raffenne changed the title ~~Make content from Search connectors fully searchable~~ [Obs AI Assistant] Make content from Search connectors fully searchable Jun 27, 2024

emma-raffenne removed this from the 8.15 milestone Jun 27, 2024

emma-raffenne added Team:Obs AI Assistant and removed Team:obs-knowledge Observability Experience Knowledge team labels Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Obs AI Assistant] Make content from Search connectors fully searchable #175434

[Obs AI Assistant] Make content from Search connectors fully searchable #175434

miltonhultgren commented Jan 24, 2024 •

edited

Loading

elasticmachine commented Jan 24, 2024

miltonhultgren commented Jan 24, 2024

dgieselaar commented Jan 24, 2024

miltonhultgren commented Jan 24, 2024

miltonhultgren commented Feb 2, 2024 •

edited

Loading

dgieselaar commented Feb 2, 2024

miltonhultgren commented Feb 2, 2024 •

edited

Loading

miltonhultgren commented Feb 5, 2024

miltonhultgren commented Feb 6, 2024

dgieselaar commented Feb 6, 2024 •

edited

Loading

miltonhultgren commented Apr 4, 2024

sorenlouv commented May 22, 2024

[Obs AI Assistant] Make content from Search connectors fully searchable #175434

[Obs AI Assistant] Make content from Search connectors fully searchable #175434

Comments

miltonhultgren commented Jan 24, 2024 • edited Loading

AC

elasticmachine commented Jan 24, 2024

miltonhultgren commented Jan 24, 2024

dgieselaar commented Jan 24, 2024

miltonhultgren commented Jan 24, 2024

miltonhultgren commented Feb 2, 2024 • edited Loading

dgieselaar commented Feb 2, 2024

miltonhultgren commented Feb 2, 2024 • edited Loading

miltonhultgren commented Feb 5, 2024

miltonhultgren commented Feb 6, 2024

dgieselaar commented Feb 6, 2024 • edited Loading

miltonhultgren commented Apr 4, 2024

sorenlouv commented May 22, 2024

miltonhultgren commented Jan 24, 2024 •

edited

Loading

miltonhultgren commented Feb 2, 2024 •

edited

Loading

miltonhultgren commented Feb 2, 2024 •

edited

Loading

dgieselaar commented Feb 6, 2024 •

edited

Loading