Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Suggest/Infer Mapping Based on Inspection #145

Open
macohen opened this issue Feb 8, 2023 · 0 comments
Open

[RFC] Suggest/Infer Mapping Based on Inspection #145

macohen opened this issue Feb 8, 2023 · 0 comments
Labels
rfc Request for Comments Search Indicates a search feature - useful for cross project searches

Comments

@macohen
Copy link
Collaborator

macohen commented Feb 8, 2023

Summary

When building a product or full-text search based application in OpenSearch, careful consideration and thought needs to be put into building the index based on the data being ingested. Often, a document to be indexed is derived from several different sources (e.g. databases, tables, or files). In OpenSearch, the best practice is to denormalize this data into a single document. This proposal is to build tooling that inspects source data and generates a suggested OpenSearch index mapping file so that search application builders do not need to start from scratch and also do not need to default to dynamic mapping.

General Use Case

  1. Upload a CSV, JSON document, or SQL query against a source
  2. Optionally specify a primary or unique key
  3. Mapping file generator then:
    4. sorts the data based on the key provided
    5. loops through X number of rows to generate a suggestion for mapping file including nested documents, datatypes,
    analyzers, and autosuggest.
  4. Search Application Builder can take the mapping file and tweak/correct the mappings.

Questions

  1. Is there an existing component, like data-prepper, that can be used to support this case?
  2. Many organizations reuse datatypes, so we would want to learn from existing mappings. How can this be done?
@macohen macohen added rfc Request for Comments and removed untriaged labels Feb 8, 2023
@macohen macohen added the Search Indicates a search feature - useful for cross project searches label Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Request for Comments Search Indicates a search feature - useful for cross project searches
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

1 participant