diff --git a/rfcs/text/0000-additional-label-types.md b/rfcs/text/0000-additional-label-types.md new file mode 100644 index 0000000000..5c955767fa --- /dev/null +++ b/rfcs/text/0000-additional-label-types.md @@ -0,0 +1,220 @@ +# 0000: Label fields for additional types + + +- Stage: **0 (strawperson)** +- Date: **TBD** + + + + + +The existing `labels` field is intended to capture custom key/value pairs to add custom meta information to events. All `label` values are indexed as `keyword`. + +Some uses may need different `label` fields for different Elasticsearch data types. To better accommodate those use cases, this is a proposal to introduce several new object fields within the existing `label` object. + + + +## Fields + + + +### Proposed Fields + +* `labels.*` fields will remain type `keyword` +* `labels.long.*` fields will be type `long` +* `labels.double.*` fields will be type `double` +* `labels.boolean.*` fields will be type `boolean` + +### Proposed mapping settings + +```json +{ + "mappings" : { + "dynamic_templates" : [ + { + "labels": { + "path_match": "labels.*", + "mapping": { + "type": "keyword" + }, + "match_mapping_type": "string" + } + }, + { + "labels_boolean": { + "path_match": "labels.boolean.*", + "mapping": { + "type": "boolean" + }, + "match_mapping_type": "boolean" + } + }, + { + "labels_double": { + "path_match": "labels.double.*", + "mapping": { + "type": "double" + }, + "match_mapping_type": "double" + } + }, + { + "labels_long": { + "path_match": "labels.long.*", + "mapping": { + "type": "long" + }, + "match_mapping_type": "long" + } + }, + { + "strings_as_keywords" : { + "match_mapping_type" : "string", + "mapping" : { + "ignore_above" : 1024, + "type" : "keyword" + } + } + } + ], + "date_detection" : false, + "properties": { + "@timestamp": { + "type": "date" + }, + "labels": { + "type": "object", + "properties": { + "long": { + "type": "object" + }, + "float": { + "type": "object" + }, + "boolean": { + "type": "object" + } + } + } + } + } +} +``` + + + +## Usage + + + +APM uses non ECS fields for other data types, such as `boolean` and `scaled_float`. Expanding `label` support for types beyond `keyword` in ECS could also better support storing OpenTelemetry attributes (key:value pairs that provide context to a distributed trace). + +## Source data + + + +```json + "labels" : { + "worker": "netclient", + "long": { + "events_published" : 50, + "events_original" : 50, + "events_encoded" : 50, + "events_failed" : 0 + } + } +``` + + + + + +## Scope of impact + + + +Sources such as Beats or Elastic Agent would need to adopt the new mappings and dynamic field mapping settings. + +Kibana index patterns would need identify the new nested fields for each of the new `label` child objects. + +## Concerns + + + +### Using flattened + +There have been discussion if using the `flattened` field type could work for the `labels` field to avoid mappings explosions. However, `flattened` has limitations has of this writing: + +* A `flattened` object will index all leaf values to `keyword`. Numeric flattened field support is not yet available in Elasticsearch ([elastic/elasticsearch#61550](https://github.com/elastic/elasticsearch/issues/61550)) +* Kibana has limited autocompletion and KQL support for `flattened` fields ([elastic/kibana#25820](https://github.com/elastic/kibana/issues/25820)). + + + + + +## People + +The following are the people that consulted on the contents of this RFC. + +* @ebeahan | author + + + + +## References + +* https://github.com/elastic/apm-server/issues/3873 + + + +### RFC Pull Requests + + + +* Stage 0: https://github.com/elastic/ecs/pull/1341 + +