Allow disabling and configuring the add_x_metadata default global processors #4670

gizas · 2023-10-16T09:54:26Z

Edited by @cmacknz to generalize to all add_x_metadata processors and include the suggested approach from #4670 (comment)

Describe the enhancement:

The default set of processors for each Beat, when they are started by agent, is defined in the code and is not configurable. Here is the definition for Metricbeat. Same default config stands for rest of beats.

For example the add_kubernetes_metadata processor is used to enrich kubernetes events with extra metadata fields. This results in the number of k8s API calls being proportional to the number of inputs in the agent policy. In this case the same functionality is also present through enrichers and start by default in code of beats as described here-see enrichers.

This enhancement will allow disabling or configuring any of the default beat global processors. This will be done by defining a new type of provider , an event_data provider, which controls the behavior of the event metadata processors added by each beat.

We will additionally expose the entire contents of the providers section of the agent policy in Fleet to allow configuring this. Here is a sample of the intended configuration:

providers:
     event_metadata:
         # The default value would be true to avoid a breaking change when they aren't specified.
         host: # controls add_host_metadata
             enabled: true
         kubernetes: # controls add_kubernetes_metadata
             enabled: false
         cloud: # controls add_cloud_metadata
             enabled: true
         docker: # controls add_docker_metadata 
             enabled: false

Describe a specific use case for the enhancement or feature:

All the elastic-agent Kubernetes deployments experience the default enablement of processor in the background, which can be "expensive" (as far as extra memory and number of API calls) that the agent has to do, especially in big Kubernetes deployments

Related Issues:

Definition of Done:

Allow disabling the add_kubernetes_metadata processor for all Beats run by Elastic Agent.
Provide tests that showcase the same functionality and enrichment with/without processor enabled
Demonstrate the event metadata processors can be disabled from Fleet.
Document the new functionality

The text was updated successfully, but these errors were encountered:

joshdover · 2023-10-16T10:24:21Z

I'm still unfamiliar with how enrichers work, so bear with me here. As far as I understand, they only apply to the metricsets that use them directly and are not configurable. I think we still need an option for users to be able to easily add the same metadata as before with any input type.

The problem we're trying to solve here is especially problematic on Elastic Agent, where we start a new Beat instance for each unique input type. Each of these Beats will run their own instance of add_kubernetes_metadata in addition to the Agent's built-in kubernetes provider. Each of these essentially makes the same K8s API requests which puts a lot of load on the K8s API. This problems scales horizontally both with number of K8s nodes in the cluster and with the number of integrations added to the policy.

I was thinking that for Agent users we could add the same exact properties that the processor gives, but isntead through the Kubernetes provider by updating integrations and standalone config examples to populate add_fields processors. An example would be like:

inputs:
- id: container-log-${kubernetes.pod.name}-${kubernetes.container.id}
  type: filestream
  streams:
    - id: container-log-${kubernetes.pod.name}-${kubernetes.container.id}
      data_stream:
        dataset: kubernetes.container_logs
        type: logs
      prospector.scanner.symlinks: true
      parsers:
        - container: ~
      paths:
        - /var/log/containers/*${kubernetes.container.id}.log

      # 👇 Here is what is new
      processors:
        - add_fields:
            target: kubernetes
            fields:
              pod.name: ${kubernetes.pod.name}
              pod.id: ${kubernetes.pod.id}
              namespace: ${kubernetes.namespace}
              labels: ${kubernetes.labels}

Ideally this is actually something that Agent could pull off automatically without having to change integrations and configurations. This would help avoid a breaking change. I see a couple options to solve this:

Change how the add_kubernetes_metadata processor works when running under Agent to somehow fetch this info from the agent core process instead of K8s API directly. This may have some of the same memory issues though, not sure. This might set us up to more easily have a cluster-level agent that provides this info for all agents later.
Disable the add_kubernetes_metadata by default when running Beats under Agent, but have Agent insert this add_fields processor block automatically when the provider detects it's on K8s.

gizas · 2023-10-16T10:38:16Z

they only apply to the metricsets that use them directly and are not configurable

Small correction on this is that yes they start by default but are configurable based on add_resource_metadata block . See below example from managed agent (same from standalone as well):

So at the moment you can only disable namespace enrichement (this is the main one that gives us labels and annotations), node enrichment (that adds node and hostname), cronjob (that adds in pods ony cronjob name) and deployments (that adds in pods deployment name)
See all options here in add_resource_metadata block

So what you describe above for the processor can still happen in the enricher and be configured with the options you specify. We maybe need to specify extra user scenarios but it is ok.

The disablement of the processor needs to happen as both start and this is not correct.

Agent to somehow fetch this info from the agent core process instead of K8s API directly.

I understand your point. To see if agent has this info already and not to ask again from them. We will need to see the code and if the cache of agent can be utlised from the integration side.
cc @ChrsMark and @MichaelKatsoulis that have worked in the past in this part of the code to see if they remember anything more

ChrsMark · 2023-10-16T11:21:39Z

Hey folks, I'm not sure if the components are completely distinguished here. So here is a quick list of the related components:

1) k8s provider of Elastic Agent

Only configurable in standalone mode and enabled by default.
Has no interaction with any other components since it only discovers Pods/Nodes/Services and emit mappings for the compo-sable controller.
code ref: https://github.com/elastic/elastic-agent/tree/main/internal/pkg/composable/providers/kubernetes

2) k8s enrichers of Metricbeat

Only configurable through the metricset specific configurations. Hence it's configurable through the Agent's policy/input.
Again has no interaction with any other components
code ref: https://github.com/elastic/beats/blob/a55ae5857c0d80e0834c9f3e98fd1b833675f12c/metricbeat/module/kubernetes/state_pod/state_pod.go#L95 (one example metricset that enables the enricher)

3) `add_kubernetes_metadata` processor

⚠️ This is what the issue is about ⚠️

This processor is started by default by all Beats because the default Beat's configuration includes it: https://github.com/elastic/beats/blob/b8a241b0703d252c3ff1ba9b417aea585f811da1/metricbeat/metricbeat.yml#L127 (same for other Beats like Filebeat etc).
I'm not sure how https://github.com/elastic/beats/blob/main/x-pack/metricbeat/cmd/root.go#L74-L78 interacts with this tbh.

From my perspective in Beats I should be able to remove the add_kubernetes_metadata: ~ processor from the Beat's default config and disable the processor. Isn't that correct @gizas ?
But Elastic Agent does not provide that option which leads to the following issue ---->

In the past we had a related issue at elastic/beats#27216 when Elastic Agent was starting (still does :)) Filebeat with the default config file (filebeat.yml) and as a result we had the processor (add_kubernetes_metadata) to produce tons of warnings because the defaults were not valid.
That's why we introduced elastic/beats#27689 to skip the enrichment whenever we detect that the k8s metadata are already present on the event (from the k8s autodiscovery provider for example).

However this only makes the processor to skip the enrichment part not the "watching" part since the processor starts properly and makes all the required API calls so as to populate the caches. This is why we need to completely disable it if we don't need it specifically.

⚠️ However the original issue is still there and is filed at #90. So I think this is why we are still hitting this @joshdover @gizas and this what we should focus on since it's a generic problem of how Elastic Agent starts and configures the underlying Beats. ⚠️

Note that the default enablement of the processor in Beats was a "nice-to-have" but in Agent it's completely useless since other components already do the job and eventually if anyone needs the processor they can add it into the policy manually.

I hope that's clear why it's super crucial to either disable the processor by default on Beats level or provide the option to do it from Elastic Agent through #90.
All the information about the problematic existence of the add_kubernetes_metadata processor can be found in the linked issues above so please have a look but happy to elaborate more on this if needed.

ps1: Reusing metadata is a valid point in general but not applicable here so let's try to keep the focus on the existent issue :).

gizas · 2023-10-16T11:31:36Z

Thanks @ChrsMark , so trying to summarise your info and Josh's comments:

Yes this issue is to remove add_kubernetes_metadata processor to run by default in beats and agents. Also to test if this leaves any issue comparing to what users see today
@joshdover (correct me here) was trying to suggest more granural disablement of existing metadata if makes sense
Suggestion whether we could reuse metadata from agents provider is not feasible in metricbeat as the provider is not used at all .

ChrsMark · 2023-10-16T11:39:15Z

@gizas 👍🏽

I had filed an issue regarding the memory utilization improvement by using common shared caches etc: elastic/beats#35591. But as I said this is out of scope for the current problem.

So yes let's try to see if we can completely disable the add_kubernetes_metadata processor on Beat's startup (ideally if we detect that we run under Elastic Agent.
We could either introduce an ENV var check at https://github.com/MichaelKatsoulis/beats/blob/3955e717adb8a1204464de756fcba91d7df89d75/libbeat/processors/add_kubernetes_metadata/kubernetes.go#L142 so as to skip the whole initialization if the specific variable is present. Sth similar to what we did for add_host_metadata processor @gizas .

MichaelKatsoulis · 2023-10-16T11:39:24Z

+1 to what @ChrsMark mentioned.
Also regarding @joshdover comment #4670, in case of container_logs integration in managed agent or standalone agent, the use of dynamic variables (/var/log/containers/*${kubernetes.container.id}.log) starts the kubernetes provider.
The provider enrichers the log events with metadata. So adding the add_fields processor won't add anything new.

MichaelKatsoulis · 2023-12-11T12:11:10Z

More on the ENV var approach , a variable worth investigating is BEAT_SETUID_AS which under agent is set to elastic-agent by the dockerfile. When beat runs autonomously the var is unset as far as I have seen until now.

gsantoro · 2023-12-11T16:24:02Z

The default set of processors for each Beat, when they are started by agent, is defined in the code. Here is the definition for Metricbeat. Same default config stands for rest of beats.

@gizas This is not entirely true. Filebeat behaves differently.

Code at https://github.com/elastic/beats/blob/157b31a79e3ad52578f30399049cac030d688934/x-pack/filebeat/cmd/root.go#L51 shows that for filebeat when SHIPPER_MODE=true all the default processor are disabled. this is only valid for filebeat and not for metricbeat.

gsantoro · 2023-12-11T16:49:56Z

@MichaelKatsoulis personally I wouldn't override the behaviour of BEAT_SETUID_AS to be used as a feature flag for enabling/disabling processors.

gizas · 2023-12-12T07:33:45Z

@gsantoro we had done something similar with an environmental variable here to alter the configuration of processors. Maybe we can think another variable to do the disablement of kubernetes processor and not be a breaking change?

Besides the x-pack configuration in this story (and x-pack folder I guess affects only Elastic agent when it starts beats underneath) we need to verify whether the configuration files (like https://github.com/elastic/beats/blob/main/filebeat/filebeat.yml#L176) needs also to be changed (when we solely run each beat)

Additionally, we would need a diagnsotics before and after the changes to figure out the improvement in memory and be sure that processor is not run otherwise. Any other ideas to measure possible imrpovement?

cmacknz · 2023-12-12T21:09:31Z

One thing we could explore here is using the recently added agent.features section of the policy. This was built as a mechanism for passing feature flags to agent and the processes it runs.

elastic-agent/elastic-agent.reference.yml

Lines 192 to 197 in e3c0695

    
           # Feature Flags 
        
           # This section enables or disables feature flags supported by Agent and its components. 
        
           #agent.features: 
        
           #  fqdn: 
        
           #    enabled: false

# This section enables or disables feature flags supported by Agent and its components.
#agent.features:
#  fqdn:
#    enabled: false

Right now there is only a single feature flag, which controls whether the agent hostname is the fully qualified domain name or not.

While it would be better if we explicitly supported global processors as a first class concept in the agent policy, there is still a lot of debate about how to do this properly since doing it for every process in agent regardless of if it is implemented as a Beat or not is complex.

So rather than waiting for the global processors problem to be solved, we could allow configuring the set of Beats global processors in the features section of the policy. If we expose these as feature flags we can conceivably obsolete them later once global processor support exists. For example we could expose them as something a bit abstracted from the implementation like (final syntax TBD):

agent.features:
    event_metadata:
         # The default value would be true to avoid a breaking change when they aren't specified.
         host: true # controls add_host_metadata
         kubernetes: false # controls add_kubernetes_metadata
         cloud: true # controls add_cloud_metadata
         docker: false # controls add_docker_metadata

This would give us a nice way to control this in the agent policy. This wouldn't be available in the Fleet UI without a change there to expose it, but it could immediately be controlled using the Fleet override API.

The agent passes the feature flags to sub-processes like Beats using the control protocol: https://github.com/elastic/elastic-agent-client/blob/21e4fd899bbcbe622248407e90744217da77b2b4/elastic-agent-client.proto#L268-L271

    // Features are the expected feature flags configurations. Can apply to either components or
    // individual units depending on the flag and its implementation. Omitted if the client reports
    // it has applied the current configuration. Added in Elastic Agent v8.7.1.
    Features features = 3;

This eventually ends up being handled in the x-pack/libbeat/management code in Beats:

https://github.com/elastic/beats/blob/06a8c09655b0d1d5d4ec4bd77bc930d18975155e/x-pack/libbeat/management/managerV2.go#L572-L584

Most of the work with this approach would be allowing changes of the feature flag to remove processors from the list of default global processors. These are currently initialized once at startup, we'd need a way to manipulate the processors list in the pipeline based on feature flag changes.

https://github.com/elastic/beats/blob/06a8c09655b0d1d5d4ec4bd77bc930d18975155e/x-pack/filebeat/cmd/root.go#L41-L66

gizas · 2023-12-13T08:23:28Z

Nice @cmacknz !
So @gsantoro I guess this is a nice test for your debugger to see when

agent.features:
    event_metadata:
         kubernetes: false # controls add_kubernetes_metadata

then we actually pass the correct config here https://github.com/elastic/beats/blob/06a8c09655b0d1d5d4ec4bd77bc930d18975155e/x-pack/libbeat/management/managerV2.go#L572-L584

Just a reminder that regardless of the way to do it, we must verify that if by default we disable the add_kubernetes_metadata we dont break anything and we dont miss metadata on our collection. We should also measure the improvement in memory and number of API calls if any

gsantoro · 2024-01-31T16:14:29Z

hey @gizas and @cmacknz ,
I am more inclined to follow the approach of ELASTIC_NETINFO at https://github.com/elastic/beats/pull/36506/files.

I would introduce a variable ELASTIC_K8S_METADATA, set to false in any elastic-agent (both standalone and managed) and be done with it.

No Fleet code change necessary. Just set the right values in the manifests and some docs.

What do you guys think?

cmacknz · 2024-01-31T20:42:47Z

That is what I would call the minimum viable change. That only fixes the k8s metadata processor for users running the agent as a container (see below), and requires them to update their k8s YAML to include it. You would think the k8s configuration stays up to date, but I have seen it lag the agent version (edit: this is probably true of the features based approach as well since it defaults to on).

What is proposed in #4670 solves this problem for every global processor in every deployment model. Adding environment variables to agents deployed on individual endpoints (laptops, desktops, etc) requires mass reconfiguration of the agent outside of Fleet since the agent installation itself must be modified. Using an environment variable is really only viable in containers/IaC use cases.

I'd prefer this get implemented as described in #4670, being able to disable global processors is a pretty common request. This is the platform level solution, which I am obviously biased towards as a person who maintains agent as a platform.

That said doing it that way will take significantly longer, probably 1 month for someone who has never touched the inside of agent. The agent team itself is unlikely to be able to take that on itself in the near future either. So if you can't commit to that then the simpler solution is the way to go.

Ideally we aren't building up layers of band aids like this all over the place, but we already have a precedent for this processor with the NETINFO variable at least so it isn't new to this area.

gsantoro · 2024-02-01T10:53:15Z

@cmacknz I understand that this feels like a band aids but I don't think any of us has 1 month to be working on this.

I have been thinking that probably we don't even need the environment variable.

If elastic-agent is already doing the work to add k8s metadata we can't just remove add_kubernetes_metadata from the list of default processors in the code.

AKAIK Beats doesn't add this processor unless it is instrumented to do so with a config. So we just don't want to add that processor by default when running in elastic-agent.

gizas · 2024-02-01T12:09:44Z

Sorry late to the party, I would only add that this story initially will try to verify that the complete removal of the add_kubernetes_metadata processor if it breaks something. So indeed I would try and see if just removing it is everything ok

Of course their might be cases that we can not think right now that we might break (and correct me on this) if the user then manually adds the processor or globally enables it (whenever is ready), wont this be sufficient?

cmacknz · 2024-02-01T16:11:40Z

add_kubernetes_metadata applies the metadata to each event that passes through each of the Beats. The Elastic Agent Kubernetes provider collects the same information, but does not apply it to any of the events. It makes the information available for variable substitution in the agent policy and adds it as metadata when the agent checks in with Fleet.

The only way to get the k8s metadata on events with the agent kubernetes provider is to have an integration template in the processors (or add them manually) using the data the k8s provider collected. I believe you do some of this in the k8s integration already, but removing add_kubernetes_metadata would be a breaking change for users that rely on this processor but don't use the Kubernetes integration.

I think Josh proposed this earlier, but we could have agent push the k8s metadata it collected down to each Beat using the control protocol where they could automatically generate the equivalent static processor. Then you could remove the add_kuberentes_metadata processor and the duplicate API calls it makes without it being a breaking change, with the catch that you are still going to have to change the interface between agent and its sub-processes to make it work. I say this would also cost you 1 month to do correctly, but reading your goals I like it better than what is in #4670. This approach works by default with no breaking change or manual configuration required.

The extra work here is a product of the agent sub-process architecture, which we are stuck with until we can shift more of this functionality into the OTel collector embedded in the agent. That is not going to happen in the short term.

cmacknz · 2024-02-01T16:13:29Z

This conversation is getting farther into the internals of the agent, happy to jump on a call to help close this out if that would help.

gsantoro · 2024-03-04T17:40:59Z

How to run audit-logs tests locally

discussion on github.

You must check out the repo obs-infraobs-team since we will use some taskfiles and env variables.

Before starting, you need to change the params at tools/envs/elastic-agent/.env to build elastic-agent:

change platform to either linux/amd64 or linux/arm64. Eg. on Mac M1 laptop I use linux/arm64.
I use EXTERNAL=false, or I don't provide this env variable entirely, so you can compile beats and run the code from source.

I have an elastic-agent repo with optimizations that make the compilation faster at https://github.com/gsantoro/elastic-agent/tree/feature/default_processors. You might want to check it out locally to build elastic-agent in 2-3 minutes instead of 20 minutes.

You need to start with a clean version of the Beats repo.

Run the following commands to setup Elastic-agent with add_kubernetes_metadata enabled (no changes to the code)

# activate env variables to configure audit logs in Kind
cd tools/envs/local-auditlogs

# start k8s on Kind, Elastic via elastic-package, configure Fleet to use elastic-agent built from source
task local-managed-source:up

Keep it running for 30 minutes to get some audit logs.

you can visualize the audit logs using a dashboard at tools/apps/audits-notebook/data/kibana-dashboard.ndjson.

Now you want to change the code in beats to disable the processor add_kubernetes_metadata.

You will need to comment the line and rebuild/rerun elastic-agent with those changes

# activate env variables to configure audit logs in Kind
cd tools/envs/local-auditlogs

# start k8s on Kind, Elastic via elastic-package, configure Fleet to use elastic-agent built from source
task local-managed-source:reload-from-source

keep it running for another 30 minutes and compare the results.

gsantoro · 2024-03-07T15:49:49Z

Summary of a zoom call with @cmacknz and @MichaelKatsoulis

we agreed that we should implement a more long-term solution, as highlighted here instead of the env variable to solve the problem only for k8s integration. This is because other teams (with other integrations) may want to disable/enable add_kubernetes_metadata or other processors without encoding the logic in beat code.
an alternative to mitigate the problem might be to reduce the number of beats by grouping more than 1 input type in the same beat. This is a solution that might solve other problems, and this is what the @cmacknz team has already been thinking about. Up to @cmacknz to discuss and implement this solution with his team
the problem of implementing a way to configure beat processors will be discussed by @cmacknz in order to find someone with enough time to implement this or some other solutions.

gsantoro · 2024-03-11T14:07:26Z

I'm closing this issue for now since it's probably going to be offloaded to @cmacknz's team

gsantoro · 2024-03-11T14:56:56Z

Some comments about the performance implications of disabling the processor add_kubernetes_metadata.

there is a dashboard for audilogs for elastic-agent
Jupyter notebook with analysis from some sample

audit logs with and without the processor

two sample docs with no processor and with the processor enabled
and a video of the explanation of the different in number of requests to K8s attached here

auditlogs_noload.mp4

gsantoro · 2024-03-12T11:13:06Z

To summarise the video, there are 5 fewer requests per minute to the Kubernetes API. With a bigger cluster or with more integrations installed the impact of the change might be even bigger.

cmacknz · 2024-03-13T14:26:58Z

I'm going to keep this open, as the problem in the description isn't technically solved yet. I will change the team label though.

elasticmachine · 2024-03-13T14:27:07Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz · 2024-05-02T20:49:22Z

It may actually be better to put this in the providers section of the policy, and kill two birds with one stone by exposing the agent.providers in Fleet which aren't supported yet. Something like:

providers:
     event_metadata:
         # The default value would be true to avoid a breaking change when they aren't specified.
         host: # controls add_host_metadata
             enabled: true
         kubernetes: # controls add_kubernetes_metadata
             enabled: false
         cloud: # controls add_cloud_metadata
             enabled: true
         docker: # controls add_docker_metadata 
             enabled: false

nimarezainia · 2024-05-03T04:04:49Z

@kpollich I wanted to bring this to your attention after speaking to Craig. I think perhaps we should finally introduce the "advanced" section in the Agent Policy and add this "providers" section. These booleans can be generated from config also.

kpollich · 2024-05-03T12:53:00Z

Sounds good to me. I think this should be achievable with the advanced settings framework.

cmacknz · 2024-05-03T14:36:51Z

Updated the description and transferring to the agent repository since this is really an agent feature.

elasticmachine · 2024-05-17T18:35:17Z

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

pchila · 2024-06-24T17:35:36Z

Initial design/implementation ideas:

The new agent event_metadata Provider is not a provider, rather a component that injects/manage components configuration (either directly in the transpiler or more likely as a ComponentModifier)
The client v2 will be modified to allow opt-in to subscribe ComponentChanges() and not just UnitChanges()
For beats, when receiving configuration for the global processors at component level, it will dump the config in a fleet-component-config.yaml file in the data.path location and re-exec itself to load that fragment of configuration (not sure it's possible to dynamically change the global providers without restarting beats since most of them are instantiated very early in the beats lifecycle)

Any thoughts/feedback/observations @cmacknz , @ycombinator ?

cmacknz · 2024-06-24T17:56:59Z

For beats, when receiving configuration for the global processors at component level, it will dump the config in a fleet-component-config.yaml file in the data.path location and re-exec itself to load that fragment of configuration (not sure it's possible to dynamically change the global providers without restarting beats since most of them are instantiated very early in the beats lifecycle)

Why do the Beats need to re-exec themselves? These processors map to global Beat processors (processors at the top level of the Beat configuration as defined in https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html#where-valid) and should be reloadable like they are if you were writing the Beat configuration by hand.

pchila · 2024-06-25T10:08:25Z

Why do the Beats need to re-exec themselves? These processors map to global Beat processors (processors at the top level of the Beat configuration as defined in https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html#where-valid) and should be reloadable like they are if you were writing the Beat configuration by hand.

I didn't find a place other than the very early stages when building out the pipelines happens using the global processors...

At that point, if we want to configure such processors we need to have the configuration available but I don't think the client is up yet (hence the idea of writing out a partial config file and re-exec)... I will have another look once I have the new client in and see what happens, if we can reconfigure the processors after we've assembled the pipeline

I would be happy to avoid writing out a file and re-exec so if by the time the beat starts receiving CheckinExpected messages we can still configure global processors on the fly, I will be more than happy save myself the extra complication and hassle

cmacknz · 2024-06-25T15:25:18Z

You should be able to test whether hot reloading the global processors at anytime works by trying it with a standalone Beat configuration.

pchila · 2024-07-23T16:10:29Z

Quick sum up of the solution and progress so far:

The solution requires modifications across 3 repositories:

elastic-agent: the "provider" (it's not really a provider, rather a ComponentModifier) has been roughly implemented ---> Allow configuring global processors #5052
elastic-agent client: modified the protobuf definitions to include a global processor config and introduced an 'opt-in' component changes publishing ---> Add global processor component config elastic-agent-client#117
beats: global processors are configured at a very early stage of the beats init and used before the V2 client had a chance to run and fetch configuration from agent and at this point we don't have a clear solution.

To go a bit deeper on the beats initialization problem:

here is where the function controlling the initialization of processors is created
here is the creation of the pipeline builder which contains the instances of the global processors
here is where the Pipeline is finally instantiated with a set of processors
Pipeline clients (inputs) will then connect to the Pipeline and receive a copy of the global processors to invoke before queuing the events for publishing. The global processors have been already set in the Pipeline during initialization above.

To try and load global processor config I opened 2 draft PRs in the beats repo with 2 different approaches:

Add a reload mechanism for global processor config, make Pipeline track all the opened clients (inputs) and, when a configuration update arrives, update itself and all the connected clients with the new set of processors in a concurrency-safe manner ---> Global processors reload beats#40091
Delay initialization of Pipeline object to give the beat a chance to load the component configuration before spawning inputs and output. If an update for the global processors comes after the pipeline has been created, restart the beat like we do for outputs ---> Global processor reload beats#40312

Neither of these beats PR are ultimately ready:

Global processors reload beats#40091 still suffers from some segmentation faults during initialization and is at risk for deadlocks for the Client <---> Pipeline updates (both need mutexes to make the update safe)
Global processor reload beats#40312 delaying the Pipeline initialization seems safer and the updates look easier to handle, however it seems that there are issues with the correct output initialization (the outputController for the Pipeline object need to be looked into). The logic for restarting has been implemented and seems to behave sensibly

botelastic bot added the needs_team label Oct 16, 2023

gizas added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Oct 16, 2023

botelastic bot removed the needs_team label Oct 16, 2023

constanca-m mentioned this issue Nov 16, 2023

Kubernetes secret provider: Caching it to avoid performance issues #3594

Closed

2 tasks

gizas mentioned this issue Nov 22, 2023

[Meta]Investigate resource consumption of Elastic Agent with K8s Integration #3801

Open

10 tasks

MichaelKatsoulis self-assigned this Dec 11, 2023

gsantoro assigned gsantoro and unassigned MichaelKatsoulis Dec 11, 2023

MichaelKatsoulis assigned gizas Jan 30, 2024

cmacknz reopened this Mar 13, 2024

cmacknz added Team:Elastic-Agent Label for the Agent team and removed Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team labels Mar 13, 2024

cmacknz unassigned gizas and gsantoro Mar 13, 2024

cmacknz changed the title ~~Disable add_kubernetes_metadata processor on beats startup~~ Elastic:agent Allow disabling add_x_metadata default global processors May 3, 2024

cmacknz transferred this issue from elastic/beats May 3, 2024

cmacknz changed the title ~~Elastic:agent Allow disabling add_x_metadata default global processors~~ Allow disabling add_x_metadata default global processors May 3, 2024

ycombinator added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label May 17, 2024

ycombinator assigned pchila May 17, 2024

gizas mentioned this issue Jun 20, 2024

[META][Kubernetes provider/Kubernetes module] Set watcher options namespace based on configuration elastic/beats#38978

Closed

6 tasks

cmacknz changed the title ~~Allow disabling add_x_metadata default global processors~~ Allow disabling and configuring the add_x_metadata default global processors Jun 20, 2024

This was referenced Jul 3, 2024

Add global processor component config elastic/elastic-agent-client#117

Draft

Allow configuring global processors #5052

Draft

Global processors reload elastic/beats#40091

Draft

flash1293 mentioned this issue Aug 7, 2024

Consider collecting (by-default) all underlying host's processes in K8s #5256

Open

Allow disabling and configuring the add_x_metadata default global processors #4670

Allow disabling and configuring the add_x_metadata default global processors #4670

Comments

gizas commented Oct 16, 2023 • edited by cmacknz Loading

Related Issues:

Definition of Done:

joshdover commented Oct 16, 2023

gizas commented Oct 16, 2023 • edited Loading

ChrsMark commented Oct 16, 2023

1) k8s provider of Elastic Agent

2) k8s enrichers of Metricbeat

3) add_kubernetes_metadata processor

⚠️ This is what the issue is about ⚠️

gizas commented Oct 16, 2023 • edited Loading

ChrsMark commented Oct 16, 2023

MichaelKatsoulis commented Oct 16, 2023

MichaelKatsoulis commented Dec 11, 2023

gsantoro commented Dec 11, 2023

gsantoro commented Dec 11, 2023

gizas commented Dec 12, 2023 • edited Loading

cmacknz commented Dec 12, 2023

gizas commented Dec 13, 2023

gsantoro commented Jan 31, 2024

cmacknz commented Jan 31, 2024 • edited Loading

gsantoro commented Feb 1, 2024

gizas commented Feb 1, 2024

cmacknz commented Feb 1, 2024

cmacknz commented Feb 1, 2024

gsantoro commented Mar 4, 2024

How to run audit-logs tests locally

gsantoro commented Mar 7, 2024

gsantoro commented Mar 11, 2024

gsantoro commented Mar 11, 2024

gsantoro commented Mar 12, 2024

cmacknz commented Mar 13, 2024

elasticmachine commented Mar 13, 2024

cmacknz commented May 2, 2024

nimarezainia commented May 3, 2024

kpollich commented May 3, 2024

cmacknz commented May 3, 2024

elasticmachine commented May 17, 2024

pchila commented Jun 24, 2024

cmacknz commented Jun 24, 2024

pchila commented Jun 25, 2024

cmacknz commented Jun 25, 2024

pchila commented Jul 23, 2024

gizas commented Oct 16, 2023 •

edited by cmacknz

Loading

gizas commented Oct 16, 2023 •

edited

Loading

3) `add_kubernetes_metadata` processor

gizas commented Oct 16, 2023 •

edited

Loading

gizas commented Dec 12, 2023 •

edited

Loading

cmacknz commented Jan 31, 2024 •

edited

Loading