[Fleet] Defining Output per integration #143905

nimarezainia · 2022-10-24T20:04:39Z

There are many legitimate reasons why an operator may need/want to send data from integrations to different outputs within a policy. Some may even need to send datastream to different outputs. Currently we only allow an output to be defined on a per policy basis. In order to support this request the per policy output definition needs to be over-written by the output defined in the integration. Our config should support this already.

Use Cases:

As an operator, I need my security logs from an agent to be sent to one logstash where as informational logs to be sent to another logstash instance.
We operate multiple beats on a given system and would like to migrate to using Elastic Agent. For historical and operational reasons these beats are writing data to distinct outputs. Once we migrate over to using Agent, we would like to keep the upstream pipeline intact.

elasticmachine · 2022-10-24T20:22:48Z

Pinging @elastic/fleet (Team:Fleet)

willemdh · 2023-05-03T08:03:08Z

There are many reasons there is a need for multiple agents on a host. One example, which is applicable to Elastic ecosystem itself, a customer typically needs to forward the Elasticsearch / Logstash / Kibana logs and metrics to a separate monitoring cluster. This is not possible in general, as there is already a set of agents running on this node to index system logs and metrics..

Elastic should really support multiple outputs per integration or provide a supported way to install and manage multiple identical agents on a system.

amitkanfer · 2023-05-03T08:35:10Z

@nimarezainia what else is needed from you on this one?

nimarezainia · 2023-05-03T09:00:00Z

@amitkanfer definition is fairly self explanatory but I need to create a mock up for the UI.

amitkanfer · 2023-05-03T09:28:27Z

Thanks @nimarezainia - once ready let's chat online and pass it to @jlind23 for development.

nicpenning · 2023-05-16T14:00:58Z

Another big reason for output per integration is when you have 20+ integrations in a specific policy, there is a good chance that some of those integrations have very different performance requirements.

The biggest need for this feature for me is having the ability to set the amount of works and bulk max size to account for a particular integration that ingest 30K events per second. We have some integrations that only receive 1-5 events per minute so it doesn't make sense to crank up the workers and bulk max size since not all integrations need that performance adjustment.

Here is a sample policy with their respective EPS and need for per integration output selection:

Firewall - 30K EPS
-12 Workers, 2500 bulk max size
Windows Events - 3000 EPS
-4 Workers, 500 bulk max size
HTTP Input - .5 EPS
-Default
API - 20 EPS
-Default
Web Logs - 300 EPS
-Default

jlind23 · 2023-10-24T14:54:27Z

@nimarezainia What would be the user experience here?
Shall we display per output/policy a list of integrations that users can check to see which one is using what output?
Or shall we in the UI below offer the option to switch the output for each integration?

nimarezainia · 2023-10-26T04:18:56Z

@jlind23 I propose the following: (@zombieFox we need to discuss this also)

In the integrations settings page we need a drop down which would display the set of outputs available to the user (configured on the Fleet->settings tab). This should default to whatever output is configured in the policy for integration data. We may want to put this in the advanced settings drop down.

The agent policy page should be modified also to show a summary of what output is allocated to which integration:

nimarezainia · 2023-10-26T04:24:42Z

In scenarios where the user is needing to send different data streams to different outputs, the above model still works as the user can add two instances of the same integration to the policy. For example of the NGINX:

nginx-1 instance:

enable access logs datastream
disable error logs datastream
set integration to send data to output Logstash-A

nginx-2 instance:

disable access logs datastream
enable error logs datastream
set integration to send data to output Logstash-B

zombieFox · 2023-10-26T15:38:40Z

We reviewed this in the UX sync. Looks good to go. The additions indicated above don't require design mocks.

The copy sounds right to me too, but we might want to pass with by David Kilfoyle or Karen Metts.

jen-huang · 2023-10-31T15:22:55Z

Moving this to tech definition for this sprint, if the work identified is a small amount, we'll proceed with implementation.

nchaulet · 2023-11-06T19:22:27Z

Proposed technical implementation for that

I did a small POC implementing only the API part for it with some shortcuts PR to ensure it will work as expected and it seems it will

Package policy/saved object changes

We will introduce a new property named output_id to the package policy. This property will be added/updated in the following components:

Saved object
Type and schema for package policy preconfigured package policy and simplified package policy

We will need to validate that creating/editing a package policy output respect the same rules as per agent policy outputs

APM and fleet server package policies cannot use non ES output.
Licence restriction it should only be available for enterprise licence as multiple output correct ? @nimarezainia

Deleting/Editing output changes

We will have to implement the same rules as we have for agent policy:

When an output assigned to a package policy is deleted, the associated package policy will revert to using the default output
Furthermore, if an output is updated, we will increment the revision for package policies and agent policies utilizing it

Full agent policy generation changes (aka policy sent to agents)

We need to adapt the policy sent to the agent to reflect our model change, the agent already support this using the use_output property and already support multiple outputs.

I tested this locally with the POC PR it seems to work with multiple logs package policy and it seems to work as expected,

The use_output field as to be populated with the package policy output id or the default data output (code here)

kibana/x-pack/plugins/fleet/server/services/agent_policies/package_policies_to_agent_inputs.ts

Line 57 in 74509cd

use_output: outputId,

The role permission has to change so we generate a role permission for each output based on the package policy assigned to them instead of one for data and one for monitoring (code here)

kibana/x-pack/plugins/fleet/server/services/agent_policies/full_agent_policy.ts

Lines 205 to 226 in 74509cd

    
           fullAgentPolicy.output_permissions = Object.keys(fullAgentPolicy.outputs).reduce< 
        
             NonNullable<FullAgentPolicy['output_permissions']> 
        
           >((outputPermissions, outputId) => { 
        
             const output = fullAgentPolicy.outputs[outputId]; 
        
             if ( 
        
               output && 
        
               (output.type === outputType.Elasticsearch || output.type === outputType.RemoteElasticsearch) 
        
             ) { 
        
               const permissions: FullAgentPolicyOutputPermissions = {}; 
        
               if (outputId === getOutputIdForAgentPolicy(monitoringOutput)) { 
        
                 Object.assign(permissions, monitoringPermissions); 
        
               } 
        
               if (outputId === getOutputIdForAgentPolicy(dataOutput)) { 
        
                 Object.assign(permissions, dataPermissions); 
        
               } 
        
               outputPermissions[outputId] = permissions; 
        
             } 
        
             return outputPermissions; 
        
           }, {});

Things to verify

Ensure that the changes are compatible with all input types. It has been tested with log inputs and seems functionnal cc @cmacknz
It seems if one package policy output is broken the input is still reported as healthy in the UI need verify and create a follow up elastic agent issue if it is the case.

nimarezainia · 2023-11-07T00:02:10Z

Package policy/saved object changes

We will introduce a new property named output_id to the package policy. This property will be added/updated in the following components:
* Saved object

* Type and schema for package policy preconfigured package policy and simplified package policy
We will need to validate that creating/editing a package policy output respect the same rules as per agent policy outputs
* APM and fleet server package policies cannot use non ES output.

* Licence restriction it should only be available for enterprise licence as multiple output correct ?  @nimarezainia

thanks @nchaulet - yes this is correct, same licensing restriction as we have for per policy output.

cmacknz · 2023-11-07T20:14:52Z

Ensure that the changes are compatible with all input types. It has been tested with log inputs and seems functionnal cc @cmacknz

We don't have any special handling for specific input types. The use_output option in the agent supports multiple outputs like this already. The only under the hood effect of multiple outputs is the possibility that the agent will run more processes than before. This will add additional queues increasing the memory usage of the agent.

For example, the following results in one logfile input process (or component in the agent model) named input-default implemented by Filebeat:

outputs:
  default:
     type: elasticsearch
     ...
inputs:
   - id: logfileA
     type: logfile
     use_output: default
     ...
   - id: logfileB
     type: logfile
     use_output: default
     ...

While the configuration below with two distinct outputs will result in two Filebeat processes/components, one named logfile-outputA and one named logfile-outputB:

outputs:
  outputA:
     type: elasticsearch
     ...
  outputB:
     type: elasticsearch
     ...
inputs:
   - id: logfileA
     type: logfile
     use_output: outputA
     ...
   - id: logfileB
     type: logfile
     use_output: outputB
     ...

You should be able to observe this directly in the output of elastic-agent status and in the set of components states reported to Fleet.

cmacknz · 2023-11-07T20:16:15Z

I should note that you only end up with additional processes when assigning inputs of the same type to different outputs. If in the example of above there was a system/metrics instead of logfileB there would be no change. This is because the agent runs instances of the same input type in the same process, and is already isolating different input types into their process.

jen-huang · 2023-11-14T18:58:23Z

Thanks @nchaulet, @nimarezainia, @cmacknz for the work & discussion here. Based on recent discussions about priority, I am going to kick this by a few sprints for implementation work.

BenB196 · 2023-12-13T16:58:36Z

One of the biggest drivers from our company's end on this would be APM Server, which can only support the Elasticsearch output. We mainly leverage Logstash output for agents. This requires us to run a second Agent for just APM server, and when you get to scale (100+ APM Server/Elastic Agent deployments across multiple Kubernetes clusters). We end up "wasting" 500MB on each node just operating the second agent for APM rather than being able to use our existing ones that default use Logstash.

Depending on how you look at it, 500MB might not seem like a lot, but when you're having to operate 50-100 deployments, that is 25GB-50GB of memory. This also indirectly generates additional monitoring data from the additional agents that we need to run and be monitored.

nimarezainia · 2023-12-14T00:58:52Z

One of the biggest drivers from our company's end on this would be APM Server, which can only support the Elasticsearch output. We mainly leverage Logstash output for agents. This requires us to run a second Agent for just APM server, and when you get to scale (100+ APM Server/Elastic Agent deployments across multiple Kubernetes clusters). We end up "wasting" 500MB on each node just operating the second agent for APM rather than being able to use our existing ones that default use Logstash.

Depending on how you look at it, 500MB might not seem like a lot, but when you're having to operate 50-100 deployments, that is 25GB-50GB of memory. This also indirectly generates additional monitoring data from the additional agents that we need to run and be monitored.

thanks @BenB196. How would you deploy the agent if you could indeed have the ability to define output per integration?

BenB196 · 2023-12-14T01:58:32Z

Hi @nchaulet currently for each Kubernetes cluster we deploy 2 DaemonSets, one that uses Logstash output and contains all normal integrations, a second which uses the Elasticsearch output and contains just APM Server. If per integration output was supported, we'd switch to deploying a single DaemonSet which uses Logstash as the default, and specifies the Elasticsearch output solely for the APM Server integration.

nicpenning · 2024-02-01T12:04:41Z

👋 just checking in on this feature! Any progress or details needed to further get thus implemented?

8.12 added the remote Elasticsearch output which was significant! The ability to do this per integration would be very beneficial as reasons previously stated. Thank you!

nimarezainia · 2024-02-02T05:41:20Z

thanks @nicpenning this is still prioritized but we have other higher impacting issues to resolve. We should get to this one soon as well.

nicpenning · 2024-02-02T11:30:41Z

Thank you for the update, Nima!

jlind23 · 2024-04-03T13:17:04Z

@nimarezainia I might have missed them but do we have any UI/UX mockup for this?

nimarezainia · 2024-04-04T01:28:27Z

@nimarezainia I might have missed them but do we have any UI/UX mockup for this?

#143905 (comment)

kpollich · 2024-06-12T17:30:03Z

Want to bump Nicolas's comment above with the necessary implementation plan as this is coming up soon in our roadmap: #143905 (comment)

karnamonkster · 2024-06-20T11:10:28Z

Really need this one to see running in our cluster. As we have made a stupid(not yet) but brave decision to move to a unified agent that would be used for all Infra, security and application specific data logging for different teams with different ECE instances as consumers.
From a data quality perspective, governance of ECS compliance led to this decision. We cannot have anyone sending same data over in different ways.
Of course there are exceptions but we still aim to keep it at minimum.
A sincere request to expedite this enhancement/feature request.

mbudge · 2024-07-16T17:47:46Z

Also need this so we can send system metrics (to avoid the logstash 403 forbidden infinite retry issue crashing logstash) , and firewall security logs and Netflow to different logstash pipeline inputs as they are higher throughput which we don’t want impacting windows security log collection.

nimarezainia · 2024-07-17T00:47:40Z

We will soon have news for you all on this issue with the targeted release. Thanks for your patience.

supu2 · 2024-07-22T06:42:42Z

@nimarezainia Is there any ETA for the release? Which release we will get that feature?
Thanks you so much for that integration.

## Summary Resolves #143905. This PR adds support for integration-level outputs. This means that different integrations within the same agent policy can now be configured to send data to different locations. This feature is gated behind `enterprise` level subscription. For each input, the agent policy will configure sending data to the following outputs in decreasing order of priority: 1. Output set specifically on the integration policy 2. Output set specifically on the integration's parent agent policy (including the case where an integration policy belongs to multiple agent policies) 3. Global default data output set via Fleet Settings Integration-level outputs will respect the same rules as agent policy-level outputs: - Certain integrations are disallowed from using certain output types, attempting to add them to each other via creation, updating, or "defaulting", will fail - `fleet-server`, `synthetics`, and `apm` can only use same-cluster Elasticsearch output - When an output is deleted, any integrations that were specifically using it will "clear" their output configuration and revert back to either `#2` or `#3` in the above list - When an output is edited, all agent policies across all spaces that use it will be bumped to a new revision, this includes: - Agent policies that have that output specifically set in their settings (existing behavior) - Agent policies that contain integrations which specifically has that output set (new behavior) - When a proxy is edited, the same new revision bump above will apply for any outputs using that proxy The final agent policy YAML that is generated will have: - `outputs` block that includes: - Data and monitoring outputs set at the agent policy level (existing behavior) - Any additional outputs set at the integration level, if they differ from the above - `outputs_permissions` block that includes permissions for each Elasticsearch output depending on which integrations and/or agent monitoring are assigned to it Integration policies table now includes `Output` column. If the output is defaulting to agent policy-level output, or global setting output, a tooltip is shown: <img width="1392" alt="image" src="https://github.com/user-attachments/assets/5534716b-49b5-402a-aa4a-4ba6533e0ca8"> Configuring an integration-level output is done under Advanced options in the policy editor. Setting to the blank value will "clear" the output configuration. The list of available outputs is filtered by what outputs are available for that integration (see above): <img width="799" alt="image" src="https://github.com/user-attachments/assets/617af6f4-e8f8-40b1-b476-848f8ac96e76"> An example of failure: ES output cannot be changed to Kafka while there is an integration <img width="1289" alt="image" src="https://github.com/user-attachments/assets/11847eb5-fd5d-4271-8464-983d7ab39218"> ## TODO - [x] Adjust side effects of editing/deleting output when policies use it across different spaces - [x] Add API integration tests - [x] Update OpenAPI spec - [x] Create doc issue ### Checklist Delete any items that are not applicable to this PR. - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

## Summary Resolves elastic#143905. This PR adds support for integration-level outputs. This means that different integrations within the same agent policy can now be configured to send data to different locations. This feature is gated behind `enterprise` level subscription. For each input, the agent policy will configure sending data to the following outputs in decreasing order of priority: 1. Output set specifically on the integration policy 2. Output set specifically on the integration's parent agent policy (including the case where an integration policy belongs to multiple agent policies) 3. Global default data output set via Fleet Settings Integration-level outputs will respect the same rules as agent policy-level outputs: - Certain integrations are disallowed from using certain output types, attempting to add them to each other via creation, updating, or "defaulting", will fail - `fleet-server`, `synthetics`, and `apm` can only use same-cluster Elasticsearch output - When an output is deleted, any integrations that were specifically using it will "clear" their output configuration and revert back to either `elastic#2` or `elastic#3` in the above list - When an output is edited, all agent policies across all spaces that use it will be bumped to a new revision, this includes: - Agent policies that have that output specifically set in their settings (existing behavior) - Agent policies that contain integrations which specifically has that output set (new behavior) - When a proxy is edited, the same new revision bump above will apply for any outputs using that proxy The final agent policy YAML that is generated will have: - `outputs` block that includes: - Data and monitoring outputs set at the agent policy level (existing behavior) - Any additional outputs set at the integration level, if they differ from the above - `outputs_permissions` block that includes permissions for each Elasticsearch output depending on which integrations and/or agent monitoring are assigned to it Integration policies table now includes `Output` column. If the output is defaulting to agent policy-level output, or global setting output, a tooltip is shown: <img width="1392" alt="image" src="https://github.com/user-attachments/assets/5534716b-49b5-402a-aa4a-4ba6533e0ca8"> Configuring an integration-level output is done under Advanced options in the policy editor. Setting to the blank value will "clear" the output configuration. The list of available outputs is filtered by what outputs are available for that integration (see above): <img width="799" alt="image" src="https://github.com/user-attachments/assets/617af6f4-e8f8-40b1-b476-848f8ac96e76"> An example of failure: ES output cannot be changed to Kafka while there is an integration <img width="1289" alt="image" src="https://github.com/user-attachments/assets/11847eb5-fd5d-4271-8464-983d7ab39218"> ## TODO - [x] Adjust side effects of editing/deleting output when policies use it across different spaces - [x] Add API integration tests - [x] Update OpenAPI spec - [x] Create doc issue ### Checklist Delete any items that are not applicable to this PR. - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

nimarezainia · 2024-08-13T20:09:13Z

@nimarezainia Is there any ETA for the release? Which release we will get that feature? Thanks you so much for that integration.

If our testing completes successfully target is 8.16

amolnater-qasource · 2024-09-13T05:03:26Z

Hi Team,

We have created 07 testcases under Testmo for this feature under Fleet test suite under below Section:

Output per integration

Please let us know if any other scenario needs to be added from our end.

Thanks!

nimarezainia transferred this issue from elastic/elastic-agent Oct 24, 2022

nimarezainia added the Team:Fleet Team label for Observability Data Collection Fleet team label Oct 24, 2022

nimarezainia self-assigned this Mar 7, 2023

jlind23 changed the title ~~Defining Output per integration~~ [Fleet]Defining Output per integration Oct 24, 2023

jlind23 changed the title ~~[Fleet]Defining Output per integration~~ [Fleet] Defining Output per integration Oct 24, 2023

nimarezainia removed their assignment Oct 31, 2023

jen-huang assigned nchaulet Oct 31, 2023

jen-huang unassigned nchaulet Nov 14, 2023

BenB196 mentioned this issue Dec 29, 2023

Provide way of disabling/configuring Leader Election Provider on Fleet Enrolled Elastic Agents elastic/elastic-agent#3968

Closed

nimarezainia mentioned this issue Apr 23, 2024

Support data tagging/add_field in Agent Policy #179915

Closed

kpollich assigned criamico Jun 18, 2024

kpollich assigned juliaElastic and jen-huang and unassigned criamico and juliaElastic Jul 2, 2024

jen-huang mentioned this issue Jul 24, 2024

[UII] Support integration-level outputs #189125

Merged

7 tasks

kilfoyle mentioned this issue Aug 2, 2024

Docs for integration-level outputs elastic/ingest-docs#1217

Closed

jen-huang closed this as completed in #189125 Aug 13, 2024

kpollich added the QA:Needs Validation Issue needs to be validated by QA label Aug 13, 2024

amolnater-qasource mentioned this issue Aug 21, 2024

Agent gets unhealthy temporarily on switching integration output to Remote ES. elastic/elastic-agent#5332

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] Defining Output per integration #143905

[Fleet] Defining Output per integration #143905

nimarezainia commented Oct 24, 2022 •

edited

Loading

elasticmachine commented Oct 24, 2022

willemdh commented May 3, 2023 •

edited

Loading

amitkanfer commented May 3, 2023 •

edited

Loading

nimarezainia commented May 3, 2023

amitkanfer commented May 3, 2023

nicpenning commented May 16, 2023

jlind23 commented Oct 24, 2023

nimarezainia commented Oct 26, 2023 •

edited

Loading

nimarezainia commented Oct 26, 2023

zombieFox commented Oct 26, 2023 •

edited

Loading

jen-huang commented Oct 31, 2023

nchaulet commented Nov 6, 2023

nimarezainia commented Nov 7, 2023

Package policy/saved object changes

cmacknz commented Nov 7, 2023

cmacknz commented Nov 7, 2023

jen-huang commented Nov 14, 2023

BenB196 commented Dec 13, 2023 •

edited

Loading

nimarezainia commented Dec 14, 2023

BenB196 commented Dec 14, 2023 •

edited

Loading

nicpenning commented Feb 1, 2024

nimarezainia commented Feb 2, 2024

nicpenning commented Feb 2, 2024

jlind23 commented Apr 3, 2024

nimarezainia commented Apr 4, 2024

kpollich commented Jun 12, 2024

karnamonkster commented Jun 20, 2024 •

edited

Loading

mbudge commented Jul 16, 2024

nimarezainia commented Jul 17, 2024

supu2 commented Jul 22, 2024

nimarezainia commented Aug 13, 2024

amolnater-qasource commented Sep 13, 2024

[Fleet] Defining Output per integration #143905

[Fleet] Defining Output per integration #143905

Comments

nimarezainia commented Oct 24, 2022 • edited Loading

elasticmachine commented Oct 24, 2022

willemdh commented May 3, 2023 • edited Loading

amitkanfer commented May 3, 2023 • edited Loading

nimarezainia commented May 3, 2023

amitkanfer commented May 3, 2023

nicpenning commented May 16, 2023

jlind23 commented Oct 24, 2023

nimarezainia commented Oct 26, 2023 • edited Loading

nimarezainia commented Oct 26, 2023

zombieFox commented Oct 26, 2023 • edited Loading

jen-huang commented Oct 31, 2023

nchaulet commented Nov 6, 2023

Package policy/saved object changes

Deleting/Editing output changes

Full agent policy generation changes (aka policy sent to agents)

Things to verify

nimarezainia commented Nov 7, 2023

Package policy/saved object changes

cmacknz commented Nov 7, 2023

cmacknz commented Nov 7, 2023

jen-huang commented Nov 14, 2023

BenB196 commented Dec 13, 2023 • edited Loading

nimarezainia commented Dec 14, 2023

BenB196 commented Dec 14, 2023 • edited Loading

nicpenning commented Feb 1, 2024

nimarezainia commented Feb 2, 2024

nicpenning commented Feb 2, 2024

jlind23 commented Apr 3, 2024

nimarezainia commented Apr 4, 2024

kpollich commented Jun 12, 2024

karnamonkster commented Jun 20, 2024 • edited Loading

mbudge commented Jul 16, 2024

nimarezainia commented Jul 17, 2024

supu2 commented Jul 22, 2024

nimarezainia commented Aug 13, 2024

amolnater-qasource commented Sep 13, 2024

nimarezainia commented Oct 24, 2022 •

edited

Loading

willemdh commented May 3, 2023 •

edited

Loading

amitkanfer commented May 3, 2023 •

edited

Loading

nimarezainia commented Oct 26, 2023 •

edited

Loading

zombieFox commented Oct 26, 2023 •

edited

Loading

BenB196 commented Dec 13, 2023 •

edited

Loading

BenB196 commented Dec 14, 2023 •

edited

Loading

karnamonkster commented Jun 20, 2024 •

edited

Loading