Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.16] Support multiple endpoints #965

Closed

Conversation

michalpristas
Copy link
Contributor

What is the problem this PR solves?

What this PR solves is a problem when agent got unenrolled on heavier load when agent managing fleet server cannot checkin to it's own server so it will fallback to unenroll.
Closes #741

How does this PR solve the problem?

Problem is solved by adding internal endpoint which is used for communication on local network (with agent handling fleet server)
It lets FS to spin up 2 set of handlers, one on public 8220 and one on port defined in config.

How to test this PR locally

This needs to be tested with work on elastic-agent Link: elastic/beats#28993

  • Start stack
  • Install agent with FS in a policy
  • Check ports
sh-3.2# lsof -i -P | grep LISTEN | grep fleet
fleet-ser  7056            root   19u  IPv4 0xba7881a9227099a5      0t0    TCP localhost:{random_port} (LISTEN)
fleet-ser  7056            root   21u  IPv6 0xba7881a91284721d      0t0    TCP *:8220 (LISTEN)
  • run wireshark, set filter to random port, there should be some comm
  • set filter to 8220 port, there should be no comm
  • enroll new agent, from another VM
  • there should be some comm on both ports

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

blakerouse and others added 30 commits January 20, 2021 10:07
* Add communication to Elastic Agent.

* Fix command-line args and config from Agent.

* Fix tests.

* Handle config errors on initial config from Agent.

* Log to err on failure.

* Add syncing of the log writter.

* Add docstring.

* Re-structure the log init.

* Remove unused code in logger.

* Fix logging of failed start.

* Don't become a leader until fleet.agent.id is set.

* Fixes from code review.

(cherry picked from commit d40570d)
This commit is adding execution permission to dependencies-report script
which are required during the unified release build.

(cherry picked from commit 19fd842)

Co-authored-by: Julien Mailleret <8582351+jmlrt@users.noreply.github.com>
[Backport 7.x]: Reduce actions fetching interval if the full page of action documents was fetched
…sults-backport

[Backport 7.x] Flatten .fleet-actions-results schema
* Add ssl configuration to fleet server http configuration.

* Add log message when tls disabled.

* Fix import.

* Fix integration test.

(cherry picked from commit 9d451b7)
* Moved it to testing/esutil package because it is still used for
  integration testing indices bootstrapping
* Remove saved objects code

* Make check happy
The index monitoring got broken dues to replacing the original
.fleet-actions and .fleet-policies indices with aliases.

This change uses the first index global checkpoint value received from
stats and returns the error if there are more than two indices for
alias for any reason.
* Add API key invalidation on unenroll ACK.

* Fix import location.

* Fix test.

(cherry picked from commit dc762b5)
…elastic#122)

* Add optional maximum connection limit to the server.  No more than N connections will be active at any time.

* And licenses

* And notice
* Add /api/status endpoint.

* Run fmt.

* Remove version from status.

(cherry picked from commit 7cb2930)
apmmachine and others added 24 commits November 3, 2021 05:24
…ing (elastic#830)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#835)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#840)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#851)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
(cherry picked from commit 8a4855b)

Co-authored-by: Sean Cunningham <sean.cunningham@elastic.co>
This was coming out of the debugging session around fleet-server where some of the log messages were not too clear to me on what these mean.
…ing (elastic#866)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#871)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#885)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#892)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#900)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#912)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#917)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#922)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#928)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#932)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…#937)

* keep trucking on ES availability errors; more tests to come

(cherry picked from commit 7fb0138)

* don't attempt to distinguish between errors, just keep retrying

(cherry picked from commit 2c75552)

* move error blackholing up the stack so the monitor will never crash, added additional logging

(cherry picked from commit f5fead9)

* pr feedback

(cherry picked from commit 1886dc5)

* upped logging level, properly wrapped errors

(cherry picked from commit 97524dc)

Co-authored-by: bryan <bclement01@gmail.com>
…ing (elastic#941)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#947)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#950)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#956)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
…ing (elastic#961)

Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co>
@michalpristas michalpristas requested a review from a team as a code owner December 7, 2021 15:51
@michalpristas michalpristas self-assigned this Dec 7, 2021
@mergify
Copy link
Contributor

mergify bot commented Dec 7, 2021

This pull request is now in conflicts. Could you fix it @michalpristas? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b backport_multiple_endpoints upstream/backport_multiple_endpoints
git merge upstream/master
git push upstream backport_multiple_endpoints

@mergify
Copy link
Contributor

mergify bot commented Dec 7, 2021

This pull request does not have a backport label. Could you fix it @michalpristas? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v/d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fleet server is unexpectedly unenrolled under load
8 participants