-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vault 1.5 crashes on linux on arm if telmetry is enabled #9553
Comments
You can disable the gauge collection process which is the source of this bug by adding the configuration
to your telemetry stanza, which will let the rest of telemetry work. |
Unfortunately that workaround didn't work for me. I have the same issue with gauge interval disabled. |
Sorry, I used the wrong name. https://www.vaultproject.io/docs/configuration/telemetry says it's |
And I should have looked at the documentation first. It works. This will get me by till 1.5.1. Thanks for the help. |
Describe the bug
After upgrading to Vault 1.5 Linux/Arm I enabled telemetry in the vault config file for the first time. When I do the active vault node crashes on restart with a go panic. "panic: invalid argument to Intn". See the bottom for the rest of the error.
I have a 3 node HA Cluster so it just keeps setting a new node as active and that node fails and passes it to the next around in a circle.
If I remove the telemetry stanza it works again. If I put back the telemetry stanza with any combination of parameters I get the same panic.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Vault should successfully start up and expose the Prometheus telemetry metrics at: /v1/sys/metrics
Environment:
vault status
): 1.5.0vault version
): v1.5.0Vault server configuration file(s):
Additional context
Add any other context about the problem here.
Vault has been working well for a long time so there were no preexisting issues. The 1.5 upgrade went completely smoothly with no issues at all until I enabled the telemetry. I have tried a bunch of different telemetry stanza parameters including just "telemetry {}" and they all seem to act the same way. I also moved the telemetry stanza around in the HCL file just in case and there was no noticeable change.
Here is the rest of the error message:
Systemd was also throwing a bunch of these when trying to start up the service and it seems to start but then crashes. Otherwise it seems to be able to start all the backends and even unseal before it crashes. These messages go away when I remove the telemetry stanza.
The text was updated successfully, but these errors were encountered: