Vault 1.4.0 won't start when a seal stanza is added in aws eks #8844

corbesero · 2020-04-24T21:23:07Z

Describe the bug

When I add a seal stanza (awskms) to a vault configuration via the vault-helm chart (0.5.0), the vault does not become available in the containers.

To Reproduce

I do a helm install without the seal stanza. Vaults (one active and one standby) will come up and I can unseal manually.
Add the seal stanza to access an existing kms key
Do a helm update
Delete the pods so they get recreated with the new configuration.

Expected behavior

The vaults should come up so that I can do the unseal migrate.

Environment:
Vault 1.4.0
AWS EKS 1.15
vault-helm chart at tag 0.5.0

Vault server configuration file(s):

disable_mlock = true
ui = true
log_level = "trace"
listener "tcp" {
  tls_disable = 1
  address = "[::]:8200"
  cluster_address = "[::]:8201"
}
storage "raft" {
  path = "/vault/data"
  retry_join {
    leader_api_addr = "http://vault-0.vault-internal:8200"
  }
  retry_join {
    leader_api_addr = "http://vault-1.vault-internal:8200"
  }
}
seal "awskms" {
 region     = "us-east-1"
 kms_key_id = "7899be0b-3be8-4dd7-a5b5-32b85c02c406"
}

Also, see attached values file for helm

Additional context

No log output is produced.

This is the output of a ps on the container

/ $ ps awx | grep vault
    1 vault     0:00 /bin/sh -ec sed -E "s/HOST_IP/${HOST_IP?}/g" /vault/config/extraconfig-from-values.hcl > /tmp/storageconfig.hcl; sed -Ei "s/POD_IP/${POD_IP?}/g" /tmp/storageconfig.hcl; /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl  
    8 vault     0:00 {docker-entrypoi} /usr/bin/dumb-init /bin/sh /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl
    9 vault     0:00 vault server -config=/tmp/storageconfig.hcl
  208 vault     0:00 /bin/sh
  254 vault     0:00 ps awx
  255 vault     0:00 grep vault

I have attached the output of kubectl describe pods -n vault for the vault-0 and vault-1 pods. The vault-0 is when the pod is still there, but the vault-1 sows the output after a while when the container completely fails.

If I comment out the seal stanza, I can do a helm upgrade, delete the pods, and the new ones come up and can be unsealed.

kubectl-describe-pod-vault-0.txt
kubectl-describe-pod-vault-1.txt
values.yaml.txt

The text was updated successfully, but these errors were encountered:

calvn · 2020-04-25T00:39:20Z

@corbesero thanks for opening a separate issue to follow up on this. We've done some initial investigation on our end, and believe that it might be an issue with the instance profile not being detected correctly.

Can you create IAM Access Keys (with the proper permissions), and provide them directly to test things out?

server:
  # extraSecretEnvironmentVars is a list of extra enviroment variables to set with the stateful set.
  # These variables take value from existing Secret objects.
  extraSecretEnvironmentVars:
  - envName: AWS_ACCESS_KEY_ID
    secretName: vault-aws
    secretKey: AWS_ACCESS_KEY_ID
  - envName: AWS_SECRET_ACCESS_KEY
    secretName: vault-aws
    secretKey: AWS_SECRET_ACCESS_KEY

  ha:
    enabled: true
    raft:
      enabled: true
      config: |
        ui = true

        listener "tcp" {
          tls_disable = 1
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }

        storage "raft" {
          path = "/vault/data"
          retry_join {
            leader_api_addr = "http://vault-0.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-1.vault-internal:8200"
          }
        }

        service_registration "kubernetes" {}

        seal "awskms" {
          region     = "us-east-1"
          kms_key_id = "<aws-kms-key-id>"
        }

Side note: If you're doing this on a test environment, make sure to delete to volumes that were created on the last attempt (via kubectl delete pvc <name of claim>) since helm does not do that automatically.

corbesero · 2020-04-25T23:36:52Z

I did that. I created a new AWS key pair with the policy to allow kms access, and vault did come up. I did not unseal it, but I saw log messages. This is not exactly an identical test, since I didn't go through the step of first letting it come up w/o the awskms seal. I will try that on Monday.

But, this does imply the vault is not happy depending on the profile of the instance role or pod OIDC from the service account.

We were really expecting to be able to use that feature since our other EKS services use that mechanism.

corbesero · 2020-04-27T13:51:45Z

I can confirm my original scenario. I have an AWS access/secret key pair set in the helm and the secret in the namespace. If I create a vault without the unseal stanza, it can start up. If I then add the seal stanza, the new containers do come up. I as able to do the migrate, and afterwards the containers did seem to do the auto unseal correctly.

This seems to strongly imply that vault is having a problem coming up when depending on the instance or service account profile instead of an explicit AWS key configuration.

calvn · 2020-04-27T23:00:02Z

Thanks for doing the setup to verify things! I don't want to draw conclusions yet, but it may be related to #8847 (also an issue with instance profile metadata not being picked up).

corbesero · 2020-04-28T13:08:51Z

@calvn I think it is the same issue. I noticed #8847 recently too. When we were installing vault 1.3, the pods were only getting the instance profile of the worker node, not the role specified in the service account via the OIDC. Our Switching to 1.4 just didn't expose the underlying profile issue until I added the aws unseal, which completely broke the instance profile being used.

inkblot · 2020-05-02T13:20:00Z

I opened #8847. I am also using an AWS KMS seal with 1.4.0, and vault successfully uses the ECS task role and not the EC2 instance profile to acquire AWS credentials for using KMS to unseal. Even so, credential acquisition is not working for the AWS auth backend.

rubroboletus · 2020-05-05T09:37:51Z

Any progress here? We have the same issue on AWS EKS 1.15 with OIDC mapped to serviceaccount. Creating new vault 1.4.1 from scratch, deploying from fresh git helm chart. SA is annotated according to EKS documentation, pods have AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE environment variables set, tokens are mounted, access to EC2 instance profile is disabled (drop any packets to 169.254.169.254).

kalafut · 2020-05-15T15:18:26Z

My comment on the linked issue might apply here too: #8847 (comment)

michaeljohnalbers · 2020-05-20T02:12:29Z

I'm seeing pretty much the same problem, but on a manually created Kubernetes cluster (not EKS, kops or anything, just plain EC2 instances with kubeadm). As soon as I add the awskms seal Vault starts but does not output any logs (regardless of log level) nor does it open the 8200 port.

I've tried attaching a completely wide open IAM role to the EC2 instance as well as using a secret key/access key pair for a role which is constrained to just the KMS operations listed in the docs. I verified these keys work with KMS operations when used with the aws cli.

Kubernetes version: 1.18.2
Vault Version: 1.4.0
Helm Chart version: 0.5.0

chancez · 2020-07-01T22:06:44Z

I believe #7738 fixes this

michaeljohnalbers · 2020-07-10T22:37:51Z

@chancez I just tried the new 1.4.3 vault image and it doesn't appear to have fixed the issue. I'm still seeing the exact same symptoms as I described above. I'm also using version 0.6.0 of the Helm chart.

chancez · 2020-07-11T00:22:06Z

I had to set AWS_ROLE_SESSION_NAME to make it work with 1.4.3

kalafut · 2020-07-11T00:24:58Z

This is good info @chancez , and relates to #9415.

tvoran · 2020-07-16T22:18:21Z

Now that #9416 has been merged to fix #9415, setting AWS_ROLE_SESSION_NAME should no longer be required in vault 1.4.4 and 1.5.0 (when they're released, that is).

tvoran · 2020-07-30T23:42:46Z

Hi @corbesero, have you had a chance to try vault 1.5.0 to see if that resolves the issue? Or 1.4.3 w/AWS_ROLE_SESSION_NAME set?

tvoran · 2020-08-20T23:00:28Z

Closing for now.

calvn added the core/seal label Apr 25, 2020

calvn added the bug Used to indicate a potential bug label Apr 27, 2020

kalafut mentioned this issue Jul 22, 2020

Vault:1.4.3 can not start with KMS auto unseal with IAM Role #9568

Closed

gargana mentioned this issue Jul 23, 2020

Helm deployment on Amazon EKS: values.yml server.serviceaccount.annotations AWS IAM Role not working as expected for KMS auto-unseal #9576

Closed

tvoran closed this as completed Aug 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vault 1.4.0 won't start when a seal stanza is added in aws eks #8844

Vault 1.4.0 won't start when a seal stanza is added in aws eks #8844

corbesero commented Apr 24, 2020

calvn commented Apr 25, 2020

corbesero commented Apr 25, 2020

corbesero commented Apr 27, 2020

calvn commented Apr 27, 2020

corbesero commented Apr 28, 2020

inkblot commented May 2, 2020

rubroboletus commented May 5, 2020

kalafut commented May 15, 2020

michaeljohnalbers commented May 20, 2020 •

edited

Loading

chancez commented Jul 1, 2020

michaeljohnalbers commented Jul 10, 2020

chancez commented Jul 11, 2020

kalafut commented Jul 11, 2020

tvoran commented Jul 16, 2020

tvoran commented Jul 30, 2020

tvoran commented Aug 20, 2020

Vault 1.4.0 won't start when a seal stanza is added in aws eks #8844

Vault 1.4.0 won't start when a seal stanza is added in aws eks #8844

Comments

corbesero commented Apr 24, 2020

calvn commented Apr 25, 2020

corbesero commented Apr 25, 2020

corbesero commented Apr 27, 2020

calvn commented Apr 27, 2020

corbesero commented Apr 28, 2020

inkblot commented May 2, 2020

rubroboletus commented May 5, 2020

kalafut commented May 15, 2020

michaeljohnalbers commented May 20, 2020 • edited Loading

chancez commented Jul 1, 2020

michaeljohnalbers commented Jul 10, 2020

chancez commented Jul 11, 2020

kalafut commented Jul 11, 2020

tvoran commented Jul 16, 2020

tvoran commented Jul 30, 2020

tvoran commented Aug 20, 2020

michaeljohnalbers commented May 20, 2020 •

edited

Loading