Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS auth backend client unable to use IAM credentials from ECS task metadata #8847

Closed
inkblot opened this issue Apr 25, 2020 · 5 comments
Closed
Labels
auth/aws bug Used to indicate a potential bug

Comments

@inkblot
Copy link

inkblot commented Apr 25, 2020

This is a regression

Worked in version: 1.3.0
Broken in version: 1.4.0

Bug Description
I have Vault deployed as an ECS service, using an ECS task definition with an associated task role. I have an AWS auth backend configured with a client that uses the IAM credentials from ECS task metadata. This configuration was working without issues with Vault 1.3.0. After upgrading Vault to version 1.4.0, I am unable to create AWS auth backend roles. Vault is unable to resolve the ARN and produces the following output (IDs and URLs redacted or modified):

$ vault write auth/aws/role/example auth_type=iam bound_iam_principal_arn=arn:aws:iam::3xxxxxxxxxx6:role/example policies=default
Error writing data to auth/aws/role/example: Error making API request.

URL: PUT https://vault.internal:8200/v1/auth/aws/role/example
Code: 400. Errors:

* unable to resolve ARN "arn:aws:iam::3xxxxxxxxxx6:role/example" to internal ID: unable to fetch current caller: InvalidClientTokenId: The security token included in the request is invalid
	status code: 403, request id: 5xxxxxxx-5xxx-4xxx-9xxx-5xxxxxxxxxx

The backend client is apparently able to use the AWS access key id and secret access key from ECS metadata, but not the token which is also required to authenticate.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy Vault 1.4.0 in AWS ECS using a task role that grants ecs:DescribeInstances, iam:GetInstanceProfiles, iam:GetRole, and iam:GetUser
  2. Configure an AWS auth backend, omitting credentials from the backend client
  3. Attempt to create an AWS auth role as indicated in the bug description

Expected behavior
I expect Vault to use AWS credentials (access key, secret key, and token) from ECS metadata, successfully resolve the IAM role, and create the auth role.

Environment:

  • Vault Server Version: 1.4.0
  • Vault CLI Version: 1.4.0
  • Server Operating System/Architecture: vault docker image deployed to AWS ECS

Vault server configuration file template:

storage "consul" {
  address = "${consul_http_address}"
  path    = "${vault_consul_path}"
}

listener "tcp" {
  address         = "0.0.0.0:8200"
  tls_cert_file = "/ssl/certs/server.pem"
  tls_key_file  = "/ssl/private/server.pem"
}

ui = true
@inkblot
Copy link
Author

inkblot commented May 2, 2020

There is additional configuration via environment variables which I realize I neglected to include. I have VAULT_SEAL_TYPE set to awskms with a corresponding VAULT_AWSKMS_SEAL_KEY_ID, and I am also setting VAULT_API_ADDR and VAULT_CLUSTER_ADDR to appropriate values for each of two nodes.

@kalafut
Copy link
Contributor

kalafut commented May 15, 2020

Hi. Vault 1.4.0 updated the AWS SDK, and along with that began using IMDSv2. In researching this issue, I've come across various issues raised against the SDK and in other projects concerning IMDSv2 within container services. A common recommendation is to increase the response hop limit for the underlying instances. I was wondering if you'd tried that? This is not desirable requirement, and it looks like there requests into AWS to improve it. Nonetheless, it would be very useful to know if this is at least a workaround for now.

References:

@inkblot
Copy link
Author

inkblot commented May 16, 2020

I was able to fix this, but not with the hop limit.

On my ECS hosts, I have a rule in the FORWARDING table that redirects 169.254.169.254:80 to a container running nginx that is configured as a WAF for the EC2 metadata service. This IMDS WAF is one hop from the task container at L3 where the hop limit is imposed. The WAF configuration blackholes credentials and user data, but leaves open other instance metadata like networking, AMI ID, etc. I added /latest/api/token to the blackhole list and now vault running as an ECS task is correctly using its ECS metadata credentials (169.254.170.2 rather than 169.254.169.254) and can resolve the IAM role when I create a new AWS auth backend role.

@chancez
Copy link

chancez commented Jul 1, 2020

I believe #7738 should fix this

@tvoran
Copy link
Member

tvoran commented Aug 20, 2020

Closing since the original issue has been resolved.

@tvoran tvoran closed this as completed Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auth/aws bug Used to indicate a potential bug
Projects
None yet
Development

No branches or pull requests

5 participants