Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AWS STS external ID #5033

Closed
wants to merge 3 commits into from

Conversation

j-vizcaino
Copy link

This PR adds support for providing an external ID when calling sts:AssumeRole in AWS.
Related issue: #3790

Motivation

It's mostly a security concern, referred to as the Confused Deputy problem. The AWS docs best describe why this is needed, especially when assuming a role in a foreign account.
Docs are here: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html

TL;DR

  • add support for providing external_id when creating a new AWS role.
  • add support for providing external_id when assuming an AWS role, possibly overriding existing external_id for the single sts:AssumeRole call

Implementation details and backward compatibility

The existing implementation stored either an ARN or an IAM policy document internally as policy/<name>. Unfortunately, this was stored as a string, making it a bit difficult to add another field to the existing key.
Furthermore, it relied on string prefix arn: to identify further down the road if the STS call should be AssumeRole (using ARN) or GetFederationToken (using IAM policy).

The new implementation:

  • stores a JSON document inside role/<name>
  • reads from role/<name> first, then policy/<name> as a fallback. The read behaviour, previously copy/pasted, is now centralised in a getRoleConfig() function used where needed.
  • lists from role/* and policy/*
  • delete both role/<name> and policy/<name>

Why provide support for external_id during write aws/sts/<role>?

We are moving from a setup where applications called STS directly, providing external IDs pulled from a secret storage backend, to a setup where Vault is used to assume the roles. Moving external IDs into that Vault is the long-term goal, but, as with every security-related matters, this will likely take time. This option is meant for us to be able to benefit from Vault assuming roles, while using our existing external secret storage backend.

@joelthompson
Copy link
Contributor

Hey @j-vizcaino -- I'm actually doing a refactor of the AWS secret backend that solves some of the config problems you described, in #4360. I think once that is merged, it'll make this PR a lot simpler, but it looks like we've duplicated some effort :(

On the specifics here, as I commented in #3790, the generation of the external ID is really important when trying to solve the confused deputy problem, and just passing an external ID in like this would be insecure if you don't fully trust the operators who can modify the configurations of Vault roles. The most secure option would be to just have Vault generate an external ID per Vault role (or possibly per AWS mount?), and that gives the owner of the AWS role that is being assumed the ability to just trust the single Vault role (or AWS mount). By accepting external ID input like you're proposing, it becomes impossible to ever move to the trust model that I'm describing here, which worries me a little bit.

How much effort would it be for you to just have Vault generate the external ID and put that back in the target role trust policies? Another option might me to just basically "reserve" some sort of prefix for Vault-generated external IDs. That is, Vault would never allow users to specify an external ID that starts with something like "vault-generated/" and so if the feature gets added to have Vault generate the external ID, it would always prepend "vault-generated/" to something like a generated UUID and then people would know that if the external ID starts with "vault-generated/" it wasn't specified by a malicious user but rather was generated by Vault.

Thoughts?

@j-vizcaino
Copy link
Author

Hey @joelthompson, thanks for the pointers and comment!
It seems we hit the same issue on the AWS secret backend and, the current PR overlaps a little what's done in #4360 (sadly I did not see this). It seems the other PR is more ambitious and, as you said, I should rewrite this patch once it gets merged. It seems #4360 has been opened for quite some time now, do you think this is going to land soon?

Regarding external ID discussion and motivation behind providing external ID either in AWS role or at assume role call time, it seems my initial use-case description was incomplete. Here's another take at it.

Our current setup involves assuming AWS roles in AWS client accounts and we use an external ID for the security reasons described in the AWS docs. That means, we generate an external ID, provide it to the client so he can update his IAM policies accordingly, then save the ID away in a separate secure store (only containing customers credentials). As a result, for each client, applications pull the shared external ID from the secure store and call STS to assume role in the AWS client account.

The target setup is to use Vault to handle the assume role calls instead of applications and use IAM credentials to authenticate to AWS (instead of instance profile).

an external ID per Vault role (or possibly per AWS mount?)

Given the example above, having an external ID per role would provide the most flexibility.

the generation of the external ID is really important when trying to solve the confused deputy problem, and just passing an external ID in like this would be insecure if you don't fully trust the operators who can modify the configurations of Vault roles

I'm not sure I fully understand the rationale behind this: IMHO, the roles/* path should always be secured unless the attached IAM policies are restrictive enough for the AWS user or role used by Vault.
Also, I do not see much benefits in having Vault generate the external ID for us, even worse, should AWS update its external ID policies (say, allow for a longer ID), suddenly Vault would need to be patched to support this.

Another option might me to just basically "reserve" some sort of prefix for Vault-generated external IDs. That is, Vault would never allow users to specify an external ID that starts with something like "vault-generated/" and so if the feature gets added to have Vault generate the external ID, it would always prepend "vault-generated/" to something like a generated UUID and then people would know that if the external ID starts with "vault-generated/" it wasn't specified by a malicious user but rather was generated by Vault.

What do you want to protect in that case? A malicious user could easily craft a string with the same prefix and add it the IAM policies (given proper escalation). It feels like having this as the only available feature will likely paint us into a corner when the Vault external ID generation suddenly don't match the in-house, specific security rules.

If we want to make the external_id generation a feature, why not add an option to do that? Something like vault write aws/roles/foo arn=... generate_external_id=true.

This solution would:

  • allow users to generate or provide an external ID, by either using write aws/roles/foo external_id=... or write aws/roles/foo generate_external_id=true. Note: generate_external_id should be opt-in, false by default.
  • support policies generate_external_id=true for writing to aws/roles, thus enforcing usage of external ID generation for a given AWS secret backend.
  • generated external ID should be stored like a user provided external ID. Unless denied by policies, generated external ID should be able to be overwritten by a user-provided one.

I hope this clears most of the questions raised. If not, I'd be happy to discuss this and provide more details if needed. I'll be available in the Vault Gitter.

@joelthompson
Copy link
Contributor

I hope #4360 will land soon :) I let it linger for a couple months for personal reasons, but I should be able to engage fully now.

When you say, "The roles/* path should always be secured unless the attached IAM policies are restrictive enough for the AWS user or role used by Vault" I agree, but the question is, secured against what types of threats/attackers? The Vault team has talked a lot about trying to make Vault more federatable, and so if you have different mounts of the AWS secrets engine with different security classifications, then you need Vault to generate the GUIDs so that the Vault roles in one mount couldn't get access to the AWS roles that were set up to only trust external IDs in the other mount.

What do you want to protect in that case? A malicious user could easily craft a string with the same prefix and add it the IAM policies (given proper escalation).

I've tried to describe what I want to protect. It doesn't matter if a user adds the string with this prefix to the IAM role trust policy because, in my proposal, Vault would never include that string as an external ID when making an AssumeRole call unless it had been internally and automatically generated, and thus the owners of the target IAM roles can be confident that Vault itself isn't being vulnerable to confused deputy attacks to get into the IAM roles.

@j-vizcaino
Copy link
Author

Your description of the problem makes a lot of sense, I initially failed to see this. Thanks for the explanation.
I still feel that with my previous proposal we can still achieve the same goal, while letting the Vault operator choose what model fits its needs, leveraging Vault policies.

@jefferai
Copy link
Member

I think we should wait until #4360 lands and then circle back on this. I think @joelthompson has the right approach here and I'd want to better understand why we'd want to support two models.

@jefferai jefferai added this to the next-release milestone Aug 13, 2018
@bpineau
Copy link

bpineau commented Aug 22, 2018

Hi @jefferai, Jerome is absent for a few weeks, so I'll take it from here (I work with him).

Allowing users to provide external ids (even if at a later point vault implements builtin external id generation at role creation time) is still required for common cases, for instance:

  • External ids may be exchanged out of band
  • Users may have already existing external ids
  • One may want to isolate role storage, and external id storage, for security reasons (a compromised vault would gain limited privileges)

I just pushed a stripped down (and rebased) approach.

(let me know if you'd rather this, as was suggested, to reject "vault-generated/" prefixed ids, to reserve a well known prefix for potential future usages).

@bpineau bpineau force-pushed the aws-sts-external-id branch 2 times, most recently from 3c1e2cf to 1b2f78f Compare August 22, 2018 15:33
Allow users to provide an external_id at assumerole creds generation time.
@jefferai
Copy link
Member

@joelthompson Do those arguments above make sense? Do you agree that there are use-cases for both?

@tyrannosaurus-becks
Copy link
Contributor

@bpineau and @j-vizcaino , thank you for submitting this code and working on this!

I completely agree that this will be a very useful feature. I could already name one company off the cuff that could use it immediately.

I think that Vault should be generating the External ID and keeping it internally, and after that, there should be an (access controlled, of course) endpoint for reading the ID. In the AWS docs, they state multiple times that it's essential that Example Corp (who, in this case is Vault) be the one issuing the External ID.

Example Corp can use any string value they want for the ExternalId, as long as it is unique for each customer. It can be a customer account number or even a random string of characters, as long as no two customers have the same value. It is not intended to be a 'secret'. Example Corp must provide the ExternalId value to each customer. What is crucial is that it must be generated by Example Corp and not their customers.

Aside from merely quoting the text, the External ID should be generated within Vault and tightly controlled, because that is part of Vault's security promise. Having it floating around somewhere else in plain text, in my opinion, causes Vault's involvement to only give an illusion of security that doesn't actually exist.

I think you make a great point here:

Users may have already existing external ids

I wonder if in that case, it might be possible for Vault to still generate a new External ID and it would be updated elsewhere as well.

@joelthompson
Copy link
Contributor

Hey @j-vizcaino @jefferai @bpineau,

I've never disagreed with @bpineau's arguments, for a certain use case. They make perfect sense.

However, I believe that the release of Namespaces really makes the other use case more real. I haven't had the opportunity to educate myself deeply on how Namespaces are implemented, so the following might be totally wrong, and if so, please correct me. Consider the following scenario:

  • A Vault server is set up in AWS using either environment variables or an IAM instance profile to give it credentials, let's call this the "shared Vault principal."
  • Vault server admins set up two namespaces, namespace A and namespace B
  • I believe that namespace A and namespace B will, by default, both use the exact same credentials to connect to AWS -- the credentials of the shared Vault principal.
  • The namespace A admin sets up some AWS roles in her AWS account with a trust policy that allows the shared Vault principal to assume it, and sets up an external ID requirement on the role. She also configures an AWS Secret Engine role to assume the AWS role.
  • The namespace B admin can now get access to the role set up by the namespace A admin once learning the external ID by configuring a similar Vault role to assume the same AWS role because it also uses the shared Vault principal
  • Somebody who happens to learn the required external ID (and they aren't treated as a secret by AWS) could then use that external ID to assume the role via namespace B without the consent of the namespace A admin.

This can be worked around by configuring each namespace with a unique set of AWS IAM user credentials. But, I feel like there's a longer design discussion about how to properly isolate namespaced mounts from ambient configuration information (such as AWS credentials supplied either via environment variables or IAM Instance Profiles), and the usage of external IDs as a way of achieving that isolation. I have some very rudimentary thoughts about how this might be achieved in the context of the AWS backends, but nothing that I'm really ready yet to write down.

My worry is that accepting this PR as is doesn't really allow for this additional usecase inside of namespaces, and further, there's a certain degree of lock-in associated with it. That is, if Vault today doesn't accept user-supplied external IDs that start with "vault-generated/" but tomorrow does, nobody cares. However, if today Vault does accept user-supplied external IDs that start with "vault-generated/" but tomorrow it doesn't, that's a backwards-incompatible, breaking change. So it seems better, IMHO, to add a few lines of code now to reserve a prefix that can be used later, than it is to preclude the usage of this prefix (or, at least, preclude it absent a breaking change). If, say, in a year or two there is no demand for having Vault generate the prefix, then the restriction in the code can be pretty trivially deleted.

If we assume Vault is to control the external ID, then I think @tyrannosaurus-becks has some great points. There should be an endpoint to read it, and also an endpoint to rotate it if needed. However, I also recognize that the requirement that Vault manage the external IDs, while it might have security benefits, would represent a barrier to adoption for customers who have pre-existing external IDs that they need to integrate into Vault, and it would significantly complicate any rotation as it would need to be coordinated with the owners of the target roles.

p.s. This is minor, but I want to offer a point of disagreement on this:

a compromised vault would gain limited privileges

A compromised Vault could listen to requests and learn external IDs that are supplied to it, so over time, this distinction doesn't matter that much.

@jefferai
Copy link
Member

@joelthompson is correct with his assertions around ambient credential information. One thing I'll add there is that "namespace admin" only exists within the context of "having appropriate policies to do things". It is possible that an uber-admin in the root namespace can simply never give out privileges for configuring the AWS credentials, allowing delegated admins to only configure, say, roles. The uber-admin can therefore override the credentials in each namespace's mount with static credentials; add in root credential rotation and this seems like a relatively viable solution, provided people actually realize the problem.

@joelthompson
Copy link
Contributor

Ah. Another thing that would be nice in this context then is, rather than having an AWS access key and secret key configured, give it an AWS IAM role to assume and use that to get credentials for other operations (similar to how STS operates in the AWS auth method). Then the uber-admin could just force the usage of that role.

However, the uber-admin would need to make sure the delegated admin can't create a sub-namespace which then is able to access the ambient credentials, but instead is forced to at least "start with" the credentials of the parent namespace (e.g., I could imagine namespace A would want to use role Foo, and namespace A/B would want to use role Bar and Bar is assumable only by Foo).

Is there any way to block a namespace from accessing ambient credential information?

Another thought -- maybe a way to force a namespace to always use credentials that are provided by a role in the AWS secret engine in the parent namespace? That feels a bit more natural.

Anyway, I realize that this is really far off on a tangent about namespaces rather than talking about using external IDs, maybe better to move this discussion elsewhere?

@tyrannosaurus-becks
Copy link
Contributor

@joelthompson I think you make excellent points and I am persuaded. Also, I'm going to do a quick add to our docs around Namespaces pointing out the issues with ambient credentials.

@jefferai
Copy link
Member

Is there any way to block a namespace from accessing ambient credential information?

There isn't. I'm not sure how we could, to be honest, but we could think about it. The answer may just be "there isn't" -- with another answer of "to prevent this from happening, never use IAM creds for any of them, and don't give those creds permission, and instead give actual static creds and let the endpoint rotate it". With the rotation it feels like this seems a valid approach.

Another thought -- maybe a way to force a namespace to always use credentials that are provided by a role in the AWS secret engine in the parent namespace? That feels a bit more natural.

Maybe but it's also a child namespace reaching into a parent, which is not something we allow right now.

@chrishoffman chrishoffman modified the milestones: 0.11.2, next-release Oct 1, 2018
@vishalnayak
Copy link
Member

Adding some more thoughts into this discussion. Note that some of these points are in contrast with what is being proposed/discussed. The goal here only is to add more angles to the thought process in order to land on an informed decision, and not to argue against any previously mentioned ideas.

  1. Vault generating external IDs
  • AWS explicitly mentions that external IDs are not treated to be secrets, and it also gives flexibility to the callers (ExampleCorp) to choose IDs of choice as long as they are distinct.

  • External IDs are to be synced with AWS out-of-band. Meaning there is no scope here to not trust the operators. If Vault generates the external IDs or if something else generates it, those will definitely fall in the hands of the operators. In other words, Vault will never have a way to know for sure if an external ID is compromised or not.

  • Vault can restrict access to external IDs, through ACLs, sure. But, the same can also be achieved by storing the external IDs in the KV store and ACLing that. Hence, I don't see a big value in Vault's AWS role generating external IDs. It could complicate adoption and integration, and remove the flexibility to use already existing unique customer identifiers.

  • Also this pops a question: Vault's AWS role supports multiple ARNs to be provided. If Vault's role generates external IDs, are we talking about restricting the role to use only one ARN and with one external ID? If yes, this is also a point that acts against generating external IDs in Vault.

  1. Is Vault an ExampleCorp?

It seems to me that Vault is not an ExampleCorp. Vault is supporting multiple applications get AssumeRole credentials through supplied external ID (which Vault has no option but to simply trust), and that seems reasonable.

  1. Can Vault become a Confused Deputy?

Vault's responsibility is to get AssumeRole credentials using the supplied external ID. A malicious client supplying a compromised external ID doesn't not make Vault a confused deputy IMHO (unless someone thinks otherwise). That's a different problem and must be handled differently (by tightly controlling access to external IDs, wherever they are).

Maybe #5329 is doing just what is supposed to be done.

@joelthompson
Copy link
Contributor

My point is that different Vault namespaces using the same ambient credentials essentially turns Vault into ExampleCorp, with each namespace corresponding to a tenant of ExampleCorp, and this can make Vault a confused deputy.

  1. Is Vault an ExampleCorp?
    It seems to me that Vault is not an ExampleCorp. Vault is supporting multiple applications get AssumeRole credentials through supplied external ID (which Vault has no option but to simply trust), and that seems reasonable.

I would rephrase that as: Vault is supporting multiple tenants getting AssumeRole credentials through the supplied external ID. That's what can make Vault like an ExampleCorp

  1. Can Vault become a Confused Deputy?
    Vault's responsibility is to get AssumeRole credentials using the supplied external ID. A malicious client supplying a compromised external ID doesn't not make Vault a confused deputy IMHO (unless someone thinks otherwise). That's a different problem and must be handled differently (by tightly controlling access to external IDs, wherever they are).

I would rephrase that as: Vault's responsibility is to get AssumeRole credentials using the supplied external ID and using the credentials it already has. When multiple tenants/namespaces share the same credentials (which might happen in the case of ambient credentials), that's what allows Vault to become a confused deputy, because one tenant can point Vault to a different tenant's role and assume those credentials using the same External ID.

@chrishoffman chrishoffman modified the milestones: next-release, 0.12 Oct 9, 2018
@briankassouf briankassouf removed this from the 1.0 milestone Nov 28, 2018
@briankassouf briankassouf added this to the 1.0.1 milestone Nov 28, 2018
@flyinbutrs
Copy link

To add (yet another) voice and opinion to the discussion, it seems to me that the place this would be most useful would be at aws/roles/<role> rather than at aws/sts/<role>. That way the external ID would be stored by the vault admin when creating the cross account role. This would allow a vault client to access cross account roles that are secured by an external_id without needing to retrieve the external ID from another source, or indeed without even needing to know about it.

A vault client should be allowed to access an endpoint of aws/sts/<role> by policy, and should not have to know or care if the role is secured by an external ID. It should only need to know that it needs credentials from that endpoint.

@jefferai jefferai modified the milestones: 1.0.1, 1.0.2 Dec 12, 2018
@chrishoffman chrishoffman modified the milestones: 1.0.2, 1.0.3 Jan 7, 2019
@hashicorp-cla
Copy link

hashicorp-cla commented Jan 15, 2019

CLA assistant check
All committers have signed the CLA.

@jefferai jefferai modified the milestones: 1.0.3, 1.1 Feb 1, 2019
@jefferai jefferai modified the milestones: 1.1, near-term, next-release Feb 14, 2019
@tyrannosaurus-becks tyrannosaurus-becks self-assigned this Dec 20, 2019
@tyrannosaurus-becks
Copy link
Contributor

Hi! I am in favor of moving this forward and would love to pick this back up. Aside from discussing this theoretically, this user has a valid use case. Also, the external_id would just be an optional field that people can use if needed.

@j-vizcaino if you are still interested in this, would you be willing to:

  • Merge in Vault master to clear merge conflicts?
  • Add one positive and one negative test covering what happens when the external ID is and isn't provided?

Thank you so much! If you don't end up circling back, totally understandable.

@j-vizcaino
Copy link
Author

Hey @tyrannosaurus-becks, I'll try to squeeze some time and update the PR.
Regarding tests, adding tests cases where external ID is not provided should not be necessary as I suspect this should already be covered by the existing ones (external ID being optional).

@seh
Copy link

seh commented Feb 20, 2020

I'm with @flyinbutrs here (#5033 (comment)): The external ID belongs in aws/roles, ideally paired with each of the IAM role ARNs. Those come in as a list today, so pairing these would be awkward. The Vault operator who exposes the ability to assume a given IAM role should constrain the external ID to use in that assumption, if any. If Vault needed to be able to assume the same IAM role using different external IDs, I'd expect an operator to register that same IAM role with Vault multiple times, once for each external ID.

@nouseforaname
Copy link

Very interested in this

@fairclothjm
Copy link
Contributor

@j-vizcaino Hi, thanks for the contribution. We decided to close this in favor of #18813

@fairclothjm fairclothjm closed this Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.