Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set httpPutResponseHopLimit to 2 in instanceMetadataOptions as default when creating instance #4247

Closed
wyike opened this issue May 5, 2023 · 6 comments · Fixed by #4250
Closed
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@wyike
Copy link
Contributor

wyike commented May 5, 2023

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

Regarding #4037 support, I would propose to set the default HTTPPutResponseHopLimit to 2 in container environment.

When customers is using instance profile role instead of using base64 aws credentials (very typical usage in production env), capa container needs 2 hops to retrieve aws credentials from metadata service. If default hop limit is 1, capa fails to get credentials and fail at the first with:

E0430 03:18:00.552022       1 controller.go:329] "Reconciler error" err=<
	.spec.vpc.id is set but VPC resource is missing in AWS; failed to describe VPC resources. (might be in creation process): failed to query ec2 for VPCs: NoCredentialProviders: no valid providers in chain. Deprecated.
		For verbose messaging see aws.Config.CredentialsChainVerboseErrors
 > controller="awscluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSCluster" AWSCluster="tkg-system/tkg-mgmt-aws-pvbgc" namespace="tkg-system" name="tkg-mgmt-aws-pvbgc" reconcileID=8e5f1348-7f16-4441-9d18-a4b141eb973b

If we set HTTPPutResponseHopLimit to 2 as default, it will avoid capa failure and other applications that needs to access AWS. Otherwise we need customers to set the awsmachine template explicitly:

  template:
    spec:
      instanceMetadataOptions:
        httpPutResponseHopLimit: 2

They are very likely to forget or not aware of this knowledge and get a failed env.

Another benefit is customers don't need to change awsmachinetemplate very often due to the HopLimit issue in production env , which as we known, is immutable and it is a burden to update to a new machinetemplate.

I also see HTTPPutResponseHopLimit to 2 is recommended in container environment:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html#imds-considerations

To avoid problems with instance metadata retrieval, consider the following:
The AWS SDKs use IMDSv2 calls by default. If the IMDSv2 call receives no response, the SDK retries the call and, if still unsuccessful, uses IMDSv1. This can result in a delay. In a container environment, if the hop limit is 1, the IMDSv2 response does not return because going to the container is considered an additional network hop. To avoid the process of falling back to IMDSv1 and the resultant delay, in a container environment we recommend that you set the hop limit to 2

https://aws.amazon.com/about-aws/whats-new/2020/08/amazon-eks-supports-ec2-instance-metadata-service-v2/

Now, newly launched and any updated EKS managed node groups will be configured with a metadata token response hop limit set to 2. For self-managed nodes, CloudFormation templates and eksctl have been updated to launch nodes by default with a hop limit of 2.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api-provider-aws version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-priority labels May 5, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 5, 2023
@wyike
Copy link
Contributor Author

wyike commented May 5, 2023

/assign

@Skarlso
Copy link
Contributor

Skarlso commented May 6, 2023

This wasn't fixed yet, right?

@wyike
Copy link
Contributor Author

wyike commented May 7, 2023

This wasn't fixed yet, right?

I think it's more like a feature enhancement :)

@Skarlso
Copy link
Contributor

Skarlso commented May 7, 2023

Yes, what I meant to say is that if it is done or not... :D

@wyike
Copy link
Contributor Author

wyike commented May 8, 2023

ah, SORRY...my eyes 😵...
I'll submit the commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants