Set httpPutResponseHopLimit to 2 in instanceMetadataOptions as default when creating instance #4247

wyike · 2023-05-05T01:16:02Z

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]

Regarding #4037 support, I would propose to set the default HTTPPutResponseHopLimit to 2 in container environment.

When customers is using instance profile role instead of using base64 aws credentials (very typical usage in production env), capa container needs 2 hops to retrieve aws credentials from metadata service. If default hop limit is 1, capa fails to get credentials and fail at the first with:

E0430 03:18:00.552022       1 controller.go:329] "Reconciler error" err=<
	.spec.vpc.id is set but VPC resource is missing in AWS; failed to describe VPC resources. (might be in creation process): failed to query ec2 for VPCs: NoCredentialProviders: no valid providers in chain. Deprecated.
		For verbose messaging see aws.Config.CredentialsChainVerboseErrors
 > controller="awscluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSCluster" AWSCluster="tkg-system/tkg-mgmt-aws-pvbgc" namespace="tkg-system" name="tkg-mgmt-aws-pvbgc" reconcileID=8e5f1348-7f16-4441-9d18-a4b141eb973b

If we set HTTPPutResponseHopLimit to 2 as default, it will avoid capa failure and other applications that needs to access AWS. Otherwise we need customers to set the awsmachine template explicitly:

  template:
    spec:
      instanceMetadataOptions:
        httpPutResponseHopLimit: 2

They are very likely to forget or not aware of this knowledge and get a failed env.

Another benefit is customers don't need to change awsmachinetemplate very often due to the HopLimit issue in production env , which as we known, is immutable and it is a burden to update to a new machinetemplate.

I also see HTTPPutResponseHopLimit to 2 is recommended in container environment:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html#imds-considerations

To avoid problems with instance metadata retrieval, consider the following:
The AWS SDKs use IMDSv2 calls by default. If the IMDSv2 call receives no response, the SDK retries the call and, if still unsuccessful, uses IMDSv1. This can result in a delay. In a container environment, if the hop limit is 1, the IMDSv2 response does not return because going to the container is considered an additional network hop. To avoid the process of falling back to IMDSv1 and the resultant delay, in a container environment we recommend that you set the hop limit to 2

https://aws.amazon.com/about-aws/whats-new/2020/08/amazon-eks-supports-ec2-instance-metadata-service-v2/

Now, newly launched and any updated EKS managed node groups will be configured with a metadata token response hop limit set to 2. For self-managed nodes, CloudFormation templates and eksctl have been updated to launch nodes by default with a hop limit of 2.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

Cluster-api-provider-aws version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2023-05-05T01:16:10Z

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wyike · 2023-05-05T09:53:19Z

/assign

Skarlso · 2023-05-06T10:20:05Z

This wasn't fixed yet, right?

wyike · 2023-05-07T00:43:34Z

This wasn't fixed yet, right?

I think it's more like a feature enhancement :)

Skarlso · 2023-05-07T15:39:15Z

Yes, what I meant to say is that if it is done or not... :D

wyike · 2023-05-08T00:42:47Z

ah, SORRY...my eyes 😵...
I'll submit the commit.

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-priority labels May 5, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 5, 2023

wyike mentioned this issue May 5, 2023

Fix InstanceMetadataOptions defaults #4147

Merged

4 tasks

k8s-ci-robot assigned wyike May 5, 2023

wyike mentioned this issue May 8, 2023

Set httpPutResponseHopLimit to 2 when creating instances #4250

Merged

4 tasks

k8s-ci-robot closed this as completed in #4250 May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set httpPutResponseHopLimit to 2 in instanceMetadataOptions as default when creating instance #4247

Set httpPutResponseHopLimit to 2 in instanceMetadataOptions as default when creating instance #4247

wyike commented May 5, 2023 •

edited

Loading

k8s-ci-robot commented May 5, 2023

wyike commented May 5, 2023

Skarlso commented May 6, 2023

wyike commented May 7, 2023

Skarlso commented May 7, 2023 •

edited

Loading

wyike commented May 8, 2023 •

edited

Loading

Set httpPutResponseHopLimit to 2 in instanceMetadataOptions as default when creating instance #4247

Set httpPutResponseHopLimit to 2 in instanceMetadataOptions as default when creating instance #4247

Comments

wyike commented May 5, 2023 • edited Loading

k8s-ci-robot commented May 5, 2023

wyike commented May 5, 2023

Skarlso commented May 6, 2023

wyike commented May 7, 2023

Skarlso commented May 7, 2023 • edited Loading

wyike commented May 8, 2023 • edited Loading

wyike commented May 5, 2023 •

edited

Loading

Skarlso commented May 7, 2023 •

edited

Loading

wyike commented May 8, 2023 •

edited

Loading