Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Update awsmachine providerID and instanceID immediately after ec2:RunInstances is called #4670

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

mjlshen
Copy link
Contributor

@mjlshen mjlshen commented Nov 30, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:
This mitigates issues caused by falling back to tag-based searching for EC2 instances in case future AWS calls fail within CreateInstance(), such as attaching ENIs to security groups or tagging ENIs.

func (r *AWSMachineReconciler) findInstance(scope *scope.MachineScope, ec2svc services.EC2Interface) (*infrav1.Instance, error) {
var instance *infrav1.Instance
// Parse the ProviderID.
//nolint:staticcheck
// Usage of noderefutil pkg would be removed in a future release.
pid, err := noderefutil.NewProviderID(scope.GetProviderID())
if err != nil {
//nolint:staticcheck
// Usage of noderefutil pkg would be removed in a future release.
if !errors.Is(err, noderefutil.ErrEmptyProviderID) {
return nil, errors.Wrapf(err, "failed to parse Spec.ProviderID")
}
// If the ProviderID is empty, try to query the instance using tags.
// If an instance cannot be found, GetRunningInstanceByTags returns empty instance with nil error.
instance, err = ec2svc.GetRunningInstanceByTags(scope)
if err != nil {
return nil, errors.Wrapf(err, "failed to query AWSMachine instance by tags")
}
} else {
// If the ProviderID is populated, describe the instance using the ID.
// InstanceIfExists() returns error (ErrInstanceNotFoundByID or ErrDescribeInstance) if the instance could not be found.
//nolint:staticcheck
instance, err = ec2svc.InstanceIfExists(pointer.String(pid.ID()))
if err != nil {
return nil, err
}
}

Which issue(s) this PR fixes:
Related to #4629

Special notes for your reviewer:

This PR focuses on mitigating the impact when the controller fails to tag ENIs after creating EC2 instances. Another PR to make the fallback tag-based searching more robust is in the works in #4630

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Update AWSMachine providerID and instanceID earlier to minimize scenarios where tag-based searching is needed

…Instances is called

This mitigates issues caused by falling back to tag-based searching for
EC2 instances in case future AWS calls fail, such as attaching ENIs to
security groups or tagging ENIs.

Signed-off-by: Michael Shen <mishen@umich.edu>
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Nov 30, 2023
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 30, 2023
@enxebre
Copy link
Member

enxebre commented Nov 30, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 30, 2023
@mjlshen
Copy link
Contributor Author

mjlshen commented Dec 4, 2023

/assign killianmuldoon

Copy link
Contributor

@AndiDog AndiDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement!

/lgtm

@richardcase
Copy link
Member

/test ?

@k8s-ci-robot
Copy link
Contributor

@richardcase: The following commands are available to trigger required jobs:

  • /test pull-cluster-api-provider-aws-build
  • /test pull-cluster-api-provider-aws-test
  • /test pull-cluster-api-provider-aws-verify

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-provider-aws-apidiff-main
  • /test pull-cluster-api-provider-aws-e2e
  • /test pull-cluster-api-provider-aws-e2e-blocking
  • /test pull-cluster-api-provider-aws-e2e-clusterclass
  • /test pull-cluster-api-provider-aws-e2e-conformance
  • /test pull-cluster-api-provider-aws-e2e-conformance-with-ci-artifacts
  • /test pull-cluster-api-provider-aws-e2e-eks
  • /test pull-cluster-api-provider-aws-e2e-eks-gc
  • /test pull-cluster-api-provider-aws-e2e-eks-testing

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-provider-aws-apidiff-main
  • pull-cluster-api-provider-aws-build
  • pull-cluster-api-provider-aws-test
  • pull-cluster-api-provider-aws-verify

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@richardcase
Copy link
Member

/test pull-cluster-api-provider-aws-e2e

@dlipovetsky
Copy link
Contributor

/lgtm

Can we please keep #4629 open after merging this PR? I think we haven't addressed the root cause there, but this is a welcome improvement nevertheless!

@mjlshen
Copy link
Contributor Author

mjlshen commented Dec 11, 2023

/test pull-cluster-api-provider-aws-e2e

@dlipovetsky
Copy link
Contributor

We discussed this at today's office hours.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dlipovetsky

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 11, 2023
@k8s-ci-robot k8s-ci-robot merged commit 5780796 into kubernetes-sigs:main Dec 11, 2023
25 of 26 checks passed
@mjlshen mjlshen deleted the 4629-status branch December 11, 2023 18:31
@damdo
Copy link
Member

damdo commented Jan 11, 2024

@richardcase do you think we could add this to the upcoming 2.4.0 and 2.3.2 milestones?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants