Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use symmetric return path for non-VPC traffic #1460

Closed
wants to merge 2 commits into from
Closed

use symmetric return path for non-VPC traffic #1460

wants to merge 2 commits into from

Conversation

kishorj
Copy link
Contributor

@kishorj kishorj commented May 11, 2021

Configure permissive rp_filter on the secondary ENIs and use symmetric
return path for non-VPC traffic ingress from the seconary ENIs.

What type of PR is this?
bug

Which issue does this PR fix:
Fixes #1392

What does this PR do / Why do we need it:
When external SNAT is disabled, this PR does the following

  • configure permissive RP filter for the secondary ENIs as well
  • setup iptables rules to mark ingress traffic from secondary ENIs
  • setup ip rules to send the return traffic through the incoming interface

This fix is needed in order to support non-VPC traffic ingress from the secondary ENIs, for example in case of NLB with IP targets and preserve client IP enabled.

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
N/A

Testing done on this change:

  • ran the automated e2e tests
  • Secondary connmark can be configured via env variable on the aws-node ds
    • ip route rules get reconciled with the new secondary connmark
    • iptables mangle chain will contain the rules with the old connmark as well, manual cleanup or node restart required
  • With External SNAT disabled
    • pods are able to initiate connections to hosts outside the VPC, traffic is through the primary interface
    • in case of NLB-IP, verify traffic flows as expected for client IP preservation enabled/disabled, pods on primary/secondary interfaces
    • instance mode NLB works as expected
    • in cluster traffic works as expected
    • enabled security group per pod, verified traffic works as expected
  • With external SNAT is enabled
    • pod specific ip route rule with the fwmark do not get configured.
    • instance mode NLB works fine with IP preserve enabled/disabled, pods on primary/secondary interfaces
    • IP mode NLB works as expected with IP preserve enabled/disabled, pods on primary/secondary interfaces
    • pod can communicate to hosts within or outside VPC
    • enabled security group per pod, verified traffic flows

Automation added to e2e:

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
upgrading to new version on a running cluster works fine

Does this change require updates to the CNI daemonset config files to work?:

Added secondaryENIConnmark configuration in the file 10.aws.conflist for cni plugin to pickup the configured connmark.

Does this PR introduce any user-facing change?:

Fixes the issue with non-VPC traffic ingress from secondary ENIs when external SNAT feature is not used.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@kishorj kishorj marked this pull request as draft May 11, 2021 07:52
@kishorj kishorj closed this May 11, 2021
@kishorj kishorj deleted the symmertic-path branch May 11, 2021 21:55
@kishorj kishorj restored the symmertic-path branch May 11, 2021 21:59
@kishorj kishorj reopened this May 11, 2021
Configure permissive rp_filter on the secondary ENIs and use symmetric
return path for non-VPC traffic ingress from the seconary ENIs.
@kishorj kishorj marked this pull request as ready for review May 20, 2021 17:46
@@ -42,13 +42,18 @@ for b in $PLUGIN_BINS; do
done

# Configure rp_filter
echo "Configure default rp_filter loose... "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below set of changes will set rp_filter to 2 on all the interfaces and that might be a concern because we currently allow Cx to specify via tags the ENIs they don't want VPC CNI to manage. Same is true for P4d instances that support multiple network cards. Although, we only manage interfaces belonging to network card 0 this change will modify the settings on interfaces tied to non-zero network cards as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any extra information that we can use to reliably ignore the interfaces, for example via IMDS? Since we are setting the default rp_filter, any new interfaces created after this point gets the permissive rp_filter configuration.

@achevuru
Copy link
Contributor

achevuru commented May 25, 2021

With this change, we will start marking the packets ingressing via secondary ENIs with 0x100 by default and that might be a breaking change for some customers if the connamrk value coincides with any connmark values they are currently using. It is good that we do a provide an option for users to choose their own secondary connmark value but I feel it might be good if we let users explicitly enable this feature/option via env variable. By doing so, they know they are enabling the connmark on packets ingressing through secondary ENIs and if they want to change the default connmark value VPC CNI uses they can go ahead and change it. Right now, we do it by default and unless someone reads the release notes they might not be aware of this new change...

@kishorj
Copy link
Contributor Author

kishorj commented May 25, 2021

With this change, we will start marking the packets ingressing via secondary ENIs with 0x100 by default and that might be a breaking change for some customers if the connamrk value coincides with any connmark values they are currently using. It is good that we do a provide an option for users to choose their own secondary connmark value but I feel it might be good if we let users explicitly enable this feature/option via env variable. By doing so, they know they are enabling the connmark on packets ingressing through secondary ENIs and if they want to change the default connmark value VPC CNI uses they can go ahead and change it. Right now, we do it by default and unless someone reads the release notes they might not be aware of this new change...

This is not a feature, but a bug fix. Something that should have worked but didn't. Having a separate flag to enable the fix specifically would be burdensome for a far more users than the ones that might have used the 0x100 connmark on their setup previously.

@kishorj
Copy link
Contributor Author

kishorj commented May 28, 2021

closing in favor of #1475

@kishorj kishorj closed this May 28, 2021
@kishorj kishorj deleted the symmertic-path branch June 8, 2021 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Traffic from outside VPC does not reach pod
2 participants