Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup: remove send_redirects sysctl configuration #1364

Merged
merged 1 commit into from
Oct 20, 2020

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Oct 13, 2020

With L3 flows, Pod-to-Pod traffic will no longer go through gateway interfaces. For Pod-to-Service traffic, Linux doesn't allow sending redirect packets for NAT'd connections. Therefore, setting send_redirects is no longer needed.

For #1363

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@tnqn
Copy link
Member Author

tnqn commented Oct 13, 2020

/test-all

@codecov-io
Copy link

codecov-io commented Oct 13, 2020

Codecov Report

Merging #1364 into master will increase coverage by 0.10%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1364      +/-   ##
==========================================
+ Coverage   64.36%   64.46%   +0.10%     
==========================================
  Files         159      159              
  Lines       12674    12654      -20     
==========================================
  Hits         8157     8157              
+ Misses       3665     3648      -17     
+ Partials      852      849       -3     
Flag Coverage Δ
#integration-tests 45.05% <ø> (+0.19%) ⬆️
#kind-e2e-tests 50.24% <ø> (-0.15%) ⬇️
#unit-tests 42.12% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/route/route_linux.go 59.88% <ø> (+1.59%) ⬆️
pkg/apis/controlplane/v1beta1/sets.go 36.76% <0.00%> (-2.95%) ⬇️
pkg/agent/controller/networkpolicy/priority.go 87.85% <0.00%> (-2.15%) ⬇️
pkg/apiserver/storage/ram/store.go 78.94% <0.00%> (-1.51%) ⬇️
...ntroller/networkpolicy/networkpolicy_controller.go 80.56% <0.00%> (-0.27%) ⬇️
pkg/agent/proxy/proxier.go 74.80% <0.00%> (ø)
...agent/controller/traceflow/traceflow_controller.go 0.00% <0.00%> (ø)
pkg/agent/controller/networkpolicy/cache.go 84.98% <0.00%> (+0.39%) ⬆️
pkg/ovs/openflow/ofctrl_builder.go 76.17% <0.00%> (+0.43%) ⬆️
pkg/agent/controller/networkpolicy/reconciler.go 69.03% <0.00%> (+0.59%) ⬆️
... and 4 more

@tnqn
Copy link
Member Author

tnqn commented Oct 13, 2020

/test-windows-conformance

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - but when AntreaProxy is not enabled, do we still need send_redirect for Service traffic?

antoninbas
antoninbas previously approved these changes Oct 14, 2020
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this, LGTM

@antoninbas
Copy link
Contributor

@tnqn nit: in the commit message / PR description: you wrote "capacities" instead of "capabilities"

@tnqn
Copy link
Member Author

tnqn commented Oct 14, 2020

Question - but when AntreaProxy is not enabled, do we still need send_redirect for Service traffic?

@jianjuns No, I have verified no redirect packets are generated for such traffic. It should make sense that a DNATed packet may be sent out from the interface which it comes in so linux must have handled it. I could also try to confirm from code.

@tnqn nit: in the commit message / PR description: you wrote "capacities" instead of "capabilities"

Thanks @antoninbas, corrected.

antoninbas
antoninbas previously approved these changes Oct 14, 2020
@tnqn
Copy link
Member Author

tnqn commented Oct 14, 2020

/test-all

@tnqn
Copy link
Member Author

tnqn commented Oct 14, 2020

/test-whole-conformance

jianjuns
jianjuns previously approved these changes Oct 14, 2020
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Let us remove it then.

@tnqn
Copy link
Member Author

tnqn commented Oct 14, 2020

/test-conformance

@@ -350,16 +335,6 @@ func writeLine(buf *bytes.Buffer, words ...string) {
}
}

func disableICMPSendRedirects(intfName string) error {
cmdStr := fmt.Sprintf("echo 0 > /proc/sys/net/ipv4/conf/%s/send_redirects", intfName)
Copy link
Member

@srikartati srikartati Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn As part of the flow exporter, we make sure a couple of netfilter/conntrack parameters to be set.
https://github.com/vmware-tanzu/antrea/blob/5f65bba1b771b6fd1f1b548a3458bd3647c74489/pkg/agent/flowexporter/connections/conntrack_linux.go#L151
I think the change of privileges for the antrea-agent container will affect those checks, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @srikartati for raising this, I didn't know we have other places to perform sysctl. Is the conntrack configuration must-have to make flow exporter work? If so, I'm thinking two approaches:

  1. Keep privilege in the upstream yaml and use customized manifest for clusters that do not allow privileged pod (but flow exporter and other features that require sysctl configuration won't be available in these clusters).
  2. Try performing sysctl in antrea-agent and if it fails, logging a warning but do not exit the program if an env variable like IGNORE_SYSCTL_ERR is set. This means user doesn't give antrea-agent pod privilege, assuming sysctl will be configured out-of-band.

@antoninbas @jianjuns @srikartati any idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this @srikartati

  • these parameters are not must-have, but they do provide additional information. The code already does not fail and exit if the kernel parameters cannot be set, so maybe IGNORE_SYSCTL_ERR is not required if we go with Quan's idea. Although the problem with always ignoring it is we do not catch issues when testing, as was the case here.
  • a possibility is to use an initContainer that's privileged in order to set these kernel parameters; and run the Agent as unprivileged. The initContainer can easily be removed from the manifest when needed.
  • could we consider setting the sysctls in the Pod spec as that doesn't require privileged mode (https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/#setting-sysctls-for-a-pod)? That relies on container runtime support to set the kernel parameters.

Copy link
Member

@srikartati srikartati Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn Setting up the conntrack parameters is not a must per say for Flow Exporter, but the flow visibility would not be complete especially because of the lack of flow stats. Currently, if there is no write access to sysctl parameters and the values are not desired, then we just log an error and continue. In this case, we just provide flow records without stats and timestamps.

Since having no privileges doesn't break functionality, IMO we can go for option 2 to configure them out-of-band and describe the importance of setting required sysctl parameters in the flow exporter documentation.
I would like to hear others opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3rd option doesn't work as setting unsafe-sysctls need extra argument --allowed-unsafe-sysctls set to kubelet and it's not allowed to set net namespace related kernel parameter for a host network Pod: https://github.com/kubernetes/kubernetes/blob/2046f4212a355a3381da20c4d800283b98e9a081/pkg/kubelet/sysctl/whitelist.go#L89-L91.

I think the initContainer approach is feasible, we may need to measure how long would it add to the start time if having another initContainer. Anyway I could reduce the scope of this PR to remove send_redirects configuration only first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I like about the initContainer approach is we can group all operations that require privileged mode and we ensure that the antrea-agent container itself can run without privileged mode since the default YAML itself will not include privileged: true. But yes, that's a separate question that can be resolved in a future PR. After merging this PR, one can already edit the manifest and remove privileged: true, which will cause a warning to be logged if FlowExporter is enabled, but will not terminate the agent.

@tnqn
Copy link
Member Author

tnqn commented Oct 19, 2020

Question - but when AntreaProxy is not enabled, do we still need send_redirect for Service traffic?

@jianjuns @antoninbas I have confirmed from code that redirect packets for a NATed connection is not allowed for Linux, it's dropped by a NAT hook:
https://github.com/torvalds/linux/blob/a811c1fa0a02c062555b54651065899437bacdbe/net/netfilter/nf_nat_proto.c#L585-L586
So we are safe to remove the sysctl configuration.

With L3 flows, Pod-to-Pod traffic will no longer go through gateway
interfaces. For Pod-to-Service traffic, Linux doesn't allow sending
redirect packets for NAT'd connections. Therefore, configuring
send_redirects is no longer needed.
@tnqn
Copy link
Member Author

tnqn commented Oct 19, 2020

/test-all
/test-whole-conformance

@tnqn tnqn changed the title Cleanup: remove send_redirects sysctl configuration and reduce privilege Cleanup: remove send_redirects sysctl configuration Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants