Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix policyOnlyMode and cloud support for AKS & EKS #1585

Conversation

antoninbas
Copy link
Contributor

policyOnlyMode was broken since adding support for IPv6 clusters in the
code base. This is because the code used the Node's PodCIDR(s) to
determine which address family was supported, which doesn't work in
policyOnlyMode, for which IPAM is not the responsibility of Antrea.

Instead we now use the following rules:

  • if there is an IPv4 PodCIDR for the Node, then v4 is supported
  • otherwise, if policyOnlyMode is used and the Node's IP address
    (primary, as reported by K8s) is an IPv4 address then v4 is supported
    Same rules for v6.

This may not work for dual-stack, but IIRC none of the cloud services
support IPv6 / dual-stack. There are also other issues for dual-stack
support in Antrea, which themselves depend on upstream issues.

Additionally, we include the following changes:

  • the build/yamls/antrea-eks-node-init.yml manifest is updated to
    account for a breaking change in the AWS CNI introspection API.
  • we pin the K8s conformance test image for all clouds to v1.18.5 to
    avoid issues with recently-added conformance tests.

Fixes #1572

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-all-features-conformance: to trigger conformance tests with all alpha features enabled.
  • /skip-all-features-conformance: to skip conformance tests with all alpha features enabled.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

jianjuns
jianjuns previously approved these changes Nov 20, 2020
@@ -64,13 +74,11 @@ spec:
echo "Waiting for aws-k8s-agent"
done

# copied from https://github.com/cilium/cilium/blob/master/install/kubernetes/cilium/charts/nodeinit/templates/daemonset.yaml#L199
# Fetch running containers from aws-k8s-agent and kill it
# copied from https://github.com/cilium/cilium/blob/v1.8.0/install/kubernetes/cilium/charts/nodeinit/templates/daemonset.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not mean your PR, but should we rewrite the script?

Copy link
Contributor Author

@antoninbas antoninbas Nov 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script seems to be working ok with the patch. Actually I need to remove the comment about cilium. The code is different from the cilium script now, since they haven't updated it to account for the API change :P

@codecov-io
Copy link

codecov-io commented Nov 20, 2020

Codecov Report

Merging #1585 (e735e7e) into master (9d3d10b) will increase coverage by 1.07%.
The diff coverage is 55.98%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1585      +/-   ##
==========================================
+ Coverage   63.31%   64.38%   +1.07%     
==========================================
  Files         170      181      +11     
  Lines       14250    15299    +1049     
==========================================
+ Hits         9023     9851     +828     
- Misses       4292     4421     +129     
- Partials      935     1027      +92     
Flag Coverage Δ
e2e-tests 47.75% <33.03%> (?)
kind-e2e-tests 52.90% <51.03%> (-2.49%) ⬇️
unit-tests 40.41% <12.20%> (-0.86%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/antrea-agent/agent.go 0.00% <0.00%> (ø)
cmd/antrea-agent/options.go 21.69% <0.00%> (+0.97%) ⬆️
pkg/agent/stats/collector.go 97.72% <ø> (ø)
pkg/controller/networkpolicy/tier.go 90.00% <ø> (ø)
pkg/features/antrea_features.go 16.66% <ø> (ø)
pkg/ovs/openflow/ofctrl_builder.go 61.67% <0.00%> (-0.50%) ⬇️
pkg/ovs/openflow/ofctrl_group.go 48.57% <0.00%> (-4.56%) ⬇️
pkg/ovs/openflow/ofctrl_action.go 68.90% <17.14%> (+7.55%) ⬆️
pkg/agent/proxy/proxier_linux.go 33.33% <33.33%> (+8.33%) ⬆️
pkg/agent/proxy/types/types.go 46.66% <33.33%> (-37.95%) ⬇️
... and 77 more

policyOnlyMode was broken since adding support for IPv6 clusters in the
code base. This is because the code used the Node's PodCIDR(s) to
determine which address family was supported, which doesn't work in
policyOnlyMode, for which IPAM is not the responsibility of Antrea.

Instead we now use the following rules:
* if there is an IPv4 PodCIDR for the Node, then v4 is supported
* otherwise, if policyOnlyMode is used and the Node's IP address
  (primary, as reported by K8s) is an IPv4 address then v4 is supported
Same rules for v6.

This may not work for dual-stack, but IIRC none of the cloud services
support IPv6 / dual-stack. There are also other issues for dual-stack
support in Antrea, which themselves depend on upstream issues.

Additionally, we include the following changes:
* the build/yamls/antrea-eks-node-init.yml manifest is updated to
  account for a breaking change in the AWS CNI introspection API.
* we pin the K8s conformance test image for all clouds to v1.18.5 to
  avoid issues with recently-added conformance tests.

Fixes antrea-io#1572
@antoninbas
Copy link
Contributor Author

I have confirmed that all the tests pass on AKS & EKS!

@antoninbas
Copy link
Contributor Author

/test-all
/test-ipv6-e2e
/test-ipv6-conformance
/test-ipv6-networkpolicy

@lzhecheng
Copy link
Contributor

/test-ipv6-only-all
/test-windows-conformance

@lzhecheng
Copy link
Contributor

/test-ipv6-e2e

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit 02215d0 into antrea-io:master Nov 20, 2020
@antoninbas antoninbas deleted the fix-network-policy-only-mode-and-cloud-support branch November 20, 2020 17:26
antoninbas added a commit to antoninbas/antrea that referenced this pull request Nov 20, 2020
policyOnlyMode was broken since adding support for IPv6 clusters in the
code base. This is because the code used the Node's PodCIDR(s) to
determine which address family was supported, which doesn't work in
policyOnlyMode, for which IPAM is not the responsibility of Antrea.

Instead we now use the following rules:
* if there is an IPv4 PodCIDR for the Node, then v4 is supported
* otherwise, if policyOnlyMode is used and the Node's IP address
  (primary, as reported by K8s) is an IPv4 address then v4 is supported
Same rules for v6.

This may not work for dual-stack, but IIRC none of the cloud services
support IPv6 / dual-stack. There are also other issues for dual-stack
support in Antrea, which themselves depend on upstream issues.

Additionally, we include the following changes:
* the build/yamls/antrea-eks-node-init.yml manifest is updated to
  account for a breaking change in the AWS CNI introspection API.
* we pin the K8s conformance test image for all clouds to v1.18.5 to
  avoid issues with recently-added conformance tests.

Fixes antrea-io#1572
antoninbas added a commit that referenced this pull request Nov 21, 2020
policyOnlyMode was broken since adding support for IPv6 clusters in the
code base. This is because the code used the Node's PodCIDR(s) to
determine which address family was supported, which doesn't work in
policyOnlyMode, for which IPAM is not the responsibility of Antrea.

Instead we now use the following rules:
* if there is an IPv4 PodCIDR for the Node, then v4 is supported
* otherwise, if policyOnlyMode is used and the Node's IP address
  (primary, as reported by K8s) is an IPv4 address then v4 is supported
Same rules for v6.

This may not work for dual-stack, but IIRC none of the cloud services
support IPv6 / dual-stack. There are also other issues for dual-stack
support in Antrea, which themselves depend on upstream issues.

Additionally, we include the following changes:
* the build/yamls/antrea-eks-node-init.yml manifest is updated to
  account for a breaking change in the AWS CNI introspection API.
* we pin the K8s conformance test image for all clouds to v1.18.5 to
  avoid issues with recently-added conformance tests.

Fixes #1572
antoninbas added a commit that referenced this pull request Dec 23, 2020
policyOnlyMode was broken since adding support for IPv6 clusters in the
code base. This is because the code used the Node's PodCIDR(s) to
determine which address family was supported, which doesn't work in
policyOnlyMode, for which IPAM is not the responsibility of Antrea.

Instead we now use the following rules:
* if there is an IPv4 PodCIDR for the Node, then v4 is supported
* otherwise, if policyOnlyMode is used and the Node's IP address
  (primary, as reported by K8s) is an IPv4 address then v4 is supported
Same rules for v6.

This may not work for dual-stack, but IIRC none of the cloud services
support IPv6 / dual-stack. There are also other issues for dual-stack
support in Antrea, which themselves depend on upstream issues.

Additionally, we include the following changes:
* the build/yamls/antrea-eks-node-init.yml manifest is updated to
  account for a breaking change in the AWS CNI introspection API.
* we pin the K8s conformance test image for all clouds to v1.18.5 to
  avoid issues with recently-added conformance tests.

Fixes #1572
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI jobs for AKS & EKS are failing
7 participants