Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent "Failed to send gratuitous NDP" warnings in IPv6 cluster #6293

Open
antoninbas opened this issue May 6, 2024 · 3 comments
Open
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@antoninbas
Copy link
Contributor

Describe the bug
I have created an IPv6-only cluster using Kind. When creating Pods, I frequently observe warnings informing me of a failure to send the gratuitous NDP messages. These messages always come as a group of 3, meaning that all 3 attempts to sendthe NDP message have failed.

I0506 21:37:45.805542       1 server.go:428] "Received CmdAdd request" request="cni_args:{container_id:\"0253ec9e6ca355411a896a6a175c0a07b2460fc8948545f4175d49a2e0a76bb2\"  netns:\"/var/run/netns/cni-e5448682-3224-4f17-ddf3-da66d41ba179\"  ifname:\"eth0\"  args:\"IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NAME=antrea-toolbox-fgmhd;K8S_POD_INFRA_CONTAINER_ID=0253ec9e6ca355411a896a6a175c0a07b2460fc8948545f4175d49a2e0a76bb2;K8S_POD_UID=b3e01035-e79d-4ad1-bb58-18117c261309\"  path:\"/opt/cni/bin\"  network_configuration:\"{\\\"cniVersion\\\":\\\"0.3.0\\\",\\\"ipam\\\":{\\\"type\\\":\\\"host-local\\\"},\\\"name\\\":\\\"antrea\\\",\\\"type\\\":\\\"antrea\\\"}\"}"
I0506 21:37:45.812608       1 server.go:498] "Allocated IP addresses" container="0253ec9e6ca355411a896a6a175c0a07b2460fc8948545f4175d49a2e0a76bb2" result={"cniVersion":"1.0.0","ips":[{"address":"fd00:10:244:2::7/64","gateway":"fd00:10:244:2::1"}],"dns":{},"VLANID":0}
I0506 21:37:47.607434       1 pod_configuration.go:257] "Configured container interface" Pod="default/antrea-toolbox-fgmhd" container="0253ec9e6ca355411a896a6a175c0a07b2460fc8948545f4175d49a2e0a76bb2" interface="eth0" hostInterface="antrea-t-301d59"
I0506 21:37:47.607602       1 server.go:527] "CmdAdd for container succeeded" container="0253ec9e6ca355411a896a6a175c0a07b2460fc8948545f4175d49a2e0a76bb2"
W0506 21:37:47.608506       1 interface_configuration_linux.go:332] Failed to send gratuitous NDP #0: write ip fe80::3c06:d4ff:fe08:8289%eth0->ff02::1%eth0: sendmsg: invalid argument
W0506 21:37:47.663654       1 interface_configuration_linux.go:332] Failed to send gratuitous NDP #1: write ip fe80::3c06:d4ff:fe08:8289%eth0->ff02::1%eth0: sendmsg: invalid argument
W0506 21:37:47.710718       1 interface_configuration_linux.go:332] Failed to send gratuitous NDP #2: write ip fe80::3c06:d4ff:fe08:8289%eth0->ff02::1%eth0: sendmsg: invalid argument

I have also seen this variation, with a slightly different error message:

I0506 21:32:32.578448       1 server.go:527] "CmdAdd for container succeeded" container="2c3f0e1c740a11a7b17359c6c91ee2670569839aecb1cbddf3edf5fe8537d108"
W0506 21:32:32.579049       1 interface_configuration_linux.go:332] Failed to send gratuitous NDP #0: failed to create NDP responder for "eth0": listen ip6:ipv6-icmp fe80::1425:15ff:fea2:ac95%eth0: bind: no such device
W0506 21:32:32.631461       1 interface_configuration_linux.go:332] Failed to send gratuitous NDP #1: failed to create NDP responder for "eth0": listen ip6:ipv6-icmp fe80::1425:15ff:fea2:ac95%eth0: bind: no such device
W0506 21:32:32.679866       1 interface_configuration_linux.go:332] Failed to send gratuitous NDP #2: failed to create NDP responder for "eth0": listen ip6:ipv6-icmp fe80::1425:15ff:fea2:ac95%eth0: bind: no such device

To Reproduce
I used this configuration for Kind:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  ipFamily: ipv6
  apiServerAddress: 127.0.0.1
  disableDefaultCNI: true
nodes:
- role: control-plane
- role: worker
- role: worker

Note that the apiServerAddress configuration is necessary because I am on macOS (refer to the Kind documentation).

To create Pods, I use the following DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: antrea-toolbox
spec:
  selector:
    matchLabels:
      app: antrea-toolbox
  template:
    metadata:
      labels:
        app: antrea-toolbox
    spec:
      hostNetwork: false
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        operator: Exists
        effect: NoSchedule
      containers:
      - name: antrea-toolbox
        image: antrea/toolbox:latest
        securityContext:
          privileged: true

You may need to apply / delete the file a few times to observe the issue.

Expected

Actual behavior

Versions:
I am using the latest version of Antrea (built from the main branch).

Additional context
I am not sure what can be causing this, but I haven't really looked too much into it. The Pods seem otherwise healthy.
The NDP messages are sent after network provisioning has completed, so it's strange to see this failing.
I also don't recall ever seeing a similar issue in the past for IPv4 / gratuitous ARP messages.

cc @tnqn

@antoninbas antoninbas added the kind/bug Categorizes issue or PR as related to a bug. label May 6, 2024
@tnqn
Copy link
Member

tnqn commented May 7, 2024

I deployed the first version that has the code (#3998), the same error happened, so I guess the code never worked:

I0507 14:08:56.995681       1 agent.go:96] Starting Antrea agent (version v1.8.0)
...
W0507 14:09:28.668531       1 interface_configuration_linux.go:330] Failed to send gratuitous NDP #0: failed to create NDP responder for "eth0": listen ip6:ipv6-icmp fe80::bc0b:7fff:fe93:c7f7%eth0: bind: no such device
W0507 14:09:28.719695       1 interface_configuration_linux.go:330] Failed to send gratuitous NDP #1: failed to create NDP responder for "eth0": listen ip6:ipv6-icmp fe80::bc0b:7fff:fe93:c7f7%eth0: bind: no such device
W0507 14:09:28.769339       1 interface_configuration_linux.go:330] Failed to send gratuitous NDP #2: failed to create NDP responder for "eth0": listen ip6:ipv6-icmp fe80::bc0b:7fff:fe93:c7f7%eth0: bind: no such device

@tnqn
Copy link
Member

tnqn commented May 7, 2024

Replacing the func with ndp.NeighborAdvertisement as suggested by #3998 (review), the message was sent successfully, but I haven't verified it's valid and can really stale a neighbor cache:

15:22:16.825943 IP6 fe80::44a9:86ff:feec:7702 > ff02::1: ICMP6, neighbor advertisement, tgt is fd00:10:244::7, length 32
15:22:16.876195 IP6 fe80::44a9:86ff:feec:7702 > ff02::1: ICMP6, neighbor advertisement, tgt is fd00:10:244::7, length 32
15:22:16.926622 IP6 fe80::44a9:86ff:feec:7702 > ff02::1: ICMP6, neighbor advertisement, tgt is fd00:10:244::7, length 32

Copy link
Contributor

github-actions bot commented Aug 6, 2024

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 6, 2024
@antoninbas antoninbas removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants