-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upscaling of coredns pods leads to DNS timeout errors #113080
Comments
@sli720: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig network |
Perhaps the issue is with networking to a particular node, and increasing the number of instances > 3 results in a coredns pod running on a problematic node? |
I've tried it out on specific nodes and I don't see a relation to the hardware. Everytime I increase the number of pods I get this issue.I also don't see any issues in the calico or system/kernel logs. Can I somehow debug if that issue is related to a specific node? |
Perhaps try issuing TCP queries to each individual CoreDNS pod IP directly. Do they all exhibit the same degree of sporadic timeout? Or some more so than others. Note: The forward timeout is 2 seconds in nodelocal/coredns. |
How healthy are your kube-proxies (specifically on the nodes that hosts the pod that can't read dns)? |
I did a fast nslookup loop against all coredns pods from different hosts and it always resolved successfully. Only when using nodelocaldns between it sometimes fails on always different hosts. |
They never crashed so far if you mean that and in the logs there are no errors. |
Do we have any updates here? |
My hypothesis on why it's happenning: LocalCoreDNS uses CoreDNS kube-dns Cluster IP for upstream So, if simultaneous connections are made ( within 2 ns) and there are multiple rules with multiple endpoints, the packets can be sent to wrong pod/node ( see race#3) . So, The probability of the sending the packet to correct pod/node decreases with more number of coredns pods. Also, There are others who faced this issue. |
many moving parts here ;) |
I've disabled nodelocaldns completely and scaled up the coredns pods again. No problems anymore. |
It's possible that you don't have visibility into DNS errors anymore, Did you validate that? Earlier, LocalCoreDNS was central place and it's logging the error. It's interesting how that fixes the error. |
Nodelocaldns instances use TCP to forward DNS requests to the Cluster IP DNS, which would mitigate the conntrack issue - requests getting resent when sender does not get ACKs. |
hi @jaswanthikolla , I was trying to re-produce the SYN-ACK issue, but encounter an issue like nodelocaldns always go back to coredns for name resolution. I tested by running a 1s loop doing nslookup kubernetes.default.svc.cluster.local on a node with nodelocaldns running, all goes well but as soon as I scale CoreDNS to zero the nslookup will fail immediately and nodelocaldns log will report connection refused to CoreDNS. I thought there's a 5s TTL set by CoreDNS and suppose nodelocaldns should not fail immediately and response using the cached record? Noticed once I scale CoreDNS up the resolution goes back to normal. This looks like nodelocaldns didn't really cache any result to reduce call to CoreDNS... I was using the stock nodelocaldns.yaml, beside the 3 standard environment variables to change, nothing else is changed. |
What happened?
Since upgrading to CentOS 9 and Kubernetes 1.24.6 (on the same hardware), sporadic DNS resolution errors occur when too many coredns pods are running at the same time. If I reduce them to <= 3, no more errors occur. Once the error occurs, you see lines in the logs of nodelocaldns pods like:
[ERROR] plugin/errors: 2 git-cache.ci.svc.cluster.local. A: select tcp 10.233.0.3:53: i/o timeout
It looks like the nodelocaldns pod sometimes can't contact the coredns pod for some reason. There are no errors seen in the logs for the coredns pods, but also for the calico pods. It also occurs with low load (CPU, network, disk) on the cluster. Could this be a bug in nodelocaldns, coredns or a wrong configuration of the /etc/resolv.conf? It is strange that it disappears when I reduce the number of coredns pods.
What did you expect to happen?
nodelocaldns pods can always contact coredns
How can we reproduce it (as minimally and precisely as possible)?
Run the nslookup command many times. Sometimes it fails, sometimes not.
Anything else we need to know?
Here is the resolv.conf of the pod used to test:
Kubernetes version
1.24.6
Cloud provider
on-premise (kubespray see kubernetes-sigs/kubespray#9328)
OS version
CentOS Stream 9 (Kernel Linux 5.14.0-160.el9.x86_64 x86_64)
Install tools
ansible through kubespray
Container runtime (CRI) and version (if applicable)
containerd 1.6.8 (also tested with docker 20.10)
Related plugins (CNI, CSI, ...) and versions (if applicable)
calico
The text was updated successfully, but these errors were encountered: