Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Server cannot talk to in-cluster service #2467

Closed
sburnwal opened this issue Sep 21, 2021 · 8 comments
Closed

API Server cannot talk to in-cluster service #2467

sburnwal opened this issue Sep 21, 2021 · 8 comments
Assignees
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@sburnwal
Copy link

sburnwal commented Sep 21, 2021

I am deploying kind cluster using docker desktop on Mac. I need the API server to contact a web hook deployed as kubernetes service using its dns name like mywebhook.dex.svc.cluster.local but API server fails. Gives error in resolving the service dns mywebhook..svc.cluster.local.

dial tcp: lookup myservice.dex.svc.cluster.local on 192.168.65.2:53: read udp 172.18.0.3:54634->192.168.65.2:53: i/o timeout

I want to understand why is it trying to reach out to 192.168.65.2:53 for DNS resolution. That address appears to be that of the docker desktop's virtual network kit (vpn) IP. I see that IP:

ps -ef | grep vpn
    0    68     1   0 Tue11PM ??        24:05.61 /opt/cisco/anyconnect/bin/vpnagentd -execv_instance
  501 56986 56984   0  5:16PM ??         0:00.12 com.docker.vpnkit --ethernet fd:3 --diagnostics fd:4 --pcap fd:5 --vsock-path vms/0/connect --host-names host.docker.internal,docker.for.mac.host.internal,docker.for.mac.localhost --listen-backlog 32 --mtu 1500 --allowed-bind-addresses 0.0.0.0 --http /Users/sburnwal/Library/Group Containers/group.com.docker/http_proxy.json --dhcp /Users/sburnwal/Library/Group Containers/group.com.docker/dhcp.json --port-max-idle-time 300 --max-connections 2000 --gateway-ip 192.168.65.1 --host-ip 192.168.65.2 --lowest-ip 192.168.65.3 --highest-ip 192.168.65.254 --gc-compact-interval 1800
  501 56988 56984   0  5:16PM ??         0:05.67 vpnkit-bridge --disable wsl2-cross-distro-service,wsl2-bootstrap-expose-ports,transfused --addr listen://1999 host
  501 56992 56989   0  5:16PM ??         1:01.13 com.docker.hyperkit -A -u -F vms/0/hyperkit.pid -c 4 -m 2048M -s 0:0,hostbridge -s 31,lpc -s 1:0,virtio-vpnkit,path=vpnkit.eth.sock,uuid=1a082c4b-04b9-44a6-83ae-f44d68d4591e -U ed765c1a-f8a3-49a9-9f97-f5e1cc8ba7b8 -s 2:0,virtio-blk,/Users/sburnwal/Library/Containers/com.docker.docker/Data/vms/0/data/Docker.raw -s 3,virtio-sock,guest_cid=3,path=vms/0,guest_forwards=2376;1525 -s 4,virtio-rnd -l com1,null,asl,log=vms/0/console-ring -f kexec,/Applications/Docker.app/Contents/Resources/linuxkit/kernel,/Applications/Docker.app/Contents/Resources/linuxkit/initrd.img,earlyprintk=serial page_poison=1 vsyscall=emulate panic=1 nospec_store_bypass_disable noibrs noibpb no_stf_barrier mitigations=off console=ttyS0 console=ttyS1  vpnkit.connect=connect://2/1999

Also, I want to know if there is a way to configure the kind cluster to disable the use of docker vpn kit IP as the IP address of the DNS server for API server.

@sburnwal sburnwal added the kind/support Categorizes issue or PR as a support question. label Sep 21, 2021
@BenTheElder
Copy link
Member

The API server can not talk to an in cluster domain because that would introduce a circular dependency. The source of truth for in-cluster service domains is the API server. If the API server used the in-cluster DNS resolution there would be a circular dependency in the cluster.

There is no way to disable use of the docker embedded DNS. KIND depends on this, and it wouldn't make in-cluster domains resolvable by the API server anyhow.

@sburnwal
Copy link
Author

@BenTheElder are you saying this is an issue with Kind cluster only or we will face it in real Kubernetes clusters (deployed on VMs or bare metals) ?

@BenTheElder
Copy link
Member

This is an issue you will face in all Kubernetes distributions I am aware of. It's fundamentally problematic to have the API server use DNS ultimately originating from itself.

@BenTheElder
Copy link
Member

See previously: kubernetes/kubeadm#1236

Kind caught an issue with an attempt at enabling this upstream previously but the problem wasn't / isn't limited to kind. At that point in time kubeadm wasn't in required PR tests and the change was only in kubeadm.

@sburnwal
Copy link
Author

Thanks for explaining. The doc here https://v1-20.docs.kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-namespaceselector does explain about circular dependency but then it also says you can service as reference. Can you please tell?

  1. If API server cannot resolve the in-cluster services, how is it resolving when the same service is used as reference?
  2. API server pod never has the in-cluster DNS resolver config in its say /etc/resolve.conf file?

@BenTheElder
Copy link
Member

Thanks for explaining. The doc here https://v1-20.docs.kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-namespaceselector does explain about circular dependency but then it also says you can service as reference. Can you please tell?

From the relevant section

https://v1-20.docs.kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#url

The host should not refer to a service running in the cluster; use a service reference by specifying the service field instead. The host might be resolved via external DNS in some apiservers (e.g., kube-apiserver cannot resolve in-cluster DNS as that would be a layering violation). host may also be an IP address.

kube-apiserver is the standard Kubernetes API server, where it must use an external DNS or IP address due to this limitation.

[...] but then it also says you can service as reference. Can you please tell?

From that section https://v1-20.docs.kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#service-reference

If the webhook is running within the cluster, then you should use service instead of url.

In this case we're referencing the service object / concept not the DNS url. Because we reference the service directly, kube-apiserver is aware of the service object and can look it up directly from the service data from itself / etcd (which is the same data used by the in-cluster DNS server), it does NOT make any DNS calls to do this.

  1. If API server cannot resolve the in-cluster services, how is it resolving when the same service is used as reference?

It doesn't need to use DNS / does not resolve them. Services are objects in the api-server, the api-server "asks" itself directly when using a service reference.

  1. API server pod never has the in-cluster DNS resolver config in its say /etc/resolve.conf file?

It does not, it runs with hostNetwork with default dnsPolicy typically, including in kubeadm upstream (which is what kind builds on), under hostNetwork with default dnsPolicy there is only the host's DNS, not any injected cluster DNS.

https://github.com/kubernetes/kubernetes/blob/b6924839cad87e7e32c6cc162c950a34b6730d1b/cmd/kubeadm/app/util/staticpod/utils.go#L76

https://github.com/kubernetes/kubernetes/blob/b6924839cad87e7e32c6cc162c950a34b6730d1b/cluster/gce/manifests/kube-apiserver.manifest#L23

@BenTheElder BenTheElder self-assigned this Sep 23, 2021
@sburnwal
Copy link
Author

Thank you for the explanation @BenTheElder ! One last piece of the puzzle - for my API server pod, I see dnsPolicy as ClusterFirst. By default, say the cluster has the domain cluster.local. So going by your explanation, does that mean even though dnsPolicy is ClusterFirst, API server is not really using the CoreDNS but always forwards to host's DNS server?

@BenTheElder
Copy link
Member

yes because it is ClusterFirst not ClusterFirstWithHostNet https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy kubernetes/dns#316

"ClusterFirst" is the default dns setting (Confusingly, not "Default", see the docs above), but by default hostNetwork pods host networking ~100%, including DNS resolution, whereas non-host-network pods use the in-cluster DNS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

2 participants