Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Pods are unable to resolve DNS for services both internally and externally. #2999

Closed
anujshankar opened this issue May 17, 2018 · 27 comments · Fixed by #3373
Closed

Pods are unable to resolve DNS for services both internally and externally. #2999

anujshankar opened this issue May 17, 2018 · 27 comments · Fixed by #3373

Comments

@anujshankar
Copy link

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Issue

What version of acs-engine?:
0.15.2

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes Version: 1.10.1

What happened:
All our internal services report a DNS resolution failure when called, results in an overall network outage in our cluster.

What you expected to happen:
DNS should be resolved.

How to reproduce it (as minimally and precisely as possible):
Happens erratically. Root cause of the problem is still unknown.

Anything else we need to know:

We are facing this issue in "production". It results in a huge business impact as all services running on our cluster goes down.

Observation:

  1. Calls to external services fail.
  2. Inter service calls within the cluster fail.
  3. Due to the above - some pods in the cluster to go CrashLoopBackOff mode.

Current Hack/Fix that we use today which is temporary:

  1. Remove the failing pods which go in CrashLoopbackOff state.
  2. Restart Kubedns pods present in the cluster.
  3. Restart all worker nodes in the cluster - a very dangerous operation.

Attached Kubedns logs

⇒  kubectl logs -n kube-system kube-dns-v20-59b4f7dc55-4dxh8 kubedns
I0515 12:30:47.752415       1 dns.go:48] version: 1.14.8
I0515 12:30:47.753947       1 server.go:71] Using configuration read from directory: /kube-dns-config with period 10s
I0515 12:30:47.754135       1 server.go:119] FLAG: --alsologtostderr="false"
I0515 12:30:47.754177       1 server.go:119] FLAG: --config-dir="/kube-dns-config"
I0515 12:30:47.754186       1 server.go:119] FLAG: --config-map=""
I0515 12:30:47.754192       1 server.go:119] FLAG: --config-map-namespace="kube-system"
I0515 12:30:47.754197       1 server.go:119] FLAG: --config-period="10s"
I0515 12:30:47.754204       1 server.go:119] FLAG: --dns-bind-address="0.0.0.0"
I0515 12:30:47.754209       1 server.go:119] FLAG: --dns-port="10053"
I0515 12:30:47.754216       1 server.go:119] FLAG: --domain="cluster.local."
I0515 12:30:47.754223       1 server.go:119] FLAG: --federations=""
I0515 12:30:47.754229       1 server.go:119] FLAG: --healthz-port="8081"
I0515 12:30:47.754235       1 server.go:119] FLAG: --initial-sync-timeout="1m0s"
I0515 12:30:47.754241       1 server.go:119] FLAG: --kube-master-url=""
I0515 12:30:47.754247       1 server.go:119] FLAG: --kubecfg-file=""
I0515 12:30:47.754253       1 server.go:119] FLAG: --log-backtrace-at=":0"
I0515 12:30:47.754261       1 server.go:119] FLAG: --log-dir=""
I0515 12:30:47.754267       1 server.go:119] FLAG: --log-flush-frequency="5s"
I0515 12:30:47.754272       1 server.go:119] FLAG: --logtostderr="true"
I0515 12:30:47.754277       1 server.go:119] FLAG: --nameservers=""
I0515 12:30:47.754291       1 server.go:119] FLAG: --stderrthreshold="2"
I0515 12:30:47.754297       1 server.go:119] FLAG: --v="2"
I0515 12:30:47.754302       1 server.go:119] FLAG: --version="false"
I0515 12:30:47.754310       1 server.go:119] FLAG: --vmodule=""
I0515 12:30:47.754411       1 server.go:201] Starting SkyDNS server (0.0.0.0:10053)
I0515 12:30:47.754470       1 server.go:222] Skydns metrics not enabled
I0515 12:30:47.754480       1 dns.go:146] Starting endpointsController
I0515 12:30:47.754485       1 dns.go:149] Starting serviceController
I0515 12:30:47.754528       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0515 12:30:47.754547       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0515 12:30:48.254763       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:48.754798       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:49.254807       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:49.754810       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:50.254806       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:50.754734       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:51.254743       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:51.754780       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:52.254785       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:52.754699       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:52.891031       1 logs.go:41] skydns: failure to forward request "read udp 10.241.5.12:44073->168.63.129.16:53: i/o timeout"
I0515 12:30:53.254755       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0515 12:30:53.754780       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
@anujshankar anujshankar changed the title Pods are uanble to resolve DNS for any of Azure service or other external sites. Pods are unable to resolve DNS both internally and externally. May 17, 2018
@anujshankar anujshankar changed the title Pods are unable to resolve DNS both internally and externally. Pods are unable to resolve DNS for services both internally and externally. May 17, 2018
@anujshankar
Copy link
Author

anujshankar commented May 17, 2018

Kubedns deployment file

{
  "kind": "Deployment",
  "apiVersion": "extensions/v1beta1",
  "metadata": {
    "name": "kube-dns-v20",
    "namespace": "kube-system",
    "selfLink": "/apis/extensions/v1beta1/namespaces/kube-system/deployments/kube-dns-v20",
    "uid": "7a2c4252-416f-11e8-8be0-000d3aa242e5",
    "resourceVersion": "5136316",
    "generation": 1,
    "creationTimestamp": "2018-04-16T12:12:48Z",
    "labels": {
      "k8s-app": "kube-dns",
      "kubernetes.io/cluster-service": "true",
      "version": "v20"
    },
    "annotations": {
      "deployment.kubernetes.io/revision": "1",
      "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"apps/v1beta1\",\"kind\":\"Deployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"k8s-app\":\"kube-dns\",\"kubernetes.io/cluster-service\":\"true\",\"version\":\"v20\"},\"name\":\"kube-dns-v20\",\"namespace\":\"kube-system\"},\"spec\":{\"replicas\":2,\"selector\":{\"matchLabels\":{\"k8s-app\":\"kube-dns\",\"version\":\"v20\"}},\"template\":{\"metadata\":{\"annotations\":{\"scheduler.alpha.kubernetes.io/critical-pod\":\"\"},\"labels\":{\"k8s-app\":\"kube-dns\",\"kubernetes.io/cluster-service\":\"true\",\"version\":\"v20\"}},\"spec\":{\"affinity\":{\"podAntiAffinity\":{\"preferredDuringSchedulingIgnoredDuringExecution\":[{\"podAffinityTerm\":{\"labelSelector\":{\"matchExpressions\":[{\"key\":\"k8s-app\",\"operator\":\"In\",\"values\":[\"kube-dns\"]}]},\"topologyKey\":\"kubernetes.io/hostname\"},\"weight\":100}]}},\"containers\":[{\"args\":[\"--domain=cluster.local.\",\"--dns-port=10053\",\"--v=2\",\"--config-dir=/kube-dns-config\"],\"image\":\"k8s-gcrio.azureedge.net/k8s-dns-kube-dns-amd64:1.14.8\",\"livenessProbe\":{\"failureThreshold\":5,\"httpGet\":{\"path\":\"/healthz-kubedns\",\"port\":8080,\"scheme\":\"HTTP\"},\"initialDelaySeconds\":60,\"successThreshold\":1,\"timeoutSeconds\":5},\"name\":\"kubedns\",\"ports\":[{\"containerPort\":10053,\"name\":\"dns-local\",\"protocol\":\"UDP\"},{\"containerPort\":10053,\"name\":\"dns-tcp-local\",\"protocol\":\"TCP\"}],\"readinessProbe\":{\"httpGet\":{\"path\":\"/readiness\",\"port\":8081,\"scheme\":\"HTTP\"},\"initialDelaySeconds\":30,\"timeoutSeconds\":5},\"resources\":{\"limits\":{\"memory\":\"170Mi\"},\"requests\":{\"cpu\":\"100m\",\"memory\":\"70Mi\"}},\"volumeMounts\":[{\"mountPath\":\"/kube-dns-config\",\"name\":\"kube-dns-config\"}]},{\"args\":[\"-v=2\",\"-logtostderr\",\"-configDir=/kube-dns-config\",\"-restartDnsmasq=true\",\"--\",\"-k\",\"--cache-size=1000\",\"--no-resolv\",\"--server=127.0.0.1#10053\",\"--server=/in-addr.arpa/127.0.0.1#10053\",\"--server=/ip6.arpa/127.0.0.1#10053\",\"--log-facility=-\"],\"image\":\"k8s-gcrio.azureedge.net/k8s-dns-dnsmasq-nanny-amd64:1.14.8\",\"name\":\"dnsmasq\",\"ports\":[{\"containerPort\":53,\"name\":\"dns\",\"protocol\":\"UDP\"},{\"containerPort\":53,\"name\":\"dns-tcp\",\"protocol\":\"TCP\"}],\"volumeMounts\":[{\"mountPath\":\"/kube-dns-config\",\"name\":\"kube-dns-config\"}]},{\"args\":[\"--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 \\u003e/dev/null\",\"--url=/healthz-dnsmasq\",\"--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 \\u003e/dev/null\",\"--url=/healthz-kubedns\",\"--port=8080\",\"--quiet\"],\"image\":\"k8s-gcrio.azureedge.net/exechealthz-amd64:1.2\",\"livenessProbe\":{\"failureThreshold\":5,\"httpGet\":{\"path\":\"/healthz-dnsmasq\",\"port\":8080,\"scheme\":\"HTTP\"},\"initialDelaySeconds\":60,\"successThreshold\":1,\"timeoutSeconds\":5},\"name\":\"healthz\",\"ports\":[{\"containerPort\":8080,\"protocol\":\"TCP\"}],\"resources\":{\"limits\":{\"memory\":\"50Mi\"},\"requests\":{\"cpu\":\"10m\",\"memory\":\"50Mi\"}}}],\"dnsPolicy\":\"Default\",\"nodeSelector\":{\"beta.kubernetes.io/os\":\"linux\"},\"serviceAccountName\":\"kube-dns\",\"tolerations\":[{\"key\":\"CriticalAddonsOnly\",\"operator\":\"Exists\"}],\"volumes\":[{\"configMap\":{\"name\":\"kube-dns\",\"optional\":true},\"name\":\"kube-dns-config\"}]}}}}\n"
    }
  },
  "spec": {
    "replicas": 2,
    "selector": {
      "matchLabels": {
        "k8s-app": "kube-dns",
        "version": "v20"
      }
    },
    "template": {
      "metadata": {
        "creationTimestamp": null,
        "labels": {
          "k8s-app": "kube-dns",
          "kubernetes.io/cluster-service": "true",
          "version": "v20"
        },
        "annotations": {
          "scheduler.alpha.kubernetes.io/critical-pod": ""
        }
      },
      "spec": {
        "volumes": [
          {
            "name": "kube-dns-config",
            "configMap": {
              "name": "kube-dns",
              "defaultMode": 420,
              "optional": true
            }
          }
        ],
        "containers": [
          {
            "name": "kubedns",
            "image": "k8s-gcrio.azureedge.net/k8s-dns-kube-dns-amd64:1.14.8",
            "args": [
              "--domain=cluster.local.",
              "--dns-port=10053",
              "--v=2",
              "--config-dir=/kube-dns-config"
            ],
            "ports": [
              {
                "name": "dns-local",
                "containerPort": 10053,
                "protocol": "UDP"
              },
              {
                "name": "dns-tcp-local",
                "containerPort": 10053,
                "protocol": "TCP"
              }
            ],
            "resources": {
              "limits": {
                "memory": "170Mi"
              },
              "requests": {
                "cpu": "100m",
                "memory": "70Mi"
              }
            },
            "volumeMounts": [
              {
                "name": "kube-dns-config",
                "mountPath": "/kube-dns-config"
              }
            ],
            "livenessProbe": {
              "httpGet": {
                "path": "/healthz-kubedns",
                "port": 8080,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 60,
              "timeoutSeconds": 5,
              "periodSeconds": 10,
              "successThreshold": 1,
              "failureThreshold": 5
            },
            "readinessProbe": {
              "httpGet": {
                "path": "/readiness",
                "port": 8081,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 30,
              "timeoutSeconds": 5,
              "periodSeconds": 10,
              "successThreshold": 1,
              "failureThreshold": 3
            },
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          },
          {
            "name": "dnsmasq",
            "image": "k8s-gcrio.azureedge.net/k8s-dns-dnsmasq-nanny-amd64:1.14.8",
            "args": [
              "-v=2",
              "-logtostderr",
              "-configDir=/kube-dns-config",
              "-restartDnsmasq=true",
              "--",
              "-k",
              "--cache-size=1000",
              "--no-resolv",
              "--server=127.0.0.1#10053",
              "--server=/in-addr.arpa/127.0.0.1#10053",
              "--server=/ip6.arpa/127.0.0.1#10053",
              "--log-facility=-"
            ],
            "ports": [
              {
                "name": "dns",
                "containerPort": 53,
                "protocol": "UDP"
              },
              {
                "name": "dns-tcp",
                "containerPort": 53,
                "protocol": "TCP"
              }
            ],
            "resources": {},
            "volumeMounts": [
              {
                "name": "kube-dns-config",
                "mountPath": "/kube-dns-config"
              }
            ],
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          },
          {
            "name": "healthz",
            "image": "k8s-gcrio.azureedge.net/exechealthz-amd64:1.2",
            "args": [
              "--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null",
              "--url=/healthz-dnsmasq",
              "--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null",
              "--url=/healthz-kubedns",
              "--port=8080",
              "--quiet"
            ],
            "ports": [
              {
                "containerPort": 8080,
                "protocol": "TCP"
              }
            ],
            "resources": {
              "limits": {
                "memory": "50Mi"
              },
              "requests": {
                "cpu": "10m",
                "memory": "50Mi"
              }
            },
            "livenessProbe": {
              "httpGet": {
                "path": "/healthz-dnsmasq",
                "port": 8080,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 60,
              "timeoutSeconds": 5,
              "periodSeconds": 10,
              "successThreshold": 1,
              "failureThreshold": 5
            },
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          }
        ],
        "restartPolicy": "Always",
        "terminationGracePeriodSeconds": 30,
        "dnsPolicy": "Default",
        "nodeSelector": {
          "beta.kubernetes.io/os": "linux"
        },
        "serviceAccountName": "kube-dns",
        "serviceAccount": "kube-dns",
        "securityContext": {},
        "affinity": {
          "podAntiAffinity": {
            "preferredDuringSchedulingIgnoredDuringExecution": [
              {
                "weight": 100,
                "podAffinityTerm": {
                  "labelSelector": {
                    "matchExpressions": [
                      {
                        "key": "k8s-app",
                        "operator": "In",
                        "values": [
                          "kube-dns"
                        ]
                      }
                    ]
                  },
                  "topologyKey": "kubernetes.io/hostname"
                }
              }
            ]
          }
        },
        "schedulerName": "default-scheduler",
        "tolerations": [
          {
            "key": "CriticalAddonsOnly",
            "operator": "Exists"
          }
        ]
      }
    },
    "strategy": {
      "type": "RollingUpdate",
      "rollingUpdate": {
        "maxUnavailable": "25%",
        "maxSurge": "25%"
      }
    },
    "revisionHistoryLimit": 2,
    "progressDeadlineSeconds": 600
  },
  "status": {
    "observedGeneration": 1,
    "replicas": 2,
    "updatedReplicas": 2,
    "readyReplicas": 2,
    "availableReplicas": 2,
    "conditions": [
      {
        "type": "Progressing",
        "status": "True",
        "lastUpdateTime": "2018-04-16T12:13:29Z",
        "lastTransitionTime": "2018-04-16T12:12:48Z",
        "reason": "NewReplicaSetAvailable",
        "message": "ReplicaSet \"kube-dns-v20-59b4f7dc55\" has successfully progressed."
      },
      {
        "type": "Available",
        "status": "True",
        "lastUpdateTime": "2018-05-16T01:25:10Z",
        "lastTransitionTime": "2018-05-16T01:25:10Z",
        "reason": "MinimumReplicasAvailable",
        "message": "Deployment has minimum availability."
      }
    ]
  }
}```

@anujshankar
Copy link
Author

anujshankar commented May 17, 2018

The kubedns pods also continuously go back and forth from running to crashloopbackoff state.

@CecileRobertMichon
Copy link
Contributor

This looks related to #2971 and #2880

@jackfrancis
Copy link
Member

@anujshankar asked in #2971, but will paste here as well:

Could you kindly:

  • setup a tcpdump for dns traffic on a node that has a scheduled pod that is doing DNS requests
  • do some DNS lookups from the node itself (from the ubuntu CLI, for example) -- we expect these lookups to succeed
  • let the pod run long enough to do its own DNS lookups -- we expect these lookups to fail

What we'd like to see is: what is the difference between the DNS lookups going out the wire, if any? Are the pod-originating lookups being SNAT'd in a particular way compared to the DNS lookups from the node OS?

@anujshankar
Copy link
Author

anujshankar commented May 18, 2018

For others facing this issue - We tried to implement the workaround suggested in #2880.

We changed the dns nameserver IPs to 8.8.8.8 (google dns) on nodes (VMs) by modifying the /etc/resolv.conf file. This solved our problem for now.

However, the scary part of this workaround is that the /etc/resolv.conf file should not be edited by hand this way on azure cluster as they will be overwritten the next time cluster is re provisioned for any reason.

Points note and incidents:

  1. Kubedns pods went in CrashLoopBackOff stat, restart counts were more than 50.
  2. We tried to add the following ConfigMap and restarted the kubedns pods. We verified if the config was picked up by checking kubedns logs once we applied the config. This didn’t solve our problem.
apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
data:
  stubDomains: |
    {"acme.local": ["1.2.3.4"]}
  upstreamNameservers: |
    ["8.8.8.8", “8.8.4.4”]
  1. Created a pod in our cluster and ran a simple apt-get update command. It failed. Logs attached.
root@anuj-shell-688784f69f-5jn99:/# apt-get update
  Err:1 http://archive.ubuntu.com/ubuntu bionic InRelease
  Temporary failure resolving 'archive.ubuntu.com’
  1. Currently the pod -> /etc/resolv.conf was pointing to 10.0.0.10(which is kube-dns ClusterIP). Changed nameserver to 8.8.8.8 in pod’s /etc/resolv.conf and ran the apt-get command - it worked.
  2. Then we removed the above applied configMap and changed the DNS Nameserver IPs from 168.63.129.16 to 8.8.8.8 in all nodes.
  3. Restarted kubedns pods.
  4. Restarted pods which were failing manually.

@jackfrancis @CecileRobertMichon
Now this is quite weird how the nameservers can cause such problems and changing them can fix the issue. Again this solution is not sustainable. Because if we restart our VMs the /etc/resolv.conf setting will go back to default.

@jackfrancis
Copy link
Member

Thanks for the update @anujshankar, of course this is not an acceptable workaround, we're investigating why 168.63.129.16 is dropping DNS requests from pod-originating traffic on some clusters.

@jackfrancis
Copy link
Member

@anujshankar Will you be able to spend some time this afternoon repro'ing this failure condition so we can do some real time debugging?

@anujshankar
Copy link
Author

anujshankar commented May 19, 2018

Sure @jackfrancis - let's do it.
How do you want to do it?
Can you share your mail id/phone number?

@anujshankar
Copy link
Author

anujshankar commented May 19, 2018

@jackfrancis take a look at the nslookups below fired from within a pod.

Any thoughts around this?

⇒  kubectl exec -it busybox-3-c8f969bdd-5xj8b -n default -- sh
/ # nslookup bing.com 168.63.129.16
Server:    168.63.129.16
Address 1: 168.63.129.16

nslookup: can't resolve 'bing.com'
/ # nslookup bing.com 8.8.8.8
Server:    8.8.8.8
Address 1: 8.8.8.8 google-public-dns-a.google.com

Name:      bing.com
Address 1: 204.79.197.200 a-0001.a-msedge.net
Address 2: 13.107.21.200

nslookup fired from a node

azureuser@k8s-master-29529181-0:~$ nslookup bing.com 168.63.129.16
Server:		168.63.129.16
Address:	168.63.129.16#53

Non-authoritative answer:
Name:	bing.com
Address: 204.79.197.200
Name:	bing.com
Address: 13.107.21.200

@jackfrancis
Copy link
Member

@anujshankar Just to confirm, are you using the Azure CNI network implementation on your cluster? (--network-plugin=cni kubelet runtime config option)

@anujshankar
Copy link
Author

@jackfrancis yes we are using Azure CNI network implementation and we are running kubelet with --containerized flag.

@jackfrancis
Copy link
Member

@khenidak FYI

@anujshankar
Copy link
Author

anujshankar commented May 25, 2018

For Reference: @khenidak has posted the workaround in ticket #2971

@anujshankar
Copy link
Author

@jackfrancis @khenidak this issue has come up again in our QA Cluster.
Let me know if we can debug together to gather more insights.

@jackfrancis
Copy link
Member

Hi @anujshankar, the next time you encounter this issue could you kindly:

  • share the output of “ebtables -t nat -L”
  • cat /var/log/kern.log

Thanks!

@anujshankar
Copy link
Author

Sure @jackfrancis !

@anujshankar
Copy link
Author

anujshankar commented Jun 8, 2018

@jackfrancis @khenidak collected the output of “ebtables -t nat -L” and "kern.log". Have sent you both the output on mail. Let us know if we can do a screenshare session.

@anujshankar
Copy link
Author

@jackfrancis #2971 the solution mentioned would help us out too?

@sharmasushant
Copy link
Contributor

@anujshankar Can you please attach the output here. Depending on the reason, it may or may not help. In addition to what jack has mentioned, please share the output of
kubectl get pods -o wide --all-namespaces
and the pod that had problems resolving the dns name.

@jackfrancis
Copy link
Member

We think it may, yes.

@anujshankar
Copy link
Author

anujshankar commented Jun 15, 2018

@sharmasushant can you send me your mail id?

@jackfrancis that's great - before proceeding - it would be great if @sharmasushant takes a close look at the output of the above commands.

@anujshankar
Copy link
Author

@jackfrancis @khenidak @sharmasushant
We had re-created our development cluster using flannel network plugin and we have been using it for around 2 weeks.

Outcome: We didn't face any DNS resolution till now.

We plan to re-create our QA and Production clusters too using flannel network plugin.

Do you think we are going in the right direction?

@sharmasushant
Copy link
Contributor

@anujshankar My email is sushant.sharma@microsoft.com
regarding flannel vs azurecni, I don't see anything specific to DNS that azurecni does that would cause the above issue. In the email, can you please also include your subscription Id and your cluster details (region & vmnames) where you are not observing DNS issue and where you are.

@anujshankar
Copy link
Author

anujshankar commented Jun 15, 2018

Sure - Will send you the details by EOD(GMT +6:30)

@anujshankar
Copy link
Author

@sharmasushant Have sent out a mail to you with the details.

@diwakar-s-maurya
Copy link
Contributor

It has been over a month with flannel and we have not seen such dns resolution related problem even once.

@jackfrancis
Copy link
Member

@diwakar-s-maurya Thanks for sharing! Would love to hear more anecdotes on real-world flannel experience, positive or negative.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
5 participants