Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networkpolicy blocks return traffic when connecting to a pod IP #100

Closed
jbeerden opened this issue Oct 18, 2023 · 5 comments · Fixed by #102
Closed

Networkpolicy blocks return traffic when connecting to a pod IP #100

jbeerden opened this issue Oct 18, 2023 · 5 comments · Fixed by #102
Labels
bug Something isn't working

Comments

@jbeerden
Copy link

What happened:
When connecting from 1 pod to another, inbound requests are allowed, but the return traffic is not.

When making the same connection trough a service, the return traffic is allowed as expected.

/var/log/aws-routed-eni/network-policy-agent.log

{"level":"info","ts":"2023-10-18T09:52:23.319Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":33140,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2023-10-18T09:52:23.720Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.152","Src Port":60816,"Dest IP":"10.255.7.42","Dest Port":7979,"Proto":"TCP","Verdict":"ACCEPT"}
{"level":"info","ts":"2023-10-18T09:52:23.720Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.152","Src Port":60816,"Dest IP":"10.255.7.42","Dest Port":7979,"Proto":"TCP","Verdict":"ACCEPT"}
{"level":"info","ts":"2023-10-18T09:52:23.721Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":60816,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2023-10-18T09:52:23.721Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":60816,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2023-10-18T09:52:24.609Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":38838,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2023-10-18T09:52:24.849Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":60808,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2023-10-18T09:52:25.809Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":60816,"Proto":"TCP","Verdict":"DENY"}
{"level":"info","ts":"2023-10-18T09:52:25.809Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.152","Src Port":60816,"Dest IP":"10.255.7.42","Dest Port":7979,"Proto":"TCP","Verdict":"ACCEPT"}

Attach logs
I tried running the /opt/cni/bin/aws-cni-support.sh script but this does not seem to be supported on bottlerocket

What you expected to happen:
Return traffic is allowed

How to reproduce it (as minimally and precisely as possible):

  1. create a namespace A with a pod running busybox for example
  2. create a namespace B with a pod running a webserver
  3. create a service in namespace B through which the webserver is exposed using the same port as the pod
  4. deploy a default deny (ingress) network policy in both namespaces
  5. deploy a network policy in namespace B that allows ingress from the pod in namespace A using a podSelector and namespaceSelector
  6. curl from the pod in namespace A to the service in namespace B --> this works
  7. curl from the pod in namespace A to the pod in namespace B --> this does not work, return traffic is blocked

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): v1.27.4-eks-2d98532
  • CNI Version: v1.15.1-eksbuild.1 (EKS add-on)
  • OS (e.g: cat /etc/os-release): Bottlerocket OS 1.14.1 (aws-k8s-1.27)
  • Kernel (e.g. uname -a): 5.15.108
@jbeerden jbeerden added the bug Something isn't working label Oct 18, 2023
@jdn5126 jdn5126 transferred this issue from aws/amazon-vpc-cni-k8s Oct 18, 2023
@jayanthvn
Copy link
Contributor

Just taking one sample flow -

{"level":"info","ts":"2023-10-18T09:52:23.720Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.152","Src Port":60816,"Dest IP":"10.255.7.42","Dest Port":7979,"Proto":"TCP","Verdict":"ACCEPT"}
{"level":"info","ts":"2023-10-18T09:52:23.720Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.152","Src Port":60816,"Dest IP":"10.255.7.42","Dest Port":7979,"Proto":"TCP","Verdict":"ACCEPT"}

After allowing the traffic (i.e, trie lookup), we make an entry in agent conntrack cache for the return flow.

So the return packet should just hit the conntrack entry (we do reverse lookup) and it should be allowed. But here seems like it is doing a trie lookup and conntrack look up for some reason has failed.

{"level":"info","ts":"2023-10-18T09:52:23.721Z","logger":"ebpf-client","msg":"Flow Info:  ","Src IP":"10.255.7.42","Src Port":7979,"Dest IP":"10.255.7.152","Dest Port":60816,"Proto":"TCP","Verdict":"DENY"}

I will try to repro this locally..

@jayanthvn
Copy link
Contributor

We were able to repro and this issue will happen when the pods are on the same node and the way NP is setup in your cluster. We have a possible fix and will open a PR soon.

@jbeerden
Copy link
Author

Thank you @jayanthvn for the quick response and fix!

Do you happen to have any insights as to when this change would make it to a new amazon-vpc-cni-k8s release?

@smithc14
Copy link

Do you happen to have any insights as to when this change would make it to a new amazon-vpc-cni-k8s release?

Also curious about release timing. This is a pretty big showstopper for users (like us) trying to move from calico to the built in vpc-cni network policy support.

@jayanthvn
Copy link
Contributor

We are in the release testing phase and should have the release by next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants