Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port-forward drops connection to pod after first connection #1169

Closed
mkfdoherty opened this issue Jan 26, 2022 · 72 comments
Closed

Port-forward drops connection to pod after first connection #1169

mkfdoherty opened this issue Jan 26, 2022 · 72 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@mkfdoherty
Copy link

What happened:

When running Kubernetes v1.23.1 on Minikube with kubectl v1.23.2 I experienced the following unexpected behaviour when trying to create a port-forward to a pod running an arbitrary service.

kubectl version:

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.2", GitCommit:"9d142434e3af351a628bffee3939e64c681afa4d", GitTreeState:"clean", BuildDate:"2022-01-19T17:27:51Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/amd64"}                                                        
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:34:54Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}

What we see is that after the first netcat connection successfully closes we get a lost connection to the pod and the port-forward closes:

kubectl port-forward to pod output:

Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432
Handling connection for 5432
E0125 16:43:20.470080   17437 portforward.go:406] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod 55b25aeaae996c672f7eb762ce083e9b9666acabe96946d47790c167f1949d64, uid : exit status 1: 2022/01/25 15:43:20 socat[5831] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
E0125 16:43:20.470389   17437 portforward.go:234] lost connection to pod

We would expect the the connection to stay open as is the case with Kubernetes before v1.23.0.

What you expected to happen:
When running the test against EKS running Kubernetes version v1.21.5-eks-bc4871b we get the port-forward behavior we are use to. The port-forward remains open after the first successful netcat connection.

kubectl version:

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.2", GitCommit:"9d142434e3af351a628bffee3939e64c681afa4d", GitTreeState:"clean", BuildDate:"2022-01-19T17:27:51Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.5-eks-bc4871b", GitCommit:"5236faf39f1b7a7dabea8df12726f25608131aa9", GitTreeState:"clean", BuildDate:"2021-10-29T23:32:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.21) exceeds the supported minor version skew of +/-1

Notice how the kubectl version is v1.23.2 and the server version is v1.21.5-eks-bc4871b. EKS seems to manage version skew on its own somehow.

The output we get after opening multiple connections is what we expect. The connection is not closed after subsequent nc commands (don’t be alarmed by the connection refusal by PostgreSQL, we are not using the right protocol or credentials. We are just trying to test the port-forward behavior and this is a simple approach to express the issue).

kubectl port-forward to pod output:

Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432
Handling connection for 5432
E0125 16:35:32.441184   17073 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod b4b99448ef949d8f4a2f7960edf5d25eaf0e3c7b82bb1fcd525c7f30ad2830d7, uid : exit status 1: 2022/01/25 15:35:32 socat[45088] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
Handling connection for 5432
E0125 16:35:35.765744   17073 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod b4b99448ef949d8f4a2f7960edf5d25eaf0e3c7b82bb1fcd525c7f30ad2830d7, uid : exit status 1: 2022/01/25 15:35:35 socat[45202] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
Handling connection for 5432
E0125 16:35:37.129167   17073 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod b4b99448ef949d8f4a2f7960edf5d25eaf0e3c7b82bb1fcd525c7f30ad2830d7, uid : exit status 1: 2022/01/25 15:35:37 socat[45243] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
Handling connection for 5432

As we can see the port-forward connection lasts for many netcat connections. This is the behavior we expect.

For completeness this was tested using Minikube running v1.21.5 Kubernetes. The problem still exists if we don't take into account version skew. But if we match the kubectl and Minikube Kubernetes version to v1.21.5 then we get the expected behavior again of port-forwards remaining open past the first connection.

How to reproduce it (as minimally and precisely as possible):

My test is as follows:

  1. Open port forward to pod with a running service like PostgreSQL (kubectl port-forward $POD_WITH_SERVICE 5432:5432)
  2. Try and open a nc connections on localhost to the localport (nc -v localhost 5432)
  3. We should be able to open nc connection multiple times without the port-forward breaking (behaviour on Kubernetes before v1.23.0)

Tests were conducted against Kubernetes versions (v1.21.5, v1.22.1 and v1.23.1) on Minikube using minikube start --kubernetes-version=v1.21.5. Using minikube kubectl -- we can match the kubectl version to the Kubernetes version Minikube is using to avoid version skew. The problem I describe only appears when running Kubernetes above v1.23.0.

Anything else we need to know?:
Based on the above testing it would seem that there is a bug introduced in kubectl >v1.23.0 which causes port-forwards to close immediately after a successful connection. This is a problem given the above test expects the old behaviour of long lasting kubectl port-forwards. My assumption is that this is a bug based on there being no mention of this behavior explicitly in CHANGELOG-1.23. so it may be a regression. Could someone please shed light on whether this is a regression or expected future behavior now for reasons unbeknown to me?

Environment:

  • Kubernetes client and server versions (use kubectl version): Listed above based on my expectations
  • Cloud provider or hardware configuration: minikube v1.25.1 on Darwin 12.1 using Docker Desktop 4.4.2 (73305) and EKS v1.21.5-eks-bc4871b to verify behavior.
  • OS (e.g: cat /etc/os-release): When testing locally on a Docker node:
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
@mkfdoherty mkfdoherty added the kind/bug Categorizes issue or PR as related to a bug. label Jan 26, 2022
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 26, 2022
@k8s-ci-robot
Copy link
Contributor

@mkfdoherty: This issue is currently awaiting triage.

SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@eddiezane
Copy link
Member

Possibly related to kubernetes/kubernetes#103526. Though that should have only had an effect when the pod dies anyways. We just stopped hiding the broken behavior.

We want to rewrite the port forward for this release as well.

@brianpursley
Copy link
Member

@eddiezane Yes, I think this is probably related to the "fix" in kubernetes/kubernetes#103526. I put "fix" in quotes because the fix was to allow port-forward to fail when there is an error instead of getting stuck in an unrecoverable non-failed state that can never process connections again.

@mkfdoherty You mentioned that this is expected behavior:

Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432
Handling connection for 5432
E0125 16:35:32.441184   17073 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod b4b99448ef949d8f4a2f7960edf5d25eaf0e3c7b82bb1fcd525c7f30ad2830d7, uid : exit status 1: 2022/01/25 15:35:32 socat[45088] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
Handling connection for 5432
E0125 16:35:35.765744   17073 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod b4b99448ef949d8f4a2f7960edf5d25eaf0e3c7b82bb1fcd525c7f30ad2830d7, uid : exit status 1: 2022/01/25 15:35:35 socat[45202] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
Handling connection for 5432
E0125 16:35:37.129167   17073 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod b4b99448ef949d8f4a2f7960edf5d25eaf0e3c7b82bb1fcd525c7f30ad2830d7, uid : exit status 1: 2022/01/25 15:35:37 socat[45243] E connect(5, AF=2 127.0.0.1:5432, 16): Connection refused
Handling connection for 5432

But I'm guessing that you continue to get connection refused even though the pod has failed and restarted. It says it is handling the connection, but it fails every time, so it's not really forwarding them. In this case port-forward is still technically running (from a process standpoint on your local machine), but is never able to forward connections again until you stop and restart it. This was behavior of kubectl prior to 1.23.0.

@mkfdoherty Can you double-check your kubectl version you were using in both cases? I don't think this problem should be dependent on the cluster version which is why I'm asking. It would surprise me if the behavior of port-forward using the same kubectl version would be different depending on the cluster version.

Also, can you check whether your pod has restarted while port-forward is running? If that happens, the behavior from kubectl 1.23.0 and later is for the kubectl port-forward command to log an error saying "lost connection to pod" and exit.


For reference, I tried reproducing using kubectl 1.23.1 with a 1.24.0-alpha cluster and also with a 1.21 cluster (this one an EKS cluster).

I was starting a tcp echo server, like this:

kubectl run tcpecho --image=alpine --restart=Never -- /bin/sh -c "apk add socat && socat -v tcp-listen:8080,fork EXEC:cat"
kubectl port-forward pod/tcpecho 8080

Then connecting like this:

nc -v localhost 8080

Are you able to reproduce the problem using the tcp echo server I mentioned above?

@mkfdoherty
Copy link
Author

@eddiezane and @brianpursley I do agree that it does sound like the "fix" in kubernetes/kubernetes#103526. Which is general is a fix. But I found it unexpected in this specific scenario that could be generalised to other cases of opening and closing connections to a service that a pod is running:

  1. A pod running PostgreSQL is up and running.
  2. We create a port-forward to the PostgreSQL pod using kubectl v1.23.3.
  3. We now open a successful psql connection to a database on the PostgreSQL instance via the port-forward.
  4. We close the connection to psql gracefully.
  5. The port-forward is now closed with the following error:
E0207 08:03:13.969992   13701 portforward.go:406] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod ae5cb9fc17a1a793190887ac6d87bb3bf12e06df55bb03e370480884d2b4d69f, uid : failed to execute portforward in network namespace "/var/run/netns/cni-06fe22d7-1b91-ffc9-2c5b-568ff9137a34": read tcp4 127.0.0.1:59580->127.0.0.1:5432: read: connection reset by peer
E0207 08:03:13.970197   13701 portforward.go:234] lost connection to pod

We expected the port-forward to remain open for subsequent psql connections (that we close after each use). This was the case before v1.23.x. I have tested using kubectl v1.22.6 and port-forward does continue to remain open and functional even when we close psql connections although the port-forward does complain of errors (these errors are not unrecoverable).

Handling connection for 5432
E0207 09:01:20.124576   14998 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod ae5cb9fc17a1a793190887ac6d87bb3bf12e06df55bb03e370480884d2b4d69f, uid : failed to execute portforward in network namespace "/var/run/netns/cni-06fe22d7-1b91-ffc9-2c5b-568ff9137a34": read tcp4 127.0.0.1:33310->127.0.0.1:5432: read: connection reset by peer

I would not consider this scenario to be an example of a port-forward error occurring that requires the port-forward connection to be closed. Opening a psql connection and closing the connection does not indicate that there is anything wrong with the underlying pod or the port-forward connection. So I would say that in this scenario the port-forward is in a recoverable state and can accept new connections unlike many other scenarios in which a port-forward may return error. Can we better distinguish between these different cases of error with port-forwards?

I have run the echo server which does not cause the port-forward to break when using netcat to connect to it. This does indeed work as you have said. I would hope that this same bahaviour would be the case when opening up connections using psql. Opening and closing a netcat connection does not close the port-forward but opening and closing psql connections gracefully does return an error and close the port-forward. The only difference I see is that a netcat connection closing does not cause the port-forward return a recoverable error but closing the psql gracefully does. Are these scenarios so different? Might we not expect the same behaviour? Or might we consider it problematic to use the port-forward in this way?

To be clarify an error in my reproduction from the original post:
My original reproduction method using netcat does not suffice and was the result of a false positive. Minikube has network issues in a recent update causing my pods to fail periodically. I understand this to be the case since my pod is running Patroni which manages PostgreSQL and retries the PostgreSQL process without kubelet being aware of this (A major downside of this design approach). Therefore the port-forward would fail for good reasons without pods being actually being restarted by Kubelet. Which I think is the intended value behind the kubernetes/kubernetes#103526 PR. I am sorry for realising this after the fact. I really appreciate you taking the time to reproduce my issue. I am using kind and EKS now to avoid the current issue I experience with networking on Minikube for my use case. And so the issue appears to be only with kubectl versions as you have proposed.

@brianpursley
Copy link
Member

This was the case before v1.23.x. I have tested using kubectl v1.22.6 and port-forward does continue to remain open and functional even when we close psql connections although the port-forward does complain of errors (these errors are not unrecoverable).

Hmm, so maybe there are different types of errors, some unrecoverable where it makes sense to stop forwarding, but others which are recoverable like your example, which you mentioned.

I’ll have to test with psql and see if I can reproduce that way and see what the difference is. It sounds like you are indeed hitting an issue with the change that was made in kubectl 1.23.0.

If that’s the case, using kubectl <1.23 should be a workaround for now until we can figure out what is going on here and fix it.

@brianpursley
Copy link
Member

I'm still trying to reproduce this using kubectl 1.23.3 (and other versions), and still am not able to. There must be something else different about our clusters.

First, just so we're on the same page, here is my latest attempt to reproduce. I create a pod running PostgreSQL, forward port 5432, then connect using psql from my local machine:

Terminal 1:

kubectl run postgres --image=postgres --env=POSTGRES_PASSWORD=hunter2
kubectl port-forward postgres 5432

Terminal 2:

~ $ psql -h localhost -U postgres << EOF
> create table foo (bar integer, baz varchar); 
> insert into foo values(1, 'a'),(2, 'b'); 
> select * from foo; 
> drop table foo;
> EOF
CREATE TABLE
INSERT 0 2
 bar | baz 
-----+-----
   1 | a
   2 | b
(2 rows)

DROP TABLE
~ $ psql -h localhost -U postgres << EOF
> create table foo (bar integer, baz varchar); 
> insert into foo values(1, 'a'),(2, 'b'); 
> select * from foo; 
> drop table foo;
> EOF
CREATE TABLE
INSERT 0 2
 bar | baz 
-----+-----
   1 | a
   2 | b
(2 rows)

DROP TABLE
~ $ psql -h localhost -U postgres -c "SELECT * FROM somethingThatDoesntExist"
ERROR:  relation "somethingthatdoesntexist" does not exist
LINE 1: SELECT * FROM somethingThatDoesntExist
                      ^
~ $ psql -h localhost -U postgres -c "SELECT * FROM somethingThatDoesntExist"
ERROR:  relation "somethingthatdoesntexist" does not exist
LINE 1: SELECT * FROM somethingThatDoesntExist
                      ^
~ $ psql -h localhost -U postgres << EOF
> create table foo (bar integer, baz varchar); 
> insert into foo values(1, 'a'),(2, 'b'); 
> select * from foo; 
> drop table foo;
> EOF
CREATE TABLE
INSERT 0 2
 bar | baz 
-----+-----
   1 | a
   2 | b
(2 rows)

DROP TABLE

After several psql sessions, port forwarding remains running. Even when I issued a command that failed, the port forwarding connection itself remained intact.

@mkfdoherty Are my above commands similar to what you are doing when the problem happens?

Next, I'd like to try to find out if there is some difference in the clusters which is making this behave differently for you than it is for me.

@mkfdoherty Can you post your output of kubectl describe nodes?

I'm wondering if there is a difference in the container runtime or CNI provider.

Thanks for any additional info you can provide. I'm hoping we can get to the bottom of this as I'm sure if you're having this problem, some others will too.

@brianpursley
Copy link
Member

I also tried using a Minikube cluster...

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:34:54Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}

And EKS as well.

@mkfdoherty
Copy link
Author

Thank you so much for your replication attempt. I appreciate you taking the time.

I decided to try and replicate this again using a popular PostgreSQL helm chart . To my surprise the behavior is different to that of the PostgreSQL instance I have running with the Patroni replication manager. This likely explains why your replication procedure does not match mine. This appeared quite odd to me given that my PostgreSQL logs and pod appeared completely healthy when closing psql connections, but only the port-forward failed. Leading me to believe that this issue would be standard behavior across psql clients interacting with PostgreSQL servers using kubectl >1.23 port-forwards.

So my original assumption that this was a common recoverable error that could affect psql and other common clients communicating with servers over port-fowards seems to be incorrect.

I am investigating this less typical issue further but at this point the kubernetes/kubernetes#103526 fix seems to work mostly as intended in the common cases I have tested. I would consider this closed unless I find reason to suspect otherwise.

@Silvenga
Copy link

Also started happening for us after upgrading to 1.23 from 1.22 - also postgres - only impacting port-forwarding.

I'm a little confused, what's the fix for this?

@cconnert
Copy link

I experienced a similar issue while doing port-forwarding to a PGO managed database:

Client Version: v1.24.2
Kustomize Version: v4.5.4
Server Version: v1.23.5

At some point the connection get lost. Interestingly the connection also is lost when I exist the local psql.
Using kubectl v.1.22.0 port-fowarding rock solid

@nic-6443
Copy link

nic-6443 commented Jun 24, 2022

I had the same problem, I found that the reason is that the Postgres server sends an RST packet when the client(like psql) disconnecting from the server in TLS mode, because it shut down the sub-process that handling connection without reading the SSL Shutdown packet send from the client. And the new logic introduced in kubernetes/kubernetes#103526 causes the port forward itself to be shut down when an RST packet is read from server side in the connection established via the port forward.

If you are just okay with using the plaintext protocol, you can use PGSSLMODE=disable to temporarily bypass this issue for Postgres situation.

@panteparak
Copy link

panteparak commented Jul 21, 2022

@nic-6443 So is there a fix for this rather than fallback to disabling SSL on postgresql? I am too experiencing this problem.

@Silvenga
Copy link

FWIW, I'm not sure what I upgraded, but I'm no longer having this issue with DataGrip on the same cluster (so an upgrade to Datagrip or the kubectrl, or the cluster backplane).

@panteparak
Copy link

FWIW, I'm not sure what I upgraded, but I'm no longer having this issue with DataGrip on the same cluster (so an upgrade to Datagrip or the kubectrl, or the cluster backplane).

I see. I will see if cluster upgrading works. (my datagrip and kubectl are up to date)

@PKizzle
Copy link

PKizzle commented Jul 22, 2022

@Silvenga Which versions have you upgraded the cluster, kubectl and DataGrip to?

@Silvenga
Copy link

@PKizzle I'm just assuming something changed, since I noticed it stopped happening (disabling SSL for my login is a chore to get done, so I never tried). I recently went though and upgraded all my local software e.g. Windows, Datagrip, etc. Postgres in the cluster wasn't upgrade.

I do distinctly remember upgrading the Datagrip Postgres driver. This was also when I migrated to use kubelogin after getting a bit too annoyed with the CLI warnings. 😆 But I doubt kubelogin would have impacted anything.

I'm on PTO, I'll check on the cluster's current version when I get back next week. Feel free to poke me if I forget.

@panteparak
Copy link

panteparak commented Aug 1, 2022

@Silvenga Just a gentle ping on this issue :D

Also, can you confirm that your postgresql is using SSL.

@Silvenga
Copy link

Silvenga commented Aug 1, 2022

K8s:

# kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.8", GitCommit:"bd30c9fbc8dc9668d7ab4b0bd4fdab5c929c1ad7", GitTreeState:"clean", BuildDate:"2022-06-21T17:15:16Z", GoVersion:"go1.17.11", Compiler:"gc", Platform:"linux/amd64"}

# kubelogin --version
kubelogin version
git hash: v0.0.14/f345047a580aaaf133b009041963d50b98d8d2e2
Go version: go1.17.11
Build time: 2022-07-07T17:00:54Z

I'm apparently still using the old xenial repositories in my WSL2 instance, where 1.20.4 is latest. I should switch that over at some point...

This cluster (v1.23.8) is located in Azure where the backplane is managed by AKS. All nodes have the latest security patches installed weekly with the K8s version matching the backplane. The cluster is using the standard Azure network driver.

Datagrip:

Datagrip: 2022.2.1
Driver: PostgreSQL JDBC Driver (ver. 42.4.0, JDBC4.2)
SSL: yes

All defaults except:

Run keep-alive query every 10 sec. (FWIW, doesn't actually seem to help)

When executing kubectl port-forward the following output is typical for me (Datagrip is functional in this case):

Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432
Handling connection for 5432
Handling connection for 5432
E0801 10:27:19.881663     121 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod 1f9a1685499a757ea34f558db57bd7bf29c54118c682e94fcd936cbd754df46d, uid : failed to execute portforward in network namespace "/var/run/netns/cni-7bfd7099-ab38-8f8b-b220-f653bb6013f4": read tcp4 127.0.0.1:50698->127.0.0.1:5432: read: connection reset by peer
E0801 10:28:16.876917     121 portforward.go:400] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod 1f9a1685499a757ea34f558db57bd7bf29c54118c682e94fcd936cbd754df46d, uid : failed to execute portforward in network namespace "/var/run/netns/cni-7bfd7099-ab38-8f8b-b220-f653bb6013f4": read tcp4 127.0.0.1:50496->127.0.0.1:5432: read: connection reset by peer

Where the connection reset by peer error is from Datagrip disconnecting one of it's connections.

Previously, the lost connection to pod error would occur nearly instantly after starting the port forward and letting Datagrip connect (I would lose the connection when Datagrip attempted to connect to the local port). It happened so quickly that running any command was fruitless.


Let me know @panteparak if I missed anything. I would really suspect that the driver was the real fix for me. Of course, I can't discount networking changes in Azure by Microsoft (there have been several since Jun 17, my first comment here).

@dnnnvx
Copy link

dnnnvx commented Aug 1, 2022

Hey, found this issue since I have the same problem but with argocd while trying to access the web UI:

kubectl port-forward service/argocd-server -n argocd 8080:443
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080

Handling connection for 8080
E0801 18:06:03.794037   26029 portforward.go:406] an error occurred forwarding 8080 -> 8080: error forwarding port 8080 to pod f4f1b30d071d4f15a4132aec5048cb482ef6f58699a32e74f284acd2bc8dd87b, uid : failed to execute portforward in network namespace "/var/run/netns/cni-7924ddc6-f0c5-1383-1d9d-0a011a47b2a7": read tcp4 127.0.0.1:56102->127.0.0.1:8080: read: connection reset by peer
E0801 18:06:03.794507   26029 portforward.go:234] lost connection to pod

The version is:

Client Version: v1.24.3
Kustomize Version: v4.5.4
Server Version: v1.24.3

(The cluster is a local kubeadm setup on 2 Intel NUCs with Debian).

EDIT:
Using http instead of https solved the problem 🤧

@kieranbenton
Copy link

Same - experiencing this with DBeaver. I am unsure as to why this issue has been closed. Surely just disabling HTTPS is not a long term solution to this problem?

@nic-6443
Copy link

nic-6443 commented Aug 5, 2022

@nic-6443 So is there a fix for this rather than fallback to disabling SSL on postgresql? I am too experiencing this problem.

@panteparak I don't have any good ideas yet, there is also a hack approach (which we currently use in our test environment): create iptables rules in init container to drop TCP RST packets sent from Postgres.

      initContainers:
      - args:
        - -I
        - OUTPUT
        - -p
        - tcp
        - --tcp-flags
        - RST
        - RST
        - -j
        - DROP
        command:
        - iptables
        image: istio/proxy_init:1.0.2
        imagePullPolicy: IfNotPresent
        name: drop-tcp-rst
        resources: {}
        securityContext:
          allowPrivilegeEscalation: true
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
            drop:
            - ALL
          privileged: true

@marcellodesales
Copy link

DataGrip confirmed

  • Setting sslmode: disabled worked in DataGrip

Screenshot 2023-04-16 at 3 30 37 PM

@ibot3
Copy link

ibot3 commented Apr 17, 2023

This isn't really helpful if the postgres server enforces ssl..

@kieranbenton
Copy link

kieranbenton commented Apr 17, 2023 via email

@jdsdc
Copy link

jdsdc commented Apr 17, 2023

This isn't really helpful if the postgres server enforces ssl..

I agree. It is just a workaround and does not address the original problem.

@kieranbenton
Copy link

This issue needs reopening

@yuvalavidor
Copy link

seeing the same, on several pods.

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:47:25Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.10-eks-48e63af", GitCommit:"9176fb99b52f8d5ff73d67fea27f3a638f679f8a", GitTreeState:"clean", BuildDate:"2023-01-24T19:17:48Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}

@barisbll
Copy link

barisbll commented May 8, 2023

I am facing the same issue with the M1 Mac, this is issue needs reopening

@MegalodonBite
Copy link

same problem here. latest k8s, latest kubectl
tried on several laptops and clusters

@adityasamant25
Copy link

Same issue here. kubectl with kind. Port forwarding for port 443 works only once after which the connection is lost and all further requests are left hanging.

@trickstyler
Copy link

Same issue with M1 MBP, trying to port-forward a redis pod into local 6379 port unsuccessfully

@zulfikar4568
Copy link

zulfikar4568 commented Jun 25, 2023

I have the same issue here, port forwarding 9090 on my load balancer, I create a cluster using Kind.

But I solved with add extra mapping in my kind config

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
  - containerPort: 443
    hostPort: 443
    protocol: TCP
  - containerPort: 9090
    hostPort: 9090
    protocol: TCP
  - containerPort: 1833
    hostPort: 1833
    protocol: TCP
  - containerPort: 4222
    hostPort: 4222
    protocol: TCP
- role: worker
- role: worker
- role: worker

@thesuperzapper
Copy link

thesuperzapper commented Jul 4, 2023

For those watching, it's possible that this issue is the same as kubernetes/kubernetes#74551, which people believe is actually an issue with the container runtime.

There is a proposed PR to fix containerd here: containerd/containerd#8418


Also, here is another related issue, for reference: #1368

@bergner
Copy link

bergner commented Sep 28, 2023

Since I'm seeing the same problem with port-forward closing after the first connection I resorted to downloading a much older version of kubectl to use for the port-forward: https://cdn.dl.k8s.io/release/v1.22.14/bin/linux/amd64/kubectl

Newer version of kubectl port-forward:

$ oc port-forward -n vizone-dev service/vizone-db 15432:5432
Forwarding from 127.0.0.1:15432 -> 5432
Forwarding from [::1]:15432 -> 5432
Handling connection for 15432
Handling connection for 15432
E0928 14:20:52.249249 3315375 portforward.go:406] an error occurred forwarding 15432 -> 5432: error forwarding port 5432 to pod 7d75b15749aa96e0d76632f20a00ad08aa2ff5482d10f1688c9c75ad6f60c669, uid : port forward into network namespace "/var/run/netns/60b9691c-10e7-42bc-8b14-1f38ee62268f": read tcp [::1]:38464->[::1]:5432: read: connection reset by peer
E0928 14:20:52.250059 3315375 portforward.go:234] lost connection to pod

1.22 version of kubectl port-forward with me connecting and exiting with psql twice. The port-forward remains running.

$ kubectl22 port-forward -n vizone-dev service/vizone-db 15432:5432
Forwarding from 127.0.0.1:15432 -> 5432
Forwarding from [::1]:15432 -> 5432
Handling connection for 15432
Handling connection for 15432
E0928 14:27:45.878971 3316749 portforward.go:400] an error occurred forwarding 15432 -> 5432: error forwarding port 5432 to pod 7d75b15749aa96e0d76632f20a00ad08aa2ff5482d10f1688c9c75ad6f60c669, uid : port forward into network namespace "/var/run/netns/60b9691c-10e7-42bc-8b14-1f38ee62268f": read tcp [::1]:32780->[::1]:5432: read: connection reset by peer
Handling connection for 15432
Handling connection for 15432
E0928 14:27:52.651387 3316749 portforward.go:400] an error occurred forwarding 15432 -> 5432: error forwarding port 5432 to pod 7d75b15749aa96e0d76632f20a00ad08aa2ff5482d10f1688c9c75ad6f60c669, uid : port forward into network namespace "/var/run/netns/60b9691c-10e7-42bc-8b14-1f38ee62268f": read tcp [::1]:49546->[::1]:5432: read: connection reset by peer

@P9os
Copy link

P9os commented Jan 19, 2024

Trouble exist on
k8s: 1.28.5 and kubectl: 1.29.1

$ kubectl -n postgres-operator port-forward pod/hippo-hippo-instance-x5ls-0 5432:5432
Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432
Handling connection for 5432
Handling connection for 5432
E0119 19:52:37.036121 1412760 portforward.go:409] an error occurred forwarding 5432 -> 5432: error forwarding port 5432 to pod 8dacd19a0fc4a570997e394e1ec4b98d849ac129377f266c7377bef888d4511e, uid : failed to execute portforward in network namespace "/var/run/netns/cni-76d1769b-331f-b977-5ae3-7c24f677755d": read tcp4 127.0.0.1:46938->127.0.0.1:5432: read: connection reset by peer
E0119 19:52:37.080359 1412760 portforward.go:370] error creating forwarding stream for port 5432 -> 5432: EOF
error: lost connection to pod

@spyro2000
Copy link

spyro2000 commented Jan 30, 2024

Same for MariaDB and a simple port-forwarding. No SSL, using default options with JDBC.

E0130 14:27:35.925386 25068 portforward.go:406] an error occurred forwarding 3307 -> 3307: error forwarding port 3307 to pod cbe8b3812e0dd7e8c793f4395d80bc3a42679829b7ab8f2c3831eaf7f4003e2b, uid : failed to execute portforward in network namespace "/var/run/netns/cni-0e9c2674-0ed6-b133-3453-140c73d6899b": failed to connect to localhost:3307 inside namespace "cbe8b3812e0dd7e8c793f4395d80bc3a42679829b7ab8f2c3831eaf7f4003e2b", IPv4: dial tcp4 127.0.0.1:3307: connect: connection refused IPv6 dial tcp6: address localhost: no suitable address found
E0130 14:27:35.999824 25068 portforward.go:234] lost connection to pod

@aaron-sysdig
Copy link

Just ran into this issue running kube locally. The issue was fixed by increasing available memory usage to the node

@piec
Copy link

piec commented Feb 8, 2024

Better than nothing workaround, wrap it in a while true:

while true; do kubectl port-forward ...; echo .; done

@manojtr
Copy link

manojtr commented Feb 26, 2024

Same issue with both server and client are version v1.27.9 in old mac. Please reopen the issue. Thanks

@vreddhi
Copy link

vreddhi commented Feb 29, 2024

I am also seeing the same issue with a simple nginx pod exposed on port 8080

@enihcam
Copy link

enihcam commented Mar 13, 2024

same issue here:

# kubectl version
Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.18.4-tke.25

====update====

Just realized that only localhost ports on target pods can be port-forwarded. It'd be good if we could have an option to allow all ports forward-able.

@thesuperzapper
Copy link

The related issues are:

It also looks like @sxllwx has a PR which might fix it: kubernetes/kubernetes#117493

@ruslaniv
Copy link

ruslaniv commented Jul 4, 2024

Same here, had to disable SSL mode in Pycharm Pro to fix the connection being dropped after first attempt

@manschoe
Copy link

manschoe commented Aug 2, 2024

Having the same issue with port-forwarding since Kubernetes version 1.29.4:
image
Everything works fine in current version, being 1.28.5.
Kubectl versions:
image
Any other suggestions/workarounds/actual fixes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.