Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

port-forward dropping the connection for certain type of files #1368

Open
dvaldivia opened this issue Feb 10, 2023 · 18 comments
Open

port-forward dropping the connection for certain type of files #1368

dvaldivia opened this issue Feb 10, 2023 · 18 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@dvaldivia
Copy link

We are seeing a problem where kubectl port-forward is dropping the connection after serving a particular type of file, in this case a video, strangely enough, when we serve videos from non static files (copying a stream of bytes from MinIO for example) the pipe never breaks, file size makes no difference.

I created a repo with a reproduction case at https://github.com/dvaldivia/gofileserve

Steps to reproduction

1. Deploy the app in your kubernetes cluster

kubectl apply -f kubernetes/deployment.yaml

2. Port Forward to this service

kubectl port-forward svc/gofileserve 4001:4000

3. On another terminal, try to get Spinning-Cat.mp4

curl http://localhost:4001/assets/media/Spinning-Cat.mp4

You'll see how the port-forward breaks, and no request will work afterwards

➜ kubectl port-forward svc/gofileserve 4001:4000
Forwarding from 127.0.0.1:4001 -> 4000
Forwarding from [::1]:4001 -> 4000
Handling connection for 4001
E0210 13:22:27.776615   61159 portforward.go:379] error copying from remote stream to local connection: readfrom tcp4 127.0.0.1:4001->127.0.0.1:52390: write tcp4 127.0.0.1:4001->127.0.0.1:52390: write: broken pipe

if you restart the port-forward and you try to get other static files, like hello.json the pipe will never break

➜ curl http://localhost:4001/assets/hello.json
{
  "hello": "world"
}

So for this particular type of file being returned something cracks inside kubectl port-forward

@dvaldivia dvaldivia added the kind/bug Categorizes issue or PR as related to a bug. label Feb 10, 2023
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 10, 2023
@mpuckett159
Copy link
Contributor

/triage accepted
/assign @brianpursley

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 1, 2023
@brianpursley
Copy link
Member

brianpursley commented Mar 2, 2023

It looks like curl is the one breaking the pipe.

When I run your curl command, I get this warning and it exits with exit code 23:

$ curl http://localhost:4001/assets/media/Spinning-Cat.mp4
Warning: Binary output can mess up your terminal. Use "--output -" to tell 
Warning: curl to output it to your terminal anyway, or consider "--output 
Warning: <FILE>" to save to a file.
$ echo $?
23

curl gives you a couple of options:

  1. You can use the --output flag with a filename, like curl http://localhost:4001/assets/media/Spinning-Cat.mp4 --output Spinning-Cat.mp4
  2. If you really want the binary output to go to stdout, you can use --output - like this: curl http://localhost:4001/assets/media/Spinning-Cat.mp4 --output -. I'd imagine you use this if you want to pipe the output to another process.

When I did option 1, it works:

$ curl http://localhost:4001/assets/media/Spinning-Cat.mp4 --output Spinning-Cat.mp4
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1427k  100 1427k    0     0   138M      0 --:--:-- --:--:-- --:--:--  154M
$ ls -la Spinning-Cat.mp4 
-rw-rw-r-- 1 bpursley bpursley 1461984 Mar  1 21:43 Spinning-Cat.mp4

I did however notice that once you get the broken pipe, kubectl's port forwarding seems to get into some weird random (?) timeout state until you restart port forwarding. It just doesn't recover well from errors (and it doesn't exit either). If port forward cannot recover from an error, it should exit with a non-zero exit code, not keep running in some semi-operable state.

Example output of this weird timeout state:

$ kubectl port-forward svc/gofileserve 4001:4000
Forwarding from 127.0.0.1:4001 -> 4000
Forwarding from [::1]:4001 -> 4000
Handling connection for 4001
E0301 21:28:52.072825   20184 portforward.go:378] error copying from remote stream to local connection: readfrom tcp4 127.0.0.1:4001->127.0.0.1:54100: write tcp4 127.0.0.1:4001->127.0.0.1:54100: write: broken pipe
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
E0301 21:31:05.071949   20184 portforward.go:346] error creating error stream for port 4001 -> 4000: Timeout occurred
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
E0301 21:31:48.082873   20184 portforward.go:346] error creating error stream for port 4001 -> 4000: Timeout occurred
Handling connection for 4001
E0301 21:32:00.898307   20184 portforward.go:346] error creating error stream for port 4001 -> 4000: Timeout occurred
Handling connection for 4001
Handling connection for 4001
Handling connection for 4001
E0301 21:32:14.506840   20184 portforward.go:346] error creating error stream for port 4001 -> 4000: Timeout occurred
Handling connection for 4001
E0301 21:32:32.954398   20184 portforward.go:346] error creating error stream for port 4001 -> 4000: Timeout occurred

@vangie
Copy link

vangie commented Mar 10, 2023

I encountered a similar problem on mac M1, where the websocket protocol was running on the forwarding port. There will be no forwarding interruption for the same service on the linux arm64 version of kubectl

E0310 15:49:30.434931   37797 portforward.go:406] an error occurred forwarding 7070 -> 7070: error forwarding port 7070 to pod 36a7b2dfefae7f1a5efdb5c2d4302f308373e04573ec4cf5fb7ad62f7e8be944, uid : failed to execute portforward in network namespace "/var/run/netns/cni-748379a4-44d1-3802-107a-8bc98f3e1fb4": readfrom tcp4 127.0.0.1:46016->127.0.0.1:7070: write tcp4 127.0.0.1:46016->127.0.0.1:7070: write: broken pipe
E0310 15:49:30.438039   37797 portforward.go:234] lost connection to pod

@dvaldivia
Copy link
Author

@brianpursley I don't think it's curl alone, we noticed this happening when our web-app attempted to load a video for a background animation, so I think it has to do more with the content being returned than anything

@andreldm
Copy link

Sorry for not bringing any concrete evidence, but I'm experiencing the same problem while port fowarding kafka-ui, it's just doing boring rest calls under the hood. Using a mac M1 as well if that matters.

@mooshmoosh
Copy link

I'm also having this issue with Apache Superset on M1 Mac.

@ravindk89
Copy link

containerd/containerd#8418 <- there might be a fix in the pipeline for this, as per this comment

@john-ramsey
Copy link

I'm also having this issue with Apache Superset on M1 Mac.

Same here - M1 as well FWIW

@gusandrioli
Copy link

Same here for Grafana UI on Mac M1

@dvaldivia
Copy link
Author

seems to be an issue happening on Apple Silicon mainly

@dwt
Copy link

dwt commented Sep 20, 2023

Apple Silicon, grafana-ui, but also cockroach-ui triggers this for me. For local clusters via docker-desktop, but also remote x86 clusters.

On kubernetes-cli 1.27.4 this hangs after the first problem, on 1.28.2 this seems to at least recover after such a problem. So still excessive logging, but better behavior?

Do you guys consider this problem solved with that, or is the logging (see collapsed section) something you also want to fix?

Logs
❯ kubectl port-forward svc/rancher-monitoring-grafana 8080:80
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
E0920 15:38:56.723419   70989 portforward.go:381] error copying from remote stream to local connection: readfrom tcp6 [::1]:8080->[::1]:58214: write tcp6 [::1]:8080->[::1]:58214: write: broken pipe
E0920 15:38:56.723658   70989 portforward.go:394] error copying from local connection to remote stream: read tcp6 [::1]:8080->[::1]:58214: read: connection reset by peer
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
E0920 15:39:04.363633   70989 portforward.go:381] error copying from remote stream to local connection: readfrom tcp6 [::1]:8080->[::1]:58221: write tcp6 [::1]:8080->[::1]:58221: write: broken pipe
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
E0920 15:39:26.930397   70989 portforward.go:370] error creating forwarding stream for port 8080 -> 8080: Timeout occurred
E0920 15:39:31.194304   70989 portforward.go:370] error creating forwarding stream for port 8080 -> 8080: Timeout occurred
Handling connection for 8080
E0920 15:39:39.843494   70989 portforward.go:347] error creating error stream for port 8080 -> 8080: Timeout occurred
Handling connection for 8080
E0920 15:39:40.720737   70989 portforward.go:370] error creating forwarding stream for port 8080 -> 8080: Timeout occurred
Handling connection for 8080
E0920 15:40:09.845963   70989 portforward.go:347] error creating error stream for port 8080 -> 8080: Timeout occurred
Handling connection for 8080
E0920 15:40:10.730176   70989 portforward.go:370] error creating forwarding stream for port 8080 -> 8080: Timeout occurred
Handling connection for 8080

@debarshibasak
Copy link

This issue happens on windows too for us.

@hadifarnoud
Copy link

I have the same issue with Postgres and pgadmin4. and doing a while true ; do nc -vz 127.0.0.1 8080 ; sleep 7 ; done did not help

@allanrogerr
Copy link

Still an issue

@ravindk89
Copy link

containerd/containerd#9875 maybe related? It seems like attempts to fix this have been ongoing.

@ramondeklein
Copy link

ramondeklein commented Jul 5, 2024

Not 100% sure if my issue is the same, but I have created a reproduction scenario that locks up the port forwarding every time. Both on Apple Silicon, Windows (running WSL2) and AKS (MS Azure). Full repo can be found here: https://github.com/ramondeklein/k8s-portforward-bug.

@dvaldivia
Copy link
Author

I think @ramondeklein is also interesting as it shines more light on this problem @brianpursley in how the TCP streams seems to be the problem, but I don't see what in kubectl could have changed around the 1.20 release to have introduced this problem

@surendragalwa
Copy link

I have also encountered similar issue when running load tests on a forwarded port -

E0828 17:56:03.007294   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.007362   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.007439   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.007578   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.008054   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.008098   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.008137   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred
E0828 17:56:03.008248   81349 portforward.go:346] error creating error stream for port 8080 -> 8080: Timeout occurred

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests