Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: cannot pickle '_io.TextIOWrapper' object #961

Open
lorddaedra opened this issue Feb 14, 2020 · 19 comments
Open

TypeError: cannot pickle '_io.TextIOWrapper' object #961

lorddaedra opened this issue Feb 14, 2020 · 19 comments

Comments

@lorddaedra
Copy link

gsutil -m -h "Cache-Control: public, max-age=31536000" cp -r test/** gs://some-bucket
Traceback (most recent call last):
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil.py", line 124, in RunMain
    sys.exit(gslib.__main__.main())
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 424, in main
    return _RunNamedCommandAndHandleExceptions(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 762, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 620, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
    return_code = command_inst.RunCommand()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1201, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1499, in Apply
    self._ParallelApply(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1719, in _ParallelApply
    self._CreateNewConsumerPool(process_count, thread_count,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1384, in _CreateNewConsumerPool
    p.start()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object

gsutil version: 4.47

@crwilcox
Copy link

I ran into the same issue. If you use another interpreter (python 3.7 for instance) all is well. This is a problem specifically with Python 3.8

Google Cloud SDK 281.0.0
beta 2019.05.17
bq 2.0.53
cloud-firestore-emulator 1.10.4
core 2020.02.14
gsutil 4.47

@lorddaedra
Copy link
Author

Bug is still there

Google Cloud SDK 283.0.0
alpha 2019.05.17
app-engine-python 1.9.88
beta 2019.05.17
bq 2.0.54
cloud-datastore-emulator 2.1.0
core 2020.02.28
gsutil 4.48

@calebmoss
Copy link

Still exists now, only on multiprocessing flag, runs fine without -m :

Google Cloud SDK 286.0.0
bq 2.0.55
core 2020.03.24
gsutil 4.48

@calebmoss
Copy link

Tracking this down, this error comes from a change in Python 3.8 in the multiprocessing library:

Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725.

Spawn is being run for those using MacOs and Python 3.8+ by default since nothing is explicitly set either through get_context or set_start_method.

@caizixian
Copy link

The issue still presents with Cloud SDK 302.0.0 (gsutil 4.52), on macOS 10.15.6 with Python 3.8.5 installed from homebrew

@caizixian
Copy link

One workaround is to use the Python 3 interpreter shipped with macOS /usr/bin/python3 by setting the Cloud SDK interpreter path https://cloud.google.com/sdk/gcloud/reference/topic/startup

@aleb
Copy link

aleb commented Jul 24, 2020

gsutil does not work with python 3.8, force it to use python 3.7 with something like

export CLOUDSDK_PYTHON=/usr/bin/python3     # on mac
export CLOUDSDK_PYTHON=/usr/bin/python3.7   # on linux

@dbrookeUAB
Copy link

dbrookeUAB commented Jul 29, 2020

@aleb I'm not sure if this is specific to Mac Mojave, but the path for python3 for me was /usr/local/bin/python3. I couldn't get it to work with python3 anyways, but forcing it to use 2.7 worked like a charm.

export CLOUDSDK_PYTHON=/usr/local/bin/python3      # did not work
export CLOUDSDK_PYTHON=/usr/bin/python2.7          # worked

From the link @caizixian provided,

Python 3 is preferred over Python 2. Note that gcloud requires Python version 2.7.x or 3.5 and up. Other Python tools shipped in the Cloud SDK do not support Python 3 and require Python 2.7.x,

@dinvlad
Copy link

dinvlad commented Sep 1, 2020

Another workaround on macOS is to

brew install python@3.7
export CLOUDSDK_PYTHON=/usr/local/opt/python@3.7/bin/python3

@pcantalupo
Copy link

@dinvlad That worked for me! Thank you so much

@rogerthomas84
Copy link

@dinvlad Thank you! Works perfectly!

@wayjake
Copy link

wayjake commented Sep 21, 2020

With such a strange "pickle" error, I didn't expect to find my resolution so quickly. Thank you, @dinvlad!!

@Catherine19950122
Copy link

export CLOUDSDK_PYTHON=/usr/bin/python2.7 will work ! export CLOUDSDK_PYTHON=/usr/bin/python3 or export CLOUDSDK_PYTHON=path/for/python3.7 will solve the current issue but will run into module 'sys' has no attribute 'maxint' error.

@max-sixty
Copy link

While I recognize comments like "Is tHiS fiXEd??" are not helpful — would it be possible for someone on the Google side to acknowledge this is a bug in gsutil and plan to resolve?

Currently, IIUC, gsutil breaks on python 3.8 — a version released a year ago, and the default brew version. Workarounds like installing another version of python are not small adjustments, and difficult for less technical colleagues. There are 49 👍s on the issue.

@dilipped
Copy link
Collaborator

Sorry for the delay. We are aware of this bug and we are working on releasing this workaround soon #1107

@dilipped
Copy link
Collaborator

dilipped commented Oct 21, 2020

Another workaround would be to disable multiprocessing altogether when using Python 3.8. This can be done either by setting the parallel_process_count=1 in the boto config file or by passing the option from the command line like this

gsutil -o "GSUtil:parallel_process_count=1" -m cp .....

This will be relatively slow as it will be using a single process, however, multithreading will be still ON.

@max-sixty
Copy link

That's an excellent workaround, thanks @dilipped !

@tartavull
Copy link

updating gsutil solved the issue with python3.8

@cuonghuynh
Copy link

gsutil does not work with python 3.8, force it to use python 3.7 with something like

export CLOUDSDK_PYTHON=/usr/bin/python3     # on mac
export CLOUDSDK_PYTHON=/usr/bin/python3.7   # on linux

It works fine to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests