Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make _PYTHON_SYSCONFIGDATA_NAME available to cmake #19301

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

henryiii
Copy link

@henryiii henryiii commented May 5, 2023

Cross compiling Python modules is broken if they use CMake to build, due to FindPython (and other libs like pybind11's config) not being able to get the correct cross-compiling values from sysconfig.

This is a suggested fix which allows emcmake.py to run by passing the value through and restoring it after Python starts up. I haven't tested the Windows version of it.

Personally not a fan of wrappers - most of the values here can be set via environment variables (CMAKE_TOOLCHAIN_FILE and CMAKE_GENERATOR), only CMAKE_CROSSCOMPILING_EMULATOR is technically needed AFAICT.

CC @hoodmane, this is a fix I suggested on Gitter.

emcmake.py Outdated
@@ -27,6 +27,9 @@ def run():

args = sys.argv[1:]

# Restore removed SYSCONFIG settings so they will be present in the CMake environment
os.environ["_PYTHON_SYSCONFIGDATA_NAME"] = os.environ["_EMCMAKE_PYTHON_SYSCONFIGDATA_NAME"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where exactly is _PYTHON_SYSCONFIGDATA_NAME removed? Its removed by the -E we pass to python in the wrapper scripts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emscripten/emcmake.bat

Lines 13 to 15 in d1530e7

:: -E will not ignore _PYTHON_SYSCONFIGDATA_NAME an internal
:: of cpython used in cross compilation via setup.py.
@set _PYTHON_SYSCONFIGDATA_NAME=

emscripten/emcmake

Lines 14 to 16 in d1530e7

# $PYTHON -E will not ignore _PYTHON_SYSCONFIGDATA_NAME an internal
# of cpython used in cross compilation via setup.py.
unset _PYTHON_SYSCONFIGDATA_NAME

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like we deliberately strip/remove that environment variable. See #16736.

Should we just remove the stripping on _PYTHON_SYSCONFIGDATA_NAME?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting so this is a fight between two python-specific use cases. I vote to remove the stripping on _PYTHON_SYSCONFIGDATA_NAME.

@hoodmane
Copy link
Contributor

hoodmane commented May 9, 2023

We probably need a better way to handle this than patching emcmake to be aware of our environment variables...

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
@@ -10,9 +10,6 @@

:: All env. vars specified in this file are to be local only to this script.
@setlocal
:: -E will not ignore _PYTHON_SYSCONFIGDATA_NAME an internal
:: of cpython used in cross compilation via setup.py.
@set _PYTHON_SYSCONFIGDATA_NAME=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files are auto-generated. See the comment at the top of the file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I suspected that but missed the comment. That begs the question, should it change for all of them? The GCC wrappers (probably!) aren't going to be calling Python to build extension modules, while CMake does. Thought was this just added due to standard practice with the -E, or was it really needed? Not sure having sysconfig report the cross-compiling Python is likely to affect things that aren't building extensions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm traveling at a conference so slow/limited currently)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure why _PYTHON_SYSCONFIGDATA_NAME was added explicitly here. Perhaps @pmp-p remembers?

I think we should probably just remove it from all of them, unless there is a real good reason not to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PYTHON_SYSCONFIGDATA_NAME will be reused in any python subprocess spawned by emcc if found in env so it must always be removed or emsdk will use the python cross compiler sysconfig for compiling ( and often fail ).

imho that fix is not ideal - and possibly wrong - it is simpler to force python3/python by altering default bin PATH to point to scripts that wrap python and are specifically set for cross compiling. That will work for any build tool including pyodide, see https://github.com/pygame-web/python-wasm-sdk for a POC.

Copy link
Author

@henryiii henryiii May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, even it it breaks emcc itself, that's fixable, as sysconfig isn't in the default packages loaded by site.py. You can just do:

import os
if "_PYTHON_SYSCONFIGDATA_NAME" in os.eviron:
    old = os.environ.pop("_PYTHON_SYSCONFIGDATA_NAME")
    import sysconfig
    os.environ["_PYTHON_SYSCONFIGDATA_NAME"] = old

at the top. The original sysconfigdata module will be loaded.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a reasonable hack yes.

I would like to be able to write a test for this.. is there some way we can make emscripten fail given a bad _PYTHON_SYSCONFIGDATA_NAME? What kind of failure would I expect?

Copy link
Contributor

@hoodmane hoodmane May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we would set _PYTHON_SYSCONFIGDATA_NAME to something like "_sysconfigdata__emscripten_wasm32-emscripten.py".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you can set _PYTHON_SYSCONFIGDATA_NAME=nothing-real and see exactly where sysconfig is being imported. If it doesn't fail, sysconfig is never imported.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm at the scientific-python developer's summit and likely can't work on this till next week or so)

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants