Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The ray_julia library artifact is not Julia version agnostic #168

Open
omus opened this issue Sep 28, 2023 · 3 comments
Open

The ray_julia library artifact is not Julia version agnostic #168

omus opened this issue Sep 28, 2023 · 3 comments

Comments

@omus
Copy link
Member

omus commented Sep 28, 2023

See #167 (comment). Example of Julia specific artifacts in action: https://github.com/beacon-biosignals/Ray.jl/releases/tag/v0.0.1

@omus
Copy link
Member Author

omus commented Oct 2, 2023

Fixing this would also allow us to reduce our GHA cache size as we current save the cache per-Julia version: #63

@omus
Copy link
Member Author

omus commented Oct 27, 2023

Example of what happens when you attempt to use the Ray package on different versions of Julia with the same library:

❯ julia-1.8 --project=build build/build_library.jl && julia-1.8 --project -e '@show VERSION; using Ray; println("Success")'
...
VERSION = v"1.8.5"
Success

❯ julia-1.9 --project -e '@show VERSION; using Ray; println("Success")'
VERSION = v"1.9.3"
C++ exception while wrapping module ray_julia_jll: invalid subtyping in definition of CxxMapStringDouble with supertype Any
ERROR: LoadError: invalid subtyping in definition of CxxMapStringDouble with supertype Any
Stacktrace:
 [1] register_julia_module
   @ ~/.julia/packages/CxxWrap/5IZvn/src/CxxWrap.jl:393 [inlined]
 [2] readmodule(so_path_cb::Ray.ray_julia_jll.var"#1#2", funcname::Symbol, m::Module, flags::Nothing)
   @ CxxWrap.CxxWrapCore ~/.julia/packages/CxxWrap/5IZvn/src/CxxWrap.jl:751
 [3] wrapmodule(so_path_cb::Function, funcname::Symbol, m::Module, flags::Nothing)
   @ CxxWrap.CxxWrapCore ~/.julia/packages/CxxWrap/5IZvn/src/CxxWrap.jl:761
 [4] include(mod::Module, _path::String)
   @ Base ./Base.jl:457
 [5] include(x::String)
   @ Ray ~/.julia/dev/Ray/src/Ray.jl:6
 [6] include
   @ ./Base.jl:457 [inlined]
 [7] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::Nothing)
   @ Base ./loading.jl:2049
in expression starting at /Users/cvogt/.julia/dev/Ray/src/ray_julia_jll/ray_julia_jll.jl:1
in expression starting at /Users/cvogt/.julia/dev/Ray/src/Ray.jl:1
in expression starting at stdin:3
ERROR: Failed to precompile Ray [3f779ece-f0b6-4c4f-a81a-0cb2add9eb95] to "/Users/cvogt/.julia/compiled/v1.9/Ray/jl_IqYMQb".
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
   @ Base ./loading.jl:2300
 [3] compilecache
   @ ./loading.jl:2167 [inlined]
 [4] _require(pkg::Base.PkgId, env::String)
   @ Base ./loading.jl:1805
 [5] _require_prelocked(uuidkey::Base.PkgId, env::String)
   @ Base ./loading.jl:1660
 [6] macro expansion
   @ ./loading.jl:1648 [inlined]
 [7] macro expansion
   @ ./lock.jl:267 [inlined]
 [8] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1611

@omus
Copy link
Member Author

omus commented Oct 27, 2023

Took a look into this. It turns out Bazel's cc_binary uses linkstatic=True which is problematic the libraries we link against will be statically compiled into the build shared library. Using otool -L we can see there is no dynamlic library links:

❯ otool -L julia_core_worker_lib.so
julia_core_worker_lib.so:
	bazel-out/darwin_arm64-opt/bin/julia_core_worker_lib.so (compatibility version 0.0.0, current version 0.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
	/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1971.0.0)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1500.65.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1971.0.0)

Applying the following diff allows us to statically link the core worker libraries while dynamically linking libjulia and libcxxwrap-julia:

diff --git a/build/BUILD.bazel b/build/BUILD.bazel
index 1d04641..5e2748b 100644
--- a/build/BUILD.bazel
+++ b/build/BUILD.bazel
@@ -26,8 +26,8 @@ cc_binary(
     deps = [
         "@com_github_ray_project_ray//:core_worker_lib",
         "@com_github_ray_project_ray//:global_state_accessor_lib",
-        "@julia//:headers",
-        "@libcxxwrap_julia//:headers",
+        "@julia//:libjulia",
+        "@libcxxwrap_julia//:libcxxwrap_julia",
     ],
     linkshared=True,
 )
diff --git a/build/WORKSPACE.bazel.tpl b/build/WORKSPACE.bazel.tpl
index 693497d..77b0d29 100644
--- a/build/WORKSPACE.bazel.tpl
+++ b/build/WORKSPACE.bazel.tpl
@@ -19,6 +19,7 @@ local_repository(
 )

 # https://groups.google.com/g/bazel-discuss/c/lsbxZxNjJQw/m/NKb7f_eJBwAJ
+# https://groups.google.com/g/bazel-discuss/c/RtbidPdVFyU
 _JULIA_BUILD_FILE_CONTENT = """\
 package(
     default_visibility = [
@@ -28,8 +29,14 @@ package(

 cc_library(
     name = "headers",
-    hdrs = glob(["julia/**/*.h"]),
-    strip_include_prefix = "julia",
+    hdrs = glob(["include/julia/**/*.h"]),
+    strip_include_prefix = "include/julia",
+)
+
+cc_import(
+    name = "libjulia",
+    deps = [":headers"],
+    shared_library = "lib/libjulia.dylib",
 )
 """

@@ -37,7 +44,7 @@ cc_library(
 # https://github.com/beacon-biosignals/Ray.jl/issues/62
 new_local_repository(
     name = "julia",
-    path = "{{{JULIA_INCLUDE_DIR}}}",
+    path = "{{{JULIA_PREFIX_DIR}}}",
     build_file_content = _JULIA_BUILD_FILE_CONTENT,
 )

@@ -53,6 +60,12 @@ cc_library(
     hdrs = glob(["include/**/*.hpp"]),
     strip_include_prefix = "include",
 )
+
+cc_import(
+    name = "libcxxwrap_julia",
+    deps = [":headers"],
+    shared_library = "lib/libcxxwrap_julia.dylib",
+)
 """

 # TODO: Will eventually use a BinaryBuilder environment to access
diff --git a/build/build_library.jl b/build/build_library.jl
index 8cbe440..e5397d3 100644
--- a/build/build_library.jl
+++ b/build/build_library.jl
@@ -9,7 +9,7 @@ const RAY_DIR = joinpath(BUILD_DIR, "ray")
 const RAY_COMMIT = readchomp(joinpath(BUILD_DIR, "ray_commit"))
 const LIBRARY_NAME = "julia_core_worker_lib.so"

-const TEMPLATE_DICT = Dict("JULIA_INCLUDE_DIR" => joinpath(Sys.BINDIR, "..", "include"),
+const TEMPLATE_DICT = Dict("JULIA_PREFIX_DIR" => joinpath(Sys.BINDIR, ".."),
                            "CXXWRAP_PREFIX_DIR" => CxxWrap.prefix_path(),
                            "RAY_DIR" => RAY_DIR)

Using otool -L on this shared library shows:

❯ otool -L julia_core_worker_lib.so
julia_core_worker_lib.so:
	bazel-out/darwin_arm64-opt/bin/julia_core_worker_lib.so (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libjulia.dylib (compatibility version 1.0.0, current version 1.8.5)
	@rpath/libcxxwrap_julia.0.dylib (compatibility version 0.0.0, current version 0.11.1)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
	/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1971.0.0)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1500.65.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1971.0.0)

Attempting to use the library:

❯ julia-1.8 --project=build build/build_library.jl && julia-1.8 --project -e 'using Ray; println("Success")'
...
Success

❯ julia-1.9 --project -e 'using Ray; println("Success")'

[46534] signal (11.2): Segmentation fault: 11
in expression starting at /Users/cvogt/.julia/dev/Ray/src/ray_julia_jll/ray_julia_jll.jl:14
_ZN5jlcxx6Module17add_type_internalINSt3__113unordered_mapINS2_12basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEdNS2_4hashIS9_EENS2_8equal_toIS9_EENS7_INS2_4pairIKS9_dEEEEEENS_13ParameterListIJEEE14_jl_datatype_tEENS_11TypeWrapperIT_EERSF_PT1_ at /private/var/tmp/_bazel_cvogt/f6792afd486f2310f20344e531af81b0/execroot/com_github_beacon_biosignals_ray_wrapper/bazel-out/darwin_arm64-opt/bin/julia_core_worker_lib.so (unknown line)
define_julia_module at /private/var/tmp/_bazel_cvogt/f6792afd486f2310f20344e531af81b0/execroot/com_github_beacon_biosignals_ray_wrapper/bazel-out/darwin_arm64-opt/bin/julia_core_worker_lib.so (unknown line)
register_julia_module at /Users/cvogt/.julia/artifacts/2526f1faf6c345898421326fb03a88a5e7875b71/lib/libcxxwrap_julia.0.11.1.dylib (unknown line)
register_julia_module at /Users/cvogt/.julia/packages/CxxWrap/5IZvn/src/CxxWrap.jl:393 [inlined]
readmodule at /Users/cvogt/.julia/packages/CxxWrap/5IZvn/src/CxxWrap.jl:751
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
wrapmodule at /Users/cvogt/.julia/packages/CxxWrap/5IZvn/src/CxxWrap.jl:761
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
jl_apply at /Users/cvogt/Development/Julia/aarch64/1.9/src/./julia.h:1880 [inlined]
do_call at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:126
eval_body at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:762
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:912
jl_eval_module_expr at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:203 [inlined]
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:715
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:856
ijl_toplevel_eval at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:921 [inlined]
ijl_toplevel_eval_in at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
_include at ./loading.jl:1963
include at ./Base.jl:457
jfptr_include_40476 at /Users/cvogt/Development/Julia/aarch64/1.9/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
jl_apply at /Users/cvogt/Development/Julia/aarch64/1.9/src/./julia.h:1880 [inlined]
jl_f__call_latest at /Users/cvogt/Development/Julia/aarch64/1.9/src/builtins.c:774
include at /Users/cvogt/.julia/dev/Ray/src/Ray.jl:6
unknown function (ip: 0x1050880d3)
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
jl_apply at /Users/cvogt/Development/Julia/aarch64/1.9/src/./julia.h:1880 [inlined]
do_call at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:126
eval_body at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:762
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:912
jl_eval_module_expr at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:203 [inlined]
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:715
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:856
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:856
ijl_toplevel_eval at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:921 [inlined]
ijl_toplevel_eval_in at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
_include at ./loading.jl:1963
include at ./Base.jl:457 [inlined]
include_package_for_output at ./loading.jl:2049
jfptr_include_package_for_output_35976 at /Users/cvogt/Development/Julia/aarch64/1.9/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
jl_apply at /Users/cvogt/Development/Julia/aarch64/1.9/src/./julia.h:1880 [inlined]
do_call at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:126
eval_body at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/cvogt/Development/Julia/aarch64/1.9/src/interpreter.c:762
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:912
jl_toplevel_eval_flex at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:856
ijl_toplevel_eval at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:921 [inlined]
ijl_toplevel_eval_in at /Users/cvogt/Development/Julia/aarch64/1.9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1903
include_string at ./loading.jl:1913 [inlined]
exec_options at ./client.jl:305
_start at ./client.jl:522
jfptr__start_45425 at /Users/cvogt/Development/Julia/aarch64/1.9/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/cvogt/Development/Julia/aarch64/1.9/src/gf.c:2940
jl_apply at /Users/cvogt/Development/Julia/aarch64/1.9/src/./julia.h:1880 [inlined]
true_main at /Users/cvogt/Development/Julia/aarch64/1.9/src/jlapi.c:573
jl_repl_entrypoint at /Users/cvogt/Development/Julia/aarch64/1.9/src/jlapi.c:717
Allocations: 748057 (Pool: 747562; Big: 495); GC: 1
ERROR: Failed to precompile Ray [3f779ece-f0b6-4c4f-a81a-0cb2add9eb95] to "/Users/cvogt/.julia/compiled/v1.9/Ray/jl_ACyUxk".
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
   @ Base ./loading.jl:2300
 [3] compilecache
   @ ./loading.jl:2167 [inlined]
 [4] _require(pkg::Base.PkgId, env::String)
   @ Base ./loading.jl:1805
 [5] _require_prelocked(uuidkey::Base.PkgId, env::String)
   @ Base ./loading.jl:1660
 [6] macro expansion
   @ ./loading.jl:1648 [inlined]
 [7] macro expansion
   @ ./lock.jl:267 [inlined]
 [8] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1611

One interesting thing to note about that stack trace is it does show that the dynamic linking is working as it tries to use the right artifact for that Julia version:

❯ julia-1.8 --project -e 'using CxxWrap; println(CxxWrap.prefix_path())'
/Users/cvogt/.julia/artifacts/41add8c9ab2e88d387fe2342076221d162e4c80a

❯ julia-1.9 --project -e 'using CxxWrap; println(CxxWrap.prefix_path())'
/Users/cvogt/.julia/artifacts/2526f1faf6c345898421326fb03a88a5e7875b71

The reason libcxxwrap-julia uses different artifacts at all per Julia versions is that for some reason this binary is Julia version specific. It's not fully clear to me why this is the case as one of the reasons CxxWrap.jl works and Cxx.jl does not is that Cxx.jl was too tied to the internals of Julia.

Anyway, the Geant4_julia_jll also has Julia version specific binaries so we may just be stuck with Julia version specific binaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant