You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a known limitation in the address sanitizer tool that it cannot resolve symbol names for .so libraries once they have been dlclose'd, which makes it difficult to attribute memory leaks correctly. For example in #4 we saw back traces that looked like this:
#0 0x7efd5eda19a7 in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:454
#1 0x7efd5e8c5203 in json_object_object_add_ex /tmp/carns/spack-stage/spack-stage-json-c-0.15-4i2cra2gmog2tf6qstkw35ox4jbqbtfy/spack-src/json_object.c:608
#2 0x7efd5a958c0c (<unknown module>)
#3 0x7efd5a957274 (<unknown module>)
#4 0x7efd5a96063d (<unknown module>)
#5 0x7efd5ebdb573 in bedrock::DynLibServiceFactory::registerProvider(bedrock::FactoryArgs const&) /tmp/carns/spack-stage/spack-stage-mochi-bedrock-main-fqlbjyevtzxobws7dci7wwpgojb4xtjj/spack-src/src/DynLibServiceFactory.hpp:78
That leak originated in a quintain provider library that had been loaded dynamically, but the key parts of the stack trace were obfuscated as (<unknown module>).
The easiest one for bedrock is to simply add RTLD_NODELETE to the flags passed in to the dlopen calls; this causes the dynamic library to remain in memory long enough for address sanitizer to produce a stack trace that looks like this instead:
#0 0x7fe34f5c89a7 in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:454
#1 0x7fe34f0bd203 in json_object_object_add_ex /tmp/carns/spack-stage/spack-stage-json-c-0.15-4i2cra2gmog2tf6qstkw35ox4jbqbtfy/spack-src/json_object.c:608
#2 0x7fe34b14fdb5 in validate_and_complete_config ../src/quintain-server.c:312
#3 0x7fe34b14e3b4 in quintain_provider_register ../src/quintain-server.c:106
#4 0x7fe34b15763d in quintain_register_provider ../src/quintain-bedrock-module.c:32
#5 0x7fe34f400c03 in bedrock::DynLibServiceFactory::registerProvider(bedrock::FactoryArgs const&) /tmp/carns/spack-stage/spack-stage-mochi-bedrock-main-legssn752dwsdq3lbyxiuk5wwzi7ecbr/spack-src/src/DynLibServiceFactory.hpp:78
#6 0x7fe34f41d6a7 in bedrock::ProviderManager::addProviderFromJSON(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /tmp/carns/spack-stage/spack-stage-mochi-bedrock-main-legssn752dwsdq3lbyxiuk5wwzi7ecbr/spack-src/src/ProviderManager.cpp:268
#7 0x7fe34f41e4cb in bedrock::ProviderManager::addProviderListFromJSON(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /tmp/carns/spack-stage/spack-stage-mochi-bedrock-main-legssn752dwsdq3lbyxiuk5wwzi7ecbr/spack-src/src/ProviderManager.cpp:281
... which makes it clear that the leak really originated in quintain.
This might not be a safe flag to use in cases, especially if we have a future scenario in which a library is loaded and unloaded repeatedly, because the RTLD_NODELETE flag prevents global and static variables from re-initializing.
For provider development/debugging purposes we could just have a bedrock json option to toggle this behavior so that bedrock configurations used for debugging/development can set that flag cleanly.
The text was updated successfully, but these errors were encountered:
For now I just added RTLD_NODELETE along with a comment saying we should come back to this if we ever want to dynamically load and unload libraries (right now Bedrock is able to load more libraries at run time, but it's not able to unload them).
There is a known limitation in the address sanitizer tool that it cannot resolve symbol names for .so libraries once they have been dlclose'd, which makes it difficult to attribute memory leaks correctly. For example in #4 we saw back traces that looked like this:
That leak originated in a quintain provider library that had been loaded dynamically, but the key parts of the stack trace were obfuscated as
(<unknown module>)
.Some workarounds have been noted at google/sanitizers#89.
The easiest one for bedrock is to simply add
RTLD_NODELETE
to the flags passed in to the dlopen calls; this causes the dynamic library to remain in memory long enough for address sanitizer to produce a stack trace that looks like this instead:... which makes it clear that the leak really originated in quintain.
This might not be a safe flag to use in cases, especially if we have a future scenario in which a library is loaded and unloaded repeatedly, because the
RTLD_NODELETE
flag prevents global and static variables from re-initializing.For provider development/debugging purposes we could just have a bedrock json option to toggle this behavior so that bedrock configurations used for debugging/development can set that flag cleanly.
The text was updated successfully, but these errors were encountered: