Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amesos2 SuperLUDist wrappers failing to compile for all ATDM Trilinos builds supporting SPARC starting 2020-10-27 #8258

Closed
bartlettroscoe opened this issue Oct 27, 2020 · 9 comments
Labels
ATDM Sev: Blocker Problems that make Trilinos unfit to be adopted by one or more ATDM APPs client: ATDM Any issue primarily impacting the ATDM project client: SPARC Issues related to or needed more specifically by the ATDM SPARC code impacting: configure or build The issue is primarily related to configuring or building PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area pkg: Amesos2 type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bartlettroscoe
Copy link
Member

CC: @trilinos/amesos2 , @srajama1 (Trilinos Linear Solvers Product Lead), @trilinos/framework

Next Action Status

Description

As shown on CDash in these these builds, it appears that the commits in PR #8138 merged to 'develop' on 2020-10-26 broke all of the ATDM Trilinos builds that enable SuperLUDist and support SPARC.

The build errors as shown here show build errors of the files:

  • packages/amesos2/src/Amesos2_Factory.cpp
  • packages/amesos2/src/Amesos2_Details_LinearSolverFactory.cpp
  • packages/amesos2/src/Amesos2_Superludist.cpp

showing compiler errors like:

DETAILED BUILD ERRORS: (click to expand)
vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_TypeMap.hpp(252): error: qualified name is not allowed

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_TypeMap.hpp(252): error: explicit type is missing ("int" assumed)

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(127): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(144): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(162): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(183): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(203): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(334): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(354): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(378): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(398): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(405): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(423): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_FunctionMap.hpp(432): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_decl.hpp(275): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"
          detected during:
            instantiation of class "Amesos2::Superludist<Matrix, Vector>::SLUData [with Matrix=Tpetra::CrsMatrix<double, int, long long, Tpetra::Details::DefaultTypes::node_type>, Vector=Tpetra::MultiVector<double, int, long long, Tpetra::Details::DefaultTypes::node_type>]" 
(307): here
            instantiation of class "Amesos2::Superludist<Matrix, Vector> [with Matrix=Tpetra::CrsMatrix<double, int, long long, Tpetra::Details::DefaultTypes::node_type>, Vector=Tpetra::MultiVector<double, int, long long, Tpetra::Details::DefaultTypes::node_type>]" 
/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist.cpp(99): here

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_decl.hpp(290): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"
          detected during:
            instantiation of class "Amesos2::Superludist<Matrix, Vector>::SLUData [with Matrix=Tpetra::CrsMatrix<double, int, long long, Tpetra::Details::DefaultTypes::node_type>, Vector=Tpetra::MultiVector<double, int, long long, Tpetra::Details::DefaultTypes::node_type>]" 
(307): here
            instantiation of class "Amesos2::Superludist<Matrix, Vector> [with Matrix=Tpetra::CrsMatrix<double, int, long long, Tpetra::Details::DefaultTypes::node_type>, Vector=Tpetra::MultiVector<double, int, long long, Tpetra::Details::DefaultTypes::node_type>]" 
/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist.cpp(99): here

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_decl.hpp(275): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"
          detected during:
            instantiation of class "Amesos2::Superludist<Matrix, Vector>::SLUData [with Matrix=Tpetra::CrsMatrix<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>, Vector=Tpetra::MultiVector<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>]" 
(307): here
            instantiation of class "Amesos2::Superludist<Matrix, Vector> [with Matrix=Tpetra::CrsMatrix<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>, Vector=Tpetra::MultiVector<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>]" 
/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist.cpp(214): here

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist_decl.hpp(290): error: class "Amesos2::TypeMap<Amesos2::Superludist, double>" has no member "LUstruct_t"
          detected during:
            instantiation of class "Amesos2::Superludist<Matrix, Vector>::SLUData [with Matrix=Tpetra::CrsMatrix<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>, Vector=Tpetra::MultiVector<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>]" 
(307): here
            instantiation of class "Amesos2::Superludist<Matrix, Vector> [with Matrix=Tpetra::CrsMatrix<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>, Vector=Tpetra::MultiVector<double, int, long long, Kokkos_Compat_KokkosSerialWrapperNode>]" 
/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/amesos2/src/Amesos2_Superludist.cpp(214): here

/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Kokkos_Complex.hpp(695): warning: calling a constexpr __host__ function("hypot") from a __host__ __device__ function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
          detected during:
            instantiation of "RealType Kokkos::abs(const Kokkos::complex<RealType> &) [with RealType=float]" 
(712): here
            instantiation of "Kokkos::complex<RealType> Kokkos::sqrt(const Kokkos::complex<RealType> &) [with RealType=float]" 
/vscratch1/jenkins/vortex-slave/workspace/Trilinos-atdm-ats2-cuda-10.1.243-xl-2020.03.18_spmpi-rolling_static_opt/SRC_AND_BUILD/Trilinos/packages/kokkos-kernels/src/Kokkos_ArithTraits.hpp(1590): here

18 errors detected in the compilation of "/tmp/jenkins/tmpxft_00015556_00000000-6_Amesos2_Superludist.cpp1.ii".

This will break all of the SPARC Trilinos Integration builds that depend on these installs of Trilinos starting on SPARC testing day 2020-10-28.

Current Status on CDash

See the current status of these builds for the current testing day here.

Steps to Reproduce

One should be able to reproduce this failure on any CEE LAN machine (using the cee-rhel6 env) or the machine 'vortex' (using the ats2 env), or any of the CTS-1 machines like 'eclipse' or 'chamma' (using the cts' env) as described in:

More specifically, the commands for the CEE LAN cee-rhel6 env (which is now a RHEL7 env) are provided at:

The exact commands to reproduce this build error on a CEE LAN RHEL7 machine should be, for example:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh \
    Trilinos-atdm-cee-rhel6_clang-9.0.1_openmpi-4.0.3_serial_static_opt

$ cmake \
 -GNinja \
 -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
 -DTrilinos_ENABLE_TESTS=ON \
 -DTrilinos_ENABLE_Amesos2=ON \
 $TRILINOS_DIR

$ make NP=16
@bartlettroscoe bartlettroscoe added type: bug The primary issue is a bug in Trilinos code or tests pkg: Amesos2 impacting: configure or build The issue is primarily related to configuring or building client: ATDM Any issue primarily impacting the ATDM project client: SPARC Issues related to or needed more specifically by the ATDM SPARC code ATDM Sev: Blocker Problems that make Trilinos unfit to be adopted by one or more ATDM APPs PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area labels Oct 27, 2020
@bartlettroscoe
Copy link
Member Author

@sebrowne, @tcfisher,

FYI: This will likely block the next upgrade of Trilinos for SPARC and it will take down all of the SPARC Trilinos Integration builds until this get fixed.

@bartlettroscoe
Copy link
Member Author

@keitat, I think this was your PR #8138 (to upgrade the version of SuperLUDist for #8062)? Can you coordinate with SPARC developers on how to fix this? (Contact me offline if you need introductions or contact info.)

@keitat
Copy link
Contributor

keitat commented Oct 27, 2020

It's strange. It should activate the macro for old SuperLU_DIST. Let me build with that.

@keitat
Copy link
Contributor

keitat commented Oct 27, 2020

I confirmed the bug on vortex60. Making a couple of PRs.

@bartlettroscoe
Copy link
Member Author

I confirmed the bug on vortex60. Making a couple of PRs.

Thanks @keitat!

@bartlettroscoe
Copy link
Member Author

@keitat, just for clarification, were you able to reproduce the build error on 'vortex' using the ATDM Trilinos build instructions at:

?

@keitat
Copy link
Contributor

keitat commented Oct 28, 2020

@keitat, just for clarification, were you able to reproduce the build error on 'vortex' using the ATDM Trilinos build instructions at:

?

Yes. I did:
source ../cmake/std/atdm/load-env.sh cuda-debug
cmake -GNinja -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Kokkos=ON -DTrilinos_ENABLE_Amesos2=ON ${Trilinos_DIR}

I found that the build system found SuperLU_DIST-5.1, which helped me to identify the typo in an ifdef block in Amesos2_Superludist_TypeMap.hpp.

@keitat
Copy link
Contributor

keitat commented Oct 28, 2020

I am able to confirm the bug in other environments using older version of SuperLU_DIST.

trilinos-autotester added a commit that referenced this issue Oct 28, 2020
…-13-0-branch

Automatically Merged using Trilinos Pull Request AutoTester rel 13.00
PR Title: Fix for #8258.  (Fixing typo in TypeMap declaration.)
PR Author: keitat
trilinos-autotester added a commit that referenced this issue Oct 28, 2020
…_develop

Automatically Merged using Trilinos Pull Request AutoTester
PR Title: Fix for #8258. (Fixing typo in TypeMap declaration.) in 13.x release
PR Author: keitat
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Oct 29, 2020
…s:develop' (0fb0c9f).

* trilinos-develop: (149 commits)
  MiniEM: Update rebalancing parameters
  Fix type-test-suite issue.
  Switch globals to REDUCTION instead of TRANSIENT
  stk snapshot as of 10/27/2020
  Fix for trilinos#8258
  MueLu RefMaxwell: Make nullspace normalization optional
  Fix fei issue for clang compiler, fix provided by Mike Glass.
  MueLu: Fix Aggregates when UVM=off
  Geminga: Set up paths for SEMS Boost
  Ifpack2: Fix tests to build with UVM disabled
  Xpetra: rm obsolete SplitMatrix implementation
  Xpetra: cleanup header inclusions in test
  Xpetra: cleanup some file headers
  Xpetra: reactivate Epetra block matrix test
  Xpetra: remove trailing white spaces
  Xpetra: improce Doxygen documentation
  MueLu RefMaxwell: Set threshold for diagonal fix
  MueLu: Rebase gold files
  MueLu: Add threshold and replacement value for diagonal fix in RAPFactory
  Xpetra: Print threshold in CheckRepairMainDiagonal, allow setting replacement
  ...
@bartlettroscoe
Copy link
Member Author

As shown here all of the build error are gone today due to the merge of PR #8271 yesterday.

Closing as complete.

Thanks for quick fix @keitat!

jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Oct 31, 2020
…s:develop' (0fb0c9f).

* trilinos-develop: (154 commits)
  atdm/contributed/blake: clean up copy/paste comment
  Framework: Fix to clang 7.0.1 build configuration
  atdm/contributed: environment script for blake testbed
  MiniEM: Update rebalancing parameters
  Fix type-test-suite issue.
  Switch globals to REDUCTION instead of TRANSIENT
  stk snapshot as of 10/27/2020
  Fix for trilinos#8258
  MueLu RefMaxwell: Make nullspace normalization optional
  Fix fei issue for clang compiler, fix provided by Mike Glass.
  MueLu: Fix Aggregates when UVM=off
  Geminga: Set up paths for SEMS Boost
  Ifpack2: Fix tests to build with UVM disabled
  Xpetra: rm obsolete SplitMatrix implementation
  Xpetra: cleanup header inclusions in test
  Xpetra: cleanup some file headers
  Xpetra: reactivate Epetra block matrix test
  Xpetra: remove trailing white spaces
  Xpetra: improce Doxygen documentation
  MueLu RefMaxwell: Set threshold for diagonal fix
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Nov 1, 2020
…s:develop' (0fb0c9f).

* trilinos-develop: (154 commits)
  atdm/contributed/blake: clean up copy/paste comment
  Framework: Fix to clang 7.0.1 build configuration
  atdm/contributed: environment script for blake testbed
  MiniEM: Update rebalancing parameters
  Fix type-test-suite issue.
  Switch globals to REDUCTION instead of TRANSIENT
  stk snapshot as of 10/27/2020
  Fix for trilinos#8258
  MueLu RefMaxwell: Make nullspace normalization optional
  Fix fei issue for clang compiler, fix provided by Mike Glass.
  MueLu: Fix Aggregates when UVM=off
  Geminga: Set up paths for SEMS Boost
  Ifpack2: Fix tests to build with UVM disabled
  Xpetra: rm obsolete SplitMatrix implementation
  Xpetra: cleanup header inclusions in test
  Xpetra: cleanup some file headers
  Xpetra: reactivate Epetra block matrix test
  Xpetra: remove trailing white spaces
  Xpetra: improce Doxygen documentation
  MueLu RefMaxwell: Set threshold for diagonal fix
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Nov 2, 2020
…s:develop' (0fb0c9f).

* trilinos-develop: (162 commits)
  atdm/contributed/blake: clean up copy/paste comment
  Framework: Fix to clang 7.0.1 build configuration
  atdm/contributed: environment script for blake testbed
  MiniEM: Update rebalancing parameters
  Fix type-test-suite issue.
  Switch globals to REDUCTION instead of TRANSIENT
  Xpetra: remove unused include and using statements
  MueLu: refurbish CoarseMapFactory unit tests
  Xpetra: refurbish StridedMapFactory
  Xpetra: refurbish parts of Xpetra::MapFactory
  MueLu: refactor (Blocked)CoarseMapFactory
  stk snapshot as of 10/27/2020
  Fix for trilinos#8258
  MueLu RefMaxwell: Make nullspace normalization optional
  MueLu: rename and cleanup unit test
  MueLu: remove debug output from unit test
  MueLu: remove obsolete TODO comment
  Fix fei issue for clang compiler, fix provided by Mike Glass.
  MueLu: Fix Aggregates when UVM=off
  Geminga: Set up paths for SEMS Boost
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Nov 3, 2020
…s:develop' (0fb0c9f).

* trilinos-develop: (168 commits)
  Tacho - testing on example as well
  Tacho - add an option for clark
  Framework: Add color codes to PR script for terminals
  Framework: PR Driver - Catch the checked-call and print a non-traceback error message for the log
  Framework: Cleanup on PR scripts to play nice with markdown conversions on Github
  Framework: Style pass for consistency on output from pr generator
  atdm/contributed/blake: clean up copy/paste comment
  Framework: Fix to clang 7.0.1 build configuration
  atdm/contributed: environment script for blake testbed
  MiniEM: Update rebalancing parameters
  Fix type-test-suite issue.
  Switch globals to REDUCTION instead of TRANSIENT
  Xpetra: remove unused include and using statements
  MueLu: refurbish CoarseMapFactory unit tests
  Xpetra: refurbish StridedMapFactory
  Xpetra: refurbish parts of Xpetra::MapFactory
  MueLu: refactor (Blocked)CoarseMapFactory
  stk snapshot as of 10/27/2020
  Fix for trilinos#8258
  MueLu RefMaxwell: Make nullspace normalization optional
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATDM Sev: Blocker Problems that make Trilinos unfit to be adopted by one or more ATDM APPs client: ATDM Any issue primarily impacting the ATDM project client: SPARC Issues related to or needed more specifically by the ATDM SPARC code impacting: configure or build The issue is primarily related to configuring or building PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area pkg: Amesos2 type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

2 participants