Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve test coverage of CrsGraph and CrsMatrix #262

Merged
merged 7 commits into from
Sep 19, 2019

Conversation

kruger
Copy link
Contributor

@kruger kruger commented Mar 18, 2019

This goes through and increases coverage in CrsGraph and CrsMatrix.
Many of the methods are just testing that they run and not that they run
correctly because they use this idiom:
fresult = graph%getFoo(); TEST_IERR()
This probably needs improving.

gaussSeidel is an example of a test that needs further work.

Also, the skeleton for these tests was done awhile ago and the API has
changed. Coverage tests should be used to see what is missing once the
nightlies come through.

@dalg24-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@aprokop
Copy link
Collaborator

aprokop commented Mar 18, 2019

ok to test

@kruger
Copy link
Contributor Author

kruger commented Mar 18, 2019

Is there an easy way of seeing the configure scripts for the Jenkins jobs? I found the CMakeCache.txt, but if there is a script, that would be easier (and maybe I am just missing something).

@aprokop
Copy link
Collaborator

aprokop commented Mar 18, 2019

@kruger yes, the scripts are in scripts directory and are named docker_cmake*. There are several of them, depending on configuration.

@kruger
Copy link
Contributor Author

kruger commented Apr 24, 2019

This PR addresses Issue #135 and some of the more detailed issues are there as well.

@codecov-io
Copy link

codecov-io commented Apr 24, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@666c0f3). Click here to learn what that means.
The diff coverage is 99.35%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #262   +/-   ##
=========================================
  Coverage          ?   57.78%           
=========================================
  Files             ?       49           
  Lines             ?    16993           
  Branches          ?        0           
=========================================
  Hits              ?     9819           
  Misses            ?     7174           
  Partials          ?        0
Impacted Files Coverage Δ
src/interface/test/test_eigen_handle.F90 93.65% <ø> (ø)
...c/tpetra/test/test_tpetra_import_export_helper.F90 100% <100%> (ø)
src/teuchos/test/test_teuchos_plist.F90 100% <100%> (ø)
src/interface/test/test_eigen_handle.cpp 97.36% <100%> (ø)
src/teuchos/test/test_teuchos_array.F90 100% <100%> (ø)
src/tpetra/src/swig/fortpetraFORTRAN_wrap.cxx 35.31% <100%> (ø)
src/teuchos/test/test_teuchos_comm.F90 100% <100%> (ø)
src/tpetra/test/test_tpetra_multivector_helper.F90 100% <100%> (ø)
src/tpetra/test/test_tpetra_crsgraph.F90 98.28% <100%> (ø)
src/tpetra/test/test_tpetra_multivector.F90 100% <100%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 666c0f3...cbe762a. Read the comment docs.

@kruger
Copy link
Contributor Author

kruger commented Jun 7, 2019

The jenkins test result is similar to my own tests: graph1%isIdenticalTo(graph2) is failing. I welcome suggestions on how to fix this.

@aprokop
Copy link
Collaborator

aprokop commented Jun 13, 2019

@kruger Interesting, just realized that the single coverage post here is updated by the coverage. So, in fact, we have the coverage report from the latest build.

@kruger
Copy link
Contributor Author

kruger commented Jun 13, 2019

@aprokop Interesting. Any idea of what the fortpetra.f90 is below 50%? If I read it correctly, it says that this branch improved things by 87%, but I can't see how to tell what's missing in coverage.

@aprokop
Copy link
Collaborator

aprokop commented Jun 13, 2019

@kruger Take a look here

@kruger
Copy link
Contributor Author

kruger commented Jun 13, 2019

Thanks!
The main issue is that not all permutations of a given method are tested. I propose that we finish the PR, close the related issue, and then open a new issue to deal with it. This can be done when funding is sorted. The issue will be to increase coverage by increasing the ways of calling a method. Maps and MultiVectors are in worse shape than CrsMatrix so this would allow a uniform approach to cleaning them all up.

As a comment for future reference: I think the main thing for maps/matrix/graphs is more 2D decomposition (or perhaps, any 2D decomposition). I have a note in the helper functions about this probably being a good idea. If we create a second version of the helper routines with the 2D decomp and then program up the same type of tests, I think it would be easy to get most of the missing coverage.

There are some miscellaneous helper routines and if/else branches that we are missing as well, but these are relatively minor.

@aprokop
Copy link
Collaborator

aprokop commented Jun 13, 2019

The main issue is that not all permutations of a given method are tested. I propose that we finish the PR, close the related issue, and then open a new issue to deal with it. This can be done when funding is sorted.

Agreed

The issue will be to increase coverage by increasing the ways of calling a method. Maps and MultiVectors are in worse shape than CrsMatrix so this would allow a uniform approach to cleaning them all up.

Yes, Maps and MultiVectors would be of higher priority.

As a comment for future reference: I think the main thing for maps/matrix/graphs is more 2D decomposition (or perhaps, any 2D decomposition).

I don't understand what 2D you are talking about. I think for me 2D may mean something different. Can you please elaborate?

There are some miscellaneous helper routines and if/else branches that we are missing as well, but these are relatively minor.

I have rebased the PR. Now I need to look through it, and see if everything looks OK. Once I've done that, I will merge it.

@kruger
Copy link
Contributor Author

kruger commented Jun 14, 2019

Maybe I misunderstood the tpetra tutorial, but I thought if you specify just a row map for the matrix constructor, then you have just the row disributed where as if you specify both row and column, then you have both distributed. In any case, we need permutations on the constructors primarily if I am understanding how the swig wrappers work.

Copy link
Collaborator

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a partial review, just saving my comments so far.

docs/source/developer_tools.rst Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
src/tpetra/test/test_tpetra_crsgraph.F90 Outdated Show resolved Hide resolved
@aprokop
Copy link
Collaborator

aprokop commented Jun 24, 2019

@kruger I rebased this PR on top of current master. Couple things that I had to fix were:

  • Replaced all DynamicProfile with StaticProfile
    DynamicProfile is deprecated, and will be gone in the future
  • Removed getLocalDiagOffsets tests
    @sethrj discovered issues with the function, and commented it out in his earlier PR.

Please take a look, and I think it would be nice to do some more thorough testing in some functions. TEST_NOTHROW only tests that the call does not throw, not that it is produces the correct results. I think the higher priority should be placed on functions that could potentially deal with plus-minus one issues.

@aprokop
Copy link
Collaborator

aprokop commented Sep 18, 2019

There is some problem in compilation with test_tpetra_crsmatrix_helper. I am able to reproduce it locally, so it's not spurious.

@aprokop
Copy link
Collaborator

aprokop commented Sep 18, 2019

I wonder if it is some kind of race condition. You try to compile that source file multiple times in that directory. If different make threads will try to compile it simultaneously, it is possible that one of them is recompiling, while the second one is trying to use it. Something like this?

@aprokop
Copy link
Collaborator

aprokop commented Sep 18, 2019

We used to have that problem before (#94). One way to fix thatis is to add a dummy library for all test_tpetra_*.F90 files, and then do DEPLIBS in TRILINOS_ADD_EXECUTABLE_AND_TEST.

@kruger kruger force-pushed the increase-cov branch 3 times, most recently from c18f553 to 089e034 Compare September 19, 2019 02:58
@kruger kruger mentioned this pull request Sep 19, 2019
3 tasks
@kruger
Copy link
Contributor Author

kruger commented Sep 19, 2019

This was PR was originally meant to cover only Issue #135, but I think we can close that issue as well as Issue #120, Issue #121, Issue #122, Issue #130, and Issue #136.

kruger and others added 2 commits September 19, 2019 10:54
This increases the coverage of ForTrilinos and addresses
primarily Issues trilinos#135 (CrsGraph), but also
Issues trilinos#121 (MultiVector) and trilinos#120 (CrsMatrix)
as well as improving coverage in tpetra and interface.

The method is to create *Unit-style tests that are as small
as possible while providing a thorough test.  To that end,
"helper files" (e.g., test_tpetra_crsgraph_helper.F90 or
test_tpetra_crsmatrix_helper.F90) were created to provide
standard matrices, graphs, maps, etc.  for testing.  The
matrix helper file is also used by the interface routines to
solve linear systems or find eigenvalues.

Other development goals:
- the pre-existing "fat tests" are maintained for useful examples
- tests must run in serial or parallel from command-line (ctest
  only tests one or the other)
- tests must be clean (no RCP errors) when running from command-line
- methods for crsgraph and crsmatrix try to show when fillComplete
  is needed for a given call.  Helper files do not fillComplete

Miscellaneous documentation changes are also included, and
build system changes were required.

Co-authored-by: Andrey Prokopenko <prokopenkoav@ornl.gov>
Co-authored-by: Michel de Messiere <demessie@txcorp.com>
Co-authored-by: Jared Popelar <jpopelar@txcorp.com>
@aprokop
Copy link
Collaborator

aprokop commented Sep 19, 2019

Hmm, I've got 2 failure locally that I need to take a look at (at least, in Debug mode).

  • test_eigen_unit
    At line 212 of file /home/xap/code/trilinos/fortrilinos/ForTrilinos/src/interface/test/test_eigen_unit.F90                                                                                     
    Fortran runtime error: Index '0' of dimension 1 of array 'evalues' below lower bound of 1
    
    This is obviously a mistake in the indexing
  • test_tpetra_multivector
    It's is in MultiVector subView test, when releasing vec.

@kruger I have made some changes to the PR, and was not careful to stack them on top of your commit, instead amending it. I cannot force-push to the branch. Can you please add me as a collaborator for the TechX ForTrilinos fork? Please do that before you fix the test_eigen_unit test.

@aprokop
Copy link
Collaborator

aprokop commented Sep 19, 2019

OK, I pushed. Go ahead and fix the test_eigen_unit test. I'm looking into the multivector one.

@kruger
Copy link
Contributor Author

kruger commented Sep 19, 2019

@aprokop I'm not seeing the multivector problem. I'm using docker_cmake which is in Debug mode. What are you seeing?

@aprokop
Copy link
Collaborator

aprokop commented Sep 19, 2019

I see things like this:

Thread 1 "ForTrilinosTpet" received signal SIGSEGV, Segmentation fault.
0x00007ffff769b87a in Teuchos::RCP<Tpetra::MultiVector<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::assert_valid_ptr (this=0x4010000000000000) at /home/xap/code/trilinos/fortrilinos/packages/teuchos/core/src/Teuchos_RCP.hpp:540
540       if (ptr_)
(gdb) bt
#0  0x00007ffff769b87a in Teuchos::RCP<Tpetra::MultiVector<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::assert_valid_ptr (this=0x4010000000000000) at /home/xap/code/trilinos/fortrilinos/packages/teuchos/core/src/Teuchos_RCP.hpp:540
#1  0x00007ffff7685894 in Teuchos::RCP<Tpetra::MultiVector<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::debug_assert_valid_ptr (this=0x4010000000000000) at /home/xap/code/trilinos/fortrilinos/packages/teuchos/core/src/Teuchos_RCPDecl.hpp:832
#2  0x00007ffff767ac2c in Teuchos::RCP<Tpetra::MultiVector<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::get (this=0x4010000000000000) at /home/xap/code/trilinos/fortrilinos/packages/teuchos/core/src/Teuchos_RCP.hpp:365
#3  0x00007ffff763c45d in _wrap_delete_TpetraMultiVector (farg1=0x7fffffffb9b0) at /home/xap/code/trilinos/fortrilinos/ForTrilinos/src/tpetra/src/swig/fortpetraFORTRAN_wrap.cxx:3391
#4  0x00007ffff7735aca in fortpetra::swigf_release_tpetramultivector (self=...) at /home/xap/code/trilinos/fortrilinos/ForTrilinos/src/tpetra/src/swig/fortpetra.F90:5439
#5  0x000055555555bbd9 in tpetramultivector_subviewnonconst (success=.TRUE.) at /home/xap/code/trilinos/fortrilinos/ForTrilinos/src/tpetra/test/test_tpetra_multivector.F90:723
#6  0x000055555555869f in test_tpetramultivector () at /home/xap/code/trilinos/fortrilinos/ForTrilinos/src/tpetra/test/test_tpetra_multivector.F90:46
#7  0x0000555555564100 in main (argc=1, argv=0x7fffffffc3fa) at /home/xap/code/trilinos/fortrilinos/ForTrilinos/src/utils/src/ForTrilinos.h:9
#8  0x00007ffff6653b97 in __libc_start_main (main=0x5555555640c0 <main>, argc=1, argv=0x7fffffffbee8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffbed8) at ../csu/libc-start.c:310
#9  0x0000555555557c1a in _start ()
(gdb) 

If I comment out any res usage in subView, it goes away. If I do res = vec%subView(cols), it reappears. So, I think there may be something weird in SWIG memory management code.

@aprokop
Copy link
Collaborator

aprokop commented Sep 19, 2019

OK, the bug was found and fixed. Once jenkins passes, can squash merge.

@aprokop aprokop merged commit 6570e8b into trilinos:master Sep 19, 2019
@aprokop aprokop deleted the increase-cov branch September 19, 2019 17:39
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants