Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nanobind leaks memory in builtin mesh generation? #2884

Closed
francesco-ballarin opened this issue Nov 9, 2023 · 2 comments
Closed

nanobind leaks memory in builtin mesh generation? #2884

francesco-ballarin opened this issue Nov 9, 2023 · 2 comments
Labels
question Further information is requested

Comments

@francesco-ballarin
Copy link
Member

I have been struggling for the last couple of days in my multiphenicsx CI, where I have parallel unit tests that (on the same commit of multiphenicsx, and same dolfinx/dolfinx:nightly image) take either 10 minutes (as expected) or 3 hours.

I've simplified my CI and ran, out of all my tests, only this test
https://github.com/multiphenics/multiphenicsx/blob/artifact/tests/unit/fem/test_dofmap_restriction.py
which is heavily parametrized, for a total of more than 50 possible configurations.
I then ran only that test, either in serial or in parallel on three processors. I profiled a good CI run (which takes 3.670 s in parallel) and a bad CI run (which takes 122.788 seconds in parallel).

I attach the profiling results in pstats.zip: you have two cases (good vs bad), cProfile files for a serial vs parallel run, and in each case an auxiliary file print.py to print the following tables on process #0 of the parallel run.

For the good case (https://github.com/multiphenics/multiphenicsx/actions/runs/6810825566/job/18519911648), I get:

Thu Nov  9 12:00:15 2023    pstats-parallel-0/combined.prof

         321486 function calls (302748 primitive calls) in 3.670 seconds

   Ordered by: cumulative time
   List reduced from 584 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.001    0.000    3.819    0.061 runner.py:111(pytest_runtest_protocol)
       63    0.006    0.000    3.798    0.060 runner.py:119(runtestprotocol)
      189    0.001    0.000    3.792    0.020 runner.py:219(call_and_report)
 1309/693    0.004    0.000    3.788    0.005 _hooks.py:479(__call__)
 1309/693    0.002    0.000    3.785    0.005 _manager.py:106(_hookexec)
 1309/693    0.020    0.000    3.784    0.005 _callers.py:27(_multicall)
      189    0.001    0.000    3.759    0.020 runner.py:247(call_runtest_hook)
      189    0.001    0.000    3.755    0.020 runner.py:318(from_call)
      189    0.000    0.000    3.753    0.020 runner.py:262(<lambda>)
       63    0.001    0.000    2.981    0.047 pytest_hooks_unit_tests.py:27(runtest_teardown)
       63    2.812    0.045    2.812    0.045 {built-in method gc.collect}
       63    0.000    0.000    0.488    0.008 runner.py:160(pytest_runtest_call)
       63    0.000    0.000    0.487    0.008 python.py:1790(runtest)
       63    0.004    0.000    0.485    0.008 python.py:187(pytest_pyfunc_call)
       56    0.001    0.000    0.443    0.008 test_dofmap_restriction.py:125(test_dofmap_restriction_is_subset_of_dofmap)
       63    0.124    0.002    0.243    0.004 function.py:500(functionspace)
       63    0.000    0.000    0.208    0.003 runner.py:155(pytest_runtest_setup)
       63    0.001    0.000    0.206    0.003 runner.py:478(setup)
       63    0.000    0.000    0.206    0.003 python.py:1794(setup)
       63    0.000    0.000    0.205    0.003 fixtures.py:561(_fillfixtures)
      182    0.000    0.000    0.205    0.001 fixtures.py:568(getfixturevalue)
  420/182    0.001    0.000    0.205    0.001 fixtures.py:592(_get_active_fixturedef)
      182    0.002    0.000    0.203    0.001 fixtures.py:620(_compute_fixture_value)
      182    0.001    0.000    0.197    0.001 fixtures.py:1041(execute)
       56    0.011    0.000    0.186    0.003 test_dofmap_restriction.py:56(assert_dofmap_restriction_is_subset_of_dofmap)
      182    0.001    0.000    0.182    0.001 fixtures.py:1108(pytest_fixture_setup)
      182    0.001    0.000    0.178    0.001 fixtures.py:886(call_fixture_func)
       63    0.000    0.000    0.175    0.003 test_dofmap_restriction.py:20(mesh)
       63    0.001    0.000    0.175    0.003 mesh.py:547(create_unit_square)
       63    0.157    0.002    0.173    0.003 mesh.py:510(create_rectangle)

For the bad case (https://github.com/multiphenics/multiphenicsx/actions/runs/6810780257/job/18519767431), I get

Thu Nov  9 11:58:38 2023    pstats-parallel-0/combined.prof

         321486 function calls (302748 primitive calls) in 122.788 seconds

   Ordered by: cumulative time
   List reduced from 584 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.001    0.000  122.948    1.952 runner.py:111(pytest_runtest_protocol)
       63    0.008    0.000  122.905    1.951 runner.py:119(runtestprotocol)
      189    0.003    0.000  122.897    0.650 runner.py:219(call_and_report)
 1309/693    0.006    0.000  122.896    0.177 _hooks.py:479(__call__)
 1309/693    0.003    0.000  122.891    0.177 _manager.py:106(_hookexec)
 1309/693    0.034    0.000  122.889    0.177 _callers.py:27(_multicall)
      189    0.002    0.000  122.793    0.650 runner.py:247(call_runtest_hook)
      189    0.008    0.000  122.786    0.650 runner.py:318(from_call)
      189    0.001    0.000  122.777    0.650 runner.py:262(<lambda>)
       63    0.000    0.000   78.202    1.241 runner.py:155(pytest_runtest_setup)
       63    0.001    0.000   78.200    1.241 runner.py:478(setup)
       63    0.000    0.000   78.199    1.241 python.py:1794(setup)
       63    0.001    0.000   78.199    1.241 fixtures.py:561(_fillfixtures)
      182    0.000    0.000   78.198    0.430 fixtures.py:568(getfixturevalue)
  420/182    0.002    0.000   78.197    0.430 fixtures.py:592(_get_active_fixturedef)
      182    0.003    0.000   78.196    0.430 fixtures.py:620(_compute_fixture_value)
      182    0.002    0.000   78.185    0.430 fixtures.py:1041(execute)
      182    0.001    0.000   78.157    0.429 fixtures.py:1108(pytest_fixture_setup)
      182    0.002    0.000   78.142    0.429 fixtures.py:886(call_fixture_func)
       63    0.000    0.000   78.137    1.240 test_dofmap_restriction.py:20(mesh)
       63    0.001    0.000   78.137    1.240 mesh.py:547(create_unit_square)
       63   78.115    1.240   78.135    1.240 mesh.py:510(create_rectangle)
       63    0.001    0.000   36.944    0.586 runner.py:160(pytest_runtest_call)
       63    0.000    0.000   36.942    0.586 python.py:1790(runtest)
       63    0.008    0.000   36.938    0.586 python.py:187(pytest_pyfunc_call)
       56    0.003    0.000   33.680    0.601 test_dofmap_restriction.py:125(test_dofmap_restriction_is_subset_of_dofmap)
       63   21.876    0.347   23.662    0.376 function.py:500(functionspace)
       63    7.662    0.122    7.662    0.122 dofmap_restriction.py:21(__init__)
       63    0.009    0.000    7.449    0.118 pytest_hooks_unit_tests.py:27(runtest_teardown)
       63    0.002    0.000    5.376    0.085 common.py:20(ActiveDofs)

In this second table, notice in particular

       63    0.000    0.000   78.137    1.240 test_dofmap_restriction.py:20(mesh)
       63    0.001    0.000   78.137    1.240 mesh.py:547(create_unit_square)
       63   78.115    1.240   78.135    1.240 mesh.py:510(create_rectangle)

where https://github.com/multiphenics/multiphenicsx/blob/artifact/tests/unit/fem/test_dofmap_restriction.py#L20 is a fixture that generates a builtin mesh.

Could there be a memory leak (or something similar) in nanobind wrappers of builtin mesh generators that causes this?

@francesco-ballarin francesco-ballarin added the question Further information is requested label Nov 9, 2023
@francesco-ballarin
Copy link
Member Author

For completeness: there seems to be something specific about the runners in GitHub actions that exacerbates a bad run. I have never been able to get such a spectacularly slow run of the same test on my laptop, even when using the same multiphenicsx commit and dolfinx/dolfinx:nightly image. However, on GitHub actions it seems to happen around 50% of the times.

@francesco-ballarin
Copy link
Member Author

Fixed by #2894 #2895 #2896 #2897

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant