Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Batch 5 #1167

Merged
merged 34 commits into from
Aug 6, 2024
Merged

GPU Batch 5 #1167

merged 34 commits into from
Aug 6, 2024

Conversation

mborland
Copy link
Member

Adds GPU support to: sqrt1pm1, erf, erfc, erf_inv, erfc_inv, powm1, lgamma, tgamma, and associated gamma functions.

Removes recursion from lgamma and tgamma by adding dispatch functions to handle the reflection case.

Copy link

codecov bot commented Jul 31, 2024

Codecov Report

Attention: Patch coverage is 93.67089% with 10 lines in your changes missing coverage. Please review.

Project coverage is 94.08%. Comparing base (ef3892c) to head (445e36a).

Files Patch % Lines
include/boost/math/special_functions/powm1.hpp 72.22% 5 Missing ⚠️
include/boost/math/special_functions/gamma.hpp 94.59% 4 Missing ⚠️
include/boost/math/special_functions/erf.hpp 94.11% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #1167   +/-   ##
========================================
  Coverage    94.07%   94.08%           
========================================
  Files          780      780           
  Lines        65764    65797   +33     
========================================
+ Hits         61867    61904   +37     
+ Misses        3897     3893    -4     
Files Coverage Δ
include/boost/math/special_functions/beta.hpp 100.00% <ø> (ø)
...de/boost/math/special_functions/detail/erf_inv.hpp 100.00% <100.00%> (ø)
...ost/math/special_functions/detail/igamma_large.hpp 33.11% <100.00%> (ø)
...ost/math/special_functions/detail/lgamma_small.hpp 59.90% <100.00%> (+0.54%) ⬆️
...h/special_functions/detail/unchecked_factorial.hpp 100.00% <100.00%> (ø)
include/boost/math/special_functions/sign.hpp 100.00% <100.00%> (ø)
include/boost/math/special_functions/sqrt1pm1.hpp 100.00% <100.00%> (ø)
include/boost/math/tools/fraction.hpp 91.04% <100.00%> (+0.56%) ⬆️
include/boost/math/tools/precision.hpp 100.00% <100.00%> (ø)
test/test_erf.cpp 95.00% <ø> (ø)
... and 7 more

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ef3892c...445e36a. Read the comment docs.

@mborland
Copy link
Member Author

@jzmaddock Can you see any reason why Windows platforms would not like the diffs for erfc_inv and tgamma? The former just changes the dispatching method from pointers to std::integral_constant to references and the latter adds a dispatching function. I would assume if there was something fundamentally wrong with the way dispatching is occuring it would hit other platforms. Both are throwing on numeric overflow

@dschmitz89
Copy link
Contributor

@mborland : this looks awesome. Could you give a little high level overview of what the goals of these GPU efforts in Boost.Math are? SciPy is doing something similar at the moment and we are also using some special functions from you guys. CC @steppi in case he did not see these recent GPU PRs

@steppi
Copy link

steppi commented Aug 5, 2024

This is really awesome to see! Thanks for the ping @dschmitz89. In scipy/scipy#20785 (comment), you had said

From a stats maintenance perspective, I would be sad to see scipy dropping boost due to array API concerns. Their implementations of special functions for statistical purposes are a great resource for us, not mentioning their responsiveness to our issues.

and I'd replied something to the effect that if Boost also supported GPU execution for special functions, then we could keep using Boost's special functions in SciPy without having to maintain separate implementations for other array libraries. @jzmaddock thought that was reasonable and now here we are. I'll play around with getting Boost special functions working in CuPy, and it looks like we'll be able to use them directly there, and aso use them internally in the xsf special function library @izaid and I are working on.

@mborland
Copy link
Member Author

mborland commented Aug 5, 2024

@mborland : this looks awesome. Could you give a little high level overview of what the goals of these GPU efforts in Boost.Math are? SciPy is doing something similar at the moment and we are also using some special functions from you guys.

I think Albert's summary is pretty good. @jzmaddock has a quite old PR adding CUDA support here: #127 so I have been extracting from and expanding on that. The development on this repo: https://github.com/cppalliance/cuda-math since my employer can only directly add GPU enabled CI there, and then cherry picking onto boost proper. Two questions for you two @dschmitz89 and @steppi

  1. Is there a specific prioritization of functions or distributions that would benefit you?

  2. Are you looking for CUDA only or are there other platforms that you need/want covered? We have been inserting both CUDA and SYCL support so far. SYCL has some more significant limitations like no recursion. So far it's been relatively easy to work around by adding dispatching functions but there will be functions that need to be completely re-written or will remain unsupported such as anything involving root finding so I'd rather punt rewrites until there is real demand.

@izaid
Copy link

izaid commented Aug 5, 2024

I think this is great too! I'll let @steppi answer any specific prioritization of functions, but the ones you've done so far look good to me. I think we really only care about CUDA for now, but others may think differently.

@mborland I wanted to throw in one point though. Are you testing these using just nvcc, or nvrtc as well? nvrtc is CUDA's runtime (JIT) compiler, and is what CuPy uses for instance. It has more stringent requirements...

@steppi
Copy link

steppi commented Aug 5, 2024

2. anything involving root finding

I’m trying to figure out root finding that will work well on GPU now. I’ll keep you posted on my progress.

@mborland
Copy link
Member Author

mborland commented Aug 5, 2024

@mborland I wanted to throw in one point though. Are you testing these using just nvcc, or nvrtc as well? nvrtc is CUDA's runtime (JIT) compiler, and is what CuPy uses for instance. It has more stringent requirements...

Right now it's just NVCC but we can look into adding NVRTC as well. Tracked here: cppalliance/cuda-math#7.

I’m trying to figure out root finding that will work well on GPU now. I’ll keep you posted on my progress.

Cool. Thank you

@dschmitz89
Copy link
Contributor

Nice that the conversation is starting. :) On the SciPy side I would defer to @steppi for all technical details as I don't have expertise in CUDA or special function computations.

@mborland
Copy link
Member Author

mborland commented Aug 6, 2024

Nice that the conversation is starting. :) On the SciPy side I would defer to @steppi for all technical details as I don't have expertise in CUDA or special function computations.

Feel free to open further issues/discussion here or downstream at: https://github.com/cppalliance/cuda-math. Either way it will get seen.

Only CI failure is from codecov so I am going to merge this one in.

@mborland mborland merged commit ab09ece into develop Aug 6, 2024
77 of 78 checks passed
@mborland mborland deleted the cuda_5 branch August 6, 2024 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants