Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt][CUDA] Specialize non-weighted neighbor sampling impl #7215

Merged
merged 7 commits into from
Mar 18, 2024

Conversation

mfbalin
Copy link
Collaborator

@mfbalin mfbalin commented Mar 14, 2024

Fixes #7173. Neighbor Sampling kernel is sped up by 4-5x compared to old, non-specialized version.

Description

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2024

Commit ID: cd4b0e1

Build ID: 1

Status: ❌ CI test failed in Stage [GPU Build].

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Mar 14, 2024

@TristonC The CI has the following error message:

/home/ubuntu/jenkins/workspace/dgl_PR-7215/third_party/cccl/libcudacxx/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows."
   12 | #  error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows."

I am going to work around it.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2024

Commit ID: cc4f6b2

Build ID: 2

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2024

Commit ID: 68c0da3

Build ID: 3

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2024

Commit ID: adb35df

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2024

Commit ID: 69575c3

Build ID: 5

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

graphbolt/src/cuda/neighbor_sampler.cu Outdated Show resolved Hide resolved
graphbolt/src/cuda/neighbor_sampler.cu Outdated Show resolved Hide resolved
graphbolt/src/cuda/neighbor_sampler.cu Outdated Show resolved Hide resolved
@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 18, 2024

Commit ID: 8811f46691dc68d496e777217070f33ef8e1d5fa

Build ID: 6

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 18, 2024

Commit ID: c093e65

Build ID: 7

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mfbalin mfbalin merged commit 9632ab1 into dmlc:master Mar 18, 2024
2 checks passed
@mfbalin mfbalin deleted the gb_cuda_optimize_neighbor_sampler branch March 18, 2024 14:38
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[GraphBolt][CUDA] Specialized kernels for simple Neighbor Sampling
3 participants