Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt][CUDA] Pipelined sampling optimization #7039

Merged

Conversation

mfbalin
Copy link
Collaborator

@mfbalin mfbalin commented Jan 26, 2024

Description

We overlap the UVA graph accesses with the rest of the computations by launching them in the UVA stream. We use a separate thread becase the index_select_csc is blocking, unlike index_select, which is nonblocking. I suspect the GIL in python is making multithreading inefficient a little bit, so more performance can be unlocked in the future after python with no GIL is available, see https://peps.python.org/pep-0703/#gpu-heavy-workloads-require-multi-core-processing

I am getting %10-%20 end-to-end speedup for sample_layer_neighbor. sample_neighbor is currently not getting much speedup because it does not need to access the whole indices tensor, which we are fetching through this optimization. Will need to specialize for sample_neighbor later.

This optimization will be the core of us providing the users a way to define their own custom samplers using dgl.sparse. My next PR will show how to implement LADIES while still taking advantage of the GraphBolt optimizations.

image

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@mfbalin mfbalin marked this pull request as draft January 26, 2024 23:26
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 26, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 26, 2024

Commit ID: e8ae274

Build ID: 1

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@mfbalin mfbalin force-pushed the gb_cuda_pipelined_sampling_optimization branch from e8ae274 to 7d46316 Compare January 27, 2024 11:47
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 27, 2024

Commit ID: 571fd8757ea01f02ef6b02f9d0ed7de9be268ae3

Build ID: 2

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@mfbalin mfbalin force-pushed the gb_cuda_pipelined_sampling_optimization branch 3 times, most recently from f3dd0b3 to ba18b7f Compare January 27, 2024 11:55
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 27, 2024

Commit ID: f3dd0b3

Build ID: 4

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 27, 2024

Commit ID: 1a5abae

Build ID: 3

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 27, 2024

Commit ID: ba18b7f

Build ID: 5

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 27, 2024

Commit ID: 3547c00

Build ID: 6

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 28, 2024

Commit ID: 1dce39e

Build ID: 7

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@mfbalin mfbalin marked this pull request as ready for review January 28, 2024 02:17
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 28, 2024

Commit ID: 0ad178a

Build ID: 8

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@frozenbugs
Copy link
Collaborator

Overall LGTM, questions on python/dgl/graphbolt/impl/neighbor_sampler.py.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: b52adea4a117bc8c2369dd6a96d4623d85c9e716

Build ID: 35

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 23ade71

Build ID: 36

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 6b21742

Build ID: 37

Status: ❌ CI test failed in Stage [GPU Build].

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 6b21742

Build ID: 38

Status: ❌ CI test failed in Stage [Torch CPU].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 0a21186

Build ID: 39

Status: ❌ CI test failed in Stage [Torch CPU].

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

@dgl-bot

@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

@Rhett-Ying Does the CI have a problem?

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 0a21186

Build ID: 40

Status: ❌ CI test failed in Stage [GPU Build].

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Feb 4, 2024

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: 0a21186

Build ID: 41

Status: ❌ CI test failed in Stage [GPU Build].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 4, 2024

Commit ID: e0fa30b

Build ID: 42

Status: ❌ CI test failed in Stage [GPU Build].

Report path: link

Full logs path: link

@Rhett-Ying
Copy link
Collaborator

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 5, 2024

Commit ID: e0fa30b

Build ID: 43

Status: ❌ CI test failed in Stage [CPU Build].

Report path: link

Full logs path: link

@Rhett-Ying
Copy link
Collaborator

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 5, 2024

Commit ID: e0fa30b

Build ID: 44

Status: ❌ CI test failed in Stage [GPU Build].

Report path: link

Full logs path: link

@Rhett-Ying
Copy link
Collaborator

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 5, 2024

Commit ID: e0fa30b

Build ID: 45

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@frozenbugs frozenbugs merged commit badeaf1 into dmlc:master Feb 5, 2024
2 checks passed
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants