Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt] Fix gpu NegativeSampler for seeds. #7068

Merged
merged 3 commits into from
Feb 4, 2024

Conversation

yxy235
Copy link
Collaborator

@yxy235 yxy235 commented Feb 2, 2024

Description

  1. move seeds, indexes, labels to GPU when sampling on GPU.
  2. Add tests for NegativeSampler on GPU.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 2, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 2, 2024

Commit ID: 1835ce3

Build ID: 1

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mfbalin mfbalin self-requested a review February 2, 2024 08:38
@mfbalin
Copy link
Collaborator

mfbalin commented Feb 2, 2024

@yxy235 If you ask for my review as well, it will be easier for me to keep track of what is changing when it comes to GPU GraphBolt support.

Copy link
Collaborator

@mfbalin mfbalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, suggested a minor improvement.

python/dgl/graphbolt/impl/uniform_negative_sampler.py Outdated Show resolved Hide resolved
@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 2, 2024

Commit ID: 4c976f4ac12c462d8726678b4bde5ddb17fec99e

Build ID: 2

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator

mfbalin commented Feb 2, 2024

I was experimenting to see what is the best way to create such tensors, below, you can see what I did:

The output of the code below on colab is as follows:

(<torch.utils.benchmark.utils.common.Measurement object at 0x7d385ed635b0>
 f(10000000, 20000000)
 setup: import torch
   604.70 us
   1 measurement, 1000 runs , 1 thread,
 <torch.utils.benchmark.utils.common.Measurement object at 0x7d377182ef80>
 f(10000000, 20000000)
 setup: import torch
   133.53 us
   1 measurement, 1000 runs , 1 thread)
import torch
import torch.utils.benchmark as benchmark
 
def f(pos_num, neg_num, dtype=torch.bool, device="cuda:0"):
    return torch.cat(
        (
            torch.ones(
                pos_num,
                dtype=dtype,
                device=device,
            ),
            torch.zeros(
                neg_num,
                dtype=dtype,
                device=device,
            ),
        ),
    )

def g(pos_num, neg_num, dtype=torch.bool, device="cuda:0"):
    labels = torch.empty(pos_num + neg_num, dtype=dtype, device=device)
    labels[:pos_num] = 1
    labels[pos_num:] = 0
    return labels

assert torch.equal(f(10, 20), g(10, 20))

N = 10000000
neg_factor = 2

stmt = f'f({N}, {N * neg_factor})'

f_timer = benchmark.Timer(stmt=stmt, setup='import torch', globals={'f': f})
g_timer = benchmark.Timer(stmt=stmt, setup='import torch', globals={'f': g})

f_timer.timeit(1000), g_timer.timeit(1000)

@mfbalin
Copy link
Collaborator

mfbalin commented Feb 2, 2024

My experiment and suggestion above is a nit, I just wanted to see what is the best way to do it.

@yxy235
Copy link
Collaborator Author

yxy235 commented Feb 2, 2024

My experiment and suggestion above is a nit, I just wanted to see what is the best way to do it.

I see. I will change it later for better performance.

Copy link
Collaborator

@mfbalin mfbalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor nit comments that don't need to be addressed for this PR. However, we might want to scan the whole code base and make similar improvements. I think such small improvements, when applied to the whole codebase, will make a meaningful difference in performance.

@yxy235 yxy235 merged commit af0b63e into dmlc:master Feb 4, 2024
2 checks passed
@yxy235 yxy235 deleted the fix_negative_sampler_seeds_gpu branch February 6, 2024 07:39
@yxy235 yxy235 mentioned this pull request Mar 6, 2024
8 tasks
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants