Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance][Feature] Implement edge excluding in EdgeDataLoader on GPU #3226

Merged
merged 17 commits into from
Aug 19, 2021

Conversation

nv-dlasalle
Copy link
Collaborator

Description

When the output device for the dataloader is a GPU, this enable making use of the cuda hash table implementation to find the set of edges to remove.

In my testing using /examples/pytorch/graphsage/train_sampling_unsupervised.py, this reduced the time to find the edges from 148ms using numpy's isin() method, to 2.0ms to copy the edge ids to the GPU and check them against the excluded edges (74x speedup). The majority of this is performing the CPU to GPU copy. In terms of just checking the edges, it takes just 200us.

Although I believe outside of the scope of this PR, the CPU side of things would benefit from a similar strategy.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change

Changes

Add Filter class to store the hashmap of edges to be excluded. This speeds up the process when there are multiple layers, as opposed to generating the same hashmap for each layer.

This also adds a unit test.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 6, 2021

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@BarclayII
Copy link
Collaborator

@zheng-da How do you like it combined with #2971? Ideally I would like to see the entry point of CPU and GPU implementations unified

Copy link
Collaborator

@BarclayII BarclayII left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some inspection I found the PR largely orthogonal to #2971 since it is just moving the edge ID exclusion postprocessing to C++/CUDA. Although if the latter was merged then we probably want to skip the edge ID exclusion postprocessing.

There's a potential conflict with #2971 though since the latter is moving the edge ID exclusion into sample_frontier.



class Filter(object):
"""Class used to either find either find the subset of ids that are in this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"either find either find". Also we usually capitalize "ids" as "IDs".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in
fe0bc66.


class Filter(object):
"""Class used to either find either find the subset of ids that are in this
filter, or th subset of ids that are not in this filter,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in
fe0bc66.

self._exclude_eids = None
self._filter = None

if device == F.cpu():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a TODO for CPU? If the filters are implemented for CPU then we won't use _locate_eids_to_exclude right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in
aa21891.

@zheng-da zheng-da merged commit f634950 into dmlc:master Aug 19, 2021
BarclayII added a commit that referenced this pull request Aug 26, 2021
…GPU (#3226)

* Update filter code

* Add unit tests

* Fixes

* Switch to indices

* Rename functions

* Fix linting

* Fix whitespace

* Add doc

* Fix heterograph

* Change workspace allocation

* Fix linting

* Fix docs in filter.py

* Add todo

Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants