Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] new functional for creating data splits in graph #5418

Merged
merged 21 commits into from
Mar 9, 2023

Conversation

gvbazhenov
Copy link
Contributor

@gvbazhenov gvbazhenov commented Mar 2, 2023

Description

New functional for creating data splits, which allows to induce distributional shifts in graph and conduct experiments with graph models in more challenging setups.

Unfortunately, I did not manage to cover changes with tests, as I could not understand what is the proper way to create an environment pytorch-ci, which is expected in task_unit_test.sh. In my custom environment, some tests failed because of incorrect package versions (module 'numpy' has no attribute 'asscalar'), lack of some other side packages (no module named 'ogb'), etc.

However, I have successfully built DGL from source with new functions and checked that everything works fine.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 2, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 2, 2023

Commit ID: cbd793b

Build ID: 1

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@gvbazhenov
Copy link
Contributor Author

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 3, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 3, 2023

Commit ID: f8e3460

Build ID: 2

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@jermainewang
Copy link
Member

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 3, 2023

Commit ID: f8e3460

Build ID: 3

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 3, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 3, 2023

Commit ID: a1e8c55

Build ID: 4

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@gvbazhenov
Copy link
Contributor Author

I have just made a minor fix in the implementation of data split, which should not induce any changes in test coverage.

By the way, it is not clear for me why CI test fails in the Torch CPU (Win64) stage. I have successfully built DGL from source and tested new functions on Linux system with GPU, so I might be missing something. Could you please help me to figure out the problem and solve it?

@mufeili
Copy link
Member

mufeili commented Mar 6, 2023

The failures were not relevant to your changes. Let me trigger the tests again.

@mufeili
Copy link
Member

mufeili commented Mar 6, 2023

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 6, 2023

Commit ID: 34ff71d179a64f68ad9f9b067477380a232c4f41

Build ID: 5

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
python/dgl/data/utils.py Outdated Show resolved Hide resolved
@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 8, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 8, 2023

Commit ID: c7ddd2b

Build ID: 16

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 8, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 8, 2023

Commit ID: 00c842b

Build ID: 17

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@gvbazhenov
Copy link
Contributor Author

please remove the unrelated third_party/nccl

Done.

@frozenbugs
Copy link
Collaborator

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Commit ID: 2f9677c

Build ID: 18

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Commit ID: 2f9677c

Build ID: 19

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

@frozenbugs
Copy link
Collaborator

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Commit ID: f3058c4e73a104605789273c51bbf948e58eeac8

Build ID: 20

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mufeili
Copy link
Member

mufeili commented Mar 9, 2023

@gvbazhenov You need to fix the issues raised by Lint / lintrunner. Click the Details button to see the requests.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Commit ID: 6d98537

Build ID: 21

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Commit ID: 57fd348

Build ID: 22

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@gvbazhenov
Copy link
Contributor Author

@gvbazhenov You need to fix the issues raised by Lint / lintrunner. Click the Details button to see the requests.

Done in commit d60cf9a.

@frozenbugs
Copy link
Collaborator

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 9, 2023

Commit ID: 57fd348

Build ID: 23

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@mufeili mufeili merged commit 54b4bd0 into dmlc:master Mar 9, 2023
@gvbazhenov
Copy link
Contributor Author

@jermainewang @mufeili @frozenbugs Thank you very much for your help and guidance.

@mufeili
Copy link
Member

mufeili commented Mar 9, 2023

@gvbazhenov Thank you for the great job!

DominikaJedynak pushed a commit to DominikaJedynak/dgl that referenced this pull request Mar 12, 2024
* new functional for creating data splits in graph

* minor fix in data split implementation

* apply suggestions from code review

Co-authored-by: Mufei Li <mufeili1996@gmail.com>

* refactoring + unit tests

* fix test file name

* move imports to the top

* Revert "fix test file name"

This reverts commit 126323e.

* remove nccl submodule

* address linter issues

---------

Co-authored-by: Mufei Li <mufeili1996@gmail.com>
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants