Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE][Sparse] Support SpSpMul #5464

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

czkkkkkk
Copy link
Collaborator

Description

Resolve #5368.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 17, 2023

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 17, 2023

Commit ID: e23ddaf

Build ID: 1

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 17, 2023

Commit ID: 81705d5

Build ID: 2

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 17, 2023

Commit ID: e5e804c

Build ID: 3

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 19, 2023

Commit ID: 032fb2e

Build ID: 4

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

shared by both matrices and the non-zero value from the first matrix at each
coordinate. The indices tensor shows the indices of the common coordinates
based on the first matrix.
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add * at the beginning of each row

* @return SparseMatrix
*/
c10::intrusive_ptr<SparseMatrix> SpSpMul(
const c10::intrusive_ptr<SparseMatrix>& lhs_mat,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it consistent --> use A & B?

static variable_list forward(
AutogradContext* ctx, c10::intrusive_ptr<SparseMatrix> lhs_mat,
torch::Tensor lhs_val, c10::intrusive_ptr<SparseMatrix> rhs_mat,
torch::Tensor rhs_val);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

A & B, A_val, B_val

std::tie(lhs_intersect_rhs, lhs_indices) =
SparseMatrixIntersection(lhs_mat, lhs_val, rhs_mat);
std::tie(rhs_intersect_lhs, rhs_indices) =
SparseMatrixIntersection(rhs_mat, rhs_val, lhs_intersect_rhs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need call intersection twice?
Can we simplify it to just one intersection calculation?

Copy link
Member

@jermainewang jermainewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. Two comments.

}
TORCH_CHECK(
!lhs_mat->HasDuplicate() && !rhs_mat->HasDuplicate(),
"Only support SpSpMul on sparse matrices without duplicate values")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HasDuplicate() is costly as it requires sorting and a linear scan (on GPU, it will incur CPU-GPU synchronization). I understand that SpSpMul shall not support matrices with duplicate entries. My question is what is the general best practice to handle those cases? I see three options:

  1. Use a heavy check like the code here.
  2. Try to design a light check.
  3. Say this is an undefined behavior but make sure the operation will not crash.

cc @frozenbugs

@@ -225,18 +233,36 @@ def test_sub_sparse_diag(val_shape):
assert torch.allclose(dense_diff, -diff4)


@pytest.mark.parametrize("op", ["mul", "truediv", "pow"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look like the new test omits the case for "truediv" and "pow". Have they been covered in other test cases?

@frozenbugs frozenbugs changed the title [Sparse] Support SpSpMul [DO NOT MERGE] Support SpSpMul Mar 21, 2023
@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 21, 2023

Commit ID: 631577eeb6b4f0e2f12450707424dd4c49b095c8

Build ID: 5

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

@czkkkkkk czkkkkkk changed the title [DO NOT MERGE] Support SpSpMul [DO NOT MERGE][Sparse] Support SpSpMul Mar 23, 2023
@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 23, 2023

Commit ID: ad7a194d06296950ca5065584b19952722a89064

Build ID: 6

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Sparse] Support SparseMatrix element-wise multiplication with different sparsities.
4 participants