[DO NOT MERGE][Sparse] Support SpSpMul #5464

czkkkkkk · 2023-03-17T08:16:24Z

Description

Resolve #5368.

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
I've leverage the tools to beautify the python and c++ code.
The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
All changes have test coverage
Code is well-documented
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
Related issue is referred in this PR
If the PR is for a new model/paper, I've updated the example index here.

Changes

dgl-bot · 2023-03-17T08:16:51Z

To trigger regression tests:

@dgl-bot run [instance-type] [which tests] [compare-with-branch];
For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

dgl-bot · 2023-03-17T08:46:18Z

Commit ID: e23ddaf

Build ID: 1

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

dgl-bot · 2023-03-17T08:48:04Z

Commit ID: 81705d5

Build ID: 2

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

dgl-bot · 2023-03-17T10:09:17Z

Commit ID: e5e804c

Build ID: 3

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-03-19T14:15:00Z

Commit ID: 032fb2e

Build ID: 4

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

frozenbugs · 2023-03-19T14:00:47Z

dgl_sparse/src/elemenwise_op.cc

+ shared by both matrices and the non-zero value from the first matrix at each
+ coordinate. The indices tensor shows the indices of the common coordinates
+ based on the first matrix.
+*/


Add * at the beginning of each row

frozenbugs · 2023-03-20T01:53:36Z

dgl_sparse/include/sparse/elementwise_op.h

+ * @return SparseMatrix
+ */
+c10::intrusive_ptr<SparseMatrix> SpSpMul(
+    const c10::intrusive_ptr<SparseMatrix>& lhs_mat,


Make it consistent --> use A & B?

frozenbugs · 2023-03-20T01:54:44Z

dgl_sparse/src/elemenwise_op.cc

+  static variable_list forward(
+      AutogradContext* ctx, c10::intrusive_ptr<SparseMatrix> lhs_mat,
+      torch::Tensor lhs_val, c10::intrusive_ptr<SparseMatrix> rhs_mat,
+      torch::Tensor rhs_val);


ditto.

A & B, A_val, B_val

frozenbugs · 2023-03-20T02:03:42Z

dgl_sparse/src/elemenwise_op.cc

+  std::tie(lhs_intersect_rhs, lhs_indices) =
+      SparseMatrixIntersection(lhs_mat, lhs_val, rhs_mat);
+  std::tie(rhs_intersect_lhs, rhs_indices) =
+      SparseMatrixIntersection(rhs_mat, rhs_val, lhs_intersect_rhs);


Why do we need call intersection twice?
Can we simplify it to just one intersection calculation?

jermainewang

Generally LGTM. Two comments.

jermainewang · 2023-03-21T09:18:12Z

dgl_sparse/src/elemenwise_op.cc

+  }
+  TORCH_CHECK(
+      !lhs_mat->HasDuplicate() && !rhs_mat->HasDuplicate(),
+      "Only support SpSpMul on sparse matrices without duplicate values")


The HasDuplicate() is costly as it requires sorting and a linear scan (on GPU, it will incur CPU-GPU synchronization). I understand that SpSpMul shall not support matrices with duplicate entries. My question is what is the general best practice to handle those cases? I see three options:

Use a heavy check like the code here.

Try to design a light check.

Say this is an undefined behavior but make sure the operation will not crash.

cc @frozenbugs

jermainewang · 2023-03-21T09:20:28Z

tests/python/pytorch/sparse/test_elementwise_op.py

@@ -225,18 +233,36 @@ def test_sub_sparse_diag(val_shape):
    assert torch.allclose(dense_diff, -diff4)


-@pytest.mark.parametrize("op", ["mul", "truediv", "pow"])


Look like the new test omits the case for "truediv" and "pow". Have they been covered in other test cases?

dgl-bot · 2023-03-21T11:20:31Z

Commit ID: 631577eeb6b4f0e2f12450707424dd4c49b095c8

Build ID: 5

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-03-23T09:37:48Z

Commit ID: ad7a194d06296950ca5065584b19952722a89064

Build ID: 6

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

czkkkkkk requested review from jermainewang and frozenbugs March 17, 2023 08:16

czkkkkkk force-pushed the sparse_mul branch from e23ddaf to 81705d5 Compare March 17, 2023 08:45

czkkkkkk force-pushed the sparse_mul branch from 81705d5 to e5e804c Compare March 17, 2023 08:47

[Sparse] Support SpSpMul

032fb2e

czkkkkkk force-pushed the sparse_mul branch from e5e804c to 032fb2e Compare March 19, 2023 13:33

frozenbugs reviewed Mar 20, 2023

View reviewed changes

jermainewang approved these changes Mar 21, 2023

View reviewed changes

frozenbugs changed the title ~~[Sparse] Support SpSpMul~~ [DO NOT MERGE] Support SpSpMul Mar 21, 2023

czkkkkkk changed the title ~~[DO NOT MERGE] Support SpSpMul~~ [DO NOT MERGE][Sparse] Support SpSpMul Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE][Sparse] Support SpSpMul #5464

[DO NOT MERGE][Sparse] Support SpSpMul #5464

czkkkkkk commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 19, 2023

frozenbugs Mar 19, 2023

frozenbugs Mar 20, 2023

frozenbugs Mar 20, 2023

frozenbugs Mar 20, 2023

jermainewang left a comment

jermainewang Mar 21, 2023

jermainewang Mar 21, 2023

dgl-bot commented Mar 21, 2023

dgl-bot commented Mar 23, 2023

		@@ -225,18 +233,36 @@ def test_sub_sparse_diag(val_shape):
		assert torch.allclose(dense_diff, -diff4)


		@pytest.mark.parametrize("op", ["mul", "truediv", "pow"])

[DO NOT MERGE][Sparse] Support SpSpMul #5464

Are you sure you want to change the base?

[DO NOT MERGE][Sparse] Support SpSpMul #5464

Conversation

czkkkkkk commented Mar 17, 2023

Description

Checklist

Changes

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 17, 2023

dgl-bot commented Mar 19, 2023

frozenbugs Mar 19, 2023

Choose a reason for hiding this comment

frozenbugs Mar 20, 2023

Choose a reason for hiding this comment

frozenbugs Mar 20, 2023

Choose a reason for hiding this comment

frozenbugs Mar 20, 2023

Choose a reason for hiding this comment

jermainewang left a comment

Choose a reason for hiding this comment

jermainewang Mar 21, 2023

Choose a reason for hiding this comment

jermainewang Mar 21, 2023

Choose a reason for hiding this comment

dgl-bot commented Mar 21, 2023

dgl-bot commented Mar 23, 2023