[Utils] Edge and LINKX homophily measure #5382

mufeili · 2023-02-24T10:09:18Z

Description

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
I've leverage the tools to beautify the python and c++ code.
The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
All changes have test coverage
Code is well-documented
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
Related issue is referred in this PR
If the PR is for a new model/paper, I've updated the example index here.

Changes

…hily

dgl-bot · 2023-02-24T10:11:13Z

To trigger regression tests:

@dgl-bot run [instance-type] [which tests] [compare-with-branch];
For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

dgl-bot · 2023-02-24T11:13:27Z

Commit ID: 35e4d87

Build ID: 1

Status: ❌ CI test failed in Stage [Tensorflow GPU Unit test].

Report path: link

Full logs path: link

…hily

dgl-bot · 2023-02-24T12:29:13Z

Commit ID: a4a2f46

Build ID: 2

Status: ❌ CI test failed in Stage [Torch GPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-02-24T13:25:36Z

Commit ID: 1cdf545

Build ID: 3

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

python/dgl/homophily.py

dgl-bot · 2023-02-27T08:43:36Z

Commit ID: afd11e76b4d8064ba514a49496b359ac0c302654

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

dgl-bot · 2023-02-28T08:51:13Z

Commit ID: 899344ce39bf545bc451d6fa39c1a712267642af

Build ID: 7

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

python/dgl/homophily.py

jermainewang · 2023-02-28T09:49:52Z

python/dgl/homophily.py

        )
-        return graph.ndata["node_value"].mean().item()
+        return F.as_scalar(F.mean(graph.ndata["same_class_deg"], dim=0))


I want to point out that the implementation is awkward due to the constraints of our current APIs: (1) Need to use framework-agnostic backend, (2) don't support integer-type aggregation, etc.

Ideally, it should be as simple as:

u, v = graph.edges() graph.edata['same_class'] = (y[u.long()] == y[v.long()]).float() graph.update_all(...) return graph.ndata["same_class_deg"].mean()

python/dgl/homophily.py

tests/python/common/test_homophily.py

python/dgl/homophily.py

dgl-bot · 2023-03-02T09:31:13Z

Commit ID: 2058918

Build ID: 8

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

…hily

dgl-bot · 2023-03-02T09:32:43Z

Commit ID: 2058918

Build ID: 9

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

dgl-bot · 2023-03-02T09:35:15Z

Commit ID: 83b6677

Build ID: 10

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

frozenbugs · 2023-03-02T10:07:43Z

python/dgl/homophily.py

+__all__ = ["node_homophily", "edge_homophily", "linkx_homophily"]
+
+
+def get_long_edges(graph):


Sound good.

nit: Maybe rename to get_edges_long, more natural.

frozenbugs · 2023-03-02T10:08:46Z

python/dgl/homophily.py

+    ----------
+    graph : DGLGraph
+        The graph.
+    y : Tensor


torch.Tensor

and others.

frozenbugs · 2023-03-02T10:10:03Z

python/dgl/homophily.py

+
+        for k in range(num_classes):
+            # Get the nodes that belong to class k.
+            class_mask = y == k


nit: class_mask = (y == k)

I initially did what you suggested, and then the lint check failed.

frozenbugs · 2023-03-02T10:14:54Z

tests/python/common/test_homophily.py

+    dgl.backend.backend_name != "pytorch", reason="Only support PyTorch for now"
+)
+@parametrize_idtype
+def test_linkx_homophily(idtype):


Is there any conner case you need to handle?
e.g. there was a max(0, xxxx)
Should we check the 0 cases?

I think the current cases are sufficient.

jermainewang · 2023-03-02T10:42:58Z

python/dgl/homophily.py

+def get_long_edges(graph):
+    """Internal function for getting the edges of a graph as long tensors."""
+    src, dst = graph.edges()
+    return src.long(), dst.long()


If there are only two lines, consider just embed them.

I'm fine either way. Maybe you two can start a fight. :) @frozenbugs

jermainewang · 2023-03-02T10:47:57Z

python/dgl/homophily.py

+        graph.edata["same_class"] = (y[src] == y[dst]).float()
+        graph.update_all(
+            fn.copy_e("same_class", "m"), fn.sum("m", "same_class_deg")
+        )


ok, now I'm pushing this further. Will using sparse API makes the code more readable?

How so? You convert the graph to a sparse matrix and call AX. I don't think there are significant differences.

with graph.local_scope(): # Handle the case where graph is of dtype int32. src, dst = get_long_edges(graph) # Compute y_v = y_u for all edges. graph.edata["same_class"] = (y[src] == y[dst]).float() graph.update_all( fn.copy_e("same_class", "m"), fn.mean("m", "same_class_deg") ) return graph.ndata["same_class_deg"].mean(dim=0).item()

v.s.

A = graph.adj same_class = (y[A.row] == y[A.col]).float() same_class_avg = dglsp.val_like(A, same_class).smean(dim=1) return same_class_avg.mean(dim=0).item()

v.s. in the new message passing API style

src, dst = get_long_edges(graph) same_class = (y[src] == y[dst]).float() same_class_avg = dgl.mpops.copy_e_mean(g, same_class) return same_class_avg.mean(dim=0).item()

Still, it's quite subtle. I'm fine either way. The question is more about when do we encourage the use of message passing APIs versus sparse APIs.

My opinion is to go with the math formulation: If the model is described in node-wise/edge-wise computation then message passing is the way to goal; otherwise, use sparse. In this case, the definition is in node/edge so message passing is more suitable. You can see that although the sparse APIs are shorter, it doesn't align well with the definition, e.g., the use of val_like and smean is not straightforward.

python/dgl/homophily.py

jermainewang · 2023-03-02T10:54:45Z

tests/python/common/test_homophily.py

+    dgl.backend.backend_name != "pytorch", reason="Only support PyTorch for now"
+)
+@parametrize_idtype
+def test_linkx_homophily(idtype):


tests/python/common/test_homophily.py

dgl-bot · 2023-03-03T05:47:07Z

Commit ID: 45116b9f96057f4dacfc7a612497bbe78d4969e8

Build ID: 11

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

dgl-bot · 2023-03-03T06:12:53Z

Commit ID: a2ecacdea5e48576edf01bb459160ffa851869da

Build ID: 12

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

jermainewang

LGTM. Do you want to remove the [DoNotMerge] tag?

dgl-bot · 2023-03-03T07:38:22Z

Commit ID: 989fd86810e476dd7e01e26fa9923716972d4ac8

Build ID: 13

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-03-03T08:04:59Z

Commit ID: 31fabd94d5b10d7f07a4d1714a385ca5125b60a4

Build ID: 14

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

dgl-bot · 2023-03-03T08:49:28Z

Commit ID: 8714d4a

Build ID: 15

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

* Update * lint * lint * r prefix * CI * lint * skip TF * Update * edge homophily * linkx homophily * format * skip TF * fix test * update * lint * lint * review * lint * update * lint * update * CI --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-36-188.ap-northeast-1.compute.internal>

Ubuntu and others added 14 commits February 23, 2023 07:41

Update

3e66773

lint

d1785ef

lint

e613ddf

r prefix

4aac2dd

CI

d7d2144

lint

618c963

skip TF

cfdd975

Update

8396fc5

Merge branch 'master' into homophily

50ec561

edge homophily

20ac1a9

Merge branch 'homophily' of https://github.com/mufeili/dgl into homop…

7bd5e44

…hily

linkx homophily

2e8104b

format

331ad0b

Merge branch 'master' into homophily

35e4d87

mufeili requested review from jermainewang and frozenbugs February 24, 2023 10:11

mufeili mentioned this pull request Feb 24, 2023

Utils for homophily measures #5351

Closed

3 tasks

Ubuntu added 2 commits February 24, 2023 11:21

skip TF

13cc169

Merge branch 'homophily' of https://github.com/mufeili/dgl into homop…

a4a2f46

…hily

fix test

1cdf545

mufeili requested review from BarclayII and rudongyu February 27, 2023 07:29

frozenbugs approved these changes Feb 27, 2023

View reviewed changes

python/dgl/homophily.py Outdated Show resolved Hide resolved

python/dgl/homophily.py Outdated Show resolved Hide resolved

mufeili changed the title ~~[Utils] Edge and LINKX homophily measure~~ [DoNotMerge] [Utils] Edge and LINKX homophily measure Feb 27, 2023

update

d359ba5

jermainewang requested changes Feb 28, 2023

View reviewed changes

rudongyu reviewed Mar 2, 2023

View reviewed changes

python/dgl/homophily.py Outdated Show resolved Hide resolved

python/dgl/homophily.py Outdated Show resolved Hide resolved

Ubuntu and others added 2 commits March 2, 2023 09:29

review

58a7153

Merge branch 'master' into homophily

2058918

mufeili requested review from jermainewang and rudongyu March 2, 2023 09:30

Ubuntu added 2 commits March 2, 2023 09:31

lint

9f179e5

Merge branch 'homophily' of https://github.com/mufeili/dgl into homop…

83b6677

…hily

frozenbugs reviewed Mar 2, 2023

View reviewed changes

jermainewang requested changes Mar 2, 2023

View reviewed changes

update

abe7381

mufeili requested a review from jermainewang March 3, 2023 05:46

lint

c1e54e3

update

e35d875

jermainewang approved these changes Mar 3, 2023

View reviewed changes

CI

558a07e

mufeili changed the title ~~[DoNotMerge] [Utils] Edge and LINKX homophily measure~~ [Utils] Edge and LINKX homophily measure Mar 3, 2023

Merge branch 'master' into homophily

8714d4a

mufeili merged commit f00cd6e into dmlc:master Mar 3, 2023

mufeili deleted the homophily branch March 3, 2023 08:58

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Utils] Edge and LINKX homophily measure #5382

[Utils] Edge and LINKX homophily measure #5382

mufeili commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 27, 2023

dgl-bot commented Feb 28, 2023

jermainewang Feb 28, 2023

dgl-bot commented Mar 2, 2023

dgl-bot commented Mar 2, 2023

dgl-bot commented Mar 2, 2023

frozenbugs Mar 2, 2023

frozenbugs Mar 2, 2023

mufeili Mar 3, 2023

frozenbugs Mar 2, 2023

mufeili Mar 3, 2023

frozenbugs Mar 2, 2023

jermainewang Mar 2, 2023

mufeili Mar 3, 2023

jermainewang Mar 2, 2023

mufeili Mar 3, 2023 •

edited

Loading

jermainewang Mar 2, 2023

mufeili Mar 3, 2023

jermainewang Mar 3, 2023

jermainewang Mar 3, 2023

mufeili Mar 3, 2023 •

edited

Loading

jermainewang Mar 3, 2023

jermainewang Mar 2, 2023

dgl-bot commented Mar 3, 2023

dgl-bot commented Mar 3, 2023

jermainewang left a comment

dgl-bot commented Mar 3, 2023

dgl-bot commented Mar 3, 2023

dgl-bot commented Mar 3, 2023

		__all__ = ["node_homophily", "edge_homophily", "linkx_homophily"]


		def get_long_edges(graph):

[Utils] Edge and LINKX homophily measure #5382

[Utils] Edge and LINKX homophily measure #5382

Conversation

mufeili commented Feb 24, 2023

Description

Checklist

Changes

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 24, 2023

dgl-bot commented Feb 27, 2023

dgl-bot commented Feb 28, 2023

Choose a reason for hiding this comment

dgl-bot commented Mar 2, 2023

dgl-bot commented Mar 2, 2023

dgl-bot commented Mar 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mufeili Mar 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mufeili Mar 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgl-bot commented Mar 3, 2023

dgl-bot commented Mar 3, 2023

jermainewang left a comment

Choose a reason for hiding this comment

dgl-bot commented Mar 3, 2023

dgl-bot commented Mar 3, 2023

dgl-bot commented Mar 3, 2023

mufeili Mar 3, 2023 •

edited

Loading

mufeili Mar 3, 2023 •

edited

Loading