Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sparse] Add relabel python API #6323

Merged
merged 2 commits into from
Sep 15, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions python/dgl/sparse/sparse_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -680,6 +680,74 @@ def sample(
self.c_sparse_matrix.sample(dim, fanout, ids, replace, bias)
)

def relabel(
frozenbugs marked this conversation as resolved.
Show resolved Hide resolved
self,
dim: int,
leading_indices: Optional[torch.Tensor] = None,
):
"""Relabels indices of a dimension and remove rows or columns without
non-zero elements in the sparse matrix.

This function serves a dual purpose: it allows you to reorganize the
indices within a specific dimension (rows or columns) of the sparse
matrix and, if needed, place certain 'leading_indices' at the beginning
of the relabeled dimension.

In the absence of 'leading_indices' (when it's set to `None`), the order
of relabeled indices remains the same as the original order, except that
rows or columns without non-zero elements are removed. When
'leading_indices' are provided, they are positioned at the start of the
relabeled dimension.
xiangyuzhi marked this conversation as resolved.
Show resolved Hide resolved

This function mimics 'dgl.to_block', a method used to compress a sampled
subgraph by eliminating redundant nodes. The 'leading_indices' parameter
replicates the behavior of 'include_dst_in_src' in 'dgl.to_block',
adding destination node information for message passing.
Setting 'leading_indices' to column IDs when relabeling the row
dimension, for example, achieves the same effect as including destination
nodes in source nodes.

Parameters
----------
dim : int
The dimension to relabel. Should be 0 or 1. Use `dim = 0` for rowwise
relabeling and `dim = 1` for columnwise relabeling.
leading_indices : torch.Tensor, optional
An optional tensor containing row or column ids that should be placed
at the beginning of the relabeled dimension.

Returns
-------
Tuple[SparseMatrix, torch.Tensor]
A tuple containing the relabeled sparse matrix and the index mapping
of the relabeled dimension from the new index to the original index.

Examples
--------
>>> indices = torch.tensor([[0, 2],
[1, 2]])
>>> A = dglsp.spmatrix(indices)

Case 1: Relabel rows without indices.

>>> B, original_rows = A.relabel(dim=0, leading_indices=None)
xiangyuzhi marked this conversation as resolved.
Show resolved Hide resolved
>>> print(B)
SparseMatrix(indices=tensor([[0, 1], [1, 2]]),
shape=(2, 3), nnz=2)
>>> print(original_rows)
torch.Tensor([0, 2])

Case 2: Relabel rows with indices.

>>> B, original_rows = A.relabel(dim=0, leading_indices=[1, 2])
>>> print(B)
SparseMatrix(indices=tensor([[1, 2], [2, 1]]),
shape=(3, 3), nnz=2)
>>> print(original_rows)
torch.Tensor([1, 2, 0])
"""
raise NotImplementedError


def spmatrix(
indices: torch.Tensor,
Expand Down