`dgl.lap_pe` cannot scale to large graphs due to materialization of dense adjacency matrix #5854

BarclayII · 2023-06-12T02:41:22Z

🐛 Bug

dgl.lap_pe cannot scale to large graphs due to materialization of dense adjacency matrix

To Reproduce

@vijaydwivedi75 told me that he would like to scale dgl.lap_pe to larger graphs like OGB products, but current implementation throws a memory error:

import dgl 
from ogb.nodeproppred import DglNodePropPredDataset
dataset = DglNodePropPredDataset(name='ogbn-products')
graph = dataset[0][0]

dgl.lap_pe(graph, 5)

Traceback (most recent call last): 
  File "<stdin>", line 1, in <module>
  File "/home/vdwivedi/miniconda3/envs/transformers/lib/python3.10/site-packages/dgl/transforms/functional.py", line 3674, in lap_pe
    EigVal, EigVec = np.linalg.eig(L.toarray())
  File "/home/vdwivedi/miniconda3/envs/transformers/lib/python3.10/site-packages/scipy/sparse/_compressed.py", line 1051, in toarray
    out = self._process_toarray_args(order, out)
  File "/home/vdwivedi/miniconda3/envs/transformers/lib/python3.10/site-packages/scipy/sparse/_base.py", line 1298, in _process_toarray_args
    return np.zeros(self.shape, dtype=self.dtype, order=order)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 43.6 TiB for an array with shape (2449029, 2449029) and data type float64

Expected behavior

We could make it work on graphs like OGB products (see fix below)

Environment

DGL Version (e.g., 1.0): master
Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3):
OS (e.g., Linux):
How you installed DGL (conda, pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version (if applicable):
GPU models and configuration (e.g. V100):
Any other relevant information:

Additional context

@vijaydwivedi75 also suggested a fix. Instead of materializing the dense adjacency matrix in

dgl/python/dgl/transforms/functional.py

Line 3675 in df97f2e

EigVal, EigVec = np.linalg.eig(L.toarray())

We could as well use scipy's implementation which can work on sparse matrices:

EigVal, EigVec = sp.linalg.eigs(L, k=pos_enc_dim+1, which='SR', tol=1e-2) # works fine for ogbn-products

The text was updated successfully, but these errors were encountered:

github-actions · 2023-07-15T01:35:40Z

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

github-actions · 2023-08-20T01:26:04Z

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

BarclayII assigned rudongyu Jun 12, 2023

BarclayII mentioned this issue Jun 12, 2023

[Optimization] Use scipy's eigs instead of numpy in lap_pe #5855

Merged

BarclayII unassigned rudongyu Jun 14, 2023

github-actions bot added the stale-issue label Jul 15, 2023

rudongyu removed the stale-issue label Jul 20, 2023

rudongyu self-assigned this Jul 20, 2023

github-actions bot added the stale-issue label Aug 20, 2023

BarclayII closed this as completed in #5855 Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`dgl.lap_pe` cannot scale to large graphs due to materialization of dense adjacency matrix #5854

`dgl.lap_pe` cannot scale to large graphs due to materialization of dense adjacency matrix #5854

BarclayII commented Jun 12, 2023

github-actions bot commented Jul 15, 2023

github-actions bot commented Aug 20, 2023

dgl.lap_pe cannot scale to large graphs due to materialization of dense adjacency matrix #5854

dgl.lap_pe cannot scale to large graphs due to materialization of dense adjacency matrix #5854

Comments

BarclayII commented Jun 12, 2023

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

github-actions bot commented Jul 15, 2023

github-actions bot commented Aug 20, 2023

`dgl.lap_pe` cannot scale to large graphs due to materialization of dense adjacency matrix #5854

`dgl.lap_pe` cannot scale to large graphs due to materialization of dense adjacency matrix #5854