This work was presented at the ICLR 2019 workshop session Representation Learning on Graphs and Manifolds.
Our paper can be found here: https://rlgm.github.io/papers/9.pdf
The notebook VRGC.ipynb requires the data that can be downloaded here.
We address the problem of graph classification based only on structural information. Inspired by natural language processing techniques (NLP), our model sequentially embeds information to estimate class membership probabilities. Besides, we experiment with NLP-like variational regularization techniques, making the model predict the next node in the sequence as it reads it. We experimentally show that our model achieves state-of-the-art classification results on several standard molecular datasets. Finally, we perform a qualitative analysis and give some insights on whether the node prediction helps the model better classify graphs.
Multi-task learning is a powerful leverage to learn rich representation in NLP [1]. We propose to use it for our problem.
We use a BFS node ordering procedure to transform graph into sequence of nodes as in [2].
Figure 2: Example of a BFS node ordering.
Each node is only related to its two closest neighbors in the order of the BFS to get a low dimensional sequence of nodes.
Figure 3: Example of a BFS node ordering.
Each node embedding contains both current and previous node BFS-related adjacency information thanks to RNN memory structure.
Figure 4: Example of a BFS node ordering.
The fully connected (FC) classifier is fed with sequence of the truncated BFS-ordered embedded node sequence.
Figure 5: Recurrent classifier for for sequence classification.
A node prediction task is added to help the classifier. The task is performed by a variational autoencoder feed with the same sequence of embedded nodes than the recurrent classifier.
Figure 6: Variational autoregressive node prediction.
- VRGC is not structurally invariant to node indexing, like in our previous graph classification work. However our model learns node indexing invariance from numerous training iterations on randomly-rooted BFS-ordered sequential graph embedding.
Figure 7: TSNE projection of the latent state preceding classification for five graphs from four distinct classes, each initiated with 20 different BFS. Colors and markers represent the respective classes of the graphs.
- VAR helps the model finding a more meaningful latent representation for classification while graph dataset becomes larger, with marginal extra computational cost with respect to RNNs.
Figure 8: Classification results.
@article{pineau2019variational,
title={Variational recurrent neural networks for graph classification},
author={Pineau, Edouard and de Lara, Nathan},
journal={Representation Learning on Graphs and Manifolds Workshop (ICLR)},
year={2019}
}
All datasets can be found here: https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets
[1] V. Sanh, T. Wolf, and S. Ruder. A hierarchical multi-task approach for learning embeddings from semantic tasks. arXiv preprint arXiv:1811.06031, 2018.
[2] GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models, Jiaxuan You, Rex Ying, Xiang Ren, William Hamilton, Jure Leskovec ; Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5708-5717, 2018.