Skip to content

ShawXh/DeepWalk-dgl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Notification

This is a test ropo. The released version is here.

DeepWalk

The implementation includes multi-processing training with CPU and mixed training with CPU and multi-GPU.

Dependencies

  • PyTorch 1.0.1+

Tested version

  • PyTorch 1.5.0
  • DGL 0.4.3

How to run the code

Format of a network file:

1(node id) 2(node id)
1 3
...

To run the code:

python3 deepwalk.py --net_file net.txt --emb_file emb.txt --adam --mix --lr 0.2 --num_procs 4 --batch_size 100 --negative 5

How to save the embedding

Functions:

SkipGramModel.save_embedding(dataset, file_name)
SkipGramModel.save_embedding_txt(dataset, file_name)

Evaluation

To evalutate embedding on multi-label classification, please refer to here

YouTube (1M nodes).

Implementation Macro-F1 (%)
1%    3%    5%    7%    9%
Micro-F1 (%)
1%    3%    5%    7%    9%
gensim.word2vec(hs) 28.73   32.51   33.67   34.28   34.79 35.73   38.34   39.37   40.08   40.77
gensim.word2vec(ns) 28.18   32.25   33.56   34.60   35.22 35.35   37.69   38.08   40.24   41.09
ours 24.58   31.23   33.97   35.41   36.48 38.93   43.17   44.73   45.42   45.92

The comparison between running time is shown as below, where the numbers in the brackets denote time used on random-walk.

Implementation gensim.word2vec(hs) gensim.word2vec(ns) Ours
Time (s) 27119.6(1759.8) 10580.3(1704.3) 428.89

Parameters.

  • walk_length = 80, number_walks = 10, window_size = 5
  • Ours: 4GPU (Tesla V100), lr = 0.2, batchs_size = 128, neg_weight = 5, negative = 1, num_thread = 4
  • Others: workers = 8, negative = 5

Speeding-up with mixed CPU & multi-GPU. The used parameters are the same as above.

#GPUs 1 2 4
Time (s) 1419.64 952.04 428.89

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages