Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[example] Create EEG-GCNN example. #3186

Merged
merged 29 commits into from
Aug 11, 2021
Merged

[example] Create EEG-GCNN example. #3186

merged 29 commits into from
Aug 11, 2021

Conversation

JOHNW02
Copy link
Contributor

@JOHNW02 JOHNW02 commented Jul 26, 2021

Description

This is an EEG-GCNN example using DGL library. The original code using pytorch_geometric can be found here.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the my best knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 26, 2021

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@mufeili mufeili self-requested a review July 27, 2021 02:34
This DGL example implements the EEG-GCNN model proposed in the paper [EEG-GCNN](https://arxiv.org/abs/2011.12107). The original code is [here](https://github.com/neerajwagh/eeg-gcnn).

## All References
- Paper can also be found on [PMLR](http://proceedings.mlr.press/v136/wagh20a.html).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one the same as the ArXiv version? If so, we can probably remove it and just mention it in the citation entry at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Mufei,

This is a simplified version. The original code has precomputed patient indices. In this example, I removed the dependency on the indices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks. Then perhaps you can mention explicitly that this is a simplified version.

First, download the precomputed data, labels, indices and put them in the repo. <br>
Then run
```python
python main.py --num_feats --num_nodes --gpu_idx --num_epochs --exp_name --batch_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to add --num_feats --num_nodes --gpu_idx --num_epochs --exp_name --batch_size as you have a default value for them.

num_feats = args.num_feats

# set up input and targets from files
memmap_x = f'norm_X'
Copy link
Member

@mufeili mufeili Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the preprocessed dataset here, which does not seem to have the file norm_X.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't updated Figshare yet. We were dealing with a bug in the original code. So, some files need to be updated. We will do that next month, including norm_X. Also, norm_X is the normalized version of psd_featureds_data_X.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we will need to wait for the files to be updated before we merge the PR.

@mufeili
Copy link
Member

mufeili commented Jul 27, 2021

Can you add an entry for this example in the indexing page here?

| DGL | AUC | Precision | Recall | F-1 | Bal. Accuracy |
|-------------------|-------------|---------------|--------------|--------------|---------------|
| Shallow EEG-GCNN | 0.875(0.036)| 0.980(0.013) | 0.735(0.055) | 0.839(0.035) | 0.811(0.035) |
| Deep EEG-GCNN | 0.890(0.004)| 0.988(0.004) | 0.723(0.035) | 0.834(0.022) | 0.829(0.005) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can a user reproduce the numbers here using python main.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they can. We are planning to update the original repo with both pytorch_geometric and dgl implementations, and release the trainning code as well. This should be done sometime next month too. For this example, can I just point people to the original repo for reproducing the stats?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect a DGL example to be self-contained and able to reproduce the results reported in README. In that case, either we can wait till you get everything ready for this PR or we can simply close the PR and add a pointer to your own repo once you get everything ready there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let me talk to Neeraj about this.

Copy link

@neerajwagh neerajwagh Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mufeili, maybe the following might clarify the design of the PR:

  1. We built the PR with the following assumption: that people can go to https://github.com/neerajwagh/eeg-gcnn for reproducing published results on the original data while the DGL example here would enable others to bring their own datasets (X, y) and train our model using the data they care about. Compared to our original PyT implementation, here we have skipped: a) subject-level data splitting, b) subject-level aggregate evaluation, c) cross-validation. This makes the example a "simplified" version and therefore will not be able to exactly reproduce published results. Nevertheless, we can update main.py and the results table so that main.py can reproduce the new "simplified" table made from a simpler model evaluation scheme (as long as users pull X, y from FigShare). We can close the PR after that change while keeping a pointer to the PyT repo for the full/"complex" implementation. We'll mention this simplification in the README so users are aware of the difference.

  2. We can fix the absence of norm_X in FigShare by adding normalization after X is read in the code. (instead of updating the uploaded FigShare files).

Let me know if that works for you.

Copy link

@neerajwagh neerajwagh Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meanwhile, the updates we make to our original PyT repo will remain independent of this example and this example will remain self-contained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @neerajwagh, the proposal sounds good to me in general. To be more specific, this example does not need to fully reproduce the experiments reported in your paper and I'm good as long as a user can run the script following the instructions in the README file and get similar numbers reported there.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mufeili Sounds good! We'll make changes accordingly then. Thanks.


### Contact

- Issues regarding non-reproducibility of results or support with the codebase should be emailed to _wei33@illinois.edu_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

emailed to John(wei33@illinois.edu)

You may also contact the authors:

@mufeili
Copy link
Member

mufeili commented Aug 10, 2021

Thank you for the great job! This PR is ready to merge once we hear back from Neeraj and resolve the last conversation.

Copy link
Member

@mufeili mufeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mufeili mufeili merged commit 738b75f into dmlc:master Aug 11, 2021
BarclayII pushed a commit that referenced this pull request Aug 26, 2021
* Create EEG-GCNN example.

* Update README.md

* Remove gitignore file.

* Update README.md

* change 'datas' to 'datasets'.

* Change train.py to main.py

* Added an entry in the indexing page.

* State "simplified version"; change how to run.

* Fix bug in contact

* Remove paper link in reference.

* Create working branch

* Add normalization of x.

* Update paper link and tags

* Update paper link in readme

* Update readme; add patient level indices

* Update readme. Add comments to models

* Update README.md

* change to with; specify location for ch and el; move note

* fix bug for note

* Add args for models; clean code.

* delete = in readme

* Add reference for spec_coh_values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants