Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asking for the numerical dataset of FB15k-num, FB15k-237-num (KBLRN paper) #2

Open
phucty opened this issue Jul 6, 2019 · 5 comments

Comments

@phucty
Copy link

phucty commented Jul 6, 2019

Could you please share the FB15k-num, FB15k-237-num used in the paper of "KBLRN: End-to-End Learning of Knowledge Base Representation with Latent, Relational, and Numerical Features"?

I tried to reproduce the experiment result of Table 4 in this paper, but I can not create the same set of valid and testing data as Table 1. (".. where numerical features are never used for the triples.")

Thank you very much and I am looking forward to hearing from you.

Best regards,

Phuc

@phucty
Copy link
Author

phucty commented Jul 7, 2019

Regarding the numerical data, the stats mentioned in the paper is "This resulted in 116 different numerical features and 12,826 entities".
It seems that the [File 1] in your GitHub repo only has 12,493 entities. Is it the [File 1] was used in this paper?
[File 1]: https://github.com/nle-ml/mmkb/blob/master/FB15K/FB15K_NumericalTriples.txt

@AGDuran
Copy link
Collaborator

AGDuran commented Jul 8, 2019

Hi Phuc,

i) you are right. Actually there are numerical features for 12,493 entities, and not 12,826. I had a dictionary of that size, but only 12,493 entities had values for this type of features.

ii) The only difference of FB15k-num and FB15k-237-num with respect to the normal versions is the validation and test set. There, we only perform link prediction in those triples where both head and tail have numerical attributes. We wanted to isolate the effect of the numerical expert, and thus we wanted to perform link prediction in that subset. However, the training set is the standard one.

Alberto

@phucty
Copy link
Author

phucty commented Jul 8, 2019

Hi Alberto,
Thank you for your respond.

Regarding ii), could you share the subset of validation and test set of FB15k-num, and FB15k-237-num? I mean the validation set and testing set.
FB15k: Valid: 5156 triples, Test: 6012 triples
FB15k-237-num: Valid: 1058 triples, Test: 1215 triples

I did remove those triples as you mentioned on the validation and testing of FB15k-237, and FB15k, but I seem that I got a different list of triples.
The following is the FB15k-237-num triples:
What I got: Valid: 9384 triples, Test: 10934 triples

Thank you.
Phuc

@AGDuran
Copy link
Collaborator

AGDuran commented Jul 9, 2019

Hi,

note that to keep a triple both head and tail need to have at least one numerical feature in common.

Please find attached the test set of FB15k-num and FB15k-237-num. Sorry but I dont find the validation files right now... I moved outside NEC and it is hard now for me to locate these files.

Alberto

NF_15ktest_triples.txt
NF_15k237test_triples.txt

@phucty
Copy link
Author

phucty commented Jul 9, 2019

Hi Alberto,

It is great. (Test file is OK for me to run the verify experiments)

Thank you very much for your time and effort.

Phuc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants