-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] x_dot_x builtin kernel support #831
Conversation
Add noting that the PinSage model example under example/pytorch/recommendation only work with Python 3.6+ as its dataset loader depends on stanfordnlp package which work only with Python 3.6+.
…ide. 1. make dgl.nn.xxx frame agnostic 2. make test.backend include dgl.nn modules 3. modify test_edge_softmax of test/mxnet/test_nn.py and test/pytorch/test_nn.py work on both CPU and GPU
1. clear all agnostic related code in dgl.nn 2. make test_graph_conv agnostic to cpu/gpu
work on both CPU and GPU.
Add base control flow code.
TODO: 1. make sure x_add_x, x_sub_x, x_mul_x, x_div_x work 2. let x_dot_x work 3. make sure backward of x_add_x, x_sub_x, x_mul_x, x_div_x work 4. let x_dot_x backward work
MXNet CI test may have some problem due to adapting to new version. I'll try to fix this tomorrow. |
…forward and backward
@jermainewang , I've verified the correctness with STT. The GPU memory footprint is about the same but builtin function is 2x slower than my custom kernels(node parallel strategy). |
Backward is still slow for dot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this PR.
static DGLDEVICE DGLINLINE DType Call(DType *lhs, DType *rhs, int64_t len) { | ||
DType out = 0; | ||
// simple vector dot vector | ||
#pragma unroll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a pragma unroll in the graph level in minigun. Nested pragma unroll usually does not give benefit. Consider remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Description
Adding support for u_dot_v, u_dot_e, v_dot_e, v_dot_u, e_dot_u and e_dot_v as builtin kernel.
Implementation will base on current binary_reduce structure.
#659
Tasks
Checklist
Please feel free to remove inapplicable items for your PR.
or have been fixed to be compatible with this change
Changes