-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU #39437
Commits on Jan 14, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 4c7ee94 - Browse repository at this point
Copy the full SHA 4c7ee94View commit details -
1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue. 2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2. 2. Act currently only be supported ReLU. (Will add GeLU in the future).
Configuration menu - View commit details
-
Copy full SHA for a82c0a8 - Browse repository at this point
Copy the full SHA a82c0a8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 26e6411 - Browse repository at this point
Copy the full SHA 26e6411View commit details
Commits on Jan 17, 2022
-
1. Added LinearAct into graph_pattern_detector.* to define (2.)'s pattern. 2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)). 3. act currently only support ReLU (Will support GeLU in the future).
Configuration menu - View commit details
-
Copy full SHA for 41b701a - Browse repository at this point
Copy the full SHA 41b701aView commit details -
1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU} fusion (GeLU will be supported in the future). 2. Only support matmul_v2 from nn.Linear.
Configuration menu - View commit details
-
Copy full SHA for 6349809 - Browse repository at this point
Copy the full SHA 6349809View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0c0f48 - Browse repository at this point
Copy the full SHA a0c0f48View commit details
Commits on Jan 19, 2022
-
Configuration menu - View commit details
-
Copy full SHA for cb1f790 - Browse repository at this point
Copy the full SHA cb1f790View commit details -
GeLU support and EpilogueSingleton
1. Added GeLU support to fused_gemm_epilogue op. 2. Added EpilogueSingleton to cache auxiliary pointer. 3. Added related UTs.
Configuration menu - View commit details
-
Copy full SHA for f001541 - Browse repository at this point
Copy the full SHA f001541View commit details -
Configuration menu - View commit details
-
Copy full SHA for 51e6a36 - Browse repository at this point
Copy the full SHA 51e6a36View commit details -
Added both train and infer pattern to LinearAct.
1. Added support of fwd graph with grap_ops linking to LinearAct. 2. Added related changes to fuse_gemm_epilogue_pass for above modification.
Configuration menu - View commit details
-
Copy full SHA for 2c24ad7 - Browse repository at this point
Copy the full SHA 2c24ad7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6919ce7 - Browse repository at this point
Copy the full SHA 6919ce7View commit details -
Configuration menu - View commit details
-
Copy full SHA for a65ab08 - Browse repository at this point
Copy the full SHA a65ab08View commit details -
Added Linear Fusion (matmul_v2 + ele_add)
1. Added matmul_v2 + ele_add pattern to LinearActPattern. 2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.
Configuration menu - View commit details
-
Copy full SHA for 1b7541b - Browse repository at this point
Copy the full SHA 1b7541bView commit details -
Configuration menu - View commit details
-
Copy full SHA for ac1a8ca - Browse repository at this point
Copy the full SHA ac1a8caView commit details
Commits on Jan 21, 2022
-
Add fused_gemm_epilogue_grad op.
1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.
Configuration menu - View commit details
-
Copy full SHA for 9cdf442 - Browse repository at this point
Copy the full SHA 9cdf442View commit details -
Configuration menu - View commit details
-
Copy full SHA for fbda512 - Browse repository at this point
Copy the full SHA fbda512View commit details -
Configuration menu - View commit details
-
Copy full SHA for 64a43ea - Browse repository at this point
Copy the full SHA 64a43eaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0369fb4 - Browse repository at this point
Copy the full SHA 0369fb4View commit details
Commits on Jan 25, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 88c9ecb - Browse repository at this point
Copy the full SHA 88c9ecbView commit details
Commits on Jan 26, 2022
-
Fuse backward of Linear( Act(x))
1. Added backward fusion pass to Linear( Act(x)). 2. Added backward fusion pass to Linear(x).
Configuration menu - View commit details
-
Copy full SHA for 009eea2 - Browse repository at this point
Copy the full SHA 009eea2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a8076a9 - Browse repository at this point
Copy the full SHA a8076a9View commit details
Commits on Jan 28, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 1268d48 - Browse repository at this point
Copy the full SHA 1268d48View commit details
Commits on Feb 8, 2022
-
Configuration menu - View commit details
-
Copy full SHA for dbed64f - Browse repository at this point
Copy the full SHA dbed64fView commit details -
Modify code with review comments.
1. Made arguments of some function pass by reference. 2. Removed redundant code. 3. Followed Google code style to change code.
Configuration menu - View commit details
-
Copy full SHA for d8a862e - Browse repository at this point
Copy the full SHA d8a862eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 54a8588 - Browse repository at this point
Copy the full SHA 54a8588View commit details -
Configuration menu - View commit details
-
Copy full SHA for 06f4240 - Browse repository at this point
Copy the full SHA 06f4240View commit details
Commits on Feb 10, 2022
-
1. Modified way to get cublasLt handler in device_context to be consistent with last changes in develop.
Configuration menu - View commit details
-
Copy full SHA for fba452e - Browse repository at this point
Copy the full SHA fba452eView commit details
Commits on Feb 11, 2022
-
Set Compiling constrains to cuBlasLt
1. Require CUDA 11.6+ 2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.
Configuration menu - View commit details
-
Copy full SHA for fe8a560 - Browse repository at this point
Copy the full SHA fe8a560View commit details
Commits on Feb 18, 2022
-
1. Changed arguments name is_first_gemm to without_x_gradient for clearing. 2. Applied PADDLE_THROW in fused_gemm_epilogue_op.
Configuration menu - View commit details
-
Copy full SHA for dcdab08 - Browse repository at this point
Copy the full SHA dcdab08View commit details
Commits on Feb 22, 2022
-
1. Applied ReserveSpace to replace Epilogue for passing auxiliary pointers between FWD and BWD.
Configuration menu - View commit details
-
Copy full SHA for 02c007f - Browse repository at this point
Copy the full SHA 02c007fView commit details -
Fix a logical error and enhance UTs.
1. Added act op count checking in UTs. 2. Fix issue to fuse backward or ReLU(Linear(X)). 3. TODO: solve GELU fusion issues.
Configuration menu - View commit details
-
Copy full SHA for 84fd06a - Browse repository at this point
Copy the full SHA 84fd06aView commit details
Commits on Feb 23, 2022
-
Fix Linear and GeLU fusion issues.
1. Modified graph_detech_pattern to fit with both linear wiht gelu or relu. 2. Modified data range in Uts to allow negative values.
Configuration menu - View commit details
-
Copy full SHA for 30b20da - Browse repository at this point
Copy the full SHA 30b20daView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1510a96 - Browse repository at this point
Copy the full SHA 1510a96View commit details -
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
… cublaslt_epilogue
Configuration menu - View commit details
-
Copy full SHA for a421be8 - Browse repository at this point
Copy the full SHA a421be8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2768d2a - Browse repository at this point
Copy the full SHA 2768d2aView commit details
Commits on Mar 1, 2022
-
Rename name of arguments in fused_gemm_epilogue_op
1. bias -> Bias. 2. out -> Out. 3. reserve_space -> ReserveSpace.
Configuration menu - View commit details
-
Copy full SHA for 3a27015 - Browse repository at this point
Copy the full SHA 3a27015View commit details -
Change EpiloguePassActivationCache as local variable.
1. Removed singleton in EpiloguePassActivationCache. 2. Made EpiloguePassActivationCache as an argument to each pass functions.
Configuration menu - View commit details
-
Copy full SHA for 2f23475 - Browse repository at this point
Copy the full SHA 2f23475View commit details -
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
… cublaslt_epilogue
1Configuration menu - View commit details
-
Copy full SHA for 5c47882 - Browse repository at this point
Copy the full SHA 5c47882View commit details