-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modified code for using ROCm backened within the PyTorch framework #1918
Conversation
mmcv/ops/csrc/pytorch/info.cpp
Outdated
@@ -4,15 +4,22 @@ | |||
#include "pytorch_cpp_helper.hpp" | |||
|
|||
#ifdef MMCV_WITH_CUDA | |||
#ifndef HIP_DIFF | |||
#ifdef MMCV_WITH_HIP | |||
#include <cuda_runtime_api.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include hip runtime here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I thought about that when I added this code. So it's fine because hipify will change this header to the hip runtime header. Though, since it's just below "#if MMCV_WITH_HIP" having "#include <hip/hip_runtime_api.h> explicitly is less confusing. I'll make that change now!
Can you pass all the unit test? pytest tests/test_ops
|
It looks like those tests ran fine on my end. I'm only seeing quite a few warnings like the following: DeprecationWarning: "out_size" is deprecated in `RoIAlignRotated.__init__`, please use "output_size" instead
'instead', DeprecationWarning) Here is my summary line running 131 passed, 35 skipped, 94 warnings in 55.46s Note the particular tests you mentioned seem to be fine... |
Sure, I see. Maybe there are something wrong with my env. I will let someone else have a try. |
My env is gfx906, ROCm 4.0.0, PyTorch 1.8.0. |
Can you inspect the core dumped file? Also, can you attempt to use a newer version of ROCm in a container? Hopefully it's not a driver issue... I see #1704 used this version of ROCm and this work was merged into main. However, this code wouldn't compile on the now supported / newer version of ROCm so that's what this PR is attempting to address. May have a versioning requirement which is not ideal, unfortunately. |
Hi there, Why is the unit tests on hold ? |
Sorry, my old ROCm environment is not available anymore. I will find a new one and continue this review ASAP. |
Looks like green checkmark :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still meet the same error. But as long as the Rocm support is broken now, we can merge this and fix the random coredump in the future.
Great! Look forward to the merge 👍 |
Hi @zstreet87 !First of all, we want to express our gratitude for your significant PR in the mmcv project. Your contribution is highly appreciated, and we are grateful for your efforts in helping improve this open-source project during your personal time. We believe that many developers will benefit from your PR. We would also like to invite you to join our Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us :https://discord.gg/raweFPmdzG If you have WeChat,welcome to join our community on WeChat. You can add our assistant :openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends:) |
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Related issue: #1898
This PR enables HIP as a backened for pytorch only. It utilizes hipify to generate the HIP source and include folders via JIT compilation within ROCm pytorch. I tested these changes with multiple docker containers, e.g., 'rocm/pytorch:rocm5.0_ubuntu18.04_py3.7_pytorch_1.10.0'.
Modification
BC-breaking (Optional)
N/A
Use cases (Optional)
Using MMCV with the pytorch framework on AMD GPUs.
Checklist
Before PR:
After PR: