Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to solve the error(4103) when profling LLM training with MI250? #119

Open
lingjiew93 opened this issue Jul 24, 2023 · 1 comment
Open

Comments

@lingjiew93
Copy link

I'm running LLM training with MI250. The instruction and code I used are https://www.mosaicml.com/blog/amd-mi250 and https://github.com/mosaicml/llm-foundry
It runs well without profiling, but when I tried to profile below errors are showd.
error(4103) "InterceptQueueCreate(), ProxyQueue::Create()"
HSA_STATUS_ERROR_INVALID_QUEUE: The queue is invalid.

@lingjiew93 lingjiew93 changed the title How to solve the error(4103)? How to solve the error(4103) when profling LLM training with MI250? Jul 24, 2023
@harkgill-amd
Copy link

Hi @lingjiew93, apologies for the lack of response. Are you still experiencing this issue with the latest ROCm 6.2.0 release? If so, could you please provide the steps to reproduce this issue including the command ran to introduce profiling?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants