-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating COSI user causes the object store reconcile to fail several times before finally succeeding #13904
Comments
Even rgw pod is up and running it may take few seconds to be ready, hence this request is failing IMO. May be something like this need to be added before the creating the cosi user |
45s is a "long" time to wait since the rgw pod is already up. We need to understand why so long and if anything can be improved with this. In the normal install flow we want to avoid reconcile failures if possible. I believe other users can be created immediately after that without waiting this long, but will check for sure... |
Objectstore controller creates cosi user before objectstore is ready, this create unecessary errors logs mentioning cosi user failed to create. Fixes: rook#13904 Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Objectstore controller creates cosi user before objectstore is ready, it will take sometime to rgw server will up and be ready receive requests via restapi. So creating cosi will fail until rgw is ready. But other users like adminops and dashboard are created with help of `radosgw-admin` command and never fails. So use the same approach for cosi user. Fixes: rook#13904 Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
I am running ceph object store with hdd drives, named s3. |
@huv95 Does the object store reconcile never succeed? Or you see that message continuously, and even after restarting the operator pod? If it is continuous, there must be some other error configuring the object store. For example, do you have at least three OSDs on different nodes? What does |
Is this a bug report or feature request?
Deviation from expected behavior:
The COSI user is created with each object store creation. After the object store creation is completed, the controller attempts to create the COSI user and fails the reconcile. After repeated reconciles, finally on the fifth reconcile and a total of 45 seconds, the COSI user is created. This timing is very consistent in my minikube environment every time I create a test object store.
Here are the failure logs that cause the reconcile to restart, then finally succeed. See attached full operator.log
Expected behavior:
In the common case, the COSI user should be created successfully without causing so many retries. If there is a known reason the user cannot be created for some time after the object store is created, let's add a check for that condition in the reconcile instead of filling the logs with failed reconciles.
How to reproduce it (minimal and precise):
The text was updated successfully, but these errors were encountered: