Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition bug with deferred credentials #1405

Merged
merged 3 commits into from
Mar 19, 2018

Conversation

kyleknap
Copy link
Contributor

@kyleknap kyleknap commented Mar 14, 2018

Related to: aws/aws-cli#3194

The issue occurred for the following scenario:

  1. One thread secures the lock and goes through the initial refresh process
  2. The thread with the lock sets the self._access_key and self._secret_key, but not yet self._frozen_credentials
  3. The other threads call refresh_needed() and notice that since self._access_key and self._secret_key are set and the credential are not expired. As a result, they just return self._frozen_credentials.
  4. However since self._frozen_credentials starts off as None and since self._frozen_credentials have not been set by the thread with the lock, the DeferredRefreshableCredentials returns None which causes the threads in this to raise a NoCredentialsError for those threads.

To fix this, the refresh_needed() was updated to use self._frozen_credentials to check if credentials have been set.

I added a test that confirmed this was an issue, and I ran it with the fix in place. I ran it in a loop (about 100 iterations) and I would see the test would fail about 10% of the time. With the fix, the test never failed no matter how many times I would try it.

The issue occurred for the following scenario:

1) One thread secures the lock and goes through the initial refresh process
2) The thread with the lock sets the self._access_key and self._secret_key,
   but not yet self._frozen_credentials
3) The other threads call refresh_needed() and notice that since
   self._access_key and self._secret_key are set and the credential are
   not expired. As a result, they just return self._frozen_credentials.
4) However since self._frozen_credentials starts off as None and since
   self._frozen_credentials have not been set by the thread with the lock,
   the DeferredRefreshableCredentials returns None which causes the threads
   in this to raise a NoCredentialsError for those threads.

To fix this, the refresh_needed() was updated to use
self._frozen_credentials to check if credentials have been set.
@codecov-io
Copy link

codecov-io commented Mar 14, 2018

Codecov Report

Merging #1405 into develop will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #1405   +/-   ##
========================================
  Coverage    80.61%   80.61%           
========================================
  Files           87       87           
  Lines        12123    12123           
========================================
  Hits          9773     9773           
  Misses        2350     2350
Impacted Files Coverage Δ
botocore/credentials.py 98.8% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1048fc4...af4e165. Read the comment docs.

Copy link
Contributor

@JordonPhillips JordonPhillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@joguSD
Copy link
Contributor

joguSD commented Mar 15, 2018

If I'm understanding correctly the test added will only fail if we regress ~10% of the time? I'm not sure how I feel about that.

@kyleknap
Copy link
Contributor Author

So I was thinking about that too. I could maybe run this in a for loop? Increasing the thread count does not really help as I tried spinning up hundreds of threads and getting it to fail was still spotty. In general, these stress testing/race condition ones can be difficult to alarm 100% of the time and we catch these because we run the tests very often across different python versions.

@kyleknap kyleknap merged commit 5e044f7 into boto:develop Mar 19, 2018
@kyleknap kyleknap deleted the race-cond-bug branch March 19, 2018 20:17
eaon added a commit to freedomofpress/securedrop that referenced this pull request Jul 19, 2023
boto3 1.6.12 is the first to update its botocore dependency which in turn
fixes boto/botocore#1405
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants