Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryCacheSetAndRemoveTests.AddAndReplaceEntries_AreThreadSafe failed #50270

Closed
danmoseley opened this issue Mar 26, 2021 · 9 comments
Closed
Labels
area-Extensions-Caching help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@danmoseley
Copy link
Member

Console log: 'Microsoft.Extensions.Caching.Memory.Tests' from job 2811b660-60e1-4db5-bc89-e99bd328af79 (ubuntu.1604.amd64.open.rt) using docker image mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 on a0045W9
...
/root/helix/work/workitem /root/helix/work/workitem
  Discovering: Microsoft.Extensions.Caching.Memory.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  Microsoft.Extensions.Caching.Memory.Tests (found 91 of 98 test cases)
  Starting:    Microsoft.Extensions.Caching.Memory.Tests (parallel test collections = on, max threads = 2)
    Microsoft.Extensions.Caching.Memory.MemoryCacheSetAndRemoveTests.AddAndReplaceEntries_AreThreadSafe [FAIL]
      Assert.Equal() Failure
      Expected: 0
      Actual:   2
      Stack Trace:
        /_/src/libraries/Microsoft.Extensions.Caching.Memory/tests/MemoryCacheSetAndRemoveTests.cs(652,0): at Microsoft.Extensions.Caching.Memory.MemoryCacheSetAndRemoveTests.AddAndReplaceEntries_AreThreadSafe()
  Finished:    Microsoft.Extensions.Caching.Memory.Tests
=== TEST EXECUTION SUMMARY ===

https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-50230-merge-2811b66060e14db5bc/Microsoft.Extensions.Caching.Memory.Tests/console.29893e8c.log?sv=2019-07-07&se=2021-04-15T01%3A34%3A28Z&sr=c&sp=rl&sig=ikZz%2Fo9JRruRy5gBk1P0IeEpuYpzmcQqzCcp%2B3fntRk%3D

@dotnet-issue-labeler dotnet-issue-labeler bot added area-Extensions-Caching untriaged New issue has not been triaged by the area owner labels Mar 26, 2021
@ghost
Copy link

ghost commented Mar 26, 2021

Tagging subscribers to this area: @eerhardt, @maryamariyan, @michaelgsharp
See info in area-owners.md if you want to be subscribed.

Issue Details
Console log: 'Microsoft.Extensions.Caching.Memory.Tests' from job 2811b660-60e1-4db5-bc89-e99bd328af79 (ubuntu.1604.amd64.open.rt) using docker image mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 on a0045W9
...
/root/helix/work/workitem /root/helix/work/workitem
  Discovering: Microsoft.Extensions.Caching.Memory.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  Microsoft.Extensions.Caching.Memory.Tests (found 91 of 98 test cases)
  Starting:    Microsoft.Extensions.Caching.Memory.Tests (parallel test collections = on, max threads = 2)
    Microsoft.Extensions.Caching.Memory.MemoryCacheSetAndRemoveTests.AddAndReplaceEntries_AreThreadSafe [FAIL]
      Assert.Equal() Failure
      Expected: 0
      Actual:   2
      Stack Trace:
        /_/src/libraries/Microsoft.Extensions.Caching.Memory/tests/MemoryCacheSetAndRemoveTests.cs(652,0): at Microsoft.Extensions.Caching.Memory.MemoryCacheSetAndRemoveTests.AddAndReplaceEntries_AreThreadSafe()
  Finished:    Microsoft.Extensions.Caching.Memory.Tests
=== TEST EXECUTION SUMMARY ===

https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-50230-merge-2811b66060e14db5bc/Microsoft.Extensions.Caching.Memory.Tests/console.29893e8c.log?sv=2019-07-07&se=2021-04-15T01%3A34%3A28Z&sr=c&sp=rl&sig=ikZz%2Fo9JRruRy5gBk1P0IeEpuYpzmcQqzCcp%2B3fntRk%3D

Author: danmoseley
Assignees: -
Labels:

area-Extensions-Caching, untriaged

Milestone: -

@danmoseley
Copy link
Member Author

danmoseley commented Mar 26, 2021

This test sets a small max size then adds lots of items quickly so it stresses compaction. Compaction is by QUWI

ThreadPool.QueueUserWorkItem(s => OvercapacityCompaction((MemoryCache)s), this);

the test only gives it 7 seconds to run
https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Caching.Memory/tests/MemoryCacheSetAndRemoveTests.cs#L637
after which it sums the sizes of the items (non atomically) and compares with the cache's declared size.

However in this case, we summed the sizes of the individual entries and they came to zero, sizes are non negative so this means that if further entries (if any) are removed, the size cannot go up. Yet we then retrieved a cache.Size of 2. So apparently it's not as simple as the compaction/expiration happening during the measurements.

Note there is also non determinism from Expiration, which is performed on a Task, and this test sets it to run every possible time. When compaction runs, it marks entries it plans to remove as expired; when remove runs, it also marks them as expired. There is also another Task started in ExpirationTokensExpired but I think (?) that isn't involved in this test.

@danmoseley
Copy link
Member Author

It might be necessary to loop the test with some tracing to figure this out. In any case, the test is fragile as is, eg., if it fails because of compaction not being complete, it could wait 30 seconds and check again. That might also fix/mask whatever this issue is.

@danmoseley
Copy link
Member Author

Incidentally I wonder whether there are some more debug-only correctness asserts we should add into MemoryCache, because it's difficult to prove its correctness by just reading it (at least for me)

@danmoseley
Copy link
Member Author

There are many other reports in issues I duped against this. I see eg it's falling 0.03% of Alpine runs alone so significantly more PRs will be affected

@danmoseley
Copy link
Member Author

This test should be disable meantime..

@danmoseley
Copy link
Member Author

OK, closing as dupe of #50270

@eerhardt
Copy link
Member

@danmoseley - you are closing this as a dupe of itself?

@danmoseley
Copy link
Member Author

Oops, I meant #33993

@ghost ghost locked as resolved and limited conversation to collaborators Jul 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Extensions-Caching help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

No branches or pull requests

3 participants