Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix readers cache truncation deadlock #5681

Commits on Jul 27, 2022

  1. s/readers_cache: made readers_cache::range_lock_holder moveable

    Fixed move constructor and operator of
    `readers_cache::range_lock_holder`. Previously moving the
    `range_lock_holder` would result in lock being released.
    
    Signed-off-by: Michal Maslanka <michal@redpanda.com>
    mmaslankaprv committed Jul 27, 2022
    Configuration menu
    Copy the full SHA
    cafa7dd View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2022

  1. s/readers_cache: fixed incorrect condition checking if range is locked

    Fixed condition checking if range is locked. Incorrect check resulted in
    situations in which log truncation was blocked by pending readers.
    
    Log truncation is multi step process that ends with grabbing necessary
    segments write logs and data deletion. In order for truncation to grab
    the write lock all readers which own read lock units from range being
    truncated must be evicted from `readers_cache`. Since log truncation
    contains multiple scheduling points it may interleave with another fiber
    creating a reader for log that is currently being truncated. This reader
    MUST not be cached as truncation would need to wait for it to be
    evicted. Additionally no new readers can be created as truncation
    related write lock request is waiting in the `read_write_lock`
    underlying semaphore waiters queue.
    
    In order to prevent readers requesting truncated range from being cached
    readers cache maintain list of locked ranges i.e. ranges for which
    readers can not be cached.
    
    Previously an incorrect condition checking if reader belongs to locked
    range allow it to be cached preventing the `truncate` action to
    continue. This stopped all other writes and truncation for 60 seconds,
    after this duration the reader was evicted from the cache, its lease was
    released and trucation was able to finish.
    
    Fixed incorrect condition checking if reader is within the locked
    range.
    
    Fixes: redpanda-data#5510
    
    Signed-off-by: Michal Maslanka <michal@redpanda.com>
    mmaslankaprv committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    c230802 View commit details
    Browse the repository at this point in the history
  2. s/tests: added readers_cache range locking test

    Signed-off-by: Michal Maslanka <michal@redpanda.com>
    mmaslankaprv committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    1c31f91 View commit details
    Browse the repository at this point in the history