Skip to content

Commit

Permalink
Fix assertion error by doing comparison with mutex (#9717)
Browse files Browse the repository at this point in the history
Summary:
On CircleCI MacOS instances, we have been seeing the following assertion error:
```
Assertion failed: (alive_log_files_tail_ == alive_log_files_.rbegin()), function WriteToWAL, file /Users/distiller/project/db/db_impl/db_impl_write.cc, line 1213.
Received signal 6 (Abort trap: 6)
#0   0x1
#1   abort (in libsystem_c.dylib) + 120
#2   err (in libsystem_c.dylib) + 0
#3   rocksdb::DBImpl::WriteToWAL(rocksdb::WriteBatch const&, rocksdb::log::Writer*, unsigned long long*, unsigned long long*, rocksdb::Env::IOPriority, bool, bool) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:1213)
#4   rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long long*, bool, bool, unsigned long long) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:1251)
#5   rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long long*, unsigned long long, bool, unsigned long long*, unsigned long, rocksdb::PreReleaseCallback*) (in librocksdb.7.0.0.dylib) (db_impl_	rite.cc:421)
#6   rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:109)
#7   rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:2159)
#8   rocksdb::DBImpl::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:37)
#9   rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) (in librocksdb.7.0.0.dylib) (db.h:382)
#10  rocksdb::DBBasicTestWithTimestampPrefixSeek_IterateWithPrefix_Test::TestBody() (in db_with_timestamp_basic_test) (db_with_timestamp_basic_test.cc:2926)
#11  void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3899)
#12  void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3935)
#13  testing::Test::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:3980)
#14  testing::TestInfo::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:4153)
#15  testing::TestCase::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:4266)
#16  testing::internal::UnitTestImpl::RunAllTests() (in db_with_timestamp_basic_test) (gtest-all.cc:6632)
#17  bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3899)
#18  bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3935)
#19  testing::UnitTest::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:6242)
#20  RUN_ALL_TESTS() (in db_with_timestamp_basic_test) (gtest.h:22110)
#21  main (in db_with_timestamp_basic_test) (db_with_timestamp_basic_test.cc:3150)
#22  start (in libdyld.dylib) + 1
```

It's likely caused by concurrent, unprotected access to the deque, even though `back()` is never popped,
and we are comparing `rbegin()` with a cached `riterator`. To be safe, do the comparison only if we have mutex.

Pull Request resolved: #9717

Test Plan:
One example
Ssh to one CircleCI MacOS instance.
```
gtest-parallel -r 1000 -w 8 ./db_test --gtest_filter=DBTest.FlushesInParallelWithCompactRange
```

Reviewed By: pdillinger

Differential Revision: D34990696

Pulled By: riversand963

fbshipit-source-id: 62dd48ae6fedbda53d0a64d73de9b948b4c26eee
  • Loading branch information
riversand963 authored and ajkr committed Mar 29, 2022
1 parent 00191c8 commit aadbf86
Showing 1 changed file with 0 additions and 10 deletions.
10 changes: 0 additions & 10 deletions db/db_impl/db_impl_write.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1119,18 +1119,8 @@ IOStatus DBImpl::WriteToWAL(const WriteBatch& merged_batch,
*log_used = logfile_number_;
}
total_log_size_ += log_entry.size();
#if defined(__has_feature)
#if __has_feature(thread_sanitizer)
if (with_db_mutex || with_log_mutex) {
#endif // __has_feature(thread_sanitizer)
#endif // defined(__has_feature)
assert(alive_log_files_tail_ == alive_log_files_.rbegin());
#if defined(__has_feature)
#if __has_feature(thread_sanitizer)
}
#endif // __has_feature(thread_sanitizer)
#endif // defined(__has_feature)
if (with_db_mutex || with_log_mutex) {
assert(alive_log_files_tail_ != alive_log_files_.rend());
}
LogFileNumberSize& last_alive_log = *alive_log_files_tail_;
Expand Down

0 comments on commit aadbf86

Please sign in to comment.