Hitting assertion in 3.6.1 compaction code #365

ghost · 2014-10-24T11:04:18Z

This is not easy to reproduce, it happened after hours of stress

bt
#0 0x0000000000425bdb in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x0000000000599768 in abort ()
#2 0x0000000000593eb4 in __assert_fail_base ()
#3 0x0000000000593f0e in __assert_fail ()
#4 0x000000000048a71d in rocksdb::InternalKey::Encode (this=) at ./db/dbformat.h:146
#5 0x000000000049b6ed in Encode (this=) at db/version_set.cc:80
#6 Compare (b=..., a=..., this=) at ./db/dbformat.h:164
#7 BySmallestKey (cmp=, b=, a=) at db/version_set.cc:85
#8 operator() (this=, this=, f2=, f1=) at db/version_set.cc:1382
#9 _M_get_insert_unique_pos (__k=@0x7fab99d6c6f0: 0x7faa0bb0a1b0, this=0x7faab4c8d2f0) at /usr/include/c++/4.9/bits/stl_tree.h:1445
#10 std::_Rb_tree<rocksdb::FileMetaData*, rocksdb::FileMetaData*, std::_Identityrocksdb::FileMetaData*, rocksdb::VersionSet::Builder::FileComparator, std::allocatorrocksdb::FileMetaData* >::_M_insert_unique<rocksdb::FileMetaData* const&> (this=0x7faab4c8d2f0, __v=@0x7fab99d6c6f0: 0x7faa0bb0a1b0) at /usr/include/c++/4.9/bits/stl_tree.h:1498
#11 0x000000000049b995 in insert (__x=@0x7fab99d6c6f0: 0x7faa0bb0a1b0, this=) at /usr/include/c++/4.9/bits/stl_set.h:502
#12 rocksdb::VersionSet::Builder::Apply (this=this@entry=0x7faa0bb0af90, edit=0x7faab4527100) at db/version_set.cc:1538
#13 0x0000000000490167 in rocksdb::VersionSet::LogAndApplyHelper (this=this@entry=0xeeff80, cfd=cfd@entry=0xefa8a0, builder=builder@entry=0x7faa0bb0af90, v=v@entry=0x7faa0bb0a410, edit=, mu=mu@entry=0xeedd50) at db/version_set.cc:1948
#14 0x00000000004936d3 in rocksdb::VersionSet::LogAndApply (this=0xeeff80, column_family_data=0xefa8a0, mutable_cf_options=..., edit=0x7faab4527100, mu=mu@entry=0xeedd50, db_directory=0xefa760, new_descriptor_log=false, new_cf_options=0x0) at db/version_set.cc:1716
#15 0x000000000044a544 in rocksdb::DBImpl::InstallCompactionResults (this=this@entry=0xeedc20, compact=compact@entry=0x7faab4c8cdf0, mutable_cf_options=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:2633
#16 0x000000000045292a in rocksdb::DBImpl::DoCompactionWork (this=this@entry=0xeedc20, compact=compact@entry=0x7faab4c8cdf0, mutable_cf_options=..., deletion_state=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:3386
#17 0x0000000000454bca in rocksdb::DBImpl::BackgroundCompaction (this=this@entry=0xeedc20, madeProgress=madeProgress@entry=0x7fab99d6e1af, deletion_state=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:2391
#18 0x000000000045fbe7 in rocksdb::DBImpl::BackgroundCallCompaction (this=0xeedc20) at db/db_impl.cc:2168
#19 0x00000000004bcb68 in BGThread (thread_id=11, this=0xee9940) at util/env_posix.cc:1616
#20 rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThreadWrapper (arg=) at util/env_posix.cc:1633
#21 0x000000000041e4a5 in start_thread (arg=0x7fab99d6f700) at pthread_create.c:309
#22 0x00000000005f52b9 in clone ()

ghost · 2014-10-24T22:10:09Z

This smells like the same issue I reported in #359 but in this case it was a debug build.

igorcanadi · 2014-10-26T17:40:35Z

Can you c/p your RocksDB configuration? Are you using custom comparator?

igorcanadi · 2014-10-26T17:41:43Z

Can you also c/p rocksdb LOG file just before the crash?

cc @yhchiang this smells like an issue you debugged on Thursday.

ghost · 2014-10-26T21:09:02Z

I've found out the way to repro the crash. The problem happens when compaction code fails to open files due to too many open files. Unfortunately, Linux sets the maximum to 1024 by default (ulimit -n), while RocksDB default is max_open_files=5000.
By setting a higher value in ulimit the problem goes away and my stress test passes fine. On the other hand, setting it to something quite low like 50 makes RocksDB crash much more easily.
Besides fixing the crash, maybe the default value for RocksDB should be reconsidered.

fyrz · 2014-10-26T21:47:39Z

@dlezama its not linux. Every distribution can define this limit. But of course hitting a crash without getting significant info about what went wrong is not what someone would expect. At least with this one.

ghost · 2014-10-26T22:31:08Z

Sorry for oversimplifying the ulimit issue. The distributions I've used for development are Ubuntu based, and my deployment is in Scientific Linux, a quite different distribution. Both have it set to 1024 by default, because of that I'm guessing it's a common default, but you're totally right, other distributions may set it to different values.
In any case, it's impossible to know what other handles are open in the process, so maybe it's not worth it to change RocksDB default, but it shouldn't crash either :)

fyrz · 2014-10-26T22:37:41Z

@dlezama im not a fb guy so it's only a opinion :-) but from what i've seen usual behavior is that there is some sort of message "Can`t open file" or whatever.

igorcanadi · 2014-10-27T04:29:47Z

This definitely sounds like an issue that @yhchiang debugged few days ago.

@dlezama usually we should return "Can't open file" error, but it looks like there's something going wrong here. We're working on fixing it.

igorcanadi · 2014-10-27T04:31:30Z

@dlezama if it's easy for you to repro, can you please try reproing with RocksDB 3.5.1 or 3.4?

ghost · 2014-10-27T07:44:49Z

It does happen in 3.5.1, please see #359 for stack traces on how this looks like in that version and in release build (no assertion, but it crashes anyway).

igorcanadi · 2014-10-27T15:05:17Z

Thanks. Can you share your LOG file with us (you can find it in your rocksdb directory)? Stack traces only verify that there was a corruption. We need to find what caused the corruption.

ghost · 2014-10-27T17:05:48Z

https://drive.google.com/file/d/0B5b62mktIEdjdG91LVgzcmhyN2s/view?usp=sharing

igorcanadi · 2014-10-28T00:32:51Z

@dlezama I'm happy to say that @yhchiang was able to reproduce this and the fix is coming!

igorcanadi · 2014-10-29T01:15:31Z

This should be fixed with 3772a3d

Can you try reproducing now?

We'll probably be releasing 3.6.2 with this patch.

ghost · 2014-10-30T18:02:51Z

3.6.2 doesn't crash, but I'm getting kIOError "Too many open files" in a write right after I see this in the LOG:
2014/10/30-10:55:27.441455 7f8ce4968700 [default] Level-0 flush table #44: started
2014/10/30-10:55:27.441485 7f8ce4968700 [default] Level-0 flush table #44: 0 bytes IO error: csp.db/000044.sst: Too many open files
2014/10/30-10:55:27.441558 7f8ced9b2700 (Original Log Time 2014/10/30-10:55:27.441551) Compaction nothing to do
2014/10/30-10:55:27.814053 7f8ce4968700 Waiting after background flush error: IO error: csp.db/000044.sst: Too many open filesAccumulated background error counts: 1

Is this expected?

igorcanadi · 2014-10-30T18:40:21Z

What is your num_open_files? What is your write_buffer_size? What is number of files in rocksdb directory? What's your ulimit for number of files?

ghost · 2014-10-30T19:12:20Z

I used a very low ulimit and left max_open_files=5000, so I could reproduce the crash, that didn't happen in 3.6.2. It surprised me a little that a write operation can fail with "too many open files" since my understanding was that writes only needed to access memory and append to the log. So I just wanted to verify that this was an expected thing to happen.
In any case, I'll make sure that in a production environment ulimit is bigger than max_open_files and this won't be an issue.

igorcanadi · 2014-10-30T20:14:25Z

@dlezama as soon as we get any IO error (even in the background) we mark the DB read only and fail the Put() with that error (that could have happened in the background). However, even a Write() process can try creating a new file, for example when it rolls the WAL file (delete old one, create new one).

So this is expected. Thanks for reporting! Interestingly, this issue lived in our codebase for at least a year and we found it at the same time both internally and externally :)

ghost mentioned this issue Oct 26, 2014

Crash in compaction. rocksdb-3.5.1 #359

Closed

igorcanadi closed this as completed Oct 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hitting assertion in 3.6.1 compaction code #365

Hitting assertion in 3.6.1 compaction code #365

ghost commented Oct 24, 2014

ghost commented Oct 24, 2014

igorcanadi commented Oct 26, 2014

igorcanadi commented Oct 26, 2014

ghost commented Oct 26, 2014

fyrz commented Oct 26, 2014

ghost commented Oct 26, 2014

fyrz commented Oct 26, 2014

igorcanadi commented Oct 27, 2014

igorcanadi commented Oct 27, 2014

ghost commented Oct 27, 2014

igorcanadi commented Oct 27, 2014

ghost commented Oct 27, 2014

igorcanadi commented Oct 28, 2014

igorcanadi commented Oct 29, 2014

ghost commented Oct 30, 2014

igorcanadi commented Oct 30, 2014

ghost commented Oct 30, 2014

igorcanadi commented Oct 30, 2014

Hitting assertion in 3.6.1 compaction code #365

Hitting assertion in 3.6.1 compaction code #365

Comments

ghost commented Oct 24, 2014

ghost commented Oct 24, 2014

igorcanadi commented Oct 26, 2014

igorcanadi commented Oct 26, 2014

ghost commented Oct 26, 2014

fyrz commented Oct 26, 2014

ghost commented Oct 26, 2014

fyrz commented Oct 26, 2014

igorcanadi commented Oct 27, 2014

igorcanadi commented Oct 27, 2014

ghost commented Oct 27, 2014

igorcanadi commented Oct 27, 2014

ghost commented Oct 27, 2014

igorcanadi commented Oct 28, 2014

igorcanadi commented Oct 29, 2014

ghost commented Oct 30, 2014

igorcanadi commented Oct 30, 2014

ghost commented Oct 30, 2014

igorcanadi commented Oct 30, 2014