Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitting assertion in 3.6.1 compaction code #365

Closed
ghost opened this issue Oct 24, 2014 · 18 comments
Closed

Hitting assertion in 3.6.1 compaction code #365

ghost opened this issue Oct 24, 2014 · 18 comments

Comments

@ghost
Copy link

ghost commented Oct 24, 2014

This is not easy to reproduce, it happened after hours of stress

bt
#0 0x0000000000425bdb in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x0000000000599768 in abort ()
#2 0x0000000000593eb4 in __assert_fail_base ()
#3 0x0000000000593f0e in __assert_fail ()
#4 0x000000000048a71d in rocksdb::InternalKey::Encode (this=) at ./db/dbformat.h:146
#5 0x000000000049b6ed in Encode (this=) at db/version_set.cc:80
#6 Compare (b=..., a=..., this=) at ./db/dbformat.h:164
#7 BySmallestKey (cmp=, b=, a=) at db/version_set.cc:85
#8 operator() (this=, this=, f2=, f1=) at db/version_set.cc:1382
#9 _M_get_insert_unique_pos (__k=@0x7fab99d6c6f0: 0x7faa0bb0a1b0, this=0x7faab4c8d2f0) at /usr/include/c++/4.9/bits/stl_tree.h:1445
#10 std::_Rb_tree<rocksdb::FileMetaData*, rocksdb::FileMetaData*, std::_Identityrocksdb::FileMetaData*, rocksdb::VersionSet::Builder::FileComparator, std::allocatorrocksdb::FileMetaData* >::_M_insert_unique<rocksdb::FileMetaData* const&> (this=0x7faab4c8d2f0, __v=@0x7fab99d6c6f0: 0x7faa0bb0a1b0) at /usr/include/c++/4.9/bits/stl_tree.h:1498
#11 0x000000000049b995 in insert (__x=@0x7fab99d6c6f0: 0x7faa0bb0a1b0, this=) at /usr/include/c++/4.9/bits/stl_set.h:502
#12 rocksdb::VersionSet::Builder::Apply (this=this@entry=0x7faa0bb0af90, edit=0x7faab4527100) at db/version_set.cc:1538
#13 0x0000000000490167 in rocksdb::VersionSet::LogAndApplyHelper (this=this@entry=0xeeff80, cfd=cfd@entry=0xefa8a0, builder=builder@entry=0x7faa0bb0af90, v=v@entry=0x7faa0bb0a410, edit=, mu=mu@entry=0xeedd50) at db/version_set.cc:1948
#14 0x00000000004936d3 in rocksdb::VersionSet::LogAndApply (this=0xeeff80, column_family_data=0xefa8a0, mutable_cf_options=..., edit=0x7faab4527100, mu=mu@entry=0xeedd50, db_directory=0xefa760, new_descriptor_log=false, new_cf_options=0x0) at db/version_set.cc:1716
#15 0x000000000044a544 in rocksdb::DBImpl::InstallCompactionResults (this=this@entry=0xeedc20, compact=compact@entry=0x7faab4c8cdf0, mutable_cf_options=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:2633
#16 0x000000000045292a in rocksdb::DBImpl::DoCompactionWork (this=this@entry=0xeedc20, compact=compact@entry=0x7faab4c8cdf0, mutable_cf_options=..., deletion_state=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:3386
#17 0x0000000000454bca in rocksdb::DBImpl::BackgroundCompaction (this=this@entry=0xeedc20, madeProgress=madeProgress@entry=0x7fab99d6e1af, deletion_state=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:2391
#18 0x000000000045fbe7 in rocksdb::DBImpl::BackgroundCallCompaction (this=0xeedc20) at db/db_impl.cc:2168
#19 0x00000000004bcb68 in BGThread (thread_id=11, this=0xee9940) at util/env_posix.cc:1616
#20 rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThreadWrapper (arg=) at util/env_posix.cc:1633
#21 0x000000000041e4a5 in start_thread (arg=0x7fab99d6f700) at pthread_create.c:309
#22 0x00000000005f52b9 in clone ()

@ghost
Copy link
Author

ghost commented Oct 24, 2014

This smells like the same issue I reported in #359 but in this case it was a debug build.

@igorcanadi
Copy link
Collaborator

Can you c/p your RocksDB configuration? Are you using custom comparator?

@igorcanadi
Copy link
Collaborator

Can you also c/p rocksdb LOG file just before the crash?

cc @yhchiang this smells like an issue you debugged on Thursday.

@ghost
Copy link
Author

ghost commented Oct 26, 2014

I've found out the way to repro the crash. The problem happens when compaction code fails to open files due to too many open files. Unfortunately, Linux sets the maximum to 1024 by default (ulimit -n), while RocksDB default is max_open_files=5000.
By setting a higher value in ulimit the problem goes away and my stress test passes fine. On the other hand, setting it to something quite low like 50 makes RocksDB crash much more easily.
Besides fixing the crash, maybe the default value for RocksDB should be reconsidered.

@fyrz
Copy link
Contributor

fyrz commented Oct 26, 2014

@dlezama its not linux. Every distribution can define this limit. But of course hitting a crash without getting significant info about what went wrong is not what someone would expect. At least with this one.

@ghost
Copy link
Author

ghost commented Oct 26, 2014

Sorry for oversimplifying the ulimit issue. The distributions I've used for development are Ubuntu based, and my deployment is in Scientific Linux, a quite different distribution. Both have it set to 1024 by default, because of that I'm guessing it's a common default, but you're totally right, other distributions may set it to different values.
In any case, it's impossible to know what other handles are open in the process, so maybe it's not worth it to change RocksDB default, but it shouldn't crash either :)

@fyrz
Copy link
Contributor

fyrz commented Oct 26, 2014

@dlezama im not a fb guy so it's only a opinion :-) but from what i've seen usual behavior is that there is some sort of message "Can`t open file" or whatever.

@igorcanadi
Copy link
Collaborator

This definitely sounds like an issue that @yhchiang debugged few days ago.

@dlezama usually we should return "Can't open file" error, but it looks like there's something going wrong here. We're working on fixing it.

@igorcanadi
Copy link
Collaborator

@dlezama if it's easy for you to repro, can you please try reproing with RocksDB 3.5.1 or 3.4?

@ghost
Copy link
Author

ghost commented Oct 27, 2014

It does happen in 3.5.1, please see #359 for stack traces on how this looks like in that version and in release build (no assertion, but it crashes anyway).

@igorcanadi
Copy link
Collaborator

Thanks. Can you share your LOG file with us (you can find it in your rocksdb directory)? Stack traces only verify that there was a corruption. We need to find what caused the corruption.

@ghost
Copy link
Author

ghost commented Oct 27, 2014

@igorcanadi
Copy link
Collaborator

@dlezama I'm happy to say that @yhchiang was able to reproduce this and the fix is coming!

@igorcanadi
Copy link
Collaborator

This should be fixed with 3772a3d

Can you try reproducing now?

We'll probably be releasing 3.6.2 with this patch.

@ghost
Copy link
Author

ghost commented Oct 30, 2014

3.6.2 doesn't crash, but I'm getting kIOError "Too many open files" in a write right after I see this in the LOG:
2014/10/30-10:55:27.441455 7f8ce4968700 [default] Level-0 flush table #44: started
2014/10/30-10:55:27.441485 7f8ce4968700 [default] Level-0 flush table #44: 0 bytes IO error: csp.db/000044.sst: Too many open files
2014/10/30-10:55:27.441558 7f8ced9b2700 (Original Log Time 2014/10/30-10:55:27.441551) Compaction nothing to do
2014/10/30-10:55:27.814053 7f8ce4968700 Waiting after background flush error: IO error: csp.db/000044.sst: Too many open filesAccumulated background error counts: 1

Is this expected?

@igorcanadi
Copy link
Collaborator

What is your num_open_files? What is your write_buffer_size? What is number of files in rocksdb directory? What's your ulimit for number of files?

@ghost
Copy link
Author

ghost commented Oct 30, 2014

I used a very low ulimit and left max_open_files=5000, so I could reproduce the crash, that didn't happen in 3.6.2. It surprised me a little that a write operation can fail with "too many open files" since my understanding was that writes only needed to access memory and append to the log. So I just wanted to verify that this was an expected thing to happen.
In any case, I'll make sure that in a production environment ulimit is bigger than max_open_files and this won't be an issue.

@igorcanadi
Copy link
Collaborator

@dlezama as soon as we get any IO error (even in the background) we mark the DB read only and fail the Put() with that error (that could have happened in the background). However, even a Write() process can try creating a new file, for example when it rolls the WAL file (delete old one, create new one).

So this is expected. Thanks for reporting! Interestingly, this issue lived in our codebase for at least a year and we found it at the same time both internally and externally :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants