Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV during BackgroundFlush #13008

Open
prabhjotlalli opened this issue Sep 12, 2024 · 4 comments
Open

SIGSEGV during BackgroundFlush #13008

prabhjotlalli opened this issue Sep 12, 2024 · 4 comments

Comments

@prabhjotlalli
Copy link

Note: Please use Issues only for bug reports. For questions, discussions, feature requests, etc. post to dev group: https://groups.google.com/forum/#!forum/rocksdb or https://www.facebook.com/groups/rocksdb.dev

Expected behavior

Background flush should always succeed.

Actual behavior

Randomly (from once a week to once a month, 1 out of a few hundred containers) running RocksDB will crash during a background flush.

Steps to reproduce the behavior

No steps to reproduce since I'm not sure whats causing it. I have a high severity error log I'll link below (and can give more info with process).

RocksDB version: 8.11.4

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fb74483841d, pid=1, tid=124
#
# JRE version: OpenJDK Runtime Environment Temurin-19.0.2+7 (19.0.2+7) (build 19.0.2+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-19.0.2+7 (19.0.2+7, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64)
# Problematic frame:
# C  [librocksdbjni10392120128203082722.so+0x43841d]  std::_Hashtable<std::string, std::string, std::allocator<std::string>, std::__detail::_Identity, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_find_before_node(unsigned long, std::string const&, unsigned long) const+0x1d
#
# Core dump will be written. Default location: /var/crash/core.%e.1.%h.%t
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#

---------------  S U M M A R Y ------------

Command Line: -javaagent:/usr/local/lib/dd-java-agent.jar -XX:+UseZGC -Xmx20g -Xms20g (some flags and params hidden)

Host: AMD EPYC 7763 64-Core Processor, 256 cores, 160G, Debian GNU/Linux 12 (bookworm)
Time: Thu Aug 15 14:34:02 2024 UTC elapsed time: 1256452.048050 seconds (14d 13h 0m 52s)

---------------  T H R E A D  ---------------

Current thread is native thread

Stack: [0x00007fb6ddffe000,0x00007fb6de7fd000],  sp=0x00007fb6de7f7a50,  free space=8166k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni10392120128203082722.so+0x43841d]  std::_Hashtable<std::string, std::string, std::allocator<std::string>, std::__detail::_Identity, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_find_before_node(unsigned long, std::string const&, unsigned long) const+0x1d
C  [librocksdbjni10392120128203082722.so+0x68b7b0]  rocksdb::BlockBasedTable::PrefetchIndexAndFilterBlocks(rocksdb::ReadOptions const&, rocksdb::FilePrefetchBuffer*, rocksdb::InternalIteratorBase<rocksdb::Slice>*, rocksdb::BlockBasedTable*, bool, rocksdb::BlockBasedTableOptions const&, int, unsigned long, unsigned long, rocksdb::BlockCacheLookupContext*)+0x790
C  [librocksdbjni10392120128203082722.so+0x68ced8]  rocksdb::BlockBasedTable::Open(rocksdb::ReadOptions const&, rocksdb::ImmutableOptions const&, rocksdb::EnvOptions const&, rocksdb::BlockBasedTableOptions const&, rocksdb::InternalKeyComparator const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, unsigned char, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, unsigned long, std::shared_ptr<rocksdb::CacheReservationManager>, std::shared_ptr<rocksdb::SliceTransform const> const&, bool, bool, int, bool, unsigned long, bool, rocksdb::TailPrefetchStats*, rocksdb::BlockCacheTracer*, unsigned long, std::string const&, unsigned long, std::array<unsigned long, 2ul>)+0x1038
C  [librocksdbjni10392120128203082722.so+0x675cda]  rocksdb::BlockBasedTableFactory::NewTableReader(rocksdb::ReadOptions const&, rocksdb::TableReaderOptions const&, std::unique_ptr<rocksdb::RandomAccessFileReader, std::default_delete<rocksdb::RandomAccessFileReader> >&&, unsigned long, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, bool) const+0xda
C  [librocksdbjni10392120128203082722.so+0x4f6b5b]  rocksdb::TableCache::GetTableReader(rocksdb::ReadOptions const&, rocksdb::FileOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, bool, bool, unsigned char, rocksdb::HistogramImpl*, std::unique_ptr<rocksdb::TableReader, std::default_delete<rocksdb::TableReader> >*, std::shared_ptr<rocksdb::SliceTransform const> const&, bool, int, bool, unsigned long, rocksdb::Temperature)+0xa5b
C  [librocksdbjni10392120128203082722.so+0x4f7f75]  rocksdb::TableCache::FindTable(rocksdb::ReadOptions const&, rocksdb::FileOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, rocksdb::BasicTypedCacheInterface<rocksdb::TableReader, (rocksdb::CacheEntryRole)13, rocksdb::Cache*>::TypedHandle**, unsigned char, std::shared_ptr<rocksdb::SliceTransform const> const&, bool, bool, rocksdb::HistogramImpl*, bool, int, bool, unsigned long, rocksdb::Temperature)+0x525
C  [librocksdbjni10392120128203082722.so+0x4fa5c3]  rocksdb::TableCache::NewIterator(rocksdb::ReadOptions const&, rocksdb::FileOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, rocksdb::RangeDelAggregator*, std::shared_ptr<rocksdb::SliceTransform const> const&, rocksdb::TableReader**, rocksdb::HistogramImpl*, rocksdb::TableReaderCaller, rocksdb::Arena*, bool, int, unsigned long, rocksdb::InternalKey const*, rocksdb::InternalKey const*, bool, unsigned char, rocksdb::TruncatedRangeDelIterator**)+0x5d3
C  [librocksdbjni10392120128203082722.so+0x34b58b]  rocksdb::BuildTable(std::string const&, rocksdb::VersionSet*, rocksdb::ImmutableDBOptions const&, rocksdb::TableBuilderOptions const&, rocksdb::FileOptions const&, rocksdb::ReadOptions const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase<rocksdb::Slice>*, std::vector<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> >, std::allocator<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> > > >, rocksdb::FileMetaData*, std::vector<rocksdb::BlobFileAddition, std::allocator<rocksdb::BlobFileAddition> >*, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, unsigned long, rocksdb::SnapshotChecker*, bool, rocksdb::InternalStats*, rocksdb::IOStatus*, std::shared_ptr<rocksdb::IOTracer> const&, rocksdb::BlobFileCreationReason, rocksdb::SeqnoToTimeMapping const&, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, rocksdb::Env::WriteLifeTimeHint, std::string const*, rocksdb::BlobFileCompletionCallback*, rocksdb::Version*, unsigned long*, unsigned long*, unsigned long*)+0x344b
C  [librocksdbjni10392120128203082722.so+0x4973af]  rocksdb::FlushJob::WriteLevel0Table()+0xf0f
C  [librocksdbjni10392120128203082722.so+0x499142]  rocksdb::FlushJob::Run(rocksdb::LogsWithPrepTracker*, rocksdb::FileMetaData*, bool*)+0x732
C  [librocksdbjni10392120128203082722.so+0x41723e]  rocksdb::DBImpl::AtomicFlushMemTablesToOutputFiles(rocksdb::autovector<rocksdb::DBImpl::BGFlushArg, 8ul> const&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority)+0xe8e
C  [librocksdbjni10392120128203082722.so+0x4198ad]  rocksdb::DBImpl::FlushMemTablesToOutputFiles(rocksdb::autovector<rocksdb::DBImpl::BGFlushArg, 8ul> const&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::Env::Priority)+0x17d
C  [librocksdbjni10392120128203082722.so+0x41a74c]  rocksdb::DBImpl::BackgroundFlush(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::FlushReason*, rocksdb::Env::Priority)+0xe6c
C  [librocksdbjni10392120128203082722.so+0x41dc08]  rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority)+0xc8
C  [librocksdbjni10392120128203082722.so+0x775bcb]  rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)+0x24b
C  [librocksdbjni10392120128203082722.so+0x775da2]  rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x62
@alanpaxton
Copy link
Contributor

Thanks for the report @prabhjotlalli - I have had a quick look and I think this is a problem that might need the attention of the core team FAO @jaykorean @pdillinger. It's initiated from Java but the problem is deep in the C++ code.
It is interesting that you are running a large multithreaded system, and the problem is "random". Is it correct that the problem occurs on different containers, i.e. not the same 1 out of 100s that you run on ? The SIGSEGV itself suggests that a HashMap implementing the STL unordered_map is corrupted, possibly it is trying to follow a field (a next pointer ?) from a null pointer. So I suspect that the STL concurrency rules (1 writer, or n readers, not both) are not being enforced somewhere in the flush code. I don't know my way round the core well enough to figure out where it is happening.

@prabhjotlalli
Copy link
Author

@alanpaxton - Correct, it happens on different containers on multiple different clusters. Sounds good let me know if there's more info I can provide or help debug this issue.

@pdillinger
Copy link
Contributor

The PrefetchIndexAndFilterBlocks function is accessing a static const unordered_map. That suggests either memory corruption or that this occurs while some other thread is running static destructors, perhaps in the process of cleaning up from some other error. Ideally we would wrap more such things in STATIC_AVOID_DESTRUCTION to minimize false attribution of the genesis of crashes and better expose the root cause.

@alanpaxton
Copy link
Contributor

Do you mean the kBuiltinNameAndAliases ? Is that as simple as just wrapping it with the macro ? I had inferred from the comments in the code that by now that field could be removed entirely, but maybe we're not living in the future yet..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants