Skip to content

fix(profiling): guard against invalid access on exit#17180

Draft
KowalskiThomas wants to merge 1 commit intomainfrom
dd/kowalski/fix/profiling-segv-sampling-thread-exit
Draft

fix(profiling): guard against invalid access on exit#17180
KowalskiThomas wants to merge 1 commit intomainfrom
dd/kowalski/fix/profiling-segv-sampling-thread-exit

Conversation

@KowalskiThomas
Copy link
Copy Markdown
Contributor

@KowalskiThomas KowalskiThomas commented Mar 30, 2026

Description

This PR addresses a crash that can occur at process exit if the Sampling thread failed to stop within the allowed time. When this happens, we would previously still free the Sampler object (through implicit destruction of the std::unique_ptr) and the Sampling Thread could then try to read data from e.g. the StringTable which was being or had been deleted.

Note I don't have any proof that the crash is caused by this specific code path, but it does make sense to me because in that case, we would definitely crash. So regardless, it's probably worth "fixing" (even though we're working around and not really properly fixing here -- I just don't see another way).

#0 0x000070c4fae4d0b0 std::_Hashtable<unsigned long, std::pair<unsigned long const, std::string>, std::allocator<std::pair<unsigned long const, std::string> >, std::__detail::_Select1st, std::equal_to<unsigned long>, std::hash<unsigned long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node(unsigned long, unsigned long const&, unsigned long) const [clone .constprop.0]
#1 0x000070c4fae4d8e5 std::_Hashtable<unsigned long, std::pair<unsigned long const, std::string>, std::allocator<std::pair<unsigned long const, std::string> >, std::__detail::_Select1st, std::equal_to<unsigned long>, std::hash<unsigned long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_node(unsigned long, unsigned long const&, unsigned long) const [clone .isra.0]
#2 0x000070c4fae4699f StringTable::key(_object*, StringTag)
#3 0x000070c4fae46cb4 Frame::create(EchionSampler&, PyCodeObject*, int) [clone .localalias]
#4 0x000070c4fae46f87 Frame::get(EchionSampler&, PyCodeObject*, int) [clone .localalias]
#5 0x000070c4fae47101 Frame::read(EchionSampler&, _PyInterpreterFrame*, _PyInterpreterFrame**)
#6 0x000070c4fae47226 unwind_frame(EchionSampler&, _object*, FrameStack&, unsigned long) [clone .localalias]
#7 0x000070c4fae496b7 ThreadInfo::unwind(EchionSampler&, _ts*) [clone .localalias]
#8 0x000070c4fae4cb01 ThreadInfo::sample(EchionSampler&, _ts*, long)
#9 0x000070c4fae4cc36 std::_Function_handler<void (_ts*, ThreadInfo&), Datadog::Sampler::sampling_thread(unsigned long)::{lambda(InterpreterInfo&)#1}::operator()(InterpreterInfo&) const::{lambda(_ts*, ThreadInfo&)#1}>::_M_invoke(std::_Any_data const&, _ts*&&, ThreadInfo&)
#10 0x000070c4fae49942 for_each_thread(EchionSampler&, InterpreterInfo&, std::function<void (_ts*, ThreadInfo&)> const&)
#11 0x000070c4fae499da std::_Function_handler<void (InterpreterInfo&), Datadog::Sampler::sampling_thread(unsigned long)::{lambda(InterpreterInfo&)#1}>::_M_invoke(std::_Any_data const&, InterpreterInfo&)
#12 0x000070c4fae46288 for_each_interp(pyruntimestate*, std::function<void (InterpreterInfo&)> const&)
#13 0x000070c4fae49d8d Datadog::Sampler::sampling_thread(unsigned long) [clone .localalias]
#14 0x000070c4fae49ff9 call_sampling_thread(void*)
#15 0x000070c4fdf321f5 start_thread (nptl/nptl/pthread_create.c:442)
#16 0x000070c4fdfb1b40 __clone (sysdeps/unix/sysv/linux/x86_64/clone.S:102)
#17 0x0000000000000000

@datadog-prod-us1-6
Copy link
Copy Markdown

View session in Datadog

Bits Dev status: ✅ Done

CI Auto-fix: Disabled | Enable

Comment @DataDog to request changes

@datadog-official
Copy link
Copy Markdown
Contributor

I can only run on private repositories.

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

Codeowners resolved as

ddtrace/internal/datadog/profiling/stack/include/sampler.hpp            @DataDog/profiling-python
ddtrace/internal/datadog/profiling/stack/src/sampler.cpp                @DataDog/profiling-python
releasenotes/notes/profiling-fix-segv-sampling-thread-exit-e909f07be050246c.yaml  @DataDog/apm-python

@KowalskiThomas KowalskiThomas added the Profiling Continous Profling label Mar 30, 2026
@KowalskiThomas KowalskiThomas force-pushed the dd/kowalski/fix/profiling-segv-sampling-thread-exit branch from bbd8d23 to ebd369e Compare March 30, 2026 08:21
@KowalskiThomas KowalskiThomas added the identified-by:crashtracking Identified by Crash Tracking label Mar 30, 2026
@datadog-prod-us1-6

This comment has been minimized.

Co-authored-by: KowalskiThomas <14239160+KowalskiThomas@users.noreply.github.com>
@KowalskiThomas KowalskiThomas force-pushed the dd/kowalski/fix/profiling-segv-sampling-thread-exit branch from ebd369e to 8dc485a Compare March 30, 2026 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bits AI identified-by:crashtracking Identified by Crash Tracking Profiling Continous Profling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant