I'm playing around with HyperQueue and ran into a server crash while submitting lots of buggy jobs.
Here's a (somewhat flaky) reproducer:
$ ./repro.sh
+ sleep 0.1
+ hq server start
2026-05-20T16:51:42Z INFO HyperQueue 0.26.0
2026-05-20T16:51:42Z INFO No online server found, starting a new server
2026-05-20T16:51:42Z INFO Storing access file as '~/.hq-server/027/access.json'
+------------------+-----------------------------+
| Server directory | ~/.hq-server |
| Server UID | 1uTywd |
| Client host | host |
| Client port | 37617 |
| Worker host | host |
| Worker port | 46671 |
| Version | 0.26.0 |
| Pid | 973563 |
| Start date | 2026-05-20 16:51:42 UTC |
| Journal path | |
+------------------+-----------------------------+
+ sleep 0.1
+ hq worker start --cpus 400
2026-05-20T16:51:42Z INFO HyperQueue 0.26.0
2026-05-20T16:51:42Z INFO Detected 1 NVIDIA GPUs from procs
2026-05-20T16:51:42Z INFO Detected 33587306496B of memory (31.28 GiB)
2026-05-20T16:51:42Z INFO Starting hyperqueue worker
2026-05-20T16:51:42Z INFO Connecting to: host:46671
2026-05-20T16:51:42Z INFO Listening on port 44595
2026-05-20T16:51:42Z INFO Connecting to server (candidate addresses = [[fe80::52b1:9516:6577:8f%2]:46671, [fe80::b1e0:d113:324f:fa8c%2]:46671, [2a02:810b:d86:2900::754d]:46671, [2a02:810b:d86:2900:4768:b0cc:89b9:bef1]:46671, [2a02:810b:d86:2900:bf1d:e644:293e:205a]:46671, [fd7a:115c:a1e0:ab12:4843:cd96:627b:1f80]:46671, 192.168.0.121:46671, 100.123.31.128:46671, 192.168.0.120:46671])
2026-05-20T16:51:42Z INFO Worker 1 registered from 192.168.0.121:45548
+-------------------+----------------------------------+
| Worker | 1 |
| State | RUNNING |
| Hostname | host |
| Started | "2026-05-20T16:51:42.230781416Z" |
| Data provider | host:44595 |
| Working directory | /tmp/hq-workerWyFp8A/work |
| Heartbeat | 8s |
| Idle timeout | None |
| Overview interval | None |
| Resources | cpus: 400 |
| | gpus/nvidia: 1 |
| | mem: 31.28 GiB |
| Time Limit | None |
| Process pid | 973565 |
| Group | default |
| Manager | None |
| Manager Job ID | N/A |
| Last task started | |
+-------------------+----------------------------------+
+ hq submit --stdout /dev/null --stderr /dev/null --array 1-5000 true
2026-05-20T16:51:42Z WARN The job will create 10000 files for stdout and stderr. Consider using the `--stream` option to stream all outputs into a single file
2026-05-20T16:51:42Z WARN You have submitted an array job, but the `stdout` path does not contain the task ID placeholder.
Individual tasks might thus overwrite the file. Consider adding `%{TASK_ID}` to the `--stdout` value.
2026-05-20T16:51:42Z WARN You have submitted an array job, but the `stderr` path does not contain the task ID placeholder.
Individual tasks might thus overwrite the file. Consider adding `%{TASK_ID}` to the `--stderr` value.
Job submitted successfully, job ID: 1
+ hq job progress all
2026-05-20T16:51:42Z INFO Waiting for 1 job with 5000 tasks
[########################################] 0/1 jobs, 4598/5000 tasks 322 RUNNING 4534 FINISHED 64 FAILED
thread 'main' (973563) panicked at crates/tako/src/internal/server/worker.rs:192:17:
assertion failed: a.assigned_tasks.insert(task_id)
stack backtrace:
0: 0x582931c91e02 - std::backtrace_rs::backtrace::libunwind::trace::h73aabaf37ceb5073
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/../../backtrace/src/backtrace/libunwind.rs:117:9
1: 0x582931c91e02 - std::backtrace_rs::backtrace::trace_unsynchronized::h30862f246760437f
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/../../backtrace/src/backtrace/mod.rs:66:14
2: 0x582931c91e02 - std::sys::backtrace::_print_fmt::h2d1afd8848eb5d7a
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/sys/backtrace.rs:68:9
3: 0x582931c91e02 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h1851ca2a850bd9a9
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/sys/backtrace.rs:38:26
4: 0x582931990b67 - core::fmt::rt::Argument::fmt::he8640bda190d4d38
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/fmt/rt.rs:152:76
5: 0x582931990b67 - core::fmt::write::h22467d3ad5dd5554
6: 0x582931c90fb5 - std::io::default_write_fmt::h351a88ae8ee5bcc5
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/io/mod.rs:639:11
7: 0x582931c90fb5 - std::io::Write::write_fmt::h5e3b6a876f7a20bf
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/io/mod.rs:1994:13
8: 0x582931c9129a - std::sys::backtrace::BacktraceLock::print::hc25d10722ea4032d
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/sys/backtrace.rs:41:9
9: 0x582931c9129a - std::panicking::default_hook::{{closure}}::he43c3ac33dfa4b50
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:292:27
10: 0x582931c9129a - std::panicking::default_hook::hd124da54acf1152f
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:319:9
11: 0x5829318847ab - call<(&std::panic::PanicHookInfo), (dyn core::ops::function::Fn<(&std::panic::PanicHookInfo), Output=()> + core::marker::Send + core::marker::Sync), alloc::alloc::Global>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/alloc/src/boxed.rs:2220:9
12: 0x5829318847ab - {closure#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:437:9
13: 0x582931c90c74 - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h1713579b981531f7
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/alloc/src/boxed.rs:2220:9
14: 0x582931c90c74 - std::panicking::panic_with_hook::h9b5f1f19954f65a8
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:833:13
15: 0x582931cb9fa8 - std::panicking::panic_handler::{{closure}}::hf431df8c849ee0d6
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:691:13
16: 0x582931cb9f29 - std::sys::backtrace::__rust_end_short_backtrace::hf97362b31a346cc0
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/sys/backtrace.rs:176:18
17: 0x582931cb9f1c - __rustc[9e6a08e89e4b9111]::rust_begin_unwind
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:689:5
18: 0x582931991b1b - core::panicking::panic_fmt::ha4414e4328fe24a0
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/panicking.rs:80:14
19: 0x582931991dc1 - core::panicking::panic::ha2e20a73227bb72e
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/panicking.rs:150:5
20: 0x582931d1fbf4 - insert_sn_task
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/worker.rs:192:17
21: 0x5829318be614 - task_running<tako::internal::server::comm::CommSender>
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/reactor.rs:316:18
22: 0x5829318be614 - on_task_update<tako::internal::server::comm::CommSender>
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/reactor.rs:244:36
23: 0x5829318be614 - {async_fn#0}<futures_util::stream::stream::split::SplitStream<tokio_util::codec::framed::Framed<tokio::net::tcp::stream::TcpStream, tokio_util::codec::length_delimited::LengthDelimitedCodec>>>
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:298:17
24: 0x5829318be614 - {closure#3}
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/macros/select.rs:705:49
25: 0x5829318b5380 - poll<tako::internal::server::rpc::worker_rpc_loop::{async_fn#0}::__tokio_select_util::Out<core::result::Result<core::option::Option<tako::internal::messages::worker::WorkerStopReason>, tako::internal::common::error::DsError>, core::result::Result<(), std::io::error::Error>, tako::gateway::LostWorkerReason>, tako::internal::server::rpc::worker_rpc_loop::{async_fn#0}::{closure_env#3}>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/future/poll_fn.rs:151:9
26: 0x5829318b5380 - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:242:18
27: 0x5829318d4784 - {async_block#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:63:83
28: 0x5829318d4784 - {closure#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/core.rs:375:24
29: 0x5829318d4784 - with_mut<tokio::runtime::task::core::Stage<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}>, core::task::poll::Poll<()>, tokio::runtime::task::core::{impl#6}::poll::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/loom/std/unsafe_cell.rs:16:9
30: 0x5829318d4784 - poll<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/core.rs:364:30
31: 0x5829318d4784 - {closure#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/harness.rs:535:30
32: 0x5829318d4784 - call_once<core::task::poll::Poll<()>, tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/panic/unwind_safe.rs:274:9
33: 0x5829318d4784 - do_call<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>, core::task::poll::Poll<()>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:581:40
34: 0x5829318d4784 - catch_unwind<core::task::poll::Poll<()>, core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panicking.rs:544:19
35: 0x5829318d4784 - catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future::{closure_env#0}<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>>, core::task::poll::Poll<()>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/panic.rs:359:14
36: 0x5829318d4784 - poll_future<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/harness.rs:523:18
37: 0x5829318d4784 - poll_inner<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/harness.rs:210:27
38: 0x5829318d4784 - poll<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/harness.rs:155:20
39: 0x5829318d4784 - poll<tako::internal::server::rpc::connection_initiator::{async_fn#0}::{async_block_env#0}, alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/raw.rs:337:13
40: 0x582931d54245 - poll
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/raw.rs:267:18
41: 0x582931d54245 - run<alloc::sync::Arc<tokio::task::local::Shared, alloc::alloc::Global>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/task/mod.rs:510:13
42: 0x582931d54245 - {closure#0}
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:773:65
43: 0x582931d54245 - with_budget<(), tokio::task::local::{impl#4}::tick::{closure_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/coop/mod.rs:167:5
44: 0x582931d54245 - budget<(), tokio::task::local::{impl#4}::tick::{closure_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/coop/mod.rs:133:5
45: 0x582931d54245 - tick
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:773:31
46: 0x5829318b50e3 - {closure#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:1080:29
47: 0x5829318b50e3 - {closure#0}<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:827:13
48: 0x5829318b50e3 - try_with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>>, core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/thread/local.rs:513:12
49: 0x5829318b50e3 - with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>>, core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/thread/local.rs:477:20
50: 0x5829318b50e3 - with<core::task::poll::Poll<core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:825:17
51: 0x5829318b50e3 - poll<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:1066:22
52: 0x5829318b50e3 - {async_fn#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:723:19
53: 0x5829318b50e3 - {async_fn#0}<tokio::net::tcp::listener::{impl#0}::accept::{async_fn_env#0}, core::result::Result<(tokio::net::tcp::stream::TcpStream, core::net::socket_addr::SocketAddr), std::io::error::Error>>
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/common/taskgroup.rs:15:36
54: 0x582931836d6c - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/server/rpc.rs:47:68
55: 0x582931836d6c - {closure#0}
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/macros/select.rs:705:49
56: 0x582931836d6c - poll<tako::control::server_start::{async_block#0}::__tokio_select_util::Out<(), core::result::Result<(), tako::internal::common::error::DsError>>, tako::control::server_start::{async_block#0}::{closure_env#0}>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/future/poll_fn.rs:151:9
57: 0x582931836d6c - {async_block#0}
at /__w/hyperqueue/hyperqueue/crates/tako/src/control.rs:264:9
58: 0x582931836d6c - {closure#0}
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/macros/select.rs:705:49
59: 0x582931836d6c - poll<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block#2}::__tokio_select_util::Out<(), (), (), core::result::Result<(), tako::internal::common::error::DsError>>, hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block#2}::{closure_env#0}>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/future/poll_fn.rs:151:9
60: 0x582931836d6c - {async_block#2}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/bootstrap.rs:271:22
61: 0x582931836d6c - {closure#0}<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:1076:44
62: 0x582931836d6c - {closure#0}<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:827:13
63: 0x582931836d6c - try_with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>>, core::task::poll::Poll<core::result::Result<(), anyhow::Error>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/thread/local.rs:513:12
64: 0x582931836d6c - with<tokio::task::local::LocalData, tokio::task::local::{impl#4}::with::{closure_env#0}<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>>, core::task::poll::Poll<core::result::Result<(), anyhow::Error>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/thread/local.rs:477:20
65: 0x582931836d6c - with<core::task::poll::Poll<core::result::Result<(), anyhow::Error>>, tokio::task::local::{impl#10}::poll::{closure_env#0}<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:825:17
66: 0x582931836d6c - poll<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:1066:22
67: 0x582931836d6c - {async_fn#0}<hyperqueue::server::bootstrap::initialize_server::{async_fn#0}::{async_block_env#2}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/local.rs:723:19
68: 0x582931836d6c - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/bootstrap.rs:394:30
69: 0x582931875acd - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/server/bootstrap.rs:69:49
70: 0x582931875acd - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/client/commands/server.rs:227:43
71: 0x582931875acd - {async_fn#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/client/commands/server.rs:163:69
72: 0x582931875acd - {async_block#0}
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:510:70
73: 0x582931860ed5 - poll<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/future/future.rs:133:9
74: 0x582931860ed5 - poll<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/future/future.rs:133:9
75: 0x582931860ed5 - {closure#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:769:70
76: 0x582931860ed5 - with_budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/coop/mod.rs:167:5
77: 0x582931860ed5 - budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/task/coop/mod.rs:133:5
78: 0x582931860ed5 - {closure#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:769:25
79: 0x582931860ed5 - enter<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure#0}::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:446:19
80: 0x582931860ed5 - {closure#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:768:44
81: 0x582931860ed5 - {closure#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:856:68
82: 0x582931860ed5 - set<tokio::runtime::scheduler::Context, tokio::runtime::scheduler::current_thread::{impl#9}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/context/scoped.rs:40:9
83: 0x582931860ed5 - {closure#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#9}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/context.rs:176:38
84: 0x582931860ed5 - try_with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#9}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/thread/local.rs:513:12
85: 0x582931860ed5 - with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#9}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/thread/local.rs:477:20
86: 0x582931860ed5 - set_scheduler<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#9}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/context.rs:176:17
87: 0x582931860ed5 - enter<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:856:27
88: 0x582931860ed5 - block_on<core::pin::Pin<&mut core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:756:24
89: 0x582931860ed5 - {closure#0}<core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:200:33
90: 0x582931860ed5 - enter_runtime<tokio::runtime::scheduler::current_thread::{impl#0}::block_on::{closure_env#0}<core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>, core::result::Result<(), hyperqueue::common::error::HqError>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/context/runtime.rs:65:16
91: 0x582931860ed5 - block_on<core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/scheduler/current_thread/mod.rs:188:9
92: 0x582931860ed5 - block_on_inner<core::pin::Pin<alloc::boxed::Box<hq::main::{async_block_env#0}, alloc::alloc::Global>>>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/runtime.rs:371:52
93: 0x582931860ed5 - block_on<hq::main::{async_block_env#0}>
at /github/home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.50.0/src/runtime/runtime.rs:343:18
94: 0x582931860ed5 - main
at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:560:12
95: 0x582931884d13 - call_once<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, ()>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/core/src/ops/function.rs:250:5
96: 0x582931884d13 - __rust_begin_short_backtrace<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, core::result::Result<(), hyperqueue::common::error::HqError>>
at /rustc/01f6ddf7588f42ae2d7eb0a2f21d44e8e96674cf/library/std/src/sys/backtrace.rs:160:18
97: 0x5829318f2281 - main
98: 0x784a4da2a601 - <unknown>
99: 0x784a4da2a718 - __libc_start_main
100: 0x5829317fc02e - _start
101: 0x0 - <unknown>
Oops, HyperQueue has crashed. This is a bug, sorry for that.
If you would be so kind, please report this issue at the HQ issue tracker: https://github.com/It4innovations/hyperqueue/issues/new?title=HQ%20crashes
Please include the above error (starting from "thread ... panicked ...") and the stack backtrace in the issue contents, along with the following information:
HyperQueue version: v0.26.0
You can also re-run HyperQueue server (and its workers) with the `RUST_LOG=hq=debug,tako=debug`
environment variable, and attach the logs to the issue, to provide us more information.
Let me know if you're able to reproduce the crash. Running with RUST_LOG didn't seem particularly helpful in this case.
Thanks!
Hi,
I'm playing around with HyperQueue and ran into a server crash while submitting lots of buggy jobs.
Here's a (somewhat flaky) reproducer:
The crash reproduces when lots of tasks are scheduled and
hq job progressobserves the job statuses. It seems like streaming the job progress is required as the issue doesn't reproduce when usinghq job wait allinstead. I've reproduced it both on the v0.26.0 release as well as the latest nightly build. It also seems to be important that some of the tasks fail -- in my reproducer this happens naturally due to lowulimit -n🙃Let me know if you're able to reproduce the crash. Running with
RUST_LOGdidn't seem particularly helpful in this case.Thanks!