Description
Problem
When a network interface goes down during an active upload (e.g., WiFi disconnect), the underlying TCP connection becomes a zombie. The Linux kernel continues retransmitting unacknowledged data with exponential backoff. By default (tcp_retries2=15), this can take ~15 minutes before the kernel gives up and returns ETIMEDOUT.
During this period, send() blocks in kernel space. CURL-level timeouts (CURLOPT_TIMEOUT, CURLOPT_LOW_SPEED_TIME) cannot interrupt a blocked send() — the thread is stuck in the kernel TCP stack.
As a result:
TosClientV2::uploadFile() hangs for ~15 minutes
The upload queue is completely blocked
All pending uploads stack up with no progress
Root Cause
The SDK does not set SO_SNDTIMEO or TCP_USER_TIMEOUT on its sockets. CURLOPT_TIMEOUT only covers HTTP-level timeouts, which cannot interrupt a kernel-blocked send() system call. The only way to enforce a hard deadline on unacknowledged TCP data is TCP_USER_TIMEOUT (Linux) or SO_SNDTIMEO.
Proposed Solution
Add a tcpUserTimeout field to ClientConfig (and propagate through HttpConfig → HttpClient). In HttpClient.cc, use CURLOPT_OPENSOCKETFUNCTION to intercept socket creation and apply setsockopt(fd, IPPROTO_TCP, TCP_USER_TIMEOUT, ...):
- sdk/include/ClientConfig.h — add field
class ClientConfig {
public:
// ... existing fields ...
int tcpUserTimeout = 0; // ms, 0 = use kernel default (tcp_retries2)
};
2. sdk/include/transport/http/HttpClient.h — add to HttpConfig
struct HttpConfig {
// ... existing fields ...
int tcpUserTimeout = 0;
};
3. sdk/src/transport/http/HttpClient.cc — register socket callback
#include <netinet/tcp.h> // TCP_USER_TIMEOUT
static curl_socket_t openSocketCallback(void* clientp, curlsocktype /purpose/,
struct curl_sockaddr* address) {
int tcpUserTimeout = static_cast<int>(clientp);
int sockfd = socket(address->family, address->socktype, address->protocol);
if (sockfd != CURL_SOCKET_BAD && tcpUserTimeout > 0) {
setsockopt(sockfd, IPPROTO_TCP, TCP_USER_TIMEOUT,
&tcpUserTimeout, sizeof(tcpUserTimeout));
}
return sockfd;
}
// In the CURL setup section (around line 340):
if (config.tcpUserTimeout > 0) {
curl_easy_setopt(curl, CURLOPT_OPENSOCKETFUNCTION, openSocketCallback);
curl_easy_setopt(curl, CURLOPT_OPENSOCKETDATA, &tcpUserTimeout_);
}
Usage Example
ClientConfig config;
config.tcpUserTimeout = 5000; // 5 seconds
TosClientV2 client(region, ak, sk, config);
// Uploads on this client will abort after 5s of unacknowledged TCP data
Environment
OS: Linux (RK3588 ARM64)
SDK version: current master
Description
Problem
When a network interface goes down during an active upload (e.g., WiFi disconnect), the underlying TCP connection becomes a zombie. The Linux kernel continues retransmitting unacknowledged data with exponential backoff. By default (tcp_retries2=15), this can take ~15 minutes before the kernel gives up and returns ETIMEDOUT.
During this period, send() blocks in kernel space. CURL-level timeouts (CURLOPT_TIMEOUT, CURLOPT_LOW_SPEED_TIME) cannot interrupt a blocked send() — the thread is stuck in the kernel TCP stack.
As a result:
TosClientV2::uploadFile() hangs for ~15 minutes
The upload queue is completely blocked
All pending uploads stack up with no progress
Root Cause
The SDK does not set SO_SNDTIMEO or TCP_USER_TIMEOUT on its sockets. CURLOPT_TIMEOUT only covers HTTP-level timeouts, which cannot interrupt a kernel-blocked send() system call. The only way to enforce a hard deadline on unacknowledged TCP data is TCP_USER_TIMEOUT (Linux) or SO_SNDTIMEO.
Proposed Solution
Add a tcpUserTimeout field to ClientConfig (and propagate through HttpConfig → HttpClient). In HttpClient.cc, use CURLOPT_OPENSOCKETFUNCTION to intercept socket creation and apply setsockopt(fd, IPPROTO_TCP, TCP_USER_TIMEOUT, ...):
class ClientConfig {
public:
// ... existing fields ...
int tcpUserTimeout = 0; // ms, 0 = use kernel default (tcp_retries2)
};
2. sdk/include/transport/http/HttpClient.h — add to HttpConfig
struct HttpConfig {
// ... existing fields ...
int tcpUserTimeout = 0;
};
3. sdk/src/transport/http/HttpClient.cc — register socket callback
#include <netinet/tcp.h> // TCP_USER_TIMEOUT
static curl_socket_t openSocketCallback(void* clientp, curlsocktype /purpose/,
struct curl_sockaddr* address) {
int tcpUserTimeout = static_cast<int>(clientp);
int sockfd = socket(address->family, address->socktype, address->protocol);
if (sockfd != CURL_SOCKET_BAD && tcpUserTimeout > 0) {
setsockopt(sockfd, IPPROTO_TCP, TCP_USER_TIMEOUT,
&tcpUserTimeout, sizeof(tcpUserTimeout));
}
return sockfd;
}
// In the CURL setup section (around line 340):
if (config.tcpUserTimeout > 0) {
curl_easy_setopt(curl, CURLOPT_OPENSOCKETFUNCTION, openSocketCallback);
curl_easy_setopt(curl, CURLOPT_OPENSOCKETDATA, &tcpUserTimeout_);
}
Usage Example
ClientConfig config;
config.tcpUserTimeout = 5000; // 5 seconds
TosClientV2 client(region, ak, sk, config);
// Uploads on this client will abort after 5s of unacknowledged TCP data
Environment
OS: Linux (RK3588 ARM64)
SDK version: current master