Skip to content

fix: reduce tcp_retries2 to 3#2912

Open
merlin-northern wants to merge 2 commits into
mendersoftware:masterfrom
merlin-northern:qa_1625__mender_connect_test_in_poor_network_conditions_fail_with_404_from_time_to_time_tcp-retries2
Open

fix: reduce tcp_retries2 to 3#2912
merlin-northern wants to merge 2 commits into
mendersoftware:masterfrom
merlin-northern:qa_1625__mender_connect_test_in_poor_network_conditions_fail_with_404_from_time_to_time_tcp-retries2

Conversation

@merlin-northern
Copy link
Copy Markdown
Contributor

@merlin-northern merlin-northern commented May 30, 2026

fix: reduce tcp_retries2 to 3

some of the tests in the mender_connect and among them test_in_poor*
expect the TCP/IP stack to react in a given time. This is not the case
as things are implemented in the linux kernel at the moment.

in order to make the test pass every time we have to influence the kernel
to calculate lower retransmission timeouts (RTO), so the TCP stack
notices that the connectivity has been broken and assumes the mender-connect
link to the deviceconnect is broken in the times hardcoded in the test.
one way of doing that is to lover the tcp_retries2 from default 15, which
may (and does as the failures of this test prove) reach even more
than 10 mintues, to new turbo sensitive 3.

Refs:
https://docs.kernel.org/networking/ip-sysctl.html

A note to reviewers @danielskinstad @kjaskiewiczz: I started 4 independent pipelines in 4.1.x listed below; I restarted integration:3 many times, it never failed. You can see all the logs from all the jobs and how other failures were distributed (it does not bring an additional instability).

Changelog: Title
Ticket: QA-1625
Signed-off-by: Peter Grzybowski peter@northern.tech

@mender-test-bot
Copy link
Copy Markdown
Contributor

@merlin-northern, start a full integration test pipeline with:

  • mentioning me and start integration pipeline

my commands and options

You can prevent me from automatically starting CI pipelines:

  • if your pull request title starts with "[NoCI] ..."

You can trigger a client pipeline on multiple prs with:

  • mentioning me and start client pipeline --pr mender/127 --pr mender-connect/255

You can trigger a client pipeline for a specific Mender Client release with:

  • mentioning me and start client pipeline --release 6.0.x (can be given multiple times)
  • by default, a pipeline is triggered for each supported release the component is a part of

You can trigger GitHub->GitLab branch sync with:

  • mentioning me and sync

You can print PR statistics for a repository with:

  • mentioning me and print fast pr stats (Team stats only)
  • mentioning me and print full pr stats (Detailed report)
  • options: --repo <repo>, --team <name>, --all-repos, --exclude-drafts, --exclude-user <user>
  • mentioning me and print full pr stats --repo mender --all-repos --exclude-drafts

You can cherry pick to a given branch or branches with:

  • mentioning me and:
 cherry-pick to:
 * 1.0.x
 * 2.0.x

@merlin-northern merlin-northern force-pushed the qa_1625__mender_connect_test_in_poor_network_conditions_fail_with_404_from_time_to_time_tcp-retries2 branch from ef6e6e2 to 666d827 Compare May 30, 2026 05:53
some of the tests in the mender_connect and among them test_in_poor*
expect the TCP/IP stack to react in a given time. This is not the case
as things are implemented in the linux kernel at the moment.

in order to make the test pass every time we have to influence the kernel
to calculate lower retransmission timeouts (RTO), so the TCP stack
notices that the connectivity has been broken and assumes the mender-connect
link to the deviceconnect is broken in the times hardcoded in the test.
one way of doing that is to lover the tcp_retries2 from default 15, which
may (and does as the failures of this test prove) reach even more
than 10 mintues, to new turbo sensitive 3.

Refs:
https://docs.kernel.org/networking/ip-sysctl.html

Changelog: Title
Ticket: QA-1625
Signed-off-by: Peter Grzybowski <peter@northern.tech>
Ticket: QA-1625
Signed-off-by: Peter Grzybowski <peter@northern.tech>
@merlin-northern merlin-northern force-pushed the qa_1625__mender_connect_test_in_poor_network_conditions_fail_with_404_from_time_to_time_tcp-retries2 branch from 35af818 to e7a8b48 Compare May 30, 2026 14:23
@mender-test-bot
Copy link
Copy Markdown
Contributor

mender-test-bot commented May 30, 2026

Merging these commits will result in the following changelog entries:

Changelogs

integration (qa_1625__mender_connect_test_in_poor_network_conditions_fail_with_404_from_time_to_time_tcp-retries2)

New changes in integration since master:

Bug Fixes
  • reduce tcp_retries2 to 3
    (QA-1625)

@merlin-northern
Copy link
Copy Markdown
Contributor Author

@mender-test-bot start integration pipeline please

@mender-test-bot
Copy link
Copy Markdown
Contributor

Hello 😺 I created a pipeline for you here: Pipeline-2564031608

Build Configuration Matrix

Key Value
INTEGRATION_REV pull/2912/head
RUN_TESTS_FULL_INTEGRATION true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants