Skip to content

Call MinimumInstanceChecker when an agent is bound#1124

Merged
res0nance merged 2 commits into
jenkinsci:masterfrom
jglick:replenish
Aug 12, 2025
Merged

Call MinimumInstanceChecker when an agent is bound#1124
res0nance merged 2 commits into
jenkinsci:masterfrom
jglick:replenish

Conversation

@jglick
Copy link
Copy Markdown
Member

@jglick jglick commented Jul 31, 2025

Extracted from #1121.

I was testing a cloud config like this

amazonEC2:
  name: 
  region: 
  cleanUpOrphanedNodes: true
  noDelayProvisioning: true
  sshKeysCredentialsId: 
  templates:
  - ami: 
    amiType:
      unixData:
        sshPort: "22"
    associatePublicIp: false
    avoidUsingOrphanedNodes: true
    connectBySSHProcess: false
    connectionStrategy: PRIVATE_IP
    deleteRootOnTermination: false
    description: 
    ebsEncryptRootVolume: DEFAULT
    ebsOptimized: false
    enclaveEnabled: false
    hostKeyVerificationStrategy: 'OFF'
    idleTerminationMinutes: "0" # ℹ️
    javaPath: java
    labelString: 
    maxTotalUses: 1 # ℹ️
    metadataEndpointEnabled: true
    metadataHopsLimit: 1
    metadataSupported: true
    metadataTokensRequired: false
    minimumNumberOfInstances: 0
    minimumNumberOfSpareInstances: 2 # ℹ️
    mode: EXCLUSIVE
    monitoring: false
    numExecutors: 1 # ℹ️
    remoteAdmin: 
    remoteFS: 
    securityGroups: 
    stopOnTerminate: false
    subnetId: 
    t2Unlimited: false
    tenancy: Default
    type: 
    useEphemeralDevices: false
  useInstanceProfileForCredentials: false

The intention is to keep two instances booted, connected, and ready to use at all times, treating them as “one-shot” (destroy after use, so there is no chance of cross-build state contamination). However after running a build, which consumed one instance, it was not replenished with a fresh one, until a periodic task ran every ten minutes. This is not nearly fast enough to keep up with even a moderately frequent build schedule. With this patch, the moment an instance is assigned to a build, the process of launching a fresh instance is started, so that the pool remains sufficiently warm so long as you do not trigger too many builds for the pool size to keep up.

There may well be other situations in which instances need to be refreshed; this is just the one I found in this scenario.

@jglick
Copy link
Copy Markdown
Member Author

jglick commented Aug 1, 2025

https://github.com/jenkinsci/ec2-plugin/pull/1124/checks?check_run_id=47157908661

org.opentest4j.AssertionFailedError: expected: <2> but was: <3>
	at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
	at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
	at org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
	at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531)
	at hudson.plugins.ec2.EC2RetentionStrategyTest.testRetentionDespiteIdleWithMinimumInstanceActiveTimeRange(EC2RetentionStrategyTest.java:928)

According to CloudBees PCT testing this test is known to be flaky. Tried it locally in this branch and it passed 5×, so I am guessing the failure is unrelated to my changes.

@res0nance res0nance merged commit 94a4ed1 into jenkinsci:master Aug 12, 2025
17 checks passed
@jglick jglick deleted the replenish branch August 12, 2025 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants