Skip to content

Resource runs its finalizers when timed out#4617

Open
fommil wants to merge 8 commits into
typelevel:series/3.7.xfrom
fommil:resource-override-timeout
Open

Resource runs its finalizers when timed out#4617
fommil wants to merge 8 commits into
typelevel:series/3.7.xfrom
fommil:resource-override-timeout

Conversation

@fommil

@fommil fommil commented Jun 16, 2026

Copy link
Copy Markdown

This fixes #4489 that was introduced by #4059 using an approach documented by @durban in #4489 (comment) and admitted reluctantly by @djspiewak in #4489 (comment) who suggests that there is perhaps a more general solution by rethinking Resource itself.

@djspiewak

djspiewak commented Jun 20, 2026

Copy link
Copy Markdown
Member

Can we make this so that it doesn't regress #4059? The reason timeout is implemented in terms of racePair in the first place is it's possible for the effect to complete at the exact moment of the timeout, and in that case, the value can be leaked because the finalizers on the value side have already popped. Something like this comes to mind:

IO.ref(false) flatMap { fired =>
  val program = Resource.eval(IO.sleep(100.millis).onCancel(fired.set(true)).timeout(100.millis).use_

  program.attempt flatMap {
    case Left(()) => program  // worked but we didn't reproduce the error
    case Right(_) => fired.get.flatMap(f => IO(assert(f == true)))
  }
}

The above will almost certainly fail with your fix, indicating a memory leak in the case where the timeout executes but we don't run the finalizer. (also note the above is a handwave due to the way I encoded the repetition; with a proper implementation it would loop forever, so this isn't a real unit test)

Edit: Additionally, this only fixes half of the problem. We also need to address the bug in start (described here: #4489 (comment))

@mergify

mergify Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Tick the box to add this pull request to the merge queue (same as @mergifyio queue).

  • Queue this pull request

@djspiewak

Copy link
Copy Markdown
Member

Oh, secondarily: can we retarget this at series/3.7.x please? That will allow us to release it faster. It's also fine if you need to target it at series/3.6.x. We're almost certainly going to do another release in that lineage.

@fommil fommil changed the base branch from series/3.x to series/3.7.x June 23, 2026 11:29
@fommil

This comment was marked as outdated.

@fommil fommil force-pushed the resource-override-timeout branch from 4588cc5 to 767ba8a Compare June 23, 2026 11:42
@fommil

fommil commented Jun 23, 2026

Copy link
Copy Markdown
Author

I added a test for the simultaneous success/timeout but I can't get it to fail even with the original proposed fix 🤷🏻‍♂️ ... I'm not sure what to do to force it to be more likely to happen.

@durban durban left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something wrong with this PR, it contains completely unrelated changes from series/3.x. (I'd guess something went wrong when it was retargeted.)

Comment thread kernel/shared/src/main/scala/cats/effect/kernel/Resource.scala Outdated
@fommil fommil force-pushed the resource-override-timeout branch 3 times, most recently from ac1e313 to f74851d Compare June 23, 2026 13:51
@fommil

fommil commented Jun 23, 2026

Copy link
Copy Markdown
Author

@durban re: unrelated commits yeah that was an accident, I rebased.

BTW, "cats eye nebula" 😆 nice... is the "cats" part an intentional cats-effect joke? I have a good view of C6 from my backyard, I got eyes on it this past winter; it's really small but stunning once you find it.

@fommil fommil force-pushed the resource-override-timeout branch from f74851d to 778de6d Compare June 23, 2026 14:31
@fommil

fommil commented Jun 23, 2026

Copy link
Copy Markdown
Author

I reverted the verbose pattern match, and added a few more tests. It would be good if these used ticked (I'll have a think about that) but it looks like only the "losing a timeout" fails on the mainline.

This PR makes all tests go green, so I still can't construct a test that lets us see the other theoretical failure case.

It's highly likely my tests are junk, though, so reviewing those would be a good first step.

@fommil fommil force-pushed the resource-override-timeout branch from 778de6d to 6c8b38a Compare June 23, 2026 14:35
@fommil fommil force-pushed the resource-override-timeout branch 2 times, most recently from 5bbaf6e to d7f6ec7 Compare June 23, 2026 16:26
@fommil fommil force-pushed the resource-override-timeout branch from d7f6ec7 to c2f8156 Compare June 23, 2026 16:27
@fommil

fommil commented Jun 23, 2026

Copy link
Copy Markdown
Author

huzzah, this latest test change seems to trigger the "leaked value" scenario. It is very non-deterministc. It fails with my change AND in the baseline. So there is always the possibility the test is junk, so please take a look at it.

Sorry for all these force pushes, I keep forgetting to scalafmt, and I wanted to include the unit tests at the start of the history.

@durban durban left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fommil No, it is not an intentional Cats (Effect) reference. (I'm not even sure there was a Cats back then.) I just wanted a nice nebula for my profile picture 🤷‍♂️

real("run nested finalisers when succeeding at the same time as the timeout") {
val go = IO.ref(false).flatMap { ref =>
val inner = Resource.make(ref.set(true))(_ => ref.set(false))
val res = Resource.make(IO.sleep(99.millis).flatMap(_ => inner.use_))(_ => IO.unit)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is strange to me. Specifically use_ing inner in the acquire part of res. Don't get me wrong, I'd expect this test to pass (and it doesn't). But the test itself is unclear to me. Was @armanbilge's minimization here not enough to reproduce the issue?

@fommil fommil Jun 24, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

His minimisation is one of the three tests I added. (The other) Daniel said he expected my change to cause the value to be leaked when it completes at the same time as the timeout so I wrote this test and was surprised to find that it fails for both the mainline and my branch. The third test is a regular "timeout wins" test which seems fine for both mainline and my suggestion.

With this test I wanted a way to tell if the value has been leaked, which is why the inner Resource modifies then cleans up the ref. If you have another way of testing a leaked value that might be cleaner because there's a double use of Resource here that can be confusing.

@fommil fommil force-pushed the resource-override-timeout branch from 5ab9f61 to adc8e66 Compare June 25, 2026 10:58
@fommil

fommil commented Jun 25, 2026

Copy link
Copy Markdown
Author

I ignored the failing test, it's somewhat concerning.

@fommil

fommil commented Jun 25, 2026

Copy link
Copy Markdown
Author

@durban well if you ever want to take your own pic of a nebula, check out https://github.com/fommil/luddcam 😄

@fommil

fommil commented Jun 25, 2026

Copy link
Copy Markdown
Author

one of the tests I added is failing the windows CI build. I can take a guess at that, I bet it's because it's doing a bunch of stuff in parallel. I'll disable one more.

@fommil

fommil commented Jun 26, 2026

Copy link
Copy Markdown
Author

@djspiewak your test is failing even on main

  real("ensure a timed out resource runs its onCancel") {
    IO.ref(true) flatMap { fired =>
      Resource
        .eval(IO.sleep(100.millis).onCancel(fired.set(false)).timeout(100.millis))
        .use_
        .attempt
        .flatMap {
          case Left(_) => IO.unit
          case Right(_) =>
            fired.get.ifM(IO.raiseError(new Exception("didn't run the cancelation")), IO.unit)
        }
    }
  }

I've added this but flagged it .ignore. Note it passes if I use a ticker. It might benefit from replicate.

@durban

durban commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Shouldn't the two cases be the other way around for that test? If it's a Right, then it did not time out, thus I wouldn't expect the onCancel to execute.

@fommil

fommil commented Jun 26, 2026

Copy link
Copy Markdown
Author

The test should probably be on both cases, but you're right. We maybe need to add some more waiting to allow the onCancel to run, I presume there's no guarantee that it runs before anything can access the value. I'd intuitively expect the cancel to run before the use completed but i guess that's not what we're seeing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible Issue with Timeout in Cats-Effect 3.6.x which resulted in Http4s Connection leaks

3 participants