Skip to content

[herd] Global count of useful hardware update#1733

Open
maranget wants to merge 5 commits intomasterfrom
global-hwupdates
Open

[herd] Global count of useful hardware update#1733
maranget wants to merge 5 commits intomasterfrom
global-hwupdates

Conversation

@maranget
Copy link
Copy Markdown
Member

@maranget maranget commented Mar 2, 2026

This PR is an evolution of PR #1730. The number of useful hardware updates is evaluated by a global scan of events at the very end of execution candidate construction. The scan counts the number of explicit write effects to locations that can be page table entries of values whose AF flag can be zero.

The technique sounds more general and should yield a better upper bound than the local technique of PR #1730.

Additionally , we perform the simple identification of the case of a failing CAS that nevertheless writes the value read from memory into memory again. In such a case, there is no need to count another hardware update.

@maranget maranget marked this pull request as ready for review March 30, 2026 08:15
maranget added 5 commits April 1, 2026 10:51
The function [delay_kont] can be used to extract the value returned
by a monad (second argument below, type `'a`).
```
  val delay_kont : string -> 'a t -> ('a ->  'a t -> 'b t) -> 'b t
```

The continuation function `(fun v mv -> ... )` can then examine
the returned value `v` and combine the monad `mv` independently,
which proves very convenient in many occasion.

The change performed by this commit permits affine (_i.e._ one or zero
effective occurrence), while linear usage (exactly one occurrence) was
mandatory before.
  + Efficient group function: sort, then group.
  + Suffix based generators:
     - generate all suffixes,
     - cross product of suffixes.
When TTHM=HA or TTHM=HD are active, HW update of the AF flag
is performed. This include the so-called "spurious" updates
that are performed independently of test code.

For efficiency reason we limit the number of such spurious
updates to what is necessary. We do so by a global
scan of the execution candidates  counting the
writes that may unset the AF flag in the final
set of effects. Notice that we also consider the
initial writes in this scan.

We perform one optimisation: by exception, when a
write effect value is the same as the value
read by the same instruction from the same location,
there is no need to add a supplementary spurious update
as the (potential) update associated to the write that
stored the value has already been counted and is sufficient.
maranget added a commit that referenced this pull request Apr 1, 2026
As introduced by PR #1733 each (potential) spurious update
of the AF flag is "motivated" by a write that can unset this AF flag.

This PR reinforces the link by adding an equation that equates
the value written and the value read by the update, resulting
in better performance in some cases.
maranget added a commit that referenced this pull request Apr 1, 2026
As introduced by PR #1733 each (potential) spurious update
of the AF flag is "motivated" by a write that can unset this AF flag.

This PR reinforces the link by adding an equation that equates
the value written and the value read by the update, resulting
in better performance in some cases.
@maranget
Copy link
Copy Markdown
Member Author

maranget commented Apr 2, 2026

Hi @relokin and @HadrienRenaud, do I have your approval for merging?

Copy link
Copy Markdown
Member

@relokin relokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me, thanks Luc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants