On Sun, 24 Mar 2024 20:43:37 GMT
Post by Scott LurndalPost by Bonita MonteroI've got a nice idea for a new processor-extrension for spin-wait
-loops. The idea is that a thread of a processors enters a sleep
state if a word in memory is equal to a certain register until
A processor which is doesn't own (or have a shared copy) of the
cacheline which would contain that word in memory will never know
if it was modified, as it won't see the invalidate messages in
a directory-based cache subsystem (leaving aside noncachable
accesses to the word in memory, of course).
It seems, I didn't understand the idea.
Of course, the waiting thread/core has the word in question in its
L1D cache when it enters the wait loop.
Of course, it is awaken if/when the the word is evicted from the cache
for unrelated reason, i.e. practically because of capacity conflict
caused by activity of other threads that are running on the same
core. There is nothing wrong with spurious awakenings as long as they
are rare.
Post by Scott LurndalThis sounds like a solution to a problem that doesn't exist,
and there would be no incentive for a processor designer
to include the substantial additional complexity required
to support your feature.
The problem does exist and primitive proposed by Bonita is not new. It
is a minor modification of Monitor/Mwait.
For current Intel and AMD processors this sort of things is
relatively unattractive because at 2 threads per core and with rather
measurable throughput gains achieved by running 2 threads instead of
one (for AMD up to 30%, for Intel a little less, but often measurable),
each thread is a valuable resource, so you don't really want to keep it
paused for too long time. And the whole point of Bonita's amendment of
existing mechanism is that the software has more control on long waits.
On IBM POWER and on few of Sun/Oracle chips they have up to 8 threads
per core, so each thread is not that valuable. It means that longer
uninterrupted wait has more sense and control of duration of the
timeout is more desirable. But may be IBM's and Oracle's variants of
MWAIT already have it?