[OpenPOWER-HDL-Cores] load/store conditional
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Tue May 25 20:15:43 UTC 2021
On Tuesday, May 25, 2021, Jacob Lifshay <programmerjake at gmail.com> wrote:
> RISC-V has an architectural forward-progress guarantee for the equivalent
> loop, suggesting that a cpu implementation prevents live-lock by blocking
> other cpus from taking the cache block associated with a reservation for a
> few cycles (enough for at least 16 simple integer instructions), giving the
> cpu enough time to get to the store-conditional and successfully store.
they call it "constraints".
the RV spec states that anything that falls outside of these constraints
*requires* a counter on the loop, to avoid live-lock. that in turn
introduces inefficiency (additional instructions inside the critical loop).
as a Hybrid GPU, dealing with Vulkan Shader data, Jacob how many atomic
data structure lock operations did you estimate per second? i can't recall
if you said it was 100,000 LR-SC locks per second or 1,000,000?
bottom line here is that live-lock or even going a few hundred times round
a loop before a lock is achieved is not viable, that's thousands of cycles
wasted, representing an unacceptably high percentage of CPU time compared
to other standard general-purpose compute workloads.
the "constraints" as they are called, which allow for backwards-branches
only, are designed to allow implementors to cut out any paths that might
involve complexity: branch prediction, speculation, TLBs, illegal
instruction traps, and, as Jacob says, get the job done atomically, all the
way from LR to SC, on a single (2 at most) cache line exclusive
lock-and-block, *guaranteeing* completion in the process, without
disruption of other cores when those cache lines are locked.
now, there *may* have been such optimisations put into IBM POWER
processors, but if there are, they didn't actually end up in the actual
the RV spec shows, in combination with the extreme use-case of a 3D GPU
(10^5 to 10^6 LRSCs per second) that these things are important.
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenPOWER-HDL-Cores