[OpenPOWER-HDL-Cores] Fwd: [RFC] SVP64 Branches (contd)

lkcl luke.leighton at gmail.com
Sun Sep 12 13:27:46 UTC 2021


https://libre-soc.org/openpower/sv/branches/

the number of modes available in SVP64 Branches is considerable and
they have significant cross-interaction.  it's also a critical
strategic instruction for High Performance Vector Supercomputing,
being as it is quite literally the fundamental basis of critical inner
loops.

it therefore needs extremely thorough review. this is therefore a
request for help in carrying out a public review. all public comments
from any source is welcome, to take place on the libre-soc-dev public
mailing list.

fundamentally, as usual with SVP64, there is the "Base Scalar v3.0B"
and there is the Vectorisation Prefix.  normally there is a hard rule
set which prohibits SVP64 from deviating from underlying Scalar v3.0B
behaviour: in this particular case however there is good reason to
consider doing so.

however given how critically important it is that SVP64 *NOT* alter
Scalar v3.0B in any way, or be *misconstrued* as altering Scalar v3.0B
in any way, we consider SVP64 Branch-Conditional to be completely
separate instructions.

a quick summary of the modes available:

* there are the usual three bits altering the base scalar v3.0B
behaviour (BO[0] to BO[2])
* predication interacts closely with the Condition Test
* the opportunity to make LR only update if the branch also takes
place seems to have been overlooked in Scalar v3.0B: this is added as
a SVP64-only option
* Horizontal and Vertical Vector Modes slightly alter the behaviour
(ALL mode is not relevant to Vertical-First)
* Horizontal ALL or ANY testing combined with BO[1] results in AND,
OR, NAND or NOR of Condition Tests
* CTR Mode has four separate *additional* sub-modes
* Vector Truncation to the Branch Point is also optional.

this brings the total combined number of options to somewhere around
2^8 (256 possible behaviours) which is far beyond anything i have ever
seen in any Vector Supercomputing or 3D GPU ISA of the past 50 years.

interestingly, much of this comprehensiveness is down to the fact that
Scalar v3.0B Branches are themselves quite comprehensive (CTR Mode).
Without CTR Mode, SVP64 Conditional Branches would be exponentially
reduced in functionality and usefulness for Supercomputing purposes.

given that the Power ISA has a reputation as a long-term stable ISA we
would clearly like, and expect, that to continue.

therefore proper and thorough review with proper feedback and open
discussion even at an early stage is critical.

SVP64 is an *extremely* comprehensive ISA that takes considerable
prior knowledge of 3D GPU and Vector Supercomputer ISAs of the past 50
years to appreciate why it is the way that it is.

* Mitch Alsup's MyISA 66000, a comparative peer, has been in draft
form for a similar timeframe (over 3 years).
* the author of MRISC32 has been developing the MRISC Vector
Processing ISA for over 18 months and is still catching up with modern
and historic Vector Processing techniques and background.

the absolute last thing anyone needs is a last minute scramble to gain
sufficient working knowledge in order to be able to assess SVP64 as
part of a formal OPF ISA WG RFC.  based on how long it has taken to
develop, this will be flat-out impractical.

given that SVP64 has taken over 3 years to develop (so far), working
knowledge of 3D GPU ISAs such as Broadcom VideoCore IV, MALI Midgard,
Vivante, AMDGPU and Intel GMA, as well as Vector Processing ISAs such
as Cray, NEC SX Aurora, RVV and Mitch Alsup's MyISA 66000, are
absolutely essential.

i cannot overemphasise enough therefore how critical it is that
OpenPOWER Foundation Members and Power ISA Hardware Engineers be
actively involved in SVP64 development.

NLnet funding is available and it is also possible to apply to
StandICT.eu for additional Horizon 2023 grant funding.  review
assistance therefore need not be unpaid work.

l.


More information about the OpenPOWER-HDL-Cores mailing list