[OpenPOWER-HDL-Cores] introducing libre-soc: nmigen hybrid cpu-vpu-gpu

Luke Kenneth Casson Leighton lkcl at lkcl.net
Fri Mar 27 13:40:15 UTC 2020


On Fri, Mar 27, 2020 at 12:46 PM Jeffrey Scheel <scheel at us.ibm.com> wrote:
>
> FWIW, we expect that a parseable version of the ISA will be available in the future.  While just having such a form will allow for mapping to any format, if you have a strong argument for a particular form (json, XML, ...), I'd be interested in understanding.
> -Jeff

that's fantastic to hear, jeffrey.  back in 1999 i wrote an SMB / CIFS
parser that literally took the text version of the CIFS specification
from Microsoft - unmodified - as-is, and, searching for packet formats
(which came originally from c structs in Microsoft's own source code)
spewed out auto-generated c code for parsing network packets.  i think
it took me under two weeks to get a working client and basic server
operational.

XML is, generally, an absolute pig for humans to read.  it's truly
dreadful, and hurts the eyes.  additionally, the libraries that parse
it, whilst they exist and are functional, are really quite large.  not
only that but its "support" for character sets includes HTML
escape-sequences when tags are in quotes, and not otherwise, oh but
some still are, but in a different format from those in quotes.  this
generally does peoples' heads in, and you need special text-editors to
view (and change) them.

JSON... it's popular, and has significant cross-platform adoption.
again however there tends to be a habit of not formatting it for human
consumption: everything on a "single line" rather than formatted in an
indented fashion with newlines and line-breaks.

both therefore have strong disadvantages when it comes to putting
revisions of the document into git repositories (and if IBM doesn't do
that, other people *will* do it, including us).

git diffs, git commits, these are all completely messed-up by a common
failing to add human-friendly line-breaks: XML is particularly bad as
the lines tend to include the start-end tags (which get indented
depending on the *parent* tag changes) and, critically, tend to wrap
well beyond 80 characters (an acceptable and extremely important
lowest-common-denominator for online collaborative contributions: see
linus torvalds comments:
https://www.linuxjournal.com/content/line-length-limits)

i'm a huge fan of dead-simple file-formats.  CSV is so trivial that
even Excel has had support for it for decades.  in combination with
markdown we got the ISA Tables into human-readable *and*
machine-parseable form in around a day:

https://libre-riscv.org/openpower/isatables/
https://git.libre-riscv.org/?p=libreriscv.git;a=tree;f=openpower;h=0e88580e82bb0e7b8df9cec850e43eae1bca27d2;hb=HEAD

fields.text was particularly important however here i had to "invent" a format:
https://git.libre-riscv.org/?p=libreriscv.git;a=blob;f=openpower/isatables/fields.text;h=e651b830b2d5af0f0ec4800daf6d10057f0d15bd;hb=HEAD

where the vertical bars line up, that indicates in that column a
subdivision between fields.  the code that "recognises" this format is
here:
https://git.libre-riscv.org/?p=soc.git;a=blob;f=src/soc/decoder/power_fields.py;h=36ad6d68905db9cb854221ca036f8cf930fed3b9;hb=9d09af12cdf661cd8ddb24853be8d11f6439c0f7

it's... obtuse (written very rapidly), uncommented (for which i
apologise profusely), but "does the job".

unfortunately i cannot think of an ideal format - JSON, XML, or other
- which would "properly" do justice to the multi-column "spanning"
necessary to effectively capture this information, with the possible
exception of HTML "Tables" (with colspan).  not even markdown tables
will properly capture the subtle inter-row relationships needed, and i
am a *really big* fan of markdown.

at least HTML tables would capture the left-right alignment (in each
cell). however its free-form nature and lack of focus tends to again
make me extremely wary of recommending it.


the only other format that i can think of that would properly express
these tables properly (and still respect the implicit requirements
that come from use of git and the collaboration that comes with git
patches, and are reasonably machine-readable) is: latex.

you can see from these examples:
https://github.com/riscv/riscv-isa-manual/blob/master/src/c.tex#L374
https://github.com/riscv/riscv-isa-manual/blob/master/src/c.tex#L1225

that table is pretty obvious what's a table, what's headings, what's a
line-break and what's a column.  the use of backslash for symbols is
initially pretty obtuse (but at least it's not &#xgobbledeook;)
however the fact that it is a (30?) year old scientific standard and
the fact that its output was *specifically* written by someone who
went all-out on being able to present scientific formulae in a
typographically-correct way makes its use for the purpose of writing
and collaboratively maintaining standards particularly compelling.

i did give serious consideration to writing a latex table parser in
python, for auto-generation purposes, however at least five possibly
as high as ten people have beaten me to it:
https://www.google.com/search?q=python+latex+parser

even perl has half a dozen latex parsers!
https://www.google.com/search?q=perl+latex+parser

this library in particular can *write* latex-formatted documents:
https://github.com/alvinwan/TexSoup

conceivably therefore, it may be possible for you to *use* that
library to perform a near-automated (or completely automated)
conversion from the (current) proprietary XML format *into* latex with
very little work... oh and not introduce any critical typographical or
transliteration errors in the process.

from there, when it comes to actual editing, i am a huge fan of
texstudio (particularly on hi-res screens, 2600x1600 and 3840x2160).
left-hand-side for editing, convenient menus for when you forget (or
don't want to remember) the various different voodoo-incantations, and
press F6 and it's *immediately* turned into a PDF document that is
embedded on the RHS.

i would offer to help collaborate on writing an XML-to-latex parser
unconditionally, if it wasn't for the ridiculously tight deadlines
we're under.  i *might* be able to at least get you started (as long
as there's no expectation that i should register an account on github
to do so), particularly if other people are willing to help pitch in
on this?

it's in everyone's best interests because writing parsers using
by-hand transliteration of content from PDF or other documents is an
absolute sure-fired way to end up both numb from the neck down (and
up) as well as introduce inconsistency errors between implementations.

ultimately, i see no reason why we shouldn't actually turn the
(extremely well-written) pseudo-code snippets into *actual code* for
emulation and actual HDL purposes.  example which is brain-dead-easy
to turn into actual computer-executable code:

if RA = 0 then b <= 0
else           b <= (RA)
EA <= b + EXTS(DS || 0b00)
RT <= MEM(EA, 8)

for things like the FP section (4.6.8 page 167 book I 3.0B), the FP
Compare example pseudo-code i would *particularly* like to
transliterate that directly into actual code, thus *guaranteeing* that
we have a standards-compliant implementation.

yes, really!

we haven't got time or manpower to spend on manual by-rote
hand-written transliteration of critically-important parts of the
POWER Spec.  *anything* that saves us time and gives us accuracy is
*really* important.

l.


More information about the OpenPOWER-HDL-Cores mailing list