[OpenPOWER-HDL-Cores] [Libre-soc-dev] microwatt / libresoc dcache

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sat May 8 10:39:42 UTC 2021


---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Fri, May 7, 2021 at 6:47 AM Paul Mackerras <paulus at ozlabs.org> wrote:

> The other point, which you don't seem to have taken in yet, is that
> this is NOT the critical path.  There is no point getting the data out
> substantially before the hit_way is known, and for the sake of timing,
> that has a register (r1.hit_way) in the path.  So r1.hit_way is not
> valid until cycle 2 (counting cycle 0 as the one where the address is
> presented to the dcache).

so my first instincts were:

* i am advocating setting up everything that's "input" to writeback_control
  as a separate variable (combinatorially written to)
* all of dcache_request which calcs req_hit_way which goes in r1.hit_way
  is combinatorial, agreed.
* r1.hit_way is used to index cache_out therefore this would be bad to make
  combinatorial as well  data_out := cache_out(r1.hit_way);

https://github.com/antonblanchard/microwatt/blob/master/dcache.vhdl#L1181

but then i noticed that in dcache_fast_hit r1.hit_way is set up in a
rising_edge.

so the capture of req_hit_way at cycle 2 (using the definition above, cycle 0
is address), this would still be in that rising_edge() block in dcache_fast_hit.

except... what i am effectively saying is, that req_hit_way would
combinatorially
propagate through to write_back_control (the two paths now being connected
through a proposed alternative data structure), and that would be bad.

yep, agree with your assessment, paul, i'm all caught up now.

solutions that i have seen to this, used by intel, have been to make multi-level
PTE caches. an 8-entry single-cycle, followed by (guessing) 256-entry two-cycle
followed by (guessing) 4k three-cycle.

l.


More information about the OpenPOWER-HDL-Cores mailing list