-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge Dev Into Master #171
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* wip load/store * created & parameterized bundles * fix config ex rs1 * optimize loopconv & loopmatmul with bundles; add bundles for preload and compute * move assignments to pipeline output Co-authored-by: Ruohan Yan <yrh@a5.Millennium.Berkeley.EDU>
Added new convenience scripts and config files
By default, there will be just one TLB shared by both the read and write DMAs
* Support single-porting accumulator through the use of accumulator "sub-banks" * Support clock-gating Gemmini modules * Support sharing SPAD/ACC between Int8 and FP gemminis * Reduce bitwidths of loop unroller multipliers and adders * Fix error where small portion of scratchpad was unusable when double-buffering in the loop unrollers When single-porting the accumulator banks, input-dilated convs will sometimes fail because they keep writing to the same accumulator banks. A different write pattern will have to be found eventually for those cases, but that's outside the scope of this PR.
Also, make it easier to pipeline reduction trees by pipelining tiles rather than PEs
We add a new experimental "pixel_repeats" feature to optimize conv layers with few input channels (like the first layer of most CNNs).
* Reduce a few bitwidths in ReservationStation.scala
Fixes bug where TLB hits were being counted incorrectly. Prior to this PR, we were using RegNext(io.req.fire()) to match TLB requests to TLB responses. However, we made our interface to the TLB combinational months ago, so the RegNext is no longer necessary (and is actually incorrect).
…ementations (#169) Rename tiled_conv_A_stride to tiled_conv and cleanup unused conv implementations
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.