-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules/zstd: Add ZstdDecoder top level proc #1315
Conversation
71be328
to
84b31f3
Compare
@lpawelcz can you add some markdown documentation on how to test the implementation against |
is the conflict in |
Would it make sense to add a README.md w/ a known limitation section in the zstd directory to document this? |
We added a
Yes, the conflict was caused by changes introduced with regards to #1308, #1202 and #1204. Fixed with rebase EDIT: |
) | ||
|
||
cc_test( | ||
name = "zstd_dec_cc_test", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this currently PASS
👍 but with the following warning:
WARNING: //xls/modules/zstd:zstd_dec_cc_test: Test execution time (19.1s excluding execution overhead) outside of range for MODERATE tests. Consider setting timeout="short" or size="small".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test execution time will surely change after enabling back the ZstdDecoderSeededTest
. Thanks for pointing that out. We will make sure to set those attributes correctly.
|
||
### Top level Proc | ||
This state machine is responsible for receiving encoded ZSTD frames, buffering the input and passing it to decoder's internal components based on the state of the proc. | ||
The states defined for the processing of ZSTD frame are as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state diagram with https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-diagrams#creating-mermaid-diagrams (instead of image files)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the state diagram of the Top Level Proc
After transmitting all data required for current block, it loops around to the block header decoding state and when next block header is not found it decodes checksum when it was requested in frame header or finishes ZSTD frame decoding and loops around to magic number decoding. | ||
|
||
### ZSTD frame header decoder | ||
This part of the design starts with detecting the ZSTD magic number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/This part of the design/This module/ ? to match with other description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frame header decoding is actually implemented as a series of functions that are called in the top level proc. I'm not sure if module
would be a good word here.
net_zstd dependency is not referenced anywhere in XLS. It was introduced as a transitive dependency of fuzztest. In fuzztest workspace net_zstd is also a transitive dependency of riegel. Riegel version used in fuzztest bazel workspace is WORKSPACE-based so transitive dependencies must be defined in workspaces that use that version of riegel. XLS however uses newer, bzlmod-based version of riegel which does not require that. That means there is no need for @net_zstd dependency in XLS workspace and it can be removed. Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
This dependency was unnecessarily removed in google@5b16a18. It is required for testing the ZSTD Decoder Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
This commit adds an initial implementation of the ZSTD Decoder. It is capable of decoding simple ZSTD frames containing raw and rle blocks. This is a squashed commit that was created from the following changes: modules/zstd: Add buffer library modules/zstd: Add Buffer use-case example modules/zstd: Add library for parsing magic number modules/zstd: Add library for parsing frame header dependency_support/libzstd: Make zstd_errors.h public dependency_support: Add decodecorpus binary modules/zstd: Add data generator library modules/zstd: Add zstd frame header tests modules/zstd: Add common zstd definitions modules/zstd: Add raw block decoder modules/zstd: Add rle block decoder modules/zstd: Add block header parsing library modules/zstd: Add SequenceExecutorPacket to common definitions modules/zstd: Add block data muxer library modules/zstd: Add block demuxer library modules/zstd: Add block decoder module modules/zstd/common: Specify decoder output format examples/ram: Export internal RAM API to other modules modules/zstd: Add Offset type to common zstd definitions modules/zstd: Add RamPrinter Proc modules/zstd: Add SequenceExecutor Proc modules/zstd: Add repacketizer modules/zstd: Add ZSTD decoder modules/zstd: Add ZSTD Decoder documentation CI: Add custom ZSTD module workflow Co-authored-by: Maciej Dudek <mdudek@antmicro.com> Co-authored-by: Pawel Czarnecki <pczarnecki@antmicro.com> Co-authored-by: Robert Winkler <rwinkler@antmicro.com> Co-authored-by: Roman Dobrodii <rdobrodii@antmicro.com> Internal-tag: [#52186] Signed-off-by: Maciej Dudek <mdudek@antmicro.com> Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com> Signed-off-by: Robert Winkler <rwinkler@antmicro.com> Signed-off-by: Roman Dobrodii <rdobrodii@antmicro.com>
5b43fbf
to
76e650a
Compare
|
||
namespace xls::zstd { | ||
|
||
static absl::StatusOr<std::vector<uint8_t>> ReadFileAsRawData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using xls::GetFileContents
?
} | ||
|
||
static std::string CreateNameForGeneratedFile( | ||
absl::Span<std::string> args, std::string_view ext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe:
Nit: span should have const elements.
std::optional<std::string_view> prefix) { | ||
std::string output; | ||
|
||
if (prefix.has_value()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
StrCat/StrJoin better
|
||
namespace xls::zstd { | ||
|
||
enum BlockType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
enum class preferred
} | ||
|
||
// Parse input buffer with libzstd and write it as ZSTD_frameHeader | ||
ASSERT_TRUE(!buffer.empty() && buffer.data() != nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
I think this could return an absl::StatusOr and CreateExpectedFrameHeaderResult could be refactored into a matcher, so you'd do something like
EXPECT_THAT(DslxHeaderParse(buffer), Eq(ZstdHeaderParse(buffer)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grebe, you mean instead of using the IrTestBase
helper w/ RunAndExpectEq
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grebe what about keeping the RunAndExpectEq
interface and adding a ZstdFrameHeader
dedicated type that can be used to construct valid test input?
// https://github.com/facebook/zstd/blob/v1.5.6/lib/decompress/zstd_decompress.c#L515 | ||
// Use only in C++ tests when comparing DSLX ZSTD Decoder with libzstd | ||
// Must be in sync with TEST_WINDOW_LOG_MAX_LIBZSTD in frame_header_test.x | ||
const uint64_t TEST_WINDOW_LOG_MAX_LIBZSTD = 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
should be constexpr uint32_t kDslxBufferSize...;
similar for other UPPER_CASE below
return window_size <= max_window_size; | ||
} | ||
|
||
void PrintZSTDFrameHeader(ZSTD_frameHeader* fh) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
Looks unused, if used somehow should probably be LOG(...) <<
if (expected_status == FrameHeaderStatus::CORRUPTED || | ||
expected_status == FrameHeaderStatus::UNSUPPORTED_WINDOW_SIZE) { | ||
return Value::Tuple({ | ||
/*window_size:*/ Value(UBits(0, 64)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
should be /window_size=/ and similar later
std::vector<uint8_t> data = packet.element(0).bits().ToBytes(); | ||
absl::StatusOr<uint64_t> len = packet.element(1).bits().ToUint64(); | ||
if (!len.ok()) { | ||
return std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
Would be good to log the status first. Is this something that's supposed to happen? A CHECK() could be appropriate too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grebe, what about switching to a StatusOr
instead of an std::optional
?
|
||
bool IsLast() { return last; } | ||
|
||
std::string PrintData() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
This seems like ToString() would be a better name,
} | ||
} | ||
|
||
const char* proc_name = "__zstd_dec__ZstdDecoderTest_0_next"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
Nit: I like constexpr std::string_view for string constants.
|
||
void PrintVector(absl::Span<uint8_t> vec) { | ||
for (int i = 0; i < vec.size(); i += 8) { | ||
std::cout << "0x" << std::hex << std::setw(3) << std::left << i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
Maybe cerr or LOG()?
|
||
if (!events.trace_msgs.empty()) { | ||
for (const auto& tm : events.trace_msgs) { | ||
std::cout << "[TRACE] " << tm.message << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @grebe
Maybe cerr or LOG()?
did you try without |
) | ||
|
||
xls_dslx_verilog( | ||
name = "block_dec_verilog", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious if those are still needed (now that you have a verilog target for the top-level)
# Force proc inlining and set last internal proc as top proc for IR optimization | ||
opt_ir_args = { | ||
"inline_procs": "true", | ||
"top": "__xls_modules_zstd_dec_mux__BlockDecoder__DecoderMux_0_next", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the xls_modules_
prefix required, other target don't seem to have it.
|
||
// Send compressed frame to decoder simulation | ||
for (int i = 0; i < frame.size(); i += 8) { | ||
auto span = absl::MakeSpan(frame.data() + i, 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this could access out-of-bounds memory if frame size is not a multiple of 8.
spawn demux::DecoderDemux(input_r, demux_raw_s, demux_rle_s, demux_cmp_s); | ||
spawn raw::RawBlockDecoder(demux_raw_r, mux_raw_s); | ||
spawn rle::RleBlockDecoder(demux_rle_r, mux_rle_s); | ||
// TODO(lpawelcz): 2023-11-28 change to compressed block decoder proc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use TODO(antmicro)?
Ended up merging the PR internally after addressing @grebe comment on our end, feel free to rebase and send any follow-up changes you may have. |
This PR supersedes #1169
It is a part of #1211.
This PR adds a top level proc for ZSTD Decoder which integrates all its components in order to allow decoding of ZSTD frames. It includes C++ tests which:
decodecorpus
tool valid ZSTD frameslibzstd
andZstdDecoder
NOTE: currently it is possible to decode frames with simple block types as RAW and RLE.