Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental serializer #7594

Merged
merged 138 commits into from
Aug 11, 2023
Merged

Conversation

TH3CHARLie
Copy link
Contributor

An attempt to add serialization/deserialization to the Halide frontend. Using flatbuffers (https://flatbuffers.dev/) as the binary format.

Work in progress right now, will summarize all unsupported features/design decisions in a bit.

Shout out to @zvookin 's previous attempt (main...halide_ir_flatbuffer) which I take heavy reference from.

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a really nice start! My comments are mostly regarding code style and structure. I suspect @zvookin will have more substantive comments about the Flatbuffer usage, as he has far more experience than I with that.

apps/serdes/Deserializer.h Outdated Show resolved Hide resolved
apps/serdes/Deserializer.h Outdated Show resolved Hide resolved
#include <fstream>
#include <iostream>

std::string Deserializer::deserialize_string(const flatbuffers::String *str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If str can't ever (legally) be null, we should pass it by const-ref rather than const-ptr.

apps/serdes/Deserializer.h Outdated Show resolved Hide resolved
apps/serdes/Deserializer.cpp Outdated Show resolved Hide resolved
apps/serdes/Printer.h Outdated Show resolved Hide resolved
apps/serdes/Serializer.cpp Outdated Show resolved Hide resolved
case Halide::Internal::ForType::GPULane: {
for_type = Halide::Serdes::ForType::ForType_GPULane;
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere: how do we ensure that additions to one of the enums (in either the .fbs or the Halide headers files) are correctly handled? I don't know the answer to this (and I'm not expecting you to have the answer, either), but I hope there exists an answer that is more robust than "hope we don't forget someday".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can alleviate this by removing the default case of the switch statement, and do the error handling outside of the switch. Then whenever we have a new addition to the enums in Halide headers (I think the workflow will be starting with adding to Halide files and then serialization protocol fbs file), the compiler will give an error here because of non-exhaustive enumeration

apps/serdes/halide_ir.fbs Outdated Show resolved Hide resolved
src/Function.h Outdated
@@ -68,6 +68,10 @@ class Function {
/** Construct a Function from an existing FunctionContents pointer. Must be non-null */
explicit Function(const FunctionPtr &);

/** Construct a function from deserializing */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arguments to this need documentation:

  • How is origin_name different from name?
  • How are required_types different from output_types?
  • What kind of strings are ok for args?
  • What happens if I pass empty vectors for any of the arguments?

@TH3CHARLie
Copy link
Contributor Author

@steven-johnson Thanks for the review comments! I'm still merging some code related to function schedules, I'll have a commit later addressing the comments.

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally LGTM pending green.

@steven-johnson
Copy link
Contributor

Just waiting for @alexreinking to approve since there are pending "requested changes"

@steven-johnson
Copy link
Contributor

Stil waiting for @alexreinking to approve since there are pending "requested changes"

# fail on case-sensitive file systems. We'll try this as a hack workaround:
# just try all three. (Note that the internal CMake library name appears to be
# `flatbuffers` in all cases.)
set(FB_NAME "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using the NAMES argument to find_package? That would simplify this considerably

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...because I didn't know about it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, sometimes I forget the abyss stared back into me

VERBATIM
)
add_custom_target(generate_fb_header DEPENDS "${fb_header}")
set_source_files_properties("${fb_header}" PROPERTIES GENERATED TRUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Add custom command sets it automatically

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that.


# flatbuffers is small and compiles quickly, but if you want/need to use
# a local version (via find_package), configure with FLATBUFFERS_USE_FETCHCONTENT=OFF
option(FLATBUFFERS_USE_FETCHCONTENT "Enable to download the Flatbuffers library via FetchContent" ON)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole FetchContent thing makes me sad. We'll (I'll) be able to do much better with CMake 3.24+ and its tighter integration between the two methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, an upgrade would give us so much leverage that I think we should poll our users to see if they can tolerate an early upgrade to 3.24.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I can fix the NAMES thing in a follow up. Let's land this.

@steven-johnson steven-johnson merged commit f75f68d into halide:main Aug 11, 2023
3 checks passed
@TH3CHARLie
Copy link
Contributor Author

Thanks everyone for the help!

std::vector<Argument> infer_arguments(const Internal::Stmt &body);

/** Get the Funcs this pipeline outputs. */
std::vector<Func> outputs() const;

/** Get the requirements of this pipeline. */
std::vector<Internal::Stmt> requirements() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably return a const ref to a vector

@@ -184,11 +184,17 @@ class Pipeline {
* outputs. Schedules the Funcs compute_root(). */
Pipeline(const std::vector<Func> &outputs);

/** Make a pipeline from deserialization. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is confusing, because the args are not a "deserialization". A better comment would be something like: "Make a Pipeline with a vector of outputs and a list of requirement assertions. Used by deserialization."

@@ -54,6 +54,9 @@ class ReductionDomain {
* the vector being outermost. */
ReductionDomain(const std::vector<ReductionVariable> &domain);

/** Construct a reduction domain from deserialization */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't very good either

@@ -652,6 +669,10 @@ class StageSchedule {
}
StageSchedule(const StageSchedule &other) = default;
StageSchedule();
StageSchedule(const std::vector<ReductionVariable> &rvars, const std::vector<Split> &splits,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment explaining what this constructor is for


namespace Halide {

void serialize_pipeline(const Pipeline &pipeline, const std::string &filename, std::map<std::string, Internal::Parameter> &params);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Badly needs a long comment explaining serialization.

namespace Halide {

/**
* Deserialize a Halide pipeline from a file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment needs to be longer, explaining in more detail what the map is for.

*/
Pipeline deserialize_pipeline(const std::string &filename, const std::map<std::string, Internal::Parameter> &external_params);

Pipeline deserialize_pipeline(std::istream &in, const std::map<std::string, Internal::Parameter> &external_params);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs a doc string

@@ -23,6 +23,11 @@ namespace Internal {

struct ParameterContents;

struct BufferConstraint {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this into the header moved into the public API. It needs documentation that explains what each field is and explains that it's only for serialization.

const Buffer<void> &buffer, int host_alignment, const std::vector<BufferConstraint> &buffer_constraints,
MemoryType memory_type);

Parameter(const Type &t, bool is_buffer, int dimensions, const std::string &name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a doc string. Note that in doxygen output this won't necessarily appear in a way that makes it clear that the comment on the previous constructor also applies.

@abadams
Copy link
Member

abadams commented Aug 15, 2023

If I understand correctly, there's going to be a follow-up PR that adds testing and fixes any issues that adding testing finds. Please also address my review comments above in that follow-up PR.

@TH3CHARLie
Copy link
Contributor Author

I can send a PR today adding the JIT test we talked about in the dev meeting, enclosed in #ifdef blocks. Some docstring are addressed in Derek's recent PR #7760

ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* init

* sync

* single func pipeline round-trip test

* roundtrip test framework completed, single output function tested, no Dag yet

* serialize Stmt, partially done (cuz no support of Expr yet), not fully
tested

* deviceAPI MemoryType ForType

* Expr, with a grain of salt

* fix exprs in stmts

* format everything

* Range

* fix undefined exprs and stmts

* address some review comments:
- proper using
- proper includes
- rename Serdes -> Serialize

* address more review comments
- rename .hlb/.hlr to .hlpipe
- reserve vectors
- proper memory management

* deserialize_expr_vector

* support bound, storageDim, loopLevel and funcSchedule

* Specialization, Definition

* sync commit

* temporarily comment out func mapping stuff to remove blockers

* helper funcs

* call_type and reduction_domain

* ModulusRemainder and VectorReduceOp, some minor refactoring

* prefetch directive

* name mangling and closing on function's odds and ends

* split

* dim

* stage schedule

* tidy

* parameter

* more parameter

* check nullptr and some minor fix

* fix crashing

* func index replacing func ptr during serialization

* extern func arg, some minor cleanup

* replace cerr with halide assert

* buffer??

* remove printer

* fix

* wrappers in func_schedule

* clear func mapping to use serializer for more than 1 pipelines, use unordered_map also

* attempt to move serialization into core, get cmake working for now

* fix

* we maybe don't need submodule

* fix cmake

* make headers work again, with some hacks ofc

* serialization now lives in libHalide

* testing 101

* don't include flatbuffers header in Halide.h

* fix

* namespace adjust

* user_assert

* fix a missing field

* fix missing type info in some exprs

* fix bug in function mapping

* fix function DAG broken issue

* format

* rm cout in cpp files and change test group name

* fix the case func ptr is not defined

* add a missing call type deserialization

* serialize unique parameters

* serialize unique buffers

* fix missing type in parameter

* fix a missing tail stra

* change find_transive_call to build_enviroment to include wrappers in the DAG

* upstream current test strategy, intercept JIT compilation for each pipeline, serdes ronudtrip and back

* make sure buffer memory layout are the same

* don't use ir comparator to compare pipelines, we will use jit tests from now

* don't serialize Parameter's buffer, compute external buffers from Call, Variable and ExternFuncArgument and don't serialize them as well

* fix, 35 tests remaining

* fix output function orders

* reuse jit_externs since we cannot really serialize it, 29 tests to go

* fix that buffer_constraints, host_alignment and memory_type are incorrectly removed, also add missing exact in split

* only use outputs and requirements from deserialized pipeline during testing

* nits

* add missing requirements during deserialization

* restore original pipeline's contents after lowering

* address some review comments

* Install flatbuffers for clang-tidy

* use std::map to make results the same on different compiler

* proper way to handle cropped buffers

* fix cmake build using alex's branch

* try set flatbuffers_DIR explicitly

* case sensitive?

* rename serialization test env var

* cleanup Serialization.cpp

* format

* have halide version embedded in the file identifier

* nits and comments

* format

* try make clang-tidy happy and const a lot of things

* const more things

* support istream input

* nit

* add template function deserialize_vector

* nit

* attempt to integrate serialization test

* line breaks

* remove hack in compile_jit, at least for now

* fix

* add #ifdef guards

* format

* try nolint

* special case two files so clang-tidy will be happy

* Make Flatbuffers-missing error more useful

* Make a few final changes

- change BUILD_SERIALIZATION -> WITH_SERIALIZATION to match other flags better
- fix capitalization of the CMake package (must be `FlatBuffers` for some Linux usage)
- add stub calls to the de/serialization calls when building without Flatbuffers

* Oops addition

* clang-format

* Add temporary debug hackery

* more hackery

* grr

* sdfsdf

* sigh, capitalization

* One more try

* Update presubmit.yml

* No more mr nice guy

* Update CMakeLists.txt

* Revise build rules & script to allow clang-tidy for the new files

* Update CMakeLists.txt

* Apply clang-tidy fixes

* Fix target for generated header

* Prefer to use FetchContent for flatbuffers

* Fixes

* set PIC on

* more pic

* fix attempt

* fix attempt

* try macos

* coreutils

* Update run-clang-tidy.sh

* noquiet

* final again?

---------

Co-authored-by: Steven Johnson <srj@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_notes For changes that may warrant a note in README for official releases.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants