-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Parser] Parser 2.0 part 2 #6162
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jroesch for the PR.
As i missed your part-1 PR, i think i have lot to catch up :)
Please find below some high level comments, hope it helps!
@@ -85,16 +85,26 @@ class SpanNode : public Object { | |||
int line; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what you are trying to achieve here is the range for line[begin, end], column[begin, end].
nit: May be we can rename the variables to reflect the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I agree with @ANSHUMAN87. Maybe its better to use "begin_line", "begin_column", "end_line", "end_column"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be begin_line, end_line, begin_column, end_column, as both values are related. Maybe even make a small range struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of painful to refactor due some of the Python code. I still need to update the Python bindings how about I add to tracking issue and we can do in follow up work?
return true; | ||
} | ||
|
||
RELAY_REGISTER_OP("parser.MetaRef") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: As "parser.MetaRef" carry special meaning in the context, i think we can define this key at one place, and reuse it in all places. This helps if we want to change it at later point in time.
src/parser/meta_ref.h
Outdated
*/ | ||
Expr MetaRef(std::string type_key, uint64_t node_index); | ||
|
||
relay::Function ExpandMetaRefs(const MetaTable& meta_table, const relay::Function& mod); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mod -> func ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I will do this.
This PR is now ready for full review, if possible can people please chime in about meta-table and related decisions. I know there are still some incomplete aspects but this is already more robust then the previous parser and I would like to incrementalize instead of sending many 4kloc PRs. |
@@ -85,16 +85,26 @@ class SpanNode : public Object { | |||
int line; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I agree with @ANSHUMAN87. Maybe its better to use "begin_line", "begin_column", "end_line", "end_column"?
* Could represent the source from an ML framework or the internal | ||
* source of a TVM program. | ||
*/ | ||
struct Source { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is not like POD aggregation, I personally prefer to use class instead
struct Source { | |
class Source { | |
public: |
*/ | ||
TVM_DLL Match(Expr data, tvm::Array<Clause> clauses, bool complete = true); | ||
TVM_DLL Match(Expr data, tvm::Array<Clause> clauses, bool complete = true, Span span = Span()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should think a bit about Span(nullptr)
vs Optional<Span>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't want spans to be optional if possible, but currently we need to migrate to setting them everywhere.
include/tvm/parser/source_map.h
Outdated
/*! \brief Construct a source from a string. */ | ||
TVM_DLL explicit Source(const std::string& source); | ||
|
||
TVM_DLL Source(const Source& source) : source(source.source), line_map(source.line_map) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we just use = default
?
class SourceMapNode : public Object { | ||
public: | ||
/*! \brief The source mapping. */ | ||
Map<SourceName, tvm::String> source_map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not much difference tho
Map<SourceName, tvm::String> source_map; | |
Map<SourceName, tvm::runtime::String> source_map; |
src/parser/diagnostic.h
Outdated
out << std::endl; | ||
} | ||
}; | ||
|
||
/*! \brief The diagnostic level, controls the printing of the message. */ | ||
enum DiagnosticLevel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use enum class if possible
#include "./token.h" | ||
|
||
namespace tvm { | ||
namespace parser { | ||
|
||
using namespace runtime; | ||
|
||
// trim from start (in place) | ||
static inline void ltrim(std::string& s) { // NOLINT(*) | ||
s.erase(s.begin(), std::find_if(s.begin(), s.end(), [](int ch) { return !std::isspace(ch); })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you prefer to use IsWhitespace we defined below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general thoughts after a first pass through the PR, I'll do a more detailed review tonight or tomorrow:
- Yay for getting rid of Antlr! This looks much easier to maintain.
- You've added diagnostic information via the Span object, and attempted to propagate it through the ExprMutator and ExprVisitor classes. There are a lot of passes that extend those classes without keeping span information, or directly use ExprFunctopr, as soon as we hit one of those passes, the diagnostic information will disappear, correct? Do we need to enforce valid spans at compile time to catch all of the passes?
- Since the diagnostic machinery is all based on source, I guess we need a pass to convert python-constructed graphs into relay source code for diagnostic information? Does this already exist somewhere in the compilation flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. only a few nitpicks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nitpicks. I'm happy to leave the completion of the diagnostic system discussed in my last comment for another PR, let's make sure those issues make it into the tracking issue.
… metadata section references.
@jroesch please followup to address the CI error |
@tqchen this should be gtg, looks like I got passed the last few CPU tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* Add code from livestream with JK * Fix errors parsing ResNet * Parse metadata section efficiently and do most of plumbing to resolve metadata section references. * WIP * Change meta reference to an operator * Meta references now work * MetaReference expansion now works * Start working on source map and move diagnostic context * Convert tokenizer and parser to use new machinery * Kill to_json * Fix comment in type_infer.cc * Remove old parser * Rename parser tests and remove old ones * Record span end information * Convert to using spans everywhere * Add span fields back to all Relay constructors * Start passing spans * Pass spans around visitors * Format * Fix * Fix * disable reference lint from string helpers * Fix tokenizer * Fix issue with empty metadata section * Document new span fields and small tweaks * Formatting * Add span doc fields * Add format tweak * Improve errors and fix the semantic version tags in Prelude * Update gradient.rly * Clean up broken spans * Clean up parser tests and turn on previously skipped tests * Update errors to handle skipped cases * Tweak * Tweak * Format * Fix some minor issues with ADT tests * Format * Fix path * WIP * WIP * Fix ir_text_printer * format * Formatted * More formatting * Repair test cases * Fix CI * Retrigger CI
* Add code from livestream with JK * Fix errors parsing ResNet * Parse metadata section efficiently and do most of plumbing to resolve metadata section references. * WIP * Change meta reference to an operator * Meta references now work * MetaReference expansion now works * Start working on source map and move diagnostic context * Convert tokenizer and parser to use new machinery * Kill to_json * Fix comment in type_infer.cc * Remove old parser * Rename parser tests and remove old ones * Record span end information * Convert to using spans everywhere * Add span fields back to all Relay constructors * Start passing spans * Pass spans around visitors * Format * Fix * Fix * disable reference lint from string helpers * Fix tokenizer * Fix issue with empty metadata section * Document new span fields and small tweaks * Formatting * Add span doc fields * Add format tweak * Improve errors and fix the semantic version tags in Prelude * Update gradient.rly * Clean up broken spans * Clean up parser tests and turn on previously skipped tests * Update errors to handle skipped cases * Tweak * Tweak * Format * Fix some minor issues with ADT tests * Format * Fix path * WIP * WIP * Fix ir_text_printer * format * Formatted * More formatting * Repair test cases * Fix CI * Retrigger CI
* Add code from livestream with JK * Fix errors parsing ResNet * Parse metadata section efficiently and do most of plumbing to resolve metadata section references. * WIP * Change meta reference to an operator * Meta references now work * MetaReference expansion now works * Start working on source map and move diagnostic context * Convert tokenizer and parser to use new machinery * Kill to_json * Fix comment in type_infer.cc * Remove old parser * Rename parser tests and remove old ones * Record span end information * Convert to using spans everywhere * Add span fields back to all Relay constructors * Start passing spans * Pass spans around visitors * Format * Fix * Fix * disable reference lint from string helpers * Fix tokenizer * Fix issue with empty metadata section * Document new span fields and small tweaks * Formatting * Add span doc fields * Add format tweak * Improve errors and fix the semantic version tags in Prelude * Update gradient.rly * Clean up broken spans * Clean up parser tests and turn on previously skipped tests * Update errors to handle skipped cases * Tweak * Tweak * Format * Fix some minor issues with ADT tests * Format * Fix path * WIP * WIP * Fix ir_text_printer * format * Formatted * More formatting * Repair test cases * Fix CI * Retrigger CI
* Add code from livestream with JK * Fix errors parsing ResNet * Parse metadata section efficiently and do most of plumbing to resolve metadata section references. * WIP * Change meta reference to an operator * Meta references now work * MetaReference expansion now works * Start working on source map and move diagnostic context * Convert tokenizer and parser to use new machinery * Kill to_json * Fix comment in type_infer.cc * Remove old parser * Rename parser tests and remove old ones * Record span end information * Convert to using spans everywhere * Add span fields back to all Relay constructors * Start passing spans * Pass spans around visitors * Format * Fix * Fix * disable reference lint from string helpers * Fix tokenizer * Fix issue with empty metadata section * Document new span fields and small tweaks * Formatting * Add span doc fields * Add format tweak * Improve errors and fix the semantic version tags in Prelude * Update gradient.rly * Clean up broken spans * Clean up parser tests and turn on previously skipped tests * Update errors to handle skipped cases * Tweak * Tweak * Format * Fix some minor issues with ADT tests * Format * Fix path * WIP * WIP * Fix ir_text_printer * format * Formatted * More formatting * Repair test cases * Fix CI * Retrigger CI
* Add code from livestream with JK * Fix errors parsing ResNet * Parse metadata section efficiently and do most of plumbing to resolve metadata section references. * WIP * Change meta reference to an operator * Meta references now work * MetaReference expansion now works * Start working on source map and move diagnostic context * Convert tokenizer and parser to use new machinery * Kill to_json * Fix comment in type_infer.cc * Remove old parser * Rename parser tests and remove old ones * Record span end information * Convert to using spans everywhere * Add span fields back to all Relay constructors * Start passing spans * Pass spans around visitors * Format * Fix * Fix * disable reference lint from string helpers * Fix tokenizer * Fix issue with empty metadata section * Document new span fields and small tweaks * Formatting * Add span doc fields * Add format tweak * Improve errors and fix the semantic version tags in Prelude * Update gradient.rly * Clean up broken spans * Clean up parser tests and turn on previously skipped tests * Update errors to handle skipped cases * Tweak * Tweak * Format * Fix some minor issues with ADT tests * Format * Fix path * WIP * WIP * Fix ir_text_printer * format * Formatted * More formatting * Repair test cases * Fix CI * Retrigger CI
* Add code from livestream with JK * Fix errors parsing ResNet * Parse metadata section efficiently and do most of plumbing to resolve metadata section references. * WIP * Change meta reference to an operator * Meta references now work * MetaReference expansion now works * Start working on source map and move diagnostic context * Convert tokenizer and parser to use new machinery * Kill to_json * Fix comment in type_infer.cc * Remove old parser * Rename parser tests and remove old ones * Record span end information * Convert to using spans everywhere * Add span fields back to all Relay constructors * Start passing spans * Pass spans around visitors * Format * Fix * Fix * disable reference lint from string helpers * Fix tokenizer * Fix issue with empty metadata section * Document new span fields and small tweaks * Formatting * Add span doc fields * Add format tweak * Improve errors and fix the semantic version tags in Prelude * Update gradient.rly * Clean up broken spans * Clean up parser tests and turn on previously skipped tests * Update errors to handle skipped cases * Tweak * Tweak * Format * Fix some minor issues with ADT tests * Format * Fix path * WIP * WIP * Fix ir_text_printer * format * Formatted * More formatting * Repair test cases * Fix CI * Retrigger CI
This PR implements both metadata parsing and converts everything to the new diagnostic machinery.
I initially believed that the right way to parse the metadata table is to generate delayed references to the metadata table and then expand them after parsing is complete. In order to do this we need a piece of syntax i.e Expr, Type, Attrs, and so on for each meta reference. Currently this only works for
Expr
as I tried to define MetaRef asop
right now in order to minimize changes.This brings up a bigger long term question of how much we should rely on the meta table outside of things such as the constants or other "unprintable" values. I realize with my tokenization approach we could eagerly expand the meta references but this requires that we have the full program on hand in order to successfully parse any program fragment.
Would love to get some more thoughts and feedback cc @MarisaKirisame @joshpoll @tqchen @zhiics @electriclilies @mbrookhart @junrushao1994.
FYI cc @tmoreau89