Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test file parser. #5860

Merged
merged 4 commits into from
Feb 7, 2019
Merged

Test file parser. #5860

merged 4 commits into from
Feb 7, 2019

Conversation

erak
Copy link
Collaborator

@erak erak commented Jan 24, 2019

Description

This PR adds a parser (part of #4223) that will be used in our file-based unit test environment (esp. for semantic tests ran by isoltest: #5872). It can parse function call definitions and expected results produced when those are executed later on:

contract test {
    uint public balance;
    function f(uint a, uint b) public returns(uint d) { return a ** 0; }
    function vote() public { require(msg.value >= balance, "Not enough ether."); }
}
// ----
// f(uint256): 1 -> 1 # Takes one parameter and is expected to return one value. #
// f(uint256, uint8): 
// 1,
// 2
// -> 1
// i_am_not_there() 
// # This function is not available and the transaction will revert. #
// -> FAILURE
// vote(), 314 ether -> # Sending ether is also supported. #

Core ideas

Scanner

  • The parser relies on a scanner and pulls through the character stream
  • Tokens are defined by SoltToken (meaning sol_t or SOLT which stands for "Solidity Test (Language)")
  • Scanned tokens: (, ), :, ->, ,, #, [, ] , identifiers, type keyword: ether, special keyword FAILURE
  • Parser takes tokens and creates objects that are picked up by the semantic tests (Semantic test infrastructure #5872): FunctionCall, FunctionCallArgs and FunctionCallExpectations.

Single-line / multi-line mode

  • single / multi-line mode will be detected as follows: every newline (//) in source results in a function call that has its display mode set to multi-mode. Function parameter and result value (or expectation parameter) lists are an exception: a single parameter stores a format information that contains a newline definition.
  • The mode is used for formatting the commandline output as well as the updated expectations printed to the test fix
  • This solution has the downside, that any special formatting will be lost if updated at some point by isoltest.

Formatting information for semantic tests #5872:

  • While a test is parsed and its parameters / expectation objects are created, the formats defined by the used literals are retained
  • When a function call object is printed, the given formats are applied. If values do not fit the given format, some fallback formats depending on the type are applied
  • Since function call parameters can contain newlines //, these are also retained, such that they can be printed correctly afterwards.
  • Formats also contain an ABIType, which can be Signed, Unsigned, Failure and None for now. These are the types that are used to encode the actual number literals and the special type Failure. None is used to encode no parameters and no return values, resulting in an empty bytes. Failure results in an empty bytes as well, but has different semantics and needs to be handled accordingly.

Further development ideas

  • create a parser with boost::spirit that does not change the current interface

@erak erak changed the title Test file parser [WIP] Test file parser. Jan 24, 2019
@erak erak mentioned this pull request Jan 24, 2019
3 tasks
@codecov
Copy link

codecov bot commented Jan 24, 2019

Codecov Report

Merging #5860 into develop will increase coverage by 0.09%.
The diff coverage is 95.17%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5860      +/-   ##
===========================================
+ Coverage    88.35%   88.44%   +0.09%     
===========================================
  Files          353      356       +3     
  Lines        33692    34127     +435     
  Branches      4029     4056      +27     
===========================================
+ Hits         29767    30184     +417     
- Misses        2559     2564       +5     
- Partials      1366     1379      +13
Flag Coverage Δ
#all 88.44% <95.17%> (+0.09%) ⬆️
#syntax 27.96% <8.75%> (-0.25%) ⬇️

namespace test
{

class TestFileParser
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it used for more than the semantics tests? If not, please add that into the name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should also be used for the syntax tests (upcoming refactoring) which is why I've chosen this name. I'll think of a better name, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it's of course fine!

std::vector<FunctionCall> parseFunctionCalls();

private:
class Scanner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class Scanner {
class Scanner
{

u256 value;
};

static std::string bytesToString(bytes const& _bytes, ByteFormat const& _format);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document. Also, the name sounds too generic. It might also be an option to add that to CommonIO.h.

};

static std::string bytesToString(bytes const& _bytes, ByteFormat const& _format);
static std::pair<bytes, ByteFormat> stringToBytes(std::string _string);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

result += newBytes;
}
else
BOOST_THROW_EXCEPTION(std::runtime_error("Test expectations contain invalidly formatted data."));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solAssert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in testing we don't use solAssert. BOOST_FAIL_MSG or BOOST_REQUIRE(false, "Test...") would be more appropriate?


vector<TestFileParser::FunctionCall> TestFileParser::parseFunctionCalls()
{
vector<TestFileParser::FunctionCall> calls;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vector<TestFileParser::FunctionCall> calls;
vector<FunctionCall> calls;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or would that be ambiguous?

void advance() { ++m_char; }
bool advanceLine()
{
auto& line = getline(m_stream, m_line);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is getline defined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is declared in <basic_string.h> for, forwared by <string>. After reading it again, I'm not sure why the compiler is not complaining about the missing namespace.

{
auto& line = getline(m_stream, m_line);
m_char = m_line.begin();
return line ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the type of line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think I get your point :) Declared the type explicitly.

call.arguments = parseFunctionCallArgument();

if (!advanceLine())
throw runtime_error("Invalid test expectation. No result specified.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should define a new exception here? What does the syntax tests parser do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should even be BOOST_THROW_EXCEPTION(runtime_error(...)). And that's what the syntax test also does. They are then being handled by isoltest and boostTest, respectively. I think introducing a new exception is a good idea.

@erak erak force-pushed the test-file-parser branch from 2a5ebeb to 95f1063 Compare January 24, 2019 11:53
@axic
Copy link
Member

axic commented Jan 24, 2019

// vote() [0.314] # Sending ether is also supported.
// ->

What is the syntax for sending ether and arguments?

@axic
Copy link
Member

axic commented Jan 24, 2019

// f(uint256):
// 1
// -> 1

If it is split like this, I suggest:

// f(uint256):
// <- 1
// -> 1

void expect(std::string::iterator& _it, std::string::iterator _end, std::string::value_type _c)
{
if (_it == _end || *_it != _c)
throw std::runtime_error(std::string("Invalid test expectation. Expected: \"") + _c + "\".");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to have explicit std prefixes in this file since it is imported above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@chriseth
Copy link
Contributor

I previously proposed to split the function arguments. What I meant was that it is possible to add newlines whenever space is allowed and that a line that starts with -> marks the transition between arguments and return values.

Example:

f(uint, uint, uint, uint): 4, 5,
6,
7
-> 9

@erak
Copy link
Collaborator Author

erak commented Jan 24, 2019

@axic Oh, the syntax is even wrong. Sorry about that. It would be (note the colon):

// vote(uint256, uint256): 1,1 [0.314]
// -> 1

or following what @chriseth proposed:

// vote(uint256, uint256): [0.314]
// 1, 1
// -> 1

I fear that this might become hard to read. What do you think about making the ether value a "second optional parameter" like below. Something, that I also considered is the potential support of arrays and the current usage of [ and ] for declaring the value.

// vote(uint256, uint256, uint256[]):
// 1, 1, [1..2]
// : 0.314
// -> 1

@erak erak force-pushed the test-file-parser branch 2 times, most recently from 12bd280 to 980aaf2 Compare January 27, 2019 17:04
@stackenbotten
Copy link

There was an error when running test_check_style for commit 980aaf24bc08485956c1dab8bcdb61971f0d2588:

Error: Trailing whitespace found:
 test/libsolidity/util/TestFileParser.cpp:207: { 

Please check that your changes are working as intended.

@erak erak force-pushed the test-file-parser branch from 980aaf2 to 8f8f6f8 Compare January 27, 2019 17:07
@erak erak changed the title [WIP] Test file parser. Test file parser. Jan 27, 2019
@erak
Copy link
Collaborator Author

erak commented Jan 27, 2019

Updated. It now parses the following syntax:

 * // f(uint256, uint256): 1, 1 # Signature and comma-separated list of arguments
 * // : 0                       # (Optional) Ether value to be send with the call
 * // -> 1, 1                   # Expected result value

I'd like to get some feedback on the ether value. This was my first shot to separate it from the other arguments. Here, I'd now vote for taking @chriseth proposal: #5860 (comment)

@erak erak mentioned this pull request Jan 27, 2019
2 tasks
@erak erak self-assigned this Jan 28, 2019
@erak erak force-pushed the test-file-parser branch 7 times, most recently from 83eb987 to 471908e Compare February 4, 2019 03:36
Copy link
Member

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but I'd say this looks good already!

@erak erak force-pushed the test-file-parser branch from ad80816 to f5fa62d Compare February 5, 2019 12:57
Copy link
Member

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's at least the typo to fix - the other comment would be great, but could be optional for now.

return true;
}

string TestFileParser::parseFunctionSignature()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this now won't work for signatures involving structs... In the ABI structs are considered tuples, so for f(S memory s, uint256 b) public pure {} with struct S { uint256 a; bytes32 b; } the signature would be f((uint256,bytes32),uint256).

I would be fine with skipping this for now, but we should at least add a TODO comment about that here. It would be straightforward to count opening and closing parentheses, though, so we could just do it now, as you like... It's not that important, because I think we don't have any test case at the moment that would use it - on the other hand it would be really handy for #1603 of course :-D.

Copy link
Collaborator Author

@erak erak Feb 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, function signatures do now allow recursive tuples e.g. f(((((bytes, bytes, (bytes)), bytes), bytes), (bytes, bytes)), (bytes, bytes)). I thought that might be useful.
Let's get this merged soon, #5872 after and then you'll be able to run those tests ;)

@erak erak force-pushed the test-file-parser branch from 6fcb42b to f07e4a2 Compare February 5, 2019 21:17
throw Error(Error::Type::ParserError, "Invalid signature detected: " + signature);

signature += parameters;
/// The last `)` could have been consumed already by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why that could happen...

{
char const* source = R"(
// f(((((bytes, bytes, bytes), bytes), bytes), bytes), bytes) ->
// f(((((bytes, bytes, (bytes)), bytes), bytes), (bytes, bytes)), (bytes, bytes)) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably signatures with spaces in them could/should actually be an error... but since that will probably in fact complicate parsing, I guess we can leave it as it is - we can just not put spaces in there in the tests...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's good point. That should go into the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very very common source of problems is using a function that does not exist because of a problem in the signature. The new semantics test infrastructure should detect that, but perhaps not as part of this pull request.

@axic
Copy link
Member

axic commented Feb 6, 2019

@bit-shift what is the current design? Is the description on the top updated and reflecting what is being implemented?

ekpyron
ekpyron previously approved these changes Feb 6, 2019
Copy link
Member

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is nice now! And with this nice parser we can easily change the actual formatting later on, if we want, so I think we should merge it.

NUM_TOKENS
#undef T
};
namespace solt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not soltest?

Copy link
Collaborator Author

@erak erak Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be also soltest :) I don't have a strong opinion an that.

Copy link
Member

@ekpyron ekpyron Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just not use any further namespace and still define Token? We're in namespace dev::solidity::test already and the other Token is in dev::solidity, isn't it? No, it's actually even in langutil now - in fact I'm not sure whether that's the right place for the other one, but it means there's even less danger of conflict.

But to me this doesn't seem too important - I'm fine with solt, soltest or no sub-namespace.

@erak erak force-pushed the test-file-parser branch 2 times, most recently from 9c97937 to a10982e Compare February 6, 2019 15:59
@erak erak force-pushed the test-file-parser branch from a10982e to c9c4578 Compare February 6, 2019 16:25
Copy link
Member

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be squashed before merging, but I'd still say it's fine!

@ekpyron
Copy link
Member

ekpyron commented Feb 7, 2019

Ah damn... looking through #5872 I just realized:
Actually we don't want or need to store the type information of the function arguments at all - instead we should just store the whole original, unparsed string of the function call right up to the ->... In #5872 we don't ever need nor want to reformat anything before the expected result - we can and should just re-emit the original, unparsed string up to the arrow when printing updated expectations... Only the result of the call should ever change when updating, but the way #5872 works now, the formatting of the arguments will change all the time as well, which will be annoying... But fortunately it will be easy to just store the original string encoding as well during scanning and I think it'll be fine to deal with all that in #5872, even though it actually belongs here.

@ekpyron ekpyron merged commit 9203363 into develop Feb 7, 2019
@chriseth chriseth deleted the test-file-parser branch April 30, 2019 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants