Test file parser. #5860

erak · 2019-01-24T10:00:43Z

Description

This PR adds a parser (part of #4223) that will be used in our file-based unit test environment (esp. for semantic tests ran by isoltest: #5872). It can parse function call definitions and expected results produced when those are executed later on:

contract test {
    uint public balance;
    function f(uint a, uint b) public returns(uint d) { return a ** 0; }
    function vote() public { require(msg.value >= balance, "Not enough ether."); }
}
// ----
// f(uint256): 1 -> 1 # Takes one parameter and is expected to return one value. #
// f(uint256, uint8): 
// 1,
// 2
// -> 1
// i_am_not_there() 
// # This function is not available and the transaction will revert. #
// -> FAILURE
// vote(), 314 ether -> # Sending ether is also supported. #

Core ideas

Scanner

The parser relies on a scanner and pulls through the character stream
Tokens are defined by SoltToken (meaning sol_t or SOLT which stands for "Solidity Test (Language)")
Scanned tokens: (, ), :, ->, ,, #, [, ] , identifiers, type keyword: ether, special keyword FAILURE
Parser takes tokens and creates objects that are picked up by the semantic tests (Semantic test infrastructure #5872): FunctionCall, FunctionCallArgs and FunctionCallExpectations.

Single-line / multi-line mode

single / multi-line mode will be detected as follows: every newline (//) in source results in a function call that has its display mode set to multi-mode. Function parameter and result value (or expectation parameter) lists are an exception: a single parameter stores a format information that contains a newline definition.
The mode is used for formatting the commandline output as well as the updated expectations printed to the test fix
This solution has the downside, that any special formatting will be lost if updated at some point by isoltest.

Formatting information for semantic tests #5872:

While a test is parsed and its parameters / expectation objects are created, the formats defined by the used literals are retained
When a function call object is printed, the given formats are applied. If values do not fit the given format, some fallback formats depending on the type are applied
Since function call parameters can contain newlines //, these are also retained, such that they can be printed correctly afterwards.
Formats also contain an ABIType, which can be Signed, Unsigned, Failure and None for now. These are the types that are used to encode the actual number literals and the special type Failure. None is used to encode no parameters and no return values, resulting in an empty bytes. Failure results in an empty bytes as well, but has different semantics and needs to be handled accordingly.

Further development ideas

create a parser with boost::spirit that does not change the current interface

codecov · 2019-01-24T10:30:14Z

Codecov Report

Merging #5860 into develop will increase coverage by 0.09%.
The diff coverage is 95.17%.

@@             Coverage Diff             @@
##           develop    #5860      +/-   ##
===========================================
+ Coverage    88.35%   88.44%   +0.09%     
===========================================
  Files          353      356       +3     
  Lines        33692    34127     +435     
  Branches      4029     4056      +27     
===========================================
+ Hits         29767    30184     +417     
- Misses        2559     2564       +5     
- Partials      1366     1379      +13

Flag	Coverage Δ
#all	`88.44% <95.17%> (+0.09%)`	⬆️
#syntax	`27.96% <8.75%> (-0.25%)`	⬇️

chriseth · 2019-01-24T11:28:40Z

test/libsolidity/util/TestFileParser.h

+namespace test
+{
+
+class TestFileParser


Please document.

Also, is it used for more than the semantics tests? If not, please add that into the name.

It should also be used for the syntax tests (upcoming refactoring) which is why I've chosen this name. I'll think of a better name, though.

In this case, it's of course fine!

chriseth · 2019-01-24T11:29:58Z

test/libsolidity/util/TestFileParser.h

+	std::vector<FunctionCall> parseFunctionCalls();
+
+private:
+	class Scanner {


Suggested change

class Scanner {

class Scanner

{

chriseth · 2019-01-24T11:32:53Z

test/libsolidity/util/TestFileParser.h

+		u256 value;
+	};
+
+	static std::string bytesToString(bytes const& _bytes, ByteFormat const& _format);


Please document. Also, the name sounds too generic. It might also be an option to add that to CommonIO.h.

chriseth · 2019-01-24T11:32:59Z

test/libsolidity/util/TestFileParser.h

+	};
+
+	static std::string bytesToString(bytes const& _bytes, ByteFormat const& _format);
+	static std::pair<bytes, ByteFormat> stringToBytes(std::string _string);


chriseth · 2019-01-24T11:33:21Z

test/libsolidity/util/TestFileParser.cpp

+			result += newBytes;
+		}
+		else
+			BOOST_THROW_EXCEPTION(std::runtime_error("Test expectations contain invalidly formatted data."));


I think in testing we don't use solAssert. BOOST_FAIL_MSG or BOOST_REQUIRE(false, "Test...") would be more appropriate?

chriseth · 2019-01-24T11:33:46Z

test/libsolidity/util/TestFileParser.cpp

+
+vector<TestFileParser::FunctionCall> TestFileParser::parseFunctionCalls()
+{
+	vector<TestFileParser::FunctionCall> calls;


Suggested change

vector<TestFileParser::FunctionCall> calls;

vector<FunctionCall> calls;

Or would that be ambiguous?

chriseth · 2019-01-24T11:35:04Z

test/libsolidity/util/TestFileParser.h

+		void advance() { ++m_char; }
+		bool advanceLine()
+		{
+			auto& line = getline(m_stream, m_line);


Where is getline defined?

This is declared in <basic_string.h> for, forwared by <string>. After reading it again, I'm not sure why the compiler is not complaining about the missing namespace.

chriseth · 2019-01-24T11:35:44Z

test/libsolidity/util/TestFileParser.h

+		{
+			auto& line = getline(m_stream, m_line);
+			m_char = m_line.begin();
+			return line ? true : false;


What is the type of line?

Well, I think I get your point :) Declared the type explicitly.

chriseth · 2019-01-24T11:37:10Z

test/libsolidity/util/TestFileParser.cpp

+		call.arguments = parseFunctionCallArgument();
+
+		if (!advanceLine())
+			throw runtime_error("Invalid test expectation. No result specified.");


Perhaps we should define a new exception here? What does the syntax tests parser do?

This should even be BOOST_THROW_EXCEPTION(runtime_error(...)). And that's what the syntax test also does. They are then being handled by isoltest and boostTest, respectively. I think introducing a new exception is a good idea.

axic · 2019-01-24T13:27:35Z

// vote() [0.314] # Sending ether is also supported.
// ->

What is the syntax for sending ether and arguments?

axic · 2019-01-24T13:28:02Z

// f(uint256):
// 1
// -> 1

If it is split like this, I suggest:

// f(uint256):
// <- 1
// -> 1

axic · 2019-01-24T13:29:57Z

test/libsolidity/util/TestFileParser.cpp

+	void expect(std::string::iterator& _it, std::string::iterator _end, std::string::value_type _c)
+	{
+		if (_it == _end || *_it != _c)
+			throw std::runtime_error(std::string("Invalid test expectation. Expected: \"") + _c + "\".");


No need to have explicit std prefixes in this file since it is imported above.

chriseth · 2019-01-24T15:18:43Z

I previously proposed to split the function arguments. What I meant was that it is possible to add newlines whenever space is allowed and that a line that starts with -> marks the transition between arguments and return values.

Example:

f(uint, uint, uint, uint): 4, 5,
6,
7
-> 9

erak · 2019-01-24T16:38:17Z

@axic Oh, the syntax is even wrong. Sorry about that. It would be (note the colon):

// vote(uint256, uint256): 1,1 [0.314]
// -> 1

or following what @chriseth proposed:

// vote(uint256, uint256): [0.314]
// 1, 1
// -> 1

I fear that this might become hard to read. What do you think about making the ether value a "second optional parameter" like below. Something, that I also considered is the potential support of arrays and the current usage of [ and ] for declaring the value.

// vote(uint256, uint256, uint256[]):
// 1, 1, [1..2]
// : 0.314
// -> 1

stackenbotten · 2019-01-27T17:04:57Z

There was an error when running test_check_style for commit 980aaf24bc08485956c1dab8bcdb61971f0d2588:

Error: Trailing whitespace found:
 test/libsolidity/util/TestFileParser.cpp:207: {

Please check that your changes are working as intended.

erak · 2019-01-27T17:12:29Z

Updated. It now parses the following syntax:

 * // f(uint256, uint256): 1, 1 # Signature and comma-separated list of arguments
 * // : 0                       # (Optional) Ether value to be send with the call
 * // -> 1, 1                   # Expected result value

I'd like to get some feedback on the ether value. This was my first shot to separate it from the other arguments. Here, I'd now vote for taking @chriseth proposal: #5860 (comment)

test/libsolidity/util/TestFileParser.h

test/libsolidity/util/TestFileParserTests.cpp

test/libsolidity/util/TestFileParser.cpp

test/libsolidity/util/TestFileParser.h

test/libsolidity/util/TestFileParser.cpp

ekpyron

A few comments, but I'd say this looks good already!

test/libsolidity/util/TestFileParserTests.cpp

test/libsolidity/util/TestFileParser.cpp

test/libsolidity/util/TestFileParser.h

test/libsolidity/util/TestFileParser.cpp

ekpyron

There's at least the typo to fix - the other comment would be great, but could be optional for now.

test/libsolidity/util/TestFileParser.h

ekpyron · 2019-02-05T13:12:35Z

test/libsolidity/util/TestFileParser.cpp

+	return true;
+}
+
+string TestFileParser::parseFunctionSignature()


Well, this now won't work for signatures involving structs... In the ABI structs are considered tuples, so for f(S memory s, uint256 b) public pure {} with struct S { uint256 a; bytes32 b; } the signature would be f((uint256,bytes32),uint256).

I would be fine with skipping this for now, but we should at least add a TODO comment about that here. It would be straightforward to count opening and closing parentheses, though, so we could just do it now, as you like... It's not that important, because I think we don't have any test case at the moment that would use it - on the other hand it would be really handy for #1603 of course :-D.

Well, function signatures do now allow recursive tuples e.g. f(((((bytes, bytes, (bytes)), bytes), bytes), (bytes, bytes)), (bytes, bytes)). I thought that might be useful.
Let's get this merged soon, #5872 after and then you'll be able to run those tests ;)

test/libsolidity/util/TestFileParser.h

test/libsolidity/util/TestFileParser.cpp

ekpyron · 2019-02-06T09:12:30Z

test/libsolidity/util/TestFileParser.cpp

+		throw Error(Error::Type::ParserError, "Invalid signature detected: " + signature);
+
+	signature += parameters;
+	/// The last `)` could have been consumed already by


I'm not sure why that could happen...

ekpyron · 2019-02-06T09:14:07Z

test/libsolidity/util/TestFileParserTests.cpp

+{
+	char const* source = R"(
+		// f(((((bytes, bytes, bytes), bytes), bytes), bytes), bytes) ->
+		// f(((((bytes, bytes, (bytes)), bytes), bytes), (bytes, bytes)), (bytes, bytes)) ->


Arguably signatures with spaces in them could/should actually be an error... but since that will probably in fact complicate parsing, I guess we can leave it as it is - we can just not put spaces in there in the tests...

Ah, that's good point. That should go into the documentation.

A very very common source of problems is using a function that does not exist because of a problem in the signature. The new semantics test infrastructure should detect that, but perhaps not as part of this pull request.

axic · 2019-02-06T12:53:12Z

@bit-shift what is the current design? Is the description on the top updated and reflecting what is being implemented?

ekpyron

I think this is nice now! And with this nice parser we can easily change the actual formatting later on, if we want, so I think we should merge it.

axic · 2019-02-06T15:41:08Z

test/libsolidity/util/TestFileParser.h

-	NUM_TOKENS
-#undef T
-};
+namespace solt


Why not soltest?

Could be also soltest :) I don't have a strong opinion an that.

Could we just not use any further namespace and still define Token? We're in namespace dev::solidity::test already and the other Token is in dev::solidity, isn't it? No, it's actually even in langutil now - in fact I'm not sure whether that's the right place for the other one, but it means there's even less danger of conflict.

But to me this doesn't seem too important - I'm fine with solt, soltest or no sub-namespace.

ekpyron

Should probably be squashed before merging, but I'd still say it's fine!

ekpyron · 2019-02-07T11:33:40Z

Ah damn... looking through #5872 I just realized:
Actually we don't want or need to store the type information of the function arguments at all - instead we should just store the whole original, unparsed string of the function call right up to the ->... In #5872 we don't ever need nor want to reformat anything before the expected result - we can and should just re-emit the original, unparsed string up to the arrow when printing updated expectations... Only the result of the call should ever change when updating, but the way #5872 works now, the formatting of the arguments will change all the time as well, which will be annoying... But fortunately it will be easy to just store the original string encoding as well during scanning and I think it'll be fine to deal with all that in #5872, even though it actually belongs here.

erak changed the title ~~Test file parser~~ [WIP] Test file parser. Jan 24, 2019

erak mentioned this pull request Jan 24, 2019

Extract semantic tests Pt. 1 #5736

Closed

3 tasks

chriseth reviewed Jan 24, 2019

View reviewed changes

erak force-pushed the test-file-parser branch from 2a5ebeb to 95f1063 Compare January 24, 2019 11:53

axic reviewed Jan 24, 2019

View reviewed changes

erak force-pushed the test-file-parser branch 2 times, most recently from 12bd280 to 980aaf2 Compare January 27, 2019 17:04

erak force-pushed the test-file-parser branch from 980aaf2 to 8f8f6f8 Compare January 27, 2019 17:07

erak changed the title ~~[WIP] Test file parser.~~ Test file parser. Jan 27, 2019

erak mentioned this pull request Jan 27, 2019

Semantic test infrastructure #5872

Merged

2 tasks

erak self-assigned this Jan 28, 2019