Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider inverting the messages dependency #1614

Open
mpkorstanje opened this issue Jun 15, 2021 · 5 comments
Open

Consider inverting the messages dependency #1614

mpkorstanje opened this issue Jun 15, 2021 · 5 comments

Comments

@mpkorstanje
Copy link
Contributor

mpkorstanje commented Jun 15, 2021

Currently everything depends on messages.

  • This happens partially because messages contains the entire domain.
  • And partially because we use messages as the Gherkin AST.

This is not great because:

  • We have a serialization format at the core of our domain rather then in an adaptor.
  • Because the Gherkin AST is the serialized format we can not utilize any patterns that make working with ASTs easier.
  • When anything in the domain changes, everything has to be updated.
  • Messages depends on Jackson. Due to it's nature as a de-serialization library Jackson gets frequent security updates.

As a visual aid (note note quite accurate, I was sketching out something else):

image

As such I would propose inverting the dependency and splitting messages into several smaller submodules. The modules are already hinted at by the comments in the proto file.

message Envelope {
oneof message {
// Gherkin
Source source = 1;
GherkinDocument gherkin_document = 2;
// Compiler(s)
Pickle pickle = 3;
//
StepDefinition step_definition = 4;
Hook hook = 5;
ParameterType parameter_type = 6;
TestCase test_case = 7;
UndefinedParameterType undefined_parameter_type = 8;
// Execution
TestRunStarted test_run_started = 9;
TestCaseStarted test_case_started = 10;
TestStepStarted test_step_started = 11;
Attachment attachment = 12;
TestStepFinished test_step_finished = 13;
TestCaseFinished test_case_finished = 14;
TestRunFinished test_run_finished = 15;
// Parsing
ParseError parse_error = 16;
Meta meta = 17;
}
}

By inverting the dependency:

  • We push the serialization format out to the edges of the architecture. This allows for proper ports and adaptors design. Only root dependents (e.g. the Cucumber CLI) would have a dependency on messages.
  • The impact of updating external dependency changes is limited to the root dependents (e.g. the Cucumber CLI).
  • Changes that impact Gherkin won't impact execution related code.and vice-versa.

image

Now I imagine the biggest perceived obstacle will be additional copying the domain objects into messages. However for example if the Gherkin AST was modelled as a tree of nodes using the visitor pattern it would be relatively straightforward to convert it to a message object.

@mattwynne
Copy link
Member

mattwynne commented Jun 15, 2021

I agree with the thrust of these observations and the proposal. I want to think a bit more about whether I agree that we need to model this as multiple subdomains (parser, compiler, glue, execution, results) in separate modules or whether those stable types at the core of our domain (and, as you say, not depend on things like Jackson) could live in a single module.

@laeubi
Copy link

laeubi commented Jun 16, 2021

But isn't it the idea of Jackson and similar frameworks to have domain objects that could be (optionally) be serialized, just to prevent creating additional domain objects for serialization?
The consequence of your proposal would be that we have some kind of layer above messages? I'm just asking because Cucumber Plugins already has such a layer but it does not contains all information there and changes seem to be hard because of the discussion about what should be API or not, so in the end I ended up using the raw messages as they are offering the greatest flexibility.
This leads me to another point where I think if this is done and should be useful for different consumers (IDE, CLI, ...) the parsers and converter has all to be public accessible.

@mpkorstanje
Copy link
Contributor Author

@mattwynne a single library would work too for the messages. As long as we don't treat the messages as our model.

@mattwynne
Copy link
Member

@mpkorstanje I agree about not treating the messages as our model, they're just DTOs. What your post has given me pause for thought about is whether we should try to define a shared core domain model/models for Cucumber though. Like that maybe messages is just filling a gap right now, that should actually be filled by a proper domain model.

@mpkorstanje
Copy link
Contributor Author

mpkorstanje commented Jun 16, 2021

Ah. Yes you are definitely right that the messages are filling in a gap of domain objects. However we should not introduce a monolithic domain. Rather each module has it's own public API and only uses another modules public API. The domain objects can be part of this public API. For example:

image

In this case we have two modules. A compiler and parser module. The Parser, Compiler, AST and Pickle classes are all part of it's public API. The AST and Pickle are also domain objects.

Internally each module can use patterns such as "Ports and Adapters" but I don't think this pattern is efficient for small libraries that do not have any input/output. Rather this is something you'd apply to utilities that use these libraries such as the Gherkin CLI because these have to deal with input and output where as the parser and compiler modules do not.

For example:

image

Now this brings us to the next problem. The current implementation has to deal with input and output just to pass the shared test suite. This was a pragmatic choice, but it also means that as a module gherkin drags in more then needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants