Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update storage api to support query operations by spanKind #1

Closed
wants to merge 3 commits into from

Conversation

guo0693
Copy link
Owner

@guo0693 guo0693 commented Nov 21, 2019

Which problem is this PR solving?

Short description of the changes

  • storage API changes to support new query parameter and return type (across all storage engines)

Copy link

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall changes look ok-ish, but I still think we should limit the blast raduis and not require filtering logic to be done in the storage, just let it return all ops for a service, and do the filtering in the QueryService only (one place).

cmd/query/app/http_handler_test.go Outdated Show resolved Hide resolved
cmd/query/app/querysvc/query_service.go Show resolved Hide resolved
plugin/storage/cassandra/spanstore/reader_test.go Outdated Show resolved Hide resolved
plugin/storage/cassandra/spanstore/reader_test.go Outdated Show resolved Hide resolved
plugin/storage/grpc/proto/storage.proto Show resolved Hide resolved
plugin/storage/grpc/shared/grpc_server.go Outdated Show resolved Hide resolved
@guo0693
Copy link
Owner Author

guo0693 commented Nov 21, 2019

overall changes look ok-ish, but I still think we should limit the blast raduis and not require filtering logic to be done in the storage, just let it return all ops for a service, and do the filtering in the QueryService only (one place).

Just want to make sure your proposal is for avoiding potential duplicate filtering logic, not for reducing the number of affected files (I assume blast radius means the number of affected files). Because as long as we touch the interface (since spanKind has to be returned in the Operation struct), all the implementations need to be changed.

GetOperations(ctx context.Context, string service) ([]*Operation, error)
GetOperations(ctx context.Context, query *OperationQueryParameters) ([]*Operation, error)
The affected files will be the same for the above two versions.
The benefit of the second method is:

  1. we can easily add more param into OperationQueryParameters without modifying so many files in the future
  2. we avoid storage plugin returning un-necessary kind of operations followed by filtering. Although the performance gain is tiny given the number of operations won't be too many, I would prefer this because filtering in query_service does not reduce the complexity of the changes.

Regarding logic duplication, if the storage support filtering in the query statement, then we are not really duplicate the filtering logic in the code.

@yurishkuro
Copy link

My point is that you're increasing the scope by requiring storage to implement filtering, which forces you to change memory storage implementation, for example. If you implement it once in the QueryService then all storage impls benefit automatically, whether they support internal filtering or not (e.g. do you know how to implement query-time filtering in Badger?)

but ok, let's do it your way.

@guo0693
Copy link
Owner Author

guo0693 commented Nov 21, 2019

My point is that you're increasing the scope by requiring storage to implement filtering, which forces you to change memory storage implementation, for example. If you implement it once in the QueryService then all storage impls benefit automatically, whether they support internal filtering or not (e.g. do you know how to implement query-time filtering in Badger?)

but ok, let's do it your way.

Those storage plugins have to be changed anyway in order to write & read the extra spanKind, they might just as well add the filtering when they make the changes since it's pretty easy to them. I will make the changes in proto and break long lines then push another commit.

Btw, I think we might need to keep the following so that old proto users don't need to change their code, otherwise they need to use the corresponding updated get or set method:

message GetOperationsResponse {
    repeated string operation = 1; // deprecated
    repeated Operation operations_v2 = 2;
}

IIRC, the backward compatibility when change names only kicks in when you unmarshall saved data into new structure.

@yurishkuro
Copy link

they might just as well add the filtering when they make the changes since it's pretty easy.

you don't know if it's easy or not, it depends on storage implementation (I wouldn't know how to make Badger return only filtered records). And if you're talking about filtering in Go code, then that's exactly the duplication I am referring to.

@@ -160,17 +167,14 @@ func TestOperationNamesStorageGetServices(t *testing.T) {
query.On("Iter").Return(iter)

s.session.On("Query", mock.AnythingOfType("string"), mock.Anything).Return(query)
services, err := s.storage.GetOperations("service-a")
services, err := s.storage.GetOperations(&spanstore.OperationQueryParameters{ServiceName: "service-a"})
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, this line of code is within the vertical 120 char line in my local intelliJ, somehow github still shows scroll bar.

Signed-off-by: Jun Guo <guo0693@gmail.com>
Signed-off-by: Jun Guo <guo0693@gmail.com>
Signed-off-by: Jun Guo <guo0693@gmail.com>
@guo0693
Copy link
Owner Author

guo0693 commented Nov 22, 2019

I need to reopen the PR with same changes against master since jaegertracing#1921-2 is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants