Skip to content

Conversation

@wolframhaussig
Copy link
Contributor

support for CALL using JDBC escape syntax
support for MERGE

MERGE support also contains support for DB links

support for CALL using JDBC escape syntax
support for MERGE

MERGE support also contains support for DB links
@eyalkoren
Copy link
Contributor

Closes #676

support DB links on CALL, DELETE, INSERT, SELECT, UPDATE
MERGE now shows MERGE INTO instead of MERGE
@eyalkoren
Copy link
Contributor

@wolframhaussig Thanks for your contribution!!
Please let me know when you feel this is ready for review

@wolframhaussig
Copy link
Contributor Author

@eyalkoren This can be reviewed: I have added support for DB links on the other types like UPDATE, SELECT now so this is all I can do for now.

@eyalkoren
Copy link
Contributor

@axw please take a look. In general, and specifically about the @ stuff- we want to treat these as sensitive info?

@eyalkoren eyalkoren requested a review from axw June 27, 2019 13:18
@axw
Copy link
Member

axw commented Jun 28, 2019

Thanks @wolframhaussig!

MERGE looks good!

What's the reason for including the DB link in the span name? Is it necessary, given that you can still find the same information in the span context? Is it likely that you would query the same schema/table pair with multiple DB links, and not want to aggregate them?

I don't think we need to consider the DB link info as sensitive. It's just a DB link name and optional user name, and we already record those kinds of things in span context.

Regarding the JDBC escape syntax: I think it might be tidier if we do this as a pre-processing step. This is (obviously) JDBC specific, so separating it out would allow us to keep the SQL-parsing code the same for all the agents excluding the pre-processing step. For the pre-processing step, I think we'd just skip over the leading "{" and possible "? =". What do you think?

@wolframhaussig
Copy link
Contributor Author

What's the reason for including the DB link in the span name? Is it necessary, given that you can still find the same information in the span context? Is it likely that you would query the same schema/table pair with multiple DB links, and not want to aggregate them?

Our UseCase contains db links to a lot of different databases located in different locations(in germany but also worldwide). My reasoning is that accessing the same table from different locations might take longer. Also, not all locations have the same amount of data as every location has only data for their plant. So a performance issue might be not reproducable on another database which has different/less data.

As far as I know the span context cannot be searched, right? So I could not search for access to the same table in the same db link. Can we come to a compromise? Are you concerned about the parsing performance or about the DB link in the span name? It would be fine for me if the information is in the Span but not in the name so I could - in theory - filter by tag/label. Of course I could add the data manually by using the public API but maybe we can add the data as label and if performance is an issue make it configurable for the user?

Regarding the JDBC escape syntax: I am not completely against the idea of taking the code out of the sql parsing as it is jdbc specific as you said. But I do not completely agree either: I know we currently need this syntax parser only for the sql CALL but there are more uses of it so it would be more future aware if it was in the sql parser itself. I guess the most interesting after the call syntax are the outer joins. I do not have the usecase so I didn't implement it but the jdbc plugin currently fails on the following JDBC escape sequence:

  {
    "input": "SELECT * FROM {oj Countries JOIN Cities ON (Countries.country_ISO_code=Cities.country_ISO_code)}",
    "output": "SELECT FROM Countries"
  }

My code currently gives "oj" as table name, your code would give "{oj" but the real table name is of course "Countries".

What is your opinion on that?

@axw
Copy link
Member

axw commented Jul 1, 2019

As far as I know the span context cannot be searched, right? So I could not search for access to the same table in the same db link.

Right, good point. I take it then that you're performing aggregations over the span data, and not just using them in the APM UI?

Are you concerned about the parsing performance or about the DB link in the span name? It would be fine for me if the information is in the Span but not in the name so I could - in theory - filter by tag/label. Of course I could add the data manually by using the public API but maybe we can add the data as label and if performance is an issue make it configurable for the user?

I'm not concerned about performance in this instance, but usability. Perhaps not everyone will want the link in the name, and having it in the name makes it more difficult to aggregate spans for multiple locations/whatever. Perhaps that's not that important. I'm not really opposed to including it in the span name, just wanted to consider our options. @roncohen WDYT?

Regarding the JDBC escape syntax: I am not completely against the idea of taking the code out of the sql parsing as it is jdbc specific as you said. But I do not completely agree either: I know we currently need this syntax parser only for the sql CALL but there are more uses of it so it would be more future aware if it was in the sql parser itself. I guess the most interesting after the call syntax are the outer joins. I do not have the usecase so I didn't implement it but the jdbc plugin currently fails on the following JDBC escape sequence:

Ah yes, I was aware of the other cases but hadn't considered the oj case properly. I think we could just skip over the { and oj tokens, so this could be implemented as a filter on the token source rather than modifying the parser. We could deal with that if/when we handle that escape syntax, or we could just implement it like that now: wrap the Scanner with another type that is escape-syntax aware, which will keep track of the escape sequences and filter out those tokens.

@wolframhaussig
Copy link
Contributor Author

Right, good point. I take it then that you're performing aggregations over the span data, and not just using them in the APM UI?

Currently, we only want the Span name to be searchable in case of a performance problem so we can see if it is a one time problem for this db link or not. But in future we also want to show the performance data from the APM agent grouped by destinations in an own dashboard.

Ah yes, I was aware of the other cases but hadn't considered the oj case properly. I think we could just skip over the { and oj tokens, so this could be implemented as a filter on the token source rather than modifying the parser. We could deal with that if/when we handle that escape syntax, or we could just implement it like that now: wrap the Scanner with another type that is escape-syntax aware, which will keep track of the escape sequences and filter out those tokens.

I am fine with that. I will wait with the change until we have completed the discussion regarding the db link topic.

@roncohen
Copy link

roncohen commented Jul 1, 2019

Maybe a good compromise would be to parse the SQL so the span names don't contain the DB LINK parts and then instead store that in something like the destination fields? They should get indexed and are searchable.

That should make it possible to create visualizations that shows whether a specific DB LINK location has problems for each SQL query or as a whole.

remove db link from Span name

TODO: define field where db link should be stored
remove JDBC escape logic from Scanner
@wolframhaussig
Copy link
Contributor Author

Maybe a good compromise would be to parse the SQL so the span names don't contain the DB LINK parts and then instead store that in something like the destination fields? They should get indexed and are searchable.

Sounds great but how do I do that? I have removed the db link from the Span name but I am lost where I should store it instead. Do we need to add a field to the co.elastic.apm.agent.impl.transaction.Db class?

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great but how do I do that? I have removed the db link from the Span name but I am lost where I should store it instead. Do we need to add a field to the co.elastic.apm.agent.impl.transaction.Db class?

I think that's what we'll need to do, but we will also need to come up with changes to the schema to make room for it.

I would suggest going ahead with the code changes apart from actually recording the db link name. We'll open an issue to discuss where it should go, and then we can come back and update the code to record it. That way the hard bits are done when we come to the decision, and it'll be quick to update.

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wolframhaussig I'm happy with this approach now!

My Java is rusty, so I'll leave it to @eyalkoren or @felixbarny to finish off the review. I'll open an issue soon to discuss where the dblink value should be stored, and then we can follow up to use the extracted value.

…/agent/jdbc/signature/SignatureParser.java

Co-Authored-By: Felix Barnsteiner <felixbarny@users.noreply.github.com>
@codecov-io
Copy link

codecov-io commented Jul 12, 2019

Codecov Report

Merging #696 into master will increase coverage by 1.49%.
The diff coverage is 77.3%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #696      +/-   ##
============================================
+ Coverage     62.32%   63.82%   +1.49%     
+ Complexity     1378       68    -1310     
============================================
  Files           199      205       +6     
  Lines          7838     8254     +416     
  Branches        973     1056      +83     
============================================
+ Hits           4885     5268     +383     
- Misses         2652     2673      +21     
- Partials        301      313      +12
Impacted Files Coverage Δ Complexity Δ
...ent/jaxws/JaxWsTransactionNameInstrumentation.java 70.58% <ø> (ø) 0 <0> (ø) ⬇️
...lastic/apm/agent/report/ReporterConfiguration.java 100% <ø> (ø) 0 <0> (-12) ⬇️
.../plugin/api/CaptureTransactionInstrumentation.java 39.13% <ø> (ø) 0 <0> (-6) ⬇️
...duled/ScheduledTransactionNameInstrumentation.java 46.15% <ø> (ø) 0 <0> (ø) ⬇️
...tic/apm/agent/configuration/CoreConfiguration.java 97.64% <ø> (-0.07%) 0 <0> (-20)
...co/elastic/apm/agent/jaxrs/JaxRsConfiguration.java 100% <100%> (ø) 0 <0> (ø) ⬇️
.../co/elastic/apm/agent/impl/payload/SystemInfo.java 79.41% <100%> (ø) 0 <0> (-26) ⬇️
...lastic/apm/agent/impl/transaction/Transaction.java 85.71% <100%> (-0.65%) 0 <0> (-27)
.../apm/agent/report/serialize/DslJsonSerializer.java 86.48% <100%> (+2.95%) 0 <0> (-144) ⬇️
...m/agent/jms/JmsMessageProducerInstrumentation.java 44.18% <100%> (+44.18%) 0 <0> (ø) ⬇️
... and 47 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4af1d3e...b73fdd0. Read the comment docs.

@felixbarny
Copy link
Member

Could you format the code you edited? It's a mix of tabs and spaces (we use only spaces) and there are some spaces missing after keywords like if( instead of if (. Try not to format code you did not touch, so that git blame still works.

After that, we can merge 🙂

@felixbarny
Copy link
Member

Sorry to bother you with formatting again but SignatureParser and the other classes you have modified would need some formatting too. Just make sure you don't format lines you didn't work on. If it's too fiddly with your formatter it's no big deal. Just tell me and I'll do the formatting 🙂

@wolframhaussig
Copy link
Contributor Author

I think I fixed all places I modified now

@felixbarny
Copy link
Member

I found some more places. Can you merge https://github.com/wolframhaussig/apm-agent-java/pull/2?

@felixbarny felixbarny merged commit 8e44cb5 into elastic:master Jul 18, 2019
@felixbarny
Copy link
Member

Thanks again, great work!

@wolframhaussig wolframhaussig deleted the 676-further-SQL-parsing-improvement branch July 19, 2019 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants