further SQL parsing improvement (#676) #696

wolframhaussig · 2019-06-26T12:58:57Z

support for CALL using JDBC escape syntax
support for MERGE

MERGE support also contains support for DB links

support for CALL using JDBC escape syntax support for MERGE MERGE support also contains support for DB links

…arsing-improvement

eyalkoren · 2019-06-26T14:21:06Z

Closes #676

support DB links on CALL, DELETE, INSERT, SELECT, UPDATE MERGE now shows MERGE INTO instead of MERGE

eyalkoren · 2019-06-27T10:45:36Z

@wolframhaussig Thanks for your contribution!!
Please let me know when you feel this is ready for review

wolframhaussig · 2019-06-27T10:55:17Z

@eyalkoren This can be reviewed: I have added support for DB links on the other types like UPDATE, SELECT now so this is all I can do for now.

eyalkoren · 2019-06-27T13:17:58Z

@axw please take a look. In general, and specifically about the @ stuff- we want to treat these as sensitive info?

axw · 2019-06-28T09:10:15Z

Thanks @wolframhaussig!

MERGE looks good!

What's the reason for including the DB link in the span name? Is it necessary, given that you can still find the same information in the span context? Is it likely that you would query the same schema/table pair with multiple DB links, and not want to aggregate them?

I don't think we need to consider the DB link info as sensitive. It's just a DB link name and optional user name, and we already record those kinds of things in span context.

Regarding the JDBC escape syntax: I think it might be tidier if we do this as a pre-processing step. This is (obviously) JDBC specific, so separating it out would allow us to keep the SQL-parsing code the same for all the agents excluding the pre-processing step. For the pre-processing step, I think we'd just skip over the leading "{" and possible "? =". What do you think?

wolframhaussig · 2019-06-28T09:33:22Z

What's the reason for including the DB link in the span name? Is it necessary, given that you can still find the same information in the span context? Is it likely that you would query the same schema/table pair with multiple DB links, and not want to aggregate them?

Our UseCase contains db links to a lot of different databases located in different locations(in germany but also worldwide). My reasoning is that accessing the same table from different locations might take longer. Also, not all locations have the same amount of data as every location has only data for their plant. So a performance issue might be not reproducable on another database which has different/less data.

As far as I know the span context cannot be searched, right? So I could not search for access to the same table in the same db link. Can we come to a compromise? Are you concerned about the parsing performance or about the DB link in the span name? It would be fine for me if the information is in the Span but not in the name so I could - in theory - filter by tag/label. Of course I could add the data manually by using the public API but maybe we can add the data as label and if performance is an issue make it configurable for the user?

Regarding the JDBC escape syntax: I am not completely against the idea of taking the code out of the sql parsing as it is jdbc specific as you said. But I do not completely agree either: I know we currently need this syntax parser only for the sql CALL but there are more uses of it so it would be more future aware if it was in the sql parser itself. I guess the most interesting after the call syntax are the outer joins. I do not have the usecase so I didn't implement it but the jdbc plugin currently fails on the following JDBC escape sequence:

  {
    "input": "SELECT * FROM {oj Countries JOIN Cities ON (Countries.country_ISO_code=Cities.country_ISO_code)}",
    "output": "SELECT FROM Countries"
  }

My code currently gives "oj" as table name, your code would give "{oj" but the real table name is of course "Countries".

What is your opinion on that?

axw · 2019-07-01T02:07:09Z

As far as I know the span context cannot be searched, right? So I could not search for access to the same table in the same db link.

Right, good point. I take it then that you're performing aggregations over the span data, and not just using them in the APM UI?

Are you concerned about the parsing performance or about the DB link in the span name? It would be fine for me if the information is in the Span but not in the name so I could - in theory - filter by tag/label. Of course I could add the data manually by using the public API but maybe we can add the data as label and if performance is an issue make it configurable for the user?

I'm not concerned about performance in this instance, but usability. Perhaps not everyone will want the link in the name, and having it in the name makes it more difficult to aggregate spans for multiple locations/whatever. Perhaps that's not that important. I'm not really opposed to including it in the span name, just wanted to consider our options. @roncohen WDYT?

Regarding the JDBC escape syntax: I am not completely against the idea of taking the code out of the sql parsing as it is jdbc specific as you said. But I do not completely agree either: I know we currently need this syntax parser only for the sql CALL but there are more uses of it so it would be more future aware if it was in the sql parser itself. I guess the most interesting after the call syntax are the outer joins. I do not have the usecase so I didn't implement it but the jdbc plugin currently fails on the following JDBC escape sequence:

Ah yes, I was aware of the other cases but hadn't considered the oj case properly. I think we could just skip over the { and oj tokens, so this could be implemented as a filter on the token source rather than modifying the parser. We could deal with that if/when we handle that escape syntax, or we could just implement it like that now: wrap the Scanner with another type that is escape-syntax aware, which will keep track of the escape sequences and filter out those tokens.

wolframhaussig · 2019-07-01T06:26:52Z

Right, good point. I take it then that you're performing aggregations over the span data, and not just using them in the APM UI?

Currently, we only want the Span name to be searchable in case of a performance problem so we can see if it is a one time problem for this db link or not. But in future we also want to show the performance data from the APM agent grouped by destinations in an own dashboard.

Ah yes, I was aware of the other cases but hadn't considered the oj case properly. I think we could just skip over the { and oj tokens, so this could be implemented as a filter on the token source rather than modifying the parser. We could deal with that if/when we handle that escape syntax, or we could just implement it like that now: wrap the Scanner with another type that is escape-syntax aware, which will keep track of the escape sequences and filter out those tokens.

I am fine with that. I will wait with the change until we have completed the discussion regarding the db link topic.

roncohen · 2019-07-01T09:11:12Z

Maybe a good compromise would be to parse the SQL so the span names don't contain the DB LINK parts and then instead store that in something like the destination fields? They should get indexed and are searchable.

That should make it possible to create visualizations that shows whether a specific DB LINK location has problems for each SQL query or as a whole.

remove db link from Span name TODO: define field where db link should be stored

remove JDBC escape logic from Scanner

wolframhaussig · 2019-07-01T13:49:34Z

Maybe a good compromise would be to parse the SQL so the span names don't contain the DB LINK parts and then instead store that in something like the destination fields? They should get indexed and are searchable.

Sounds great but how do I do that? I have removed the db link from the Span name but I am lost where I should store it instead. Do we need to add a field to the co.elastic.apm.agent.impl.transaction.Db class?

axw

Sounds great but how do I do that? I have removed the db link from the Span name but I am lost where I should store it instead. Do we need to add a field to the co.elastic.apm.agent.impl.transaction.Db class?

I think that's what we'll need to do, but we will also need to come up with changes to the schema to make room for it.

I would suggest going ahead with the code changes apart from actually recording the db link name. We'll open an issue to discuss where it should go, and then we can come back and update the code to record it. That way the hard bits are done when we come to the decision, and it'll be quick to update.

...ugins/apm-jdbc-plugin/src/main/java/co/elastic/apm/agent/jdbc/signature/SignatureParser.java

reworked db link logic

axw

Thanks @wolframhaussig I'm happy with this approach now!

My Java is rusty, so I'll leave it to @eyalkoren or @felixbarny to finish off the review. I'll open an issue soon to discuss where the dblink value should be stored, and then we can follow up to use the extracted value.

...ugins/apm-jdbc-plugin/src/main/java/co/elastic/apm/agent/jdbc/signature/SignatureParser.java

...ins/apm-jdbc-plugin/src/main/java/co/elastic/apm/agent/jdbc/signature/filter/JdbcFilter.java

…/agent/jdbc/signature/SignatureParser.java Co-Authored-By: Felix Barnsteiner <felixbarny@users.noreply.github.com>

codecov-io · 2019-07-12T13:48:13Z

Codecov Report

Merging #696 into master will increase coverage by 1.49%.
The diff coverage is 77.3%.

@@             Coverage Diff              @@
##             master     #696      +/-   ##
============================================
+ Coverage     62.32%   63.82%   +1.49%     
+ Complexity     1378       68    -1310     
============================================
  Files           199      205       +6     
  Lines          7838     8254     +416     
  Branches        973     1056      +83     
============================================
+ Hits           4885     5268     +383     
- Misses         2652     2673      +21     
- Partials        301      313      +12

Impacted Files	Coverage Δ	Complexity Δ
...ent/jaxws/JaxWsTransactionNameInstrumentation.java	`70.58% <ø> (ø)`	`0 <0> (ø)`	⬇️
...lastic/apm/agent/report/ReporterConfiguration.java	`100% <ø> (ø)`	`0 <0> (-12)`	⬇️
.../plugin/api/CaptureTransactionInstrumentation.java	`39.13% <ø> (ø)`	`0 <0> (-6)`	⬇️
...duled/ScheduledTransactionNameInstrumentation.java	`46.15% <ø> (ø)`	`0 <0> (ø)`	⬇️
...tic/apm/agent/configuration/CoreConfiguration.java	`97.64% <ø> (-0.07%)`	`0 <0> (-20)`
...co/elastic/apm/agent/jaxrs/JaxRsConfiguration.java	`100% <100%> (ø)`	`0 <0> (ø)`	⬇️
.../co/elastic/apm/agent/impl/payload/SystemInfo.java	`79.41% <100%> (ø)`	`0 <0> (-26)`	⬇️
...lastic/apm/agent/impl/transaction/Transaction.java	`85.71% <100%> (-0.65%)`	`0 <0> (-27)`
.../apm/agent/report/serialize/DslJsonSerializer.java	`86.48% <100%> (+2.95%)`	`0 <0> (-144)`	⬇️
...m/agent/jms/JmsMessageProducerInstrumentation.java	`44.18% <100%> (+44.18%)`	`0 <0> (ø)`	⬇️
... and 47 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4af1d3e...b73fdd0. Read the comment docs.

make dbLink nullable minor formatting fix

…ment Simplify JDBC escape filtering and eliminate allocations

…com/wolframhaussig/apm-agent-java into 676-further-SQL-parsing-improvement

felixbarny · 2019-07-17T08:23:00Z

Could you format the code you edited? It's a mix of tabs and spaces (we use only spaces) and there are some spaces missing after keywords like if( instead of if (. Try not to format code you did not touch, so that git blame still works.

After that, we can merge 🙂

felixbarny · 2019-07-17T09:15:18Z

Sorry to bother you with formatting again but SignatureParser and the other classes you have modified would need some formatting too. Just make sure you don't format lines you didn't work on. If it's too fiddly with your formatter it's no big deal. Just tell me and I'll do the formatting 🙂

wolframhaussig · 2019-07-17T11:12:19Z

I think I fixed all places I modified now

felixbarny · 2019-07-18T09:24:16Z

I found some more places. Can you merge https://github.com/wolframhaussig/apm-agent-java/pull/2?

…ment formatting

felixbarny · 2019-07-18T09:47:10Z

Thanks again, great work!

further SQL parsing improvement (#676)

5e65a48

support for CALL using JDBC escape syntax support for MERGE MERGE support also contains support for DB links

eyalkoren added the [zube]: In Progress label Jun 26, 2019

Merge remote-tracking branch 'upstream/master' into 676-further-SQL-p…

14b3c0c

…arsing-improvement

further SQL parsing improvement (#676)

af23e5d

support DB links on CALL, DELETE, INSERT, SELECT, UPDATE MERGE now shows MERGE INTO instead of MERGE

eyalkoren requested a review from axw June 27, 2019 13:18

eyalkoren added [zube]: In Review and removed [zube]: In Progress labels Jun 30, 2019

wolframhaussig added 2 commits July 1, 2019 14:17

further SQL parsing improvement (#676)

ac11700

remove db link from Span name TODO: define field where db link should be stored

further SQL parsing improvement (#676)

a8b4266

remove JDBC escape logic from Scanner

axw requested changes Jul 2, 2019

View reviewed changes

...ugins/apm-jdbc-plugin/src/main/java/co/elastic/apm/agent/jdbc/signature/SignatureParser.java Outdated Show resolved Hide resolved

...ugins/apm-jdbc-plugin/src/main/java/co/elastic/apm/agent/jdbc/signature/SignatureParser.java Outdated Show resolved Hide resolved

further SQL parsing improvement (#676)

d8f6395

reworked db link logic

axw approved these changes Jul 4, 2019

View reviewed changes

This was referenced Jul 4, 2019

Proposal: add Database Link to span context elastic/apm#107

Open

[Agents] Oracle SQL parsing improvements elastic/apm#108

Open

eyalkoren assigned felixbarny Jul 8, 2019

felixbarny reviewed Jul 12, 2019

View reviewed changes

Update apm-agent-plugins/apm-jdbc-plugin/src/main/java/co/elastic/apm…

6a0a0b4

…/agent/jdbc/signature/SignatureParser.java Co-Authored-By: Felix Barnsteiner <felixbarny@users.noreply.github.com>

wolframhaussig and others added 2 commits July 15, 2019 08:19

further SQL parsing improvement (#676)

14486fb

make dbLink nullable minor formatting fix

Simplify JDBC escape filtering and eliminate allocations

e323eb2

felixbarny and others added 5 commits July 16, 2019 18:18

Make filter final

705c49c

Merge pull request #1 from felixbarny/676-further-SQL-parsing-improve…

10e9ff2

…ment Simplify JDBC escape filtering and eliminate allocations

added edge case testcase

183bc0c

Merge branch '676-further-SQL-parsing-improvement' of https://github.…

c08dd89

…com/wolframhaussig/apm-agent-java into 676-further-SQL-parsing-improvement

fix edge case when detecting procedure name using jdbc escape syntax

19b0a3b

formatting

15cacb9

formatting

0ce009b

formatting

20515ba

Merge pull request #2 from felixbarny/676-further-SQL-parsing-improve…

b73fdd0

…ment formatting

felixbarny approved these changes Jul 18, 2019

View reviewed changes

felixbarny merged commit 8e44cb5 into elastic:master Jul 18, 2019

zube bot added [zube]: Done and removed [zube]: In Review labels Jul 18, 2019

wolframhaussig deleted the 676-further-SQL-parsing-improvement branch July 19, 2019 12:47

felixbarny removed the [zube]: Done label Jul 22, 2019

further SQL parsing improvement (#676) #696

further SQL parsing improvement (#676) #696

Uh oh!

Conversation

wolframhaussig commented Jun 26, 2019

Uh oh!

eyalkoren commented Jun 26, 2019

Uh oh!

eyalkoren commented Jun 27, 2019

Uh oh!

wolframhaussig commented Jun 27, 2019

Uh oh!

eyalkoren commented Jun 27, 2019

Uh oh!

axw commented Jun 28, 2019

Uh oh!

wolframhaussig commented Jun 28, 2019

Uh oh!

axw commented Jul 1, 2019

Uh oh!

wolframhaussig commented Jul 1, 2019

Uh oh!

roncohen commented Jul 1, 2019

Uh oh!

wolframhaussig commented Jul 1, 2019

Uh oh!

axw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

axw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-io commented Jul 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

felixbarny commented Jul 17, 2019

Uh oh!

felixbarny commented Jul 17, 2019

Uh oh!

wolframhaussig commented Jul 17, 2019

Uh oh!

felixbarny commented Jul 18, 2019

Uh oh!

felixbarny commented Jul 18, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov-io commented Jul 12, 2019 •

edited

Loading