Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtins: add builtin to retrieve the payload(s) for a span. #60616

Merged
merged 1 commit into from
Feb 18, 2021

Conversation

angelapwen
Copy link

@angelapwen angelapwen commented Feb 16, 2021

Part of addressing #55733

The payloads_for_span builtin retrieves all payloads for
a given span ID, given that the span is part of an active trace.
The payloads are returned in JSONB format. If the span is not
found, or if the span does not have any payloads, the builtin
returns an empty JSON object.

With the appropriate usage of this builtin and the
crdb_internal.trace_id builtin as shown in the contention_event
logic test, all payloads for the current trace may be surfaced.

Release note (sql change): add payloads_for_span builtin that
takes in a span ID and returns its paylods in JSONB format. If
the span is not found, or if the span does not have any payloads,
the builtin returns an empty JSON object.

@angelapwen angelapwen added the do-not-merge bors won't merge a PR with this label. label Feb 16, 2021
@angelapwen angelapwen self-assigned this Feb 16, 2021
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@angelapwen angelapwen changed the title [WIP] Add payload builtin builtins: add builtin to retrieve the payload(s) for a span. Feb 16, 2021
@angelapwen angelapwen removed the do-not-merge bors won't merge a PR with this label. label Feb 16, 2021
@angelapwen angelapwen requested review from tbg and knz February 16, 2021 17:18
craig bot pushed a commit that referenced this pull request Feb 17, 2021
60512: backupccl: update backup detached tests with retry execution under transaction r=pbardea a=Elliebababa

update backup and restore detached tests with retry execution under transaction

Previously, backup and restore detached tests do not retry if transactions fail. They could be flaky tests and fail in CI check.

With this PR, backup and restore detached tests under transaction will not be aborted by retry errors.

Release notes: None.

60594: build: explicitly add unzip as a dependency r=rickystewart a=ulfjack

The `unzip` package is not installed by default in Ubuntu. It's currently pulled in as an implicit dependency of the Bazel package, but that breaks if Bazel ever removes that dependency (or when changing the Dockerfile to not install Bazel).

60648: distsql: improve test harness for processors against operators r=yuzefovich a=yuzefovich

Previously, when comparing the output of processors and operators, if
any order is allowed, we would perform a quadratic search. This is
suboptimal, and this commit improves the situation by sorting both
outputs first and then using the ordered comparison.

Additionally, this commit fixes a rare flake when seemingly the same
outputs would result in a mismatch. I could somewhat reliably reproduce
it before the patch and cannot reproduce it after the patch. It is
likely to do something with the float comparison, but I didn't get to
the bottom of the flake, yet I don't think it's worth it given that this
commit improves the situation anyway.

Fixes: #60608.

Release note: None

60666: util/tracing: re-key activeSpans map on span ID r=irfansharif a=angelapwen

Note that the changes in this PR should be used in #60616 to more easily retrieve a span given its span ID, without having to visit all spans and check for a match on span ID. Whichever PR lands later should update the sections of code marked as #TODOs in #60616.

Previously, a tracer's activeSpans map was keyed on the memory
address of the span itself. This commit keys the map on the span ID
(which is deterministically unique) and sets the corresponding
value in the map to be the memory address of the span itself. This
allows us to continue to visit all active spans via the map, but
also easily retrieve a span from its ID using the map.

Release note: None

60679: licenses: Update CCL text r=petermattis a=bdarnell

Remove a dead link and clarify related text

Release note: None


60681: sql: fix spacing for comments around virtual schema r=irfansharif a=irfansharif

Release note: None

Co-authored-by: elliebababa <ellie24.huang@gmail.com>
Co-authored-by: Ulf Adams <ulfjack@users.noreply.github.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: angelapwen <angelaw@cockroachlabs.com>
Co-authored-by: Ben Darnell <ben@cockroachlabs.com>
Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Holding off on further review until we introduce a tabular for for span payloads)

pkg/sql/sem/builtins/builtins.go Outdated Show resolved Hide resolved
pkg/sql/sem/builtins/builtins.go Outdated Show resolved Hide resolved
pkg/sql/sem/builtins/builtins.go Outdated Show resolved Hide resolved
pkg/sql/sem/builtins/builtins.go Outdated Show resolved Hide resolved
@angelapwen
Copy link
Author

I need to rebase on top of #60666 and then push up my newest changes that use the activeSpans map rather than VisitSpans(). Please hold 😸

@angelapwen
Copy link
Author

Ok, have rebased and resolved comments 🙆‍♀️ if #59992 is merged soon I'm inclined to fix up the commits here and merge what we have, then work on improving the output of the built-in to return a table separately. Otherwise I'll also do that in this PR.

@irfansharif
Copy link
Contributor

I'd personally prefer to iterate right here, especially if we're considering changing the output form anyway. I imagine it'll end up undoing most of what was introduced in the last commit.

@angelapwen
Copy link
Author

angelapwen commented Feb 18, 2021

@irfansharif ah, ignore the previous comment here — I'd thought you meant iterate over the spans but you meant iterate on the PR. 😸 That sounds fine to me.

@angelapwen angelapwen force-pushed the add-payload-builtin branch 2 times, most recently from 33d3166 to b454338 Compare February 18, 2021 10:27
@angelapwen
Copy link
Author

I've now rebased off of master now that #59992 is merged ⚡ and removed the num_payloads column of the node_inflight_trace_spans vtable that #59992 introduced.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is OK as a transition point. Ok to merge this after you spell out what's going to happen next.

Another confirmation that the JSON array approach is defective is that it does not perform memory monitoring and so the accumulation in RAM can cause the node to crash with an OOM error.

This would be alleviated by a streaming approach, or adding memory monitoring to the current implementation).

Reviewed 15 of 15 files at r12.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)

@angelapwen
Copy link
Author

angelapwen commented Feb 18, 2021

Thanks @knz ✨ To make sure I understand correctly, when I make the modification so that the built-in returns a table, there should not be an OOM error because the value generator functions stream values into rows default?

As for what's going to happen next:

  • Update the payloads_for_spans builtin to return not a JSONB Array but instead a table using the generator functions. Each row of the table would represent a single payload attached to the span. This table should include columns including at least type and JSON so that a user would be able to filter off of a certain type (or other attribute).
  • After this, add a pretty column to the table returned and have the JSON pretty-printed as the value.

Copy link
Author

@angelapwen angelapwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)


pkg/sql/sem/builtins/builtins.go, line 3611 at r11 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

Not anymore! Thanks, removing locally.

Done.


pkg/sql/sem/builtins/builtins.go, line 3625 at r11 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

Yep, you're right, this was a vestige of the string version we had played around with! Removing.

Done.


pkg/sql/sem/builtins/builtins.go, line 3629 at r11 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

You're right, though now that I am not going to use the VisitSpans method (because I can use the activeSpans map) this is no longer relevant.

Done.


pkg/sql/sem/builtins/builtins.go, line 3634 at r11 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

It doesn't ever evaluate to true — this is why I had never hit the panic when NewDJSON is passed nil yesterday. I'm removing this and in the next version where I get the span from its ID in the map, will return an empty JSON array if the span ID isn't found.

Done.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a pretty column to the table returned and have the JSON pretty-printed as the value.

thanks, can you explain this point more?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @knz, and @tbg)

@angelapwen
Copy link
Author

@knz I meant that the built-in's table returned could also have a column called pretty whose value is the pretty-printed version of the JSON blob (in text format). We would want both this column and the column that returns the actual JSONB value so that a user could also use the JSONB functions on this column.

By the way, I was thinking — should the builtins for trace_id and payloads_for_span require that a user is admin? If so I can add those checks to this PR.

@irfansharif
Copy link
Contributor

By the way, I was thinking — should the builtins for trace_id and payloads_for_span require that a user is admin? If so I can add those checks to this PR.

Yup.

The `crdb_internal.payloads_for_span` builtin retrieves all
payloads for a given span ID, given that the span is part of an
active trace. The payloads are returned in JSONB format. If the
span is not found, or if the span does not have any payloads, the
builtin returns an empty JSON object.

With the appropriate usage of this builtin and the
`crdb_internal.trace_id` builtin as shown in the `contention_event`
logic test, all payloads for the current trace may be surfaced.

Release note (sql change): add `payloads_for_span` builtin that
takes in a span ID and returns its paylods in JSONB format. If
the span is not found, or if the span does not have any payloads,
the builtin returns an empty JSON object.
@angelapwen
Copy link
Author

Added admin checks for the trace_id and payloads_for_span builtins 😺

@angelapwen
Copy link
Author

bors r=knz,irfansharif

Thanks for the reviews 😄

@craig
Copy link
Contributor

craig bot commented Feb 18, 2021

This PR was included in a batch that was canceled, it will be automatically retried

@craig
Copy link
Contributor

craig bot commented Feb 18, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 18, 2021

Build succeeded:

@craig craig bot merged commit d67d6cc into cockroachdb:master Feb 18, 2021
@angelapwen angelapwen deleted the add-payload-builtin branch February 19, 2021 09:52
craig bot pushed a commit that referenced this pull request Feb 22, 2021
60784: builtins: add generator builtin for span payloads r=irfansharif,knz a=angelapwen

Following up from #60616. Part of addressing #55733.

Previously we introduced a `crdb_internal.payloads_for_span`
builtin that returned a JSONB array representing all payloads for
a particular span. To improve the user experience of the builtin
as well as prevent OOMs resulting from builtin usage, we replace it
with a generator builtin that returns a table representing all payloads
for a span, where each row represents a single payload.

The new builtin, also called `crdb_internal.payloads_for_span`, has
columns for the `payload_type` so that the user has the ability to
easily filter on the type, and `payload_jsonb` so the user can use
jsonb builtins to filter further.

Release note (sql change): Update `crdb_internal.payloads_for_span`
builtin to return a table instead of a JSONB array. Each row of the
table represents one payload for the given span. It has columns for
`payload_type` and `payload_jsonb`.

Co-authored-by: angelapwen <angelaw@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants