Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Analytics): Support for Timeseries Aggregated Statistics #3207

Merged
merged 33 commits into from
Sep 15, 2021

Conversation

rslanka
Copy link
Contributor

@rslanka rslanka commented Sep 7, 2021

Adds support for doing aggregations over time-series aspect fields, and migrates the existing usage stats implementation on top this feature. The key changes include

  • two new annotations @TimeseriesField and @TimeseriesFieldCollection to the PDL for defining the aggregatable fields
  • necessary elastic index generation logic
  • a generalized query end-point via the new analytics end-point.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

Copy link
Contributor

@dexter-mh-lee dexter-mh-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a string grouping and then date grouping, will it work?

In general, we need more docs (Logic overview, API examples, etc.). Maybe add them in the next PR!

import static com.linkedin.metadata.entity.ebean.EbeanUtils.parseSystemMetadata;
import static com.linkedin.metadata.entity.ebean.EbeanUtils.toAspectRecord;
import static com.linkedin.metadata.entity.ebean.EbeanUtils.toJsonAspect;
import static com.linkedin.metadata.entity.ebean.EbeanUtils.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this file? Seems like the only change is on the comments. Let's just change that and make it clear what we are changing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static imports change in this file is a result of running the linkedin code formatter on this file!

@rslanka
Copy link
Contributor Author

rslanka commented Sep 8, 2021

If we have a string grouping and then date grouping, will it work?
@dexter-mh-lee, Yes, it would work! The only limitation on the grouping buckets is that if a date grouping bucket is specified, it is allowed only on the timestampMillis field because of the way the index is built.

In general, we need more docs (Logic overview, API examples, etc.). Maybe add them in the next PR!
Sure, will ship them in the next PR.

DEFAULT,
/**
* Skip annotations on aspect record fields, only
* parse entity + aspect annotations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: might be useful to describe briefly why we need this mode

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit c418bc8 into datahub-project:master Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants