Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web console: use new sampler features #14017

Merged
merged 14 commits into from
Apr 7, 2023

Conversation

vogievetsky
Copy link
Contributor

@vogievetsky vogievetsky commented Apr 3, 2023

This is the UI part that follows, #13653 and #13900

This simplifies the data loader code by using schema discovery and also adds support for the Kafka input format

image

image

This PR also:

  • fixes the docs replacing the incorrect usage of headerLabelPrefix with headerColumnPrefix (that one tripped me up)
  • adds much better mock data to the data loader tables thereby greatly increasing the size of the snapshots

@clintropolis clintropolis added this to the 26.0 milestone Apr 5, 2023
Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks 🤘

Comment on lines 673 to 675
// const ingestionSpec: IngestionSpec = {
// type: 'index_parallel',
// spec: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these commented blocks supposed to be here?

Comment on lines 52 to 68
{ type: 'json', name: 'spend', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{ type: 'string', name: 'id', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{ type: 'json', name: 'tags', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{ type: 'json', name: 'nums', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
],
physicalDimensions: [
{ type: 'json', name: 'user', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{
type: 'json',
name: 'followers',
multiValueHandling: 'SORTED_ARRAY',
createBitmapIndex: true,
},
{ type: 'json', name: 'spend', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{ type: 'json', name: 'id', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{ type: 'json', name: 'tags', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
{ type: 'json', name: 'nums', multiValueHandling: 'SORTED_ARRAY', createBitmapIndex: true },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suppose I changed this underneath you in #14014, but it might be nice to update this at some point (also I have a bug to fix here since right now the sampler will show the new 'auto' in the physical schema but still have 'json' in the logical schema, so i think its ok to hold off on updating this)

export function guessDimensionsFromSampleResponse(sampleResponse: SampleResponse): DimensionSpec[] {
const { logicalDimensions, physicalDimensions, data } = sampleResponse;
return logicalDimensions.map(d => {
// Boolean column are currently reported as "long" so let's turn them into "string"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im still looking into this, since technically looking at stuff it looks like it should be dependent on the value of druid.expressions.useStrictBooleans. Additionally, 'long' really probably is better when using 'auto', since longs do have indexes in this mode, so im a bit conflicted about this staying like this long term, but i think its fine at least until we make 'auto' the default schemaless (or add indexes to classic 'long' schema)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason it is picking string is that if you pick long then it forces the true / false in the input to null (instead of 1 / 0)

@vogievetsky vogievetsky merged commit 5ee4ece into apache:master Apr 7, 2023
@vogievetsky vogievetsky deleted the flexi-schema branch April 7, 2023 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants