Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResultSet fillMissingDates not working as expected if dimensions in the query #718

Closed
0is1 opened this issue Jun 11, 2020 · 7 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@0is1
Copy link

0is1 commented Jun 11, 2020

Describe the bug

According to the documentation (https://cube.dev/docs/@cubejs-client-core#result-set-pivot) resultSet.tablePivot() (and other pivot methods) should have pivotConfig.fillMissingDates true by default.

It seems that there's some sort of bug if dimensions are set to the query. All these (resultSet.tablePivot(), resultSet.tablePivot({fillMissingDates: false}) and resultSet.tablePivot(resultSet.normalizePivotConfig())) return same data for this kind of query:

{
    dimensions: ["xxx.name", "Dates.date"],
    filters: [],
    measures: ["zzz.count"],
    offset: 0,
    timeDimensions: [{
        dateRange: ["2020-06-04T00:00:00.000", "2020-06-10T23:59:59.999"],
        dimension: "Dates.date"
    }],
    timezone: "Europe/Helsinki"
}

resultSet.tablePivot(resultSet.normalizePivotConfig()) returns

[
    {xxx.name: "yyy", Dates.date: "2020-06-04T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy", Dates.date: "2020-06-07T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy", Dates.date: "2020-06-09T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy2", Dates.date: "2020-06-04T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy2", Dates.date: "2020-06-07T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy2", Dates.date: "2020-06-09T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy2", Dates.date: "2020-06-04T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy2", Dates.date: "2020-06-07T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy2", Dates.date: "2020-06-09T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy3", Dates.date: "2020-06-04T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy3", Dates.date: "2020-06-07T00:00:00.000", zzz.count: "1"},
    {xxx.name: "yyy3", Dates.date: "2020-06-09T00:00:00.000", zzz.count: "1"},
]

Every different xxx.name should have zzz.count: 0 if there’s no data on that day, right?

If I add granularity: 'day' and use resultSet.tablePivot(resultSet.normalizePivotConfig()) then it returns zero data but only for xxx.name: "yyy". So there’s no data for xxx.name: "yyy2" or xxx.name: "yyy3" in that case.

To Reproduce
See example type of query above. You need to have data with count values >= 0 and use some dimensions. If you don't include any dimensions to the query, this is working as expected.

Expected behavior
All dimension xxx.name values should have zzz.count: 0 if there's no data on that day when using resultSet.tablePivot().

Version:

"@cubejs-client/core": "^0.19.22",
"@cubejs-client/react": "^0.19.22",

Additional context
Slack conversation https://cube-js.slack.com/archives/CC0403RRR/p1591601894030200

@vasilev-alex vasilev-alex added the bug Something isn't working label Jun 11, 2020
@vasilev-alex vasilev-alex self-assigned this Jun 11, 2020
@dvins
Copy link

dvins commented Jun 24, 2020

We just ran into this bug ourselves, still exists in 0.19.35. However, removing the dimension did not solve the issue for us.

@vasilev-alex
Copy link
Member

Hi @dvins, @0is1!

The incorrect behavior was fixed in 0.19.35. You can learn more about pivotConfig here https://cube.dev/docs/@cubejs-client-core#pivot-config

The only thing with the fillMissingDates option is that it works (fills missing dates) only when there's a single time dimension on x axis (otherwise it's ignored)

e.g.

  const pivotConfig = {
    x: ['Orders.ts.week'],
    y: ['Users.country', 'measures'],
    fillMissingDates: true
  }

So, @0is1 in your case the values of xxx.name will lay on the y axis. Please let me know if that's what you want.

@dvins could you please tell more about your use case? Ideally share code examples and the desired outcome.

@dvins
Copy link

dvins commented Jun 24, 2020

Hi @vasilev-alex,

It's pretty basic, our pivotConfig is similar:

   const timeDimension = "Session.startedAtUtc.day";
   const pivotConfig = { x:["measures"], y:[timeDimension], fillMissingDates: true};

We then call as follows:

const columns = resultSet.seriesNames(pivotConfig).map(c => ({
    field: c.key,
    cellStyle: { textAlign: "center"},
    headerName: c.title,
    suppressFieldDotNotation: true
    }));

We can see in the call to .seriesName(pivotConfig) call above that missing dates are not being included. We only get series for those dates for the data in the ResultSet. Similarly, when we later make a .chartPivot(pivotConfig) each record has only the series that was originally returned, there are attributes for the missing dates with any kind of empty/null/zero value.

I can share other details with you privately via the CubeJS Slack group. Will message you there.

@vasilev-alex
Copy link
Member

@dvins, it will work in your case if you put the timeDimension on x and the other dimensions on the y axis.

@0is1
Copy link
Author

0is1 commented Jun 30, 2020

Hi @dvins, @0is1!

The incorrect behavior was fixed in 0.19.35. You can learn more about pivotConfig here https://cube.dev/docs/@cubejs-client-core#pivot-config

The only thing with the fillMissingDates option is that it works (fills missing dates) only when there's a single time dimension on x axis (otherwise it's ignored)

e.g.

  const pivotConfig = {
    x: ['Orders.ts.week'],
    y: ['Users.country', 'measures'],
    fillMissingDates: true
  }

So, @0is1 in your case the values of xxx.name will lay on the y axis. Please let me know if that's what you want.

@dvins could you please tell more about your use case? Ideally share code examples and the desired outcome.

Hi @vasilev-alex !

I did a quick test with the latest version ("@cubejs-client/core": "^0.19.40", and "@cubejs-client/react": "^0.19.37",) and it seems to broke resultSet.tableColumns() logic? Now there seems to be column for every dimension value (before there was only one column per dimension key)?

So tableColumns() return data like this:

[
{key: "Y.firstpublished.day", type: "time", format: undefined},
{key: "Dimension 1 value 1", type: "string", format: undefined, meta: undefined},
{key: "Dimension 1 value 2", type: "string", format: undefined, meta: undefined},
{key: "Dimension 1 value 3", type: "string", format: undefined, meta: undefined},
{key: "Dimension 1 value 4", type: "string", format: undefined, meta: undefined},
{key: "Dimension 1 value 5", type: "string", format: undefined, meta: undefined},
etc...,
]

for query like this:

dimensions: ["X.name"]
filters: []
measures: ["X.count"]
offset: 0
order: {}
timeDimensions: [{granularity: "day", dimension: "Y.firstpublished"}]

I expect tableColumns() to return:

[
{key: "Y.firstpublished.day", type: "time", format: undefined},
{key: "X.name", type: "string", format: undefined, meta: undefined},
]

Did there happen some kind of change to that logic also? So do I need to use some new props when using resultSet.tableColumns() if I want to get only one column per dimension key?

@vasilev-alex
Copy link
Member

vasilev-alex commented Jul 1, 2020

@0is1

and it seems to broke resultSet.tableColumns() logic?

Actually we have fixed the incorrect behavior for tableColums and now it works as expected (more on this here https://cube.dev/docs/@cubejs-client-core#result-set-table-columns)

As for your case you can simply pass pivotConfig to configure the way tablePivot and tableColumns work

I expect tableColumns() to return:

[
{key: "Y.firstpublished.day", type: "time", format: undefined},
{key: "X.name", type: "string", format: undefined, meta: undefined},
]

you can pass this config (see the docs here https://cube.dev/docs/@cubejs-client-core#pivot-config)

const pivotConfig = {
  x: ['Y.firstpublished.day', 'X.name'],
  y: ['measures']
}

But there's a limitation here, when you pass any dimension other than timeDimension (Y.firstpublished.day in your case) to the x axis the fillMissingDates option gets ignored.

As of now it's the expected behavior. But it would be really nice if you shared your thoughts on how it should fill missing dates?

e.g. if you have this kind of data:

const query = {
  measures: ['Orders.count'],
  dimensions: ['Users.country'],
  timeDimensions: [
    {
      dimension: 'Orders.ts',
      granularity: 'day',
      dateRange: ['2020-05-01', '2020-05-31']
    },
  ],
};

// the result of `tablePivot`
const tablePivotResult = [
  {
    'Users.country': 'Italy',
    'Orders.ts.day': '2020-05-02T00:00:00.000',
    'Orders.count': 2,
  },
  {
    'Users.country': 'Italy',
    'Orders.ts.day': '2020-05-04T00:00:00.000',
    'Orders.count': 2,
  },
  {
    'Users.country': 'US',
    'Orders.ts.day': '2020-05-14T00:00:00.000',
    'Orders.count': 1,
  },
  {
    'Users.country': 'France',
    'Orders.ts.day': '2020-05-01T00:00:00.000',
    'Orders.count': 1,
  },
  {
    'Users.country': 'France',
    'Orders.ts.day': '2020-05-10T00:00:00.000',
    'Orders.count': 1,
  },
]

// after `fillMissingDates`

const res = [
  {
    'Users.country': 'Italy',
    'Orders.ts.day': '2020-05-01T00:00:00.000',
    'Orders.count': 0,
  },
  {
    'Users.country': 'Italy',
    'Orders.ts.day': '2020-05-02T00:00:00.000',
    'Orders.count': 2,
  },
  // ...
  {
    'Users.country': 'Italy',
    'Orders.ts.day': '2020-05-31T00:00:00.000',
    'Orders.count': 0
  },
  {
    'Users.country': 'US',
    'Orders.ts.day': '2020-05-01T00:00:00.000',
    'Orders.count': 0,
  },
  // ...
  {
    'Users.country': 'US',
    'Orders.ts.day': '2020-05-31T00:00:00.000',
    'Orders.count': 0,
  },
  // ...
  // ...
]

And what if we add even more dimensions? Like Orders.status to the query

const tablePivotResult = [
  {
    'Users.country': 'Italy',
    'Orders.status': 'shipped',
    'Orders.ts.day': '2020-05-02T00:00:00.000',
    'Orders.count': 2,
  },
  {
    'Users.country': 'Italy',
    'Orders.status': 'shipped',
    'Orders.ts.day': '2020-05-04T00:00:00.000',
    'Orders.count': 2,
  },
  {
    'Users.country': 'US',
    'Orders.status': 'pending',
    'Orders.ts.day': '2020-05-14T00:00:00.000',
    'Orders.count': 1,
  },
  {
    'Users.country': 'France',
    'Orders.status': 'pending',
    'Orders.ts.day': '2020-05-01T00:00:00.000',
    'Orders.count': 1,
  },
  {
    'Users.country': 'France',
    'Orders.status': 'shipped',
    'Orders.ts.day': '2020-05-10T00:00:00.000',
    'Orders.count': 1,
  },
]

In the example above users from Italy have all orders shipped but some users from other countries have pending orders. Should we take it into account and fill the missing dates for all possible statuses?

[
  {
    'Users.country': 'Italy',
    'Orders.status': 'shipped',
    'Orders.ts.day': '2020-05-01T00:00:00.000',
    'Orders.count': 0,
  },
  //...
  {
    'Users.country': 'Italy',
    'Orders.status': 'pending',
    'Orders.ts.day': '2020-05-01T00:00:00.000',
    'Orders.count': 0,
  },
  //...
  //...
]

@0is1
Copy link
Author

0is1 commented Sep 3, 2020

Hello!

Current version 0.20.x seems to work as wanted. Thank you @vasilev-alex 🍻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants