Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/6623 ga4 widgets real data #6790

Merged
merged 31 commits into from
Apr 5, 2023

Conversation

eugene-manuilov
Copy link
Collaborator

@eugene-manuilov eugene-manuilov commented Mar 29, 2023

Summary

Addresses issue:

Relevant technical choices

PR Author Checklist

  • My code is tested and passes existing unit tests.
  • My code has an appropriate set of unit tests which all pass.
  • My code is backward-compatible with WordPress 5.2 and PHP 5.6.
  • My code follows the WordPress coding standards.
  • My code has proper inline documentation.
  • I have added a QA Brief on the issue linked above.
  • I have signed the Contributor License Agreement (see https://cla.developers.google.com/).

Do not alter or remove anything below. The following sections will be managed by moderators only.

Code Reviewer Checklist

  • Run the code.
  • Ensure the acceptance criteria are satisfied.
  • Reassess the implementation with the IB.
  • Ensure no unrelated changes are included.
  • Ensure CI checks pass.
  • Check Storybook where applicable.
  • Ensure there is a QA Brief.

Merge Reviewer Checklist

  • Ensure the PR has the correct target branch.
  • Double-check that the PR is okay to be merged.
  • Ensure the corresponding issue has a ZenHub release assigned.
  • Add a changelog message to the issue.

Copy link
Collaborator

@techanvil techanvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eugene-manuilov, good work here so far, but there are a few areas that need some attention.

Additionally, I'd suggest the QAB should be a bit more in depth. We should QA for the cases where there are gaps in the Google-returned data, not just the empty response scenario.

In order to find gaps in real data, I wrote a little snippet that can be run in the console. This will flag some gaps when run while connected as oi.ie with the GA4 property 285794360.

let report = await googlesitekit.data.select("modules/analytics-4").getReport({
  startDate: "2022-01-01",
  endDate: "2023-03-13",
  metrics: ["totalUSers"],
  dimensions: ["date"],
  orderby: [{ dimension: { dimensionName: "date" } }],
});

// Run the above again if `report` is `undefined` the first time.

const format = (date) => {
  const y = date.getFullYear();
  const m = date.getMonth() + 1;
  const d = date.getDate();
  return `${y}-${m}-${d}`;
};

let prevDate = undefined;
report.rows.forEach(({ dimensionValues: [{ value }] }) => {
  const [_, y, m, d] = /(....)(..)(..)/.exec(value);
  const date = new Date(0);
  date.setFullYear(y, m - 1, d);
  if (prevDate) {
    if ( date.getTime() - prevDate !== 86400000) {
      console.log("gap", format(prevDate), '-', format(date));
    }
  }
  prevDate = date;
});

I'd suggest this should probably be a QA:Eng task for an engineer to dig in a bit and test the various scenarios with gaps, single vs multiple date ranges, etc.

includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
tests/phpunit/integration/Modules/Analytics_4Test.php Outdated Show resolved Hide resolved
tests/phpunit/integration/Modules/Analytics_4Test.php Outdated Show resolved Hide resolved
includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
Comment on lines 1488 to 1492
$row = new Google_Service_AnalyticsData_Row();
$row->setDimensionValues( array( $dimension_value ) );
$row->setMetricValues( $metric_values );

$rows[] = $row;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here needs to be updated based on how many date ranges there are. This is correct for the case where there is a single date range, but when there are multiple date ranges, we need to add a row for each date range, with a 2nd dimension value which is date_range_X, where X is the zero-based index of the date range.

You can see this in action by requesting a multi-date-range report. For illustration here's one of our fixtures which shows the same thing:

"rows": [
{
"dimensionValues": [
{ "value": "20201201" },
{ "value": "date_range_0" }
],
"metricValues": [ { "value": "55" }, { "value": "8" } ]
},
{
"dimensionValues": [
{ "value": "20201201" },
{ "value": "date_range_1" }
],
"metricValues": [ { "value": "3" }, { "value": "8" } ]
},
{
"dimensionValues": [
{ "value": "20201202" },
{ "value": "date_range_0" }
],
"metricValues": [ { "value": "17" }, { "value": "23" } ]
},
{
"dimensionValues": [
{ "value": "20201202" },
{ "value": "date_range_1" }
],
"metricValues": [ { "value": "10" }, { "value": "44" } ]
},

Furthermore - the logic also needs changing to ensure the rows are inserted in the right order, when date is provided as an order-by dimension. As GA4 reports no longer return rows ordered by default, we now need to explicitly specify for them to be ordered by date for our charts. As things stand this code simply appends the dates to the end of $rows which can result in an out of order sequence when there are some rows returned in the response.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added the date range index and updated the logic to ensure that rows are sorted correctly.

Copy link
Collaborator

@techanvil techanvil Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, as mentioned on a separate comment though, the logic still needs a bit of an update. Details are on the other comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, but this still needs another tweak, with details on a separate comment.

includes/Modules/Analytics_4/Report.php Outdated Show resolved Hide resolved
}

$property_id = self::normalize_property_id( $settings['propertyID'] );
$request->setProperty( $property_id );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI there's a question mark about whether we should retain this use of setProperty(), as discussed here. It's out of scope for this refactor though.

includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
* @param int|bool $date_range_index The date range index for the current date.
* @return Google_Service_AnalyticsData_Row A new zero-value row instance.
*/
public static function create_report_row( $metric_headers, $current_date, $date_range_index ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hesitant about this choice to make create_report_row() public, when it's apparent it's only public in order to be used as a helper in a test. Ideally, we'd structure our public interfaces to only represent the required interface for use in production. Of course, the usual workaround is to extract the functionality to reuse in tests into a separate class/exported function etc.

With this general principle in mind, I'd prefer to see this made private with a separate test helper utility, or this to be extracted into a separate class. Again, interested to hear your take on this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, your concerns make sense to me. Reworked it to be a trait that we can use both in the response class and in the test class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice one!

includes/Modules/Analytics_4/Report.php Outdated Show resolved Hide resolved
// Loop through all days in the date range and check if there is a metric value
// for it. If the metric value is missing, we will need to add one with a zero value.
$now = $start;
do {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... This will have a side effect of ensuring the rows are always sorted, when empty rows are added, regardless of whether the orderby clause is specified.

I think it's OK, we just need to be careful not to accidentally rely on this behaviour, it could give the impression that the GA4 report endpoint automatically sorts the response like the UA one does...

Just mentioning this for the sake of raising visibility, I don't think we don't need to change it.

$now += DAY_IN_SECONDS;

// Copy the existing row if it is available, otherwise create a new zero-value row.
$rows[ $current_date ] = isset( $existing_rows[ $current_date ] )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is still not quite right. It's currently only inserting a single row per missing date except where there's an overlap with the date ranges.

It needs to be adjusted so that it inserts a row per date, per date range. It's also not sorting holistically across all rows, it's sorting per date range.

In other words, say the requested date ranges are as follows:

'startDate'        => '2023-02-01',
'endDate'          => '2023-02-03',
'compareStartDate' => '2023-01-01',
'compareEndDate'   => '2023-01-03',

The current result looks like this:

array( '20230201', 'date_range_0' ),
array( '20230202', 'date_range_0' ),
array( '20230203', 'date_range_0' ),
array( '20230101', 'date_range_1' ),
array( '20230102', 'date_range_1' ),
array( '20230103', 'date_range_1' ),

But, it should look like this, with rows fully sorted and dates repeated per date range:

array( '20230101', 'date_range_1' ),
array( '20230101', 'date_range_0' ),
array( '20230102', 'date_range_1' ),
array( '20230102', 'date_range_0' ),
array( '20230103', 'date_range_1' ),
array( '20230103', 'date_range_0' ),
array( '20230201', 'date_range_0' ),
array( '20230201', 'date_range_1' ),
array( '20230202', 'date_range_0' ),
array( '20230202', 'date_range_1' ),
array( '20230203', 'date_range_0' ),
array( '20230203', 'date_range_1' ),

When the ranges overlap, care needs to be taken to ensure there's still only one row per date per range:

// Args:

'startDate'        => '2023-01-01',
'endDate'          => '2023-01-03',
'compareStartDate' => '2023-01-02',
'compareEndDate'   => '2023-01-04',

// Result:

array( '20230101', 'date_range_0' ),
array( '20230101', 'date_range_1' ),
array( '20230102', 'date_range_0' ),
array( '20230102', 'date_range_1' ),
array( '20230103', 'date_range_0' ),
array( '20230103', 'date_range_1' ),
array( '20230104', 'date_range_1' ),
array( '20230104', 'date_range_0' ),

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. you are right. This should be updated now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, this still needs a bit of a tweak with details on a separate comment.

includes/Modules/Analytics_4.php Outdated Show resolved Hide resolved
Copy link
Collaborator

@techanvil techanvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @eugene-manuilov, nice refactor here. I have left a few more comments - notably, the row-adding logic still needs a bit more work.

Additionally, I'd suggest updating the screenshot in the QAB which still shows numeric zeroes, this might cause an unwanted QA/Execution cycle.

Copy link
Collaborator

@techanvil techanvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @eugene-manuilov, this is looking great, but there are a couple of aspects that still need to be addressed, please see the related comments.

Also, it looks like you didn't spot this comment I left - I've gone ahead and updated the screenshot myself.

Additionally, I'd suggest updating the screenshot in the QAB which still shows numeric zeroes, this might cause an unwanted QA/Execution cycle.

@eugene-manuilov
Copy link
Collaborator Author

Also, it looks like you didn't spot this comment I left - I've gone ahead and updated the screenshot myself.

Additionally, I'd suggest updating the screenshot in the QAB which still shows numeric zeroes, this might cause an unwanted QA/Execution cycle.

Thanks for updating that screenshot, @techanvil, I missed your message.

// Add rows for the current date for each date range.
self::iterate_date_ranges(
$date_ranges,
function( $date ) use ( &$rows, $ranges_count, $metric_headers, $multiple_ranges ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use clause is missing $existing_rows, which means the existing row is never found, so we always return zero data for all rows. It's a small fix, I'll apply the change myself.

Suggested change
function( $date ) use ( &$rows, $ranges_count, $metric_headers, $multiple_ranges ) {
function( $date ) use ( &$rows, $existing_rows, $ranges_count, $metric_headers, $multiple_ranges ) {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eugene-manuilov - although I'd applied this fix myself, hoping to prevent an extra execution/review cycle, I've realised that the test coverage is lacking a check to ensure that the resulting rows which do have data retain this data and aren't overwritten with zeroes. Therefore, please can you update the tests to provide coverage for this aspect?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tests are updated.

Copy link
Collaborator

@techanvil techanvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @eugene-manuilov, this looks good but as discussed here I've applied a small fix myself and then realized the tests should also be updated. Please can you update the tests as described?

Copy link
Collaborator

@techanvil techanvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eugene-manuilov, thanks for making the changes to the test.

I do think the resulting test code could be simplified a bit to make it easier to understand, while including the expected values in the @dataProvider data set, to give a clearer picture of the inputs vs expectations.

In order to speed things up I've pushed a commit myself to make this change and illustrate the point. Please review the change and let me know your thoughts. If you're happy with this, I think it's ready to merge.

@eugene-manuilov
Copy link
Collaborator Author

@techanvil looks good to me.

@techanvil techanvil merged commit a886611 into develop Apr 5, 2023
@techanvil techanvil deleted the enhancement/6623-ga4-widgets-real-data branch April 5, 2023 10:03
Copy link
Collaborator

@techanvil techanvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops - forgot to approve this before merge. LGTM, nice work @eugene-manuilov!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants