Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data on KMW tiles does not match what is in Analytics #7214

Open
wpdarren opened this issue Jun 28, 2023 · 4 comments
Open

Data on KMW tiles does not match what is in Analytics #7214

wpdarren opened this issue Jun 28, 2023 · 4 comments
Labels
blocked upstream Issues which are blocked by external work Module: Analytics Google Analytics module related issues P2 Low priority Type: Bug Something isn't working

Comments

@wpdarren
Copy link
Collaborator

wpdarren commented Jun 28, 2023

Bug Description

While testing the key metrics tiles I noticed that there's a difference between the data for New visitors when looking at the title compared with the Analytics account dashboard.

Site Kit has 1.5K
image.png

Analytics has 1.6K
image.png

This has an impact on the calculation of the Loyal visitors because it is derived from the new visitors total.

While creating this ticket I also noticed differences in the Most popular content by pageviews tile where the top 3 content by page views is not the same as in Analytics.

Site Kit you can see the popular pages are in a different order and the page views are different
image.png

Please note that the comparisons have only been completed on 28 days to create this ticket but we should check all reporting periods in Site Kit, i.e. 7, 14, 28 and 90 days.

Steps to reproduce

  1. Set up Site Kit with Google Analytics.
  2. Make sure that you have oi.ie or another site with data in developer plugin.
  3. Make sure that UserInput feature flag is enabled so that the tiles load on the dashboard.
  4. Observe the data as per my points above.

#6245 Discrepancies:

Please see my note re the percentage not appearing to be correct - since we have a ticket to investigate this, I do not feel that this should hold us up approving this. Thoughts?

I wanted to highlight that this tile percentage is not calculating as per the Analytics report I created.

The data below is for 28 days.

image

1,845 Organic search / 2,343 Total users x 100 = 78.75%

As you can see from the screenshot below, they don't match. Site Kit shows, 80.4%

image


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  • The following Key Metric Widget tiles should show the same values in Site Kit for the corresponding metrics in the GA4 Analytics Admin dashboard for the exact same date range:
    • New Visitors
    • Loyal Visitors
    • Top Traffic Source
    • Most popular content by pageviews
    • Top cities driving traffic
  • This should also be the case for all the date ranges supported by Site Kit.

Implementation Brief

Test Coverage

QA Brief

Changelog entry

@wpdarren wpdarren added Module: Analytics Google Analytics module related issues Type: Bug Something isn't working labels Jun 28, 2023
@mxbclang mxbclang added the P0 High priority label Jun 29, 2023
@jimmymadon jimmymadon self-assigned this Jul 9, 2023
@jimmymadon jimmymadon removed their assignment Jul 11, 2023
@tofumatt tofumatt self-assigned this Jul 18, 2023
@tofumatt
Copy link
Collaborator

As long as these metrics are supported by GA4 as well, sounds good to me. Would be worth checking during the IB if they're supported 🙂

ACs 👍🏻 Moving to IB.

@techanvil
Copy link
Collaborator

techanvil commented Jul 24, 2023

Hey @marrrmarrr, could you please take a look at this issue, and provide a bit of direction?

Having spent some time investigating it, I have identified a few widget-specific aspects to address which I can go on to spec out and/or clarify.

However, before doing so I need to ask for some higher level clarification about how far to go with addressing the core of the issue here, which is the fact that the metrics that Site Kit surface relating to user counts can differ from those which are presented in the GA4 Analytics UI (although not always visibly, the differences can be close enough that the rounded figures match).

It boils down to the usage, to date, of the totalUsers metric in Site Kit, while the GA4 Analytics UI uses activeUsers for its user counts (custom reports notwithstanding).

As long as we continue to use the totalUsers metric, we can't hope to address the crux of this issue, and it should be rescoped to addressing inconsistencies/errors in the listed Key Metrics widgets, with the parity between Site Kit and the GA4 Analytics UI figures removed as a requirement.

One complicating factor is the fact we do use activeUsers in a couple of these widgets: Loyal Visitors and New Visitors. So, these specific widgets can have parity with the GA4 Analytics UI - but, at the expense of consistency with the rest of Site Kit, notably the All Users headline figure in the All Traffic widget that is immediately below Key Metrics. By continuing to use activeUsers for Loyal/New Visitors (both of which show the total number of visitors as well as their own headline figure), and totalUsers for All Users, we can end up showing inconsistent data on adjacent widgets.

In other words we could end up with this sort of scenario:

image


Having realised that the GA4 Analytics UI uses activeUsers as its standard measure, I was initially going to simply ask if we can switch all of Site Kit's GA4 metrics from totalUsers to activeUsers. However, @aaemnnosttv has referred me to the thread on Slack where this has been recently discussed, from which I understand we're awaiting some feedback from the GA team on this one.

So, my questions are as follows:

  • Have the GA team replied on the topic, and if so can/should we indeed make a wholesale transition from using totalUsers to use activeUsers for our GA4 metrics in Site Kit?
  • If we are not in a position to transition from totalUsers to activeUsers, can we instead switch from activeUsers to totalUsers in the Loyal Visitors and New Visitors widgets to avoid the inconsistency within Site Kit itself as described above?

Thanks in advance!

@techanvil
Copy link
Collaborator

I've created a separate issue, #7366, to address the fixes for the listed widgets that aren't dependent on the reply from GA.

@techanvil
Copy link
Collaborator

Here's an update to help keep track of where things stand on this issue at present.

  • We are currently waiting for an answer on whether we're going to continue using totalUsers for most of our related reports or make a wholesale switch to activeUsers.
  • That aside, it doesn't look like we need to make any further code changes to address the core of the issue. Those changes that were required were split out into Address inconsistencies in the Loyal Visitors, New Visitors, and "Most popular content" Key Metrics widgets. #7366 and addressed separately.
  • Any perceived differences between Site Kit and Analytics should hopefully be resolvable by taking the following into account:
    • Ensure that we are comparing Site Kit to data on the Analytics Reports UI (or from the reporting API) and not Explorations, as there can be differences in the reported data between these two areas (and Site Kit itself uses the reporting API). See https://support.google.com/analytics/answer/9371379
    • Remember that for all Analytics reports, Site Kit automatically applies a dimension filter for the hostname to match the site URL, or its www. prefixed variant. For example, when viewing oi.ie, SK will apply the dimension filter for the hostname to match oi.ie or www.oi.ie. So we should ensure we're applying the same filter for any reports that we're using for comparison. Note that this may have an impact on some but not all properties, it depends how many sites they are used on. See here for a bit more on using filters in the GA4 UI: https://support.google.com/analytics/answer/11377859
    • Bear in mind that even when comparing a report from the GA4 API to a Report on the UI, there can be some minor, unexplained discrepancies (up to low double digits in my testing to date). This is mentioned here: https://support.google.com/analytics/thread/180423862/different-report-results-between-google-analytics-4-ui-and-google-data-api-data?hl=en
  • Any discrepancies that cannot be explained by the above could indicate a bug and would merit looking into further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked upstream Issues which are blocked by external work Module: Analytics Google Analytics module related issues P2 Low priority Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants