Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Backfill metrics pt. 1 #1067

Merged
merged 7 commits into from
Mar 23, 2019
Merged

Backfill metrics pt. 1 #1067

merged 7 commits into from
Mar 23, 2019

Conversation

boek
Copy link
Contributor

@boek boek commented Mar 18, 2019

No description provided.

@ghost ghost assigned boek Mar 18, 2019
@ghost ghost added the review label Mar 18, 2019
@boek
Copy link
Contributor Author

boek commented Mar 19, 2019

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?
  • We expand on how users are opening the app
  • Do users use search to get to their destination?
  • Do users use the autocomplete feature?
  • Do users use search suggestions?
  • How many people use Fenix as their default Browser?
  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?
  • An increase/decrease will indicate happiness, stickiness and usefulness of the browser
  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?
  • N/A (These are baseline metrics)
  1. Can current instrumentation answer these questions?
  • Currently no, as these are some of the first metrics we're recording
  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the found on the Mozilla wiki.
  • All data is Category 2.
  1. How long will this data be collected?
  • @mmccorks will permanently monitor this data.
  1. What populations will you measure?
  • All release, beta, and nightly users with telemetry enabled.
  1. Please provide a general description of how you will analyze this data.
  • Leanplum
  • Glean
  1. Where do you intend to share the results of your analysis?
  • Only on leanplum, glean and with mobile teams.

@boek boek marked this pull request as ready for review March 19, 2019 22:06
@boek boek requested a review from a team as a code owner March 19, 2019 22:06
@boek boek requested a review from liuche March 19, 2019 22:06
@boek
Copy link
Contributor Author

boek commented Mar 19, 2019

Data review for @liuche

Copy link
Member

@travis79 travis79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a glean perspective, this looks good. I left what I hope is a helpful comment on the GleanMetricsService.kt file.


override fun shouldTrack(event: Event): Boolean = Settings.getInstance(context).isTelemetryEnabled
override fun shouldTrack(event: Event): Boolean {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glean does this internally based on a persistent internal flag that can be toggled using the Glean.setUploadEnabled() function. The idea was to toggle the flag when the user opted out of telemetry and let the glean library handle discarding recorded data and preventing upload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travis79 Good to know, the secondary case for this method is we need a way to allow/block some metrics from being sent to some providers to support Leanplum

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then that's a good case for doing this in the client app. Glean currently doesn't have a way to disable a single specific metric at run-time, only through the metrics.yaml file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably wouldn't be too hard to make that work -- but maybe it doesn't buy much...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travis79 @mdboom If you look at the code in metrics.kt you can see how I have it setup. Right now I have a generic Event type for things that I want to track inside Fenix. When we track an event it dispatches each event to a MetricsService which decides whether or not it wants to track that event and then transforms is appropriately into a format that service expects.

// Interaction Events
data class SearchBarTapped(val source: Source) : Event() {
enum class Source { HOME, BROWSER }
override val extras: Map<String, String>?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads up -- the extras map will become Map<Enum, String> in the next android-components release. This lets us check that the keys are valid at compile time rather than run time. Unfortunately, it's a breaking change to the API -- and you're our first external user (lucky you!)

mozilla-mobile/android-components#2403

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually really like this change 👍

@boek boek changed the title Backfill metrics Backfill metrics pt. 1 Mar 20, 2019
@boek boek added the needs:data-review PR is awaiting a data review label Mar 20, 2019
Copy link
Contributor

@liuche liuche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some changes, mostly de-duplicating documentation (which is good! writing docs once is better)

Thanks for being the guinea pig for Glean SDK usage and documentation. I think parts are working, but would be happy to see some clearer documentation on the Glean SDK side so future consumers of the library have a clearer understanding. @mdboom is that something we could work on for future consumers of Glean? I think this is a valuable first pass!

data_reviews:
- N/A
- https://github.com/mozilla-mobile/fenix/pull/1067#issuecomment-474598673
notification_emails:
- telemetry-client-dev@mozilla.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be useful to have a Product/Team email here as well, to be aware of expiry?

app/metrics.yaml Outdated
data_reviews:
- N/A
- https://github.com/mozilla-mobile/fenix/pull/1067#issuecomment-474598673
notification_emails:
- telemetry-client-dev@mozilla.com
expires: never
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of expiry, I'm not a fan of "never" expiry - is setting this to 6mo-1yr acceptable? It's also a regular check for whoever the product manager is to be aware of what data we're collecting.

cc @mdboom on seeing this "never" pop up from the sample code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not even sure what expiry would mean here, @mdboom is this a date field? Would be really helpful to have this in the sample yaml file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuche https://mozilla.github.io/glean_parser/metrics-yaml.html#expires It looks like this field isn't for when the data expires, but when we should stop collecting this data.

Definitely an interesting idea! But I don't think we'll want this to expire?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, after re-reading I understand now! I will set this to a year. I'm not sure if there is a good mailing list to alert yet though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the correct documentation for that field. And as you point out, data retention is a separate issue (which we hope to address at some point...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdboom yeah, my point here was more to say, can you set the sample yaml file to actually use an expiration date, rather than never? I think that's a huge problem when the "default" in the sample is "never expire this" 😓

Copy link
Contributor Author

@boek boek Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuche I changed it to a date, specifically 1 year from now. I think we can talk with Product about this more later as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuche: I'm not sure how to do that without breaking the sample, though. :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked to @mdboom and he said he'll file a bug next week to set a date for the sample, and add a comment for any consumers of it to update the expiry date.

app/metrics.yaml Outdated
@@ -0,0 +1,20 @@
# This file defines the metrics that are recorded by glean telemetry. They are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit do you need a license on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I assume this is just copied from the Glean sample file? I wonder if we should include the license there too @mdboom

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easy enough to add. I don't feel too strongly either way. I filed a bug for glean itself here, and @boek and the fenix team can decide about this file.

app/metrics.yaml Outdated
description: >
A User opened the app
extra_keys:
source: "The source from which the app was opened"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific about this, including all possible values this can be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see the higher detail later in the docs! I think the goal of Glean is to avoid having to write those docs anymore, so put all that detail and values and explanation right here in the yaml file itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 whenever possible

app/metrics.yaml Outdated Show resolved Hide resolved
@@ -52,6 +52,16 @@ sealed class Event {
get() = mapOf("source" to source.name)
}

data class EnteredUrl(val autoCompleted: Boolean) : Event() {
override val extras: Map<String, String>?
get() = mapOf("autocomplete" to autoCompleted.toString())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't love all these booleans-to-strings, because it's less clear what is happening. @mdboom can Glean support non-string, string maps? I thought maps could be printed out as strings without every item being a string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion, @liuche. I've filed a this bug to follow up on this. @boek: I'd suggest you merge this as-is in the PR for now, and we can revisit once the glean team has decided how to move forward with that. It should be easy enough to find these instances and change them...

docs/telemetry.md Outdated Show resolved Hide resolved
docs/telemetry.md Outdated Show resolved Hide resolved
Copy link
Contributor

@liuche liuche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data-review+

One tiny nit about LP + typo, and @boek please remember to add a Product email to the expiry notifications (or file a bug if necessary to track).
Also @mdboom to add a bug to change expiry: never to an actual date + comment in the sample.yaml code (please link here).

Data Review Form

  1. Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?
    Yes, documented by Glean SDK in metrics.yaml, and can be scraped into docs page in Automate Fenix telemetry documentation to probe-scraper #1156

  2. Is there a control mechanism that allows the user to turn the data collection on and off?
    Yes, Fenix has data collection toggle in-app

  3. If the request is for permanent data collection, is there someone who will monitor the data over time?**
    Expires in 1 year, open for renewal

  4. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? **
    Type 2, interaction: app opened, search bar tapped, entered url, performed search, default browser, with either booleans or a set of pre-defined event strings.

  5. Is the data collection request for default-on or default-off?
    Default on

  6. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
    No

  7. Is the data collection covered by the existing Firefox privacy notice?
    Yes

  8. Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**
    No, will expire - but @boek needs to add another Product email to these probes so they get notified when the probe collection is about to expire.

  9. Does the data collection use a third-party collection tool?
    No, uses Glean

Fenix sends event pings that allows us to measure feature performance. These are defined inside the It is defined inside the [`metrics.yaml`](https://github.com/mozilla-mobile/fenix/blob/master/app/metrics.yaml) file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo in this sentence

Also, you should keep the app_opened ping because the Leanplum key is not documented in Glean I assume, and #981 is still open.

@boek
Copy link
Contributor Author

boek commented Mar 22, 2019

Tracking notification reminder here: #1157

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs:data-review PR is awaiting a data review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants