Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't store zero steps #84

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

Don't store zero steps #84

wants to merge 2 commits into from

Conversation

blootsvoets
Copy link
Member

Attempts to reduce the amount of data sent - ignore all zero values. Not tested in production, but this should help increase the specificity of the data compliance dashboard.

@@ -67,8 +67,8 @@ public void initialize(RestSourceConnectorConfig config) {
ZonedDateTime startDate = request.getDateRange().end();

return iterableToStream(dataset)
.filter(activity -> activity.get("value") != null && activity.get("value").asInt() != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still does not solve the problem where the data is present and is valid (like someone is sitting so 0 steps, which may be valuable in some use cases and should be collected) vs when someone is not wearing the device (these need to be discarded).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that for my account, Fitbit always sends 0, whether I'm wearing the device or not. It basically gives no information except that the account is being synced.

Copy link
Member

@yatharthranjan yatharthranjan Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but it would still be useful to collect steps for cases where the 0 steps is valid, for instance, you could find out if someone is at rest (when steps are 0) but there is HR data, meaning, they are wearing the device but not moving. Similarly, in some tasks (like the 6minute walk test), the subjects are instructed that they can use a chair to sit if needed in between the test, this would be 0 steps too and valuable information. So, collecting 0 steps could be useful in certain scenarios. Not collecting all steps which are 0 will remove this information too.
If this is required on the dashboard, can the query to db handle this (like WHERE steps != 0)?
Or can be handled using KSQL so it doesn't even end up in the db.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s just that with this method, the connector will always update to the latest step count, even if it’s zero and no steps were taken YET. But the connector then won’t go back to retrospectively evaluate whether steps were still reported by the app at a later time. I’m just saying zero steps contains NO information, it could be:

  • no steps were taken
  • the Fitbit device was not worn during that interval
  • The Fitbit device was worn with non-zero steps but didn’t sync with the app yet
  • The Fitbit device was worn with zero steps taken, but didn’t sync with the app yet.

the only cases it eliminates is:

  • the Fitbit device was worn, synced with the app and the participant made steps
  • The Fitbit connector could not sync

the latter case you can eliminate by looking at other topics or comparing with the Fitbit website. If so, then having a record with zero steps gives exactly the same information as having no record at all. The advantage of not storing a record then, is that the request generator will recognize that no valid steps data is available yet for a given time and it will try to re-fetch it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see, I understand the reasoning behind the request generator, I am happy with the change but I think we should still consult with the data scientists and also how backwards compatible this would be for analysis if they were already using 0 steps for something. I think 0 steps (although could be misleading) is more intuitive to understand than missing step data (which would now mean either actual missing data or any of the cases you mentioned above) and we should make sure this is documented extensively (perhaps in the schema and specs).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok if it will mess with the analysis, otherwise I could see whether we can make a route-specific change regarding what qualifies as a successful call, and send the data to Kafka regardless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes handling the step route as a special case makes sense.

@blootsvoets blootsvoets changed the title Don't store zero steps / calories / heart rates Don't store zero steps Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants