Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

care partner alerts #715

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open

care partner alerts #715

wants to merge 28 commits into from

Conversation

ewollesen
Copy link
Contributor

@ewollesen ewollesen commented May 6, 2024

This used to be a series of PRs, but that didn't really work out. They're all collapsed into this one.

Shouldn't be merged until tidepool-org/go-common#64 is merged, then this should have it's go-common bumped.

@ewollesen ewollesen requested a review from toddkazakov May 6, 2024 15:55
@ewollesen ewollesen removed the request for review from toddkazakov May 8, 2024 19:59
@ewollesen ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from 8549c33 to 8367902 Compare June 24, 2024 19:09
@ewollesen ewollesen changed the title adds List and Get methods to alerts client minimal implementation of care partner alerts Jul 9, 2024
@ewollesen ewollesen requested a review from toddkazakov July 9, 2024 22:50
@ewollesen
Copy link
Contributor Author

ewollesen commented Jul 11, 2024

To use this in QA, it must be paired with tidepool-org/hydrophone#145 and tidepool-org/go-common#64

@ewollesen ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from c50d589 to 986106b Compare July 11, 2024 22:55
Copy link
Contributor

@toddkazakov toddkazakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but the retry mechanism which is implemented here doesn't satisfy the latency requirements. The current implementation is ok for internal usage, but it's not production ready. This could be handled in a separate PR if this makes the development and QA process easier.

auth/store/mongo/device_tokens_repository.go Outdated Show resolved Hide resolved
auth/store/test/device_token_repository.go Show resolved Hide resolved
data/events/alerts.go Outdated Show resolved Hide resolved
data/events/alerts.go Show resolved Hide resolved
}
handler := asyncevents.NewSaramaConsumerGroupHandler(&asyncevents.NTimesRetryingConsumer{
Consumer: r.Config.MessageConsumer,
Delay: CappedExponentialBinaryDelay(AlertsEventRetryDelayMaximum),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a suitable retry strategy given the latency requirements for this service. Kafka's consumer group concurrency is limited to the number of partitions of the topic. This number cannot be very high because Kafka's memory consumption grows linearly with the number of partitions. From this follows that the number of partitions is much lower than the number of users we will have and the data of multiple users will end up in the same partition. A failure to evaluate a single user's alerts for one minute as currently set by the CappedExponentialBinaryDelay will introduce at least a minute delay to all of the users sharing the same partition, because messages in a single partition are processed serially.

Alert notifications should be near real-time - up to 10 seconds latency is acceptable. I think the solution proposed in this design document is how this should be handled. Other solutions which satisfy the requirements are welcome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will require some more in-depth thought on my part... Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right, let's get this review merged, and I'll work on getting a multiple topic solution set up. Given the flexibility we have now, it shouldn't be too bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the multi-tier retry in the eric-alerts-multi-topic-retry branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be implemented in this branch now.

data/events/events.go Outdated Show resolved Hide resolved
data/service/service/standard.go Outdated Show resolved Hide resolved
@ewollesen ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from c08e1fc to 967c617 Compare July 19, 2024 16:02
@ewollesen ewollesen requested a review from toddkazakov July 29, 2024 22:41
@ewollesen ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from 9432468 to eaa652e Compare September 17, 2024 20:07
toddkazakov
toddkazakov previously approved these changes Sep 18, 2024
toddkazakov
toddkazakov previously approved these changes Oct 5, 2024
@ewollesen
Copy link
Contributor Author

@toddkazakov I just removed two config env vars. I believe we talked about that before, but it slipped my mind until I was reviewing the helm chart changes today, where they came up again.

So the re-review here is just around the config parsing, in the most recent commit of the PR, nothing else is changed.

The Get endpoint already exists on the service, so only the List endpoint
needed to be added there.

BACK-2554
This functionality will be used by care partner processes to retrieve device
tokens in order to send mobile device push notifications in response to care
partner alerts being triggered.

BACK-2554
This was missed when moving device tokens from the data service to the auth
service in commit a0f5a84.

BACK-2554
Basic steps are taken to allow for other push notification services to be
easily added in the future.

BACK-2554
So that sarama log messages better follow our standards, and will be emitted
as JSON when log.Logger is configured for that.

Before this change, the sarama logs were printed in plain-text without any of
the benefits of the platform log.Logger.

BACK-2554
The existing FaultTolerantConsumer isn't used because it's retry semantics are
hard-wired and aren't compatible with what care partner alerting's needs.

Note: A proper implementation of AlertsEventsConsumer to consume events is yet
to be written. It will follow shortly.

BACK-2554
The upload id is necessary to ensure that only the proper device data uploads
are evaluated for care partner alert conditions.

BACK-2554
If the necessary configuration isn't found, then push notifications will
instead be logged.

BACK-2554
These methods return Note objects that can be sent as push notifications.

NotLooping evaluation will be handled in a later commit.

BACK-2554
It uses the new asyncevents from go-common, as alerts processing requires
different retry semantics than the existing solution.

The Pusher interface is moved out of data/service into data/events to avoid a
circular dependency.

BACK-2554
No longer needed
In response to request during code review.
As caught by Todd in code review.

BACK-2554
When a care partner alert encounters an error, the message is moved to a
separate topic that will cause it to be retried after a delay. Any number of
these topics can be configured.

BACK-2499
Instead of a static delay, uses a "not before" time found in a Kafka message
header. Consumption of the message will not be attempted until the time has
passed. This allows for more accurate delays, as the time required to process
an earlier message doesn't further delay the current message's processing.

BACK-2449
These won't be changing at runtime, so there's no need to complicate the
initialization by making these configurable. The topic's prefix is
configurable, and that's the part that will change from environment to
environment at runtime.

BACK-2554
A rebase has picked up work performed by Darin, which removes the need
for this token injection. \o/ Yay!
These tests, and the functionality they cover were moved into the
alerts/client.go in a previous commit.
@@ -16,22 +16,20 @@ import (

// Client for managing alerts configs.
type Client struct {
client PlatformClient
logger platformlog.Logger
tokenProvider auth.ServerSessionTokenProvider
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darin upstreamed some changes that remove the necessity of a separate token provider.

@ewollesen ewollesen changed the title minimal implementation of care partner alerts care partner alerts Dec 11, 2024
This is to parallel the CI server, which is using go-ci-test.

The old test target is now ginkgo-test.
@ewollesen ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from 3156400 to 5250e83 Compare December 13, 2024 20:35
- UsersWithoutCommunication endpoint added to data service
- UsersWithoutCommunication endpoint added to alerts client
- implementing no communication alerts via the task service
- evaluation of alerts conditions re-worked
    - The new system recognizes that some alerts are generated by events
      (so-called "Data Alerts") while others are polled (no communication).
    - The new evaluator lives in the alerts package (was data/events)
- implemented tracking of sent notifications
- Recording repo is implemented to record/index the time of the last received
  data from a user

BACK-2558
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants