Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
source-pendo: backfill a day at a time
For Pendo accounts with a significant amount of data, the strategy of asking for all data up until the present does not work. Pendo's response time is long enough to cause TimeoutErrors, and responses that are received end up OOM-ing the connector. To fix this, there is now a distinct backfill process for events and aggregated events. Notable changes include: - Separating backfills from ongoing incremental replication. - Backfills try to get at max a day's worth of data. Incremental replication continues to get all data up to the present. - The cutoff between backfills and incremental replication is shifted backwards 12 hours due to delays between when an event occurred and when it is available in the API. - The Pendo API response limit is increased to 50k. This might be able to be increased, but I've done limited testing to confirm 50k won't OOM the connector. - API responses are now sorted first by timestamp and next by resource ID. This lets us more effectively "paginate" through documents if multiple occur at the same timestamp. - If all the documents in a given API response have the same timestamp, the connector will fetch the remaining documents with that exact timestamp before incrementing the connector by a millisecond.
- Loading branch information