-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardization of ACTIVATE_VERSION
message
#9
Comments
@dmosorast Are there any signals earlier in the stream that the target can use to predict if the stream is later going to receive an ACTIVATE_VERSION message? This could have influence on how the rather chooses to store records during the sync operation. |
@aaronsteers That's a good point. I would say that part of the way we have implemented this in existing taps is that a tap that uses it should always send an ACTIVATE_VERSION message if it doesn't have a version saved in the stream's state. The effect of this as a signal to the target would be what you're suggesting, as well as a notification when a stream's state has been reset, and thus a new historical extraction is coming down the pipe. I haven't had time to sit down, but the gist of my thoughts for outlining purposes here (while I'm here 😄 ) are:
Because it's a low-level signal, it doesn't support any specific feature, but more of a specific behavior which the target can take in the context of its own destination location. That said, the most common is to bring hard deletes to an EL flow when a full dataset refresh is required. |
Super helpful to better understand the details here, @dmosorast. Thanks for this. To answer my question from above in this context:
Side note: If the tap changes upstream to no longer send ACTIVATE_VERSION messages, then target table (or other tracking mechanism) may need to be reconfigured or recreated. But this is a niche case for target developers, which I don't think we need to solve for here. |
I agree. The details of what the target developer is required to do to react to a version message or how to track this state is something that I'd like to keep from specifying. This is just a sort of state-like stream-level metadata. The question of what to do if a table becomes non-versioned. That's probably what I would call a breaking change (like a PK change, or a column datatype change), which requires some case-specific means of handling. In this scenario, if a stream is known to no longer be "versioned", I would expect something like deleting the target table and backfilling with the tap to be the approach to get it "non-versioned", but that scenario is rather awkward. Honestly, there's likely no harm in just tracking a version for every stream, it's just not something that has been common, since many SaaS sources don't allow hard deletes, so we've chosen to not add the complexity in those cases. |
Agreed. 👍 Very well put. |
Linking to related item #8, which would allow taps and targets to advertise that they support ACTIVATE_VERSION messages. |
This is something that's been frequently buzzing about the community. The "activate version" message has been used by taps in the past to track the "version" of some data. The semantics of this message haven't been formalized, so I'd like to track that conversation here, and provide my historical knowledge on the subject.
I don't have a clear picture pulled together just yet, but just getting this topic pinned for now.
The text was updated successfully, but these errors were encountered: