Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migrated state information to flexible sync client BIND message #6464

Merged
merged 18 commits into from
Apr 12, 2023

Conversation

michael-wb
Copy link
Contributor

@michael-wb michael-wb commented Mar 31, 2023

What, How & Why?

Updated the BIND message for flexible sync to repurpose the server_path field to use as stringified JSON data. If the client is in the FLX migrated state, the original PBS partition value will be added to the JSON data section of the FLX BIND message.
Added unit tests for the message generation portion of the ProtocolCodec class.

Fixes #6442

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • [ ] C-API, if public C++ API changed.

std::string path_data;
if (m_is_flx_sync_session) {
nlohmann::json bind_json_data;
if (protocol_version >= 8) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kind of protocol version specific logic should really go in the protocol codec. It's the thing that decides what version/mode of the wire protocol results in which bytes get put on the wire.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had it as two different functions with the FLX one taking a json object, but changed it to this since all the other functions just took strings. I'll change it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

if (m_state != MigrationState::Migrated) {
return std::nullopt;
}
return m_migrated_partition;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just realizing that this function and get_query_string() are returning a string_view which is an unowned view of a std::string protected by a lock but don't guarantee that the lock is held after returned, so it's possible to call get_migrated_partition() and have the returned string_view pointing to free'd or otherwise invalid memory after return.

I think both this and get_query_string() need to return a std::optionalstd::string

void make_bind_message(int protocol_version, OutputBuffer&, session_ident_type session_ident,
const std::string& server_path, const std::string& signed_user_token,
bool need_client_file_ident, bool is_subserver);
void make_bind_message(OutputBuffer&, session_ident_type session_ident, const std::string& path_data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need a FLX/PBS specific version of these functions like we have with IDENT messages to handle the different formats/protocol version handling here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the two different versions back and updated the tests.

template <class A, class B>
bool compare_out_string(const A& expected, const B& out, const std::shared_ptr<util::Logger>& logger)
{
if (expected.size() == out.size() && memcmp(expected.data(), out.data(), out.size()) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you wanted to make this a little more idiomatic, at least to my eyes you could do something like

template <class A, class B>
void check_out_string(const A& expected, const B& out, unit_test::TestContext& test_context)
{
    CHECK_EQUAL(std::string_view(std::data(expected), std::size(expected)), std::string_view(std::data(out), std::size(out)));
}

Then the CHECK_EQUAL macro would just work as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works great - thanks for the recommendation

@@ -333,6 +333,10 @@ class Session {
/// changeset data in a single integration attempt.
size_t flx_bootstrap_batch_size_bytes = 1024 * 1024;

/// Contains the original PBS partition value from before the migration -
/// empty if not migrated
std::optional<std::string_view> migrated_partition;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really just be a std::shared_ptr and then the sync client can just query this as needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the original string value (as an optional) is necessary and I didn't want to extend the lifetime of the migration_store beyond SyncSession. I can change it if you think it would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated it to be a reference to the migration store, since Daniels PR needs the store for creating the subscriptions and completing the migration. I used the same changes from his review.

@@ -0,0 +1,224 @@
/*************************************************************************
*
* Copyright 2016 Realm Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright 2023

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - definitely a copy/paste artifact from reusing another test file as a starting point....

@michael-wb michael-wb requested a review from kmorkos April 4, 2023 05:41
@danieltabacaru
Copy link
Collaborator

The github issue this PR fixes mentions userRequestedPBS to be added as part of BIND. Is that work done in a different pr? I need it for #6460.

@michael-wb
Copy link
Contributor Author

The github issue this PR fixes mentions userRequestedPBS to be added as part of BIND. Is that work done in a different pr? I need it for #6460.

It was removed since the presence of the partition value is enough to imply "userRequestedPBS". The migrated partition value is an optional string that will only be populated when the client is in the migrated FLX state. I will update the issue.

@michael-wb michael-wb requested review from jbreams and kmorkos April 6, 2023 20:09
Copy link
Contributor

@jbreams jbreams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the server has support for backfilling partition values, can we add a test to verify that sending this info in the bind message actually does what we expect on the server side?

@@ -750,7 +751,7 @@ class ClientImpl::Session {

/// Update internal client state when a flx subscription becomes complete outside
/// of the normal sync process. This can happen during client reset.
void non_sync_flx_completion(int64_t version);
void on_sync_flx_completion(int64_t version);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, what a typo!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging by the comment it should be prefixed with non

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so on_non_sync_flx_completion? or just revert it...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on_non_sync_flx_completion does sound better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's calling on_flx_sync_version_complete() on the wrapper, I updated the name to be the same.

json_data_stg = json_data.dump();
}

static_cast<void>(protocol_version);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to static_cast this here. we do it in the pbs function just to avoid an unused variable warning, but we do use it here. we can consider just removing it as a parameter to the pbs version to simplify things a bit.

@michael-wb michael-wb requested a review from jbreams April 11, 2023 16:17
Copy link
Contributor

@jbreams jbreams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, just a few questions.

@@ -11,7 +11,7 @@ constexpr static std::string_view c_flx_migration_started_at("flx_migration_star
constexpr static std::string_view c_flx_migration_completed_at("flx_migration_completed_at");
constexpr static std::string_view c_flx_migration_state("flx_migration_state");
constexpr static std::string_view c_flx_migration_query_string("flx_migration_query_string");

constexpr static std::string_view c_flx_migration_original_partition("flx_migration_original_partition");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm just noticing this now, and we def don't need to change it if you and @danieltabacaru like it this way, but these names will end up with a table called flx_migration and then columns named flx_migration_started_at. so in json it'll look like {flx_migration_started_at: timestamp, flx_migration_state: state, flx_migration_query_string: "foo=bar", flx_migration_original_partition: "bar", flx_migration_completed_at: timestamp} - did we mean for it to look more like {started_at: timestamp, state: state, query_string: "foo=bar", original_partition="bar", completed_at: timestamp}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - I can update that now

CHANGELOG.md Outdated
@@ -20,6 +20,7 @@
* Fix compiler warnings with MacOS Clang 14.0.3 ([PR #6467](https://github.com/realm/realm-core/pull/6467))
* Perform a client reset to migrate a sync'd realm from PBS to FLX and vice versa ([#6393](https://github.com/realm/realm-core/issues/6393))
* The following unused util headers have been deleted: call_with_tuple.hpp, get_file_size.hpp, inspect.hpp, substitute.hpp, type_list.hpp, and utf8.hpp.
* Add migrated state information to flexible sync client BIND message ([PR #6464](https://github.com/realm/realm-core/pull/6464))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add something about how this will support the server back-filling partition values that aren't in your schema for migrated PBS realms? Just something so SDKs/customers know how this change actually effects them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

src/realm/sync/noinst/migration_store.cpp Outdated Show resolved Hide resolved
save_sync_config_after_migration();
download_fresh_realm(error.server_requests_action);
return;
case sync::ProtocolErrorInfo::Action::RevertToPBS:
// If the client was updated to use FLX natively, but the server was rolled back to PBS,
// propagate the error up to the user
// the server should be sending "SwitchToFLX", throw exception if this error is received.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SwitchToPBS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

std::string server_path = get_virt_path();
logger.debug("Sending: BIND(session_ident=%1, need_client_file_ident=%2 is_subserver=%3 server_path=%4)",
session_ident, need_client_file_ident, is_subserver, server_path);
protocol.make_pbs_bind_message(protocol_version, out, session_ident, server_path, empty_access_token,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated: is there a plan, at some point, to make pbs work with a json so we don't need to treat it differently than flx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so - the JSON data was "shimmed" into the BIND message for FLX, since the existing server_path section, which is used to send the partition value in PBS, was not used by FLX. In addition, since we want to move away from PBS eventually, there likely won't be a need for adding JSON values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

REALM_ASSERT(m_state == state);
REALM_ASSERT(m_query_string == query_string);
REALM_ASSERT(m_migrated_partition == migrated_partition);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -34,6 +34,10 @@ namespace sync {
// 7 Client takes the 'action' specified in the 'json_error' messages received
// from server. Client sends 'json_error' messages to the server.
//
// 8 Support for PBS->FLX client migration and websocket http errors as websocket
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could split the first sentence in two since it's different things

if (m_is_flx_sync_session) {
nlohmann::json bind_json_data;
if (auto migrated_partition = get_migration_store()->get_migrated_partition()) {
bind_json_data["migratedPartition"] = *migrated_partition;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so server_path is not needed anymore for flx bind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not being used - the call to get_virt_path() returns the partition value provided in SyncConfig, which is always an empty string for FLX sync.

{
std::string json_data_stg;
// Protocol version v8 and above accepts stringified json_data for the first data argument
if (protocol_version >= 8 && !json_data.empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the first check actually needed? I guess this should only be released once the v8 is enabled.

Side note: Is this suppose to work with a server which does not support v8? As it stands, we are not going to send the server_path so I don't know if that's a problem or not (for what is worth, we don't send the server_path either for >= v8)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, since BAAS is the only server used with FLX sync, it will always be up to date to the latest protocol. It may start to get tricky once we have to work with tiered sync, since we can't rely on the server being up to the latest version.
We can remove this, but I'd prefer to do it once we commit to start using protocol v8 once everything is integrated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough. let's make sure we don't forget to remove it then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to #6342 to remove this.

@michael-wb michael-wb merged commit be4192d into master Apr 12, 2023
@michael-wb michael-wb deleted the mwb/bind-msg-updates branch April 12, 2023 14:30
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add migrated state information to flexible sync client BIND message
4 participants