Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delays to SM buffer #4407

Merged
merged 7 commits into from
Dec 6, 2024
Merged

Add delays to SM buffer #4407

merged 7 commits into from
Dec 6, 2024

Conversation

gustawlippa
Copy link
Contributor

@gustawlippa gustawlippa commented Dec 3, 2024

This PR adds <delay> elements to stanzas rerouted through mod_stream_management. This element was already present when a user requested stream resumption, and had unacknowledged stanzas.

Currently, when a user reconnects, has unacknowledged stanzas, and doesn't request stream resumption, the stream management rerouting behaviour is to try to resend the stanzas to them, because MIM doesn't know if they have reached the user. However, if the user is using a different resource, the stanzas would get rerouted only after resume_timeout passes, which could be surprising without the <delay> element. Even when rerouting to the same resource before the timeout, I think adding the <delay> could help client developers detect retransmitted stanzas.

The <delay> elements sent by SM and mod_presence will now also include a description of where the delay is coming from (this is in allowed by https://xmpp.org/extensions/xep-0203.html). This may improve debugging in corner cases.

I've also applied some fixes to the sm_SUITE and reworked the resend_unacked_after_resume_timeout so that the testcase reflects the title. With the new changes it is indeed checking the after_resume_timeout part. I've added a similar case which checks the behaviour for users reconnecting with a different resource.

@mongoose-im

This comment was marked as outdated.

Copy link

codecov bot commented Dec 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.27%. Comparing base (d8bb43d) to head (4415dbe).
Report is 43 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4407      +/-   ##
==========================================
- Coverage   85.32%   85.27%   -0.05%     
==========================================
  Files         550      550              
  Lines       33905    33913       +8     
==========================================
- Hits        28929    28919      -10     
- Misses       4976     4994      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

It is rerouted when no session was resumed and the resume_timeout has passed.
This resulted in suprising messages, sometimes after a long time.
@gustawlippa gustawlippa force-pushed the fix-duplicated-mam-msg branch from 014ce65 to 53e2120 Compare December 5, 2024 08:59
@mongoose-im

This comment was marked as outdated.

This is allowed by the delayed delivery XEP. It should make it more clear where
the delay is coming from for the user and in the case of testing/debugging.
@mongoose-im

This comment was marked as outdated.

The tests weren't really waiting for a timeout before. Extracted the tests into
their own group and added one with different resource to show how it is handled
by MIM.
@gustawlippa gustawlippa force-pushed the fix-duplicated-mam-msg branch from 5300f81 to 3072292 Compare December 6, 2024 08:17
@mongoose-im

This comment was marked as outdated.

The initial presence response and delayed message/presence can come in any
order.
@mongoose-im

This comment was marked as outdated.

@gustawlippa gustawlippa marked this pull request as ready for review December 6, 2024 10:32
Copy link
Collaborator

@NelsonVides NelsonVides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat. Testing stream_management was always complicated, I remember excavating a long history of issues and tricky edge cases over git. This one looks like a very good improvement (and the PR description is gold) 👌🏽

src/stream_management/mod_stream_management.erl Outdated Show resolved Hide resolved
@mongoose-im

This comment was marked as outdated.

Comment on lines 625 to 626
{_From, _To, El} = mongoose_acc:packet(AccWithTS),
El
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mongoose_acc:element(Acc) will give you the element more cleanly than with packet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, thanks, let me change that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I amended the previous commit

@gustawlippa gustawlippa force-pushed the fix-duplicated-mam-msg branch from e242184 to b8d4427 Compare December 6, 2024 15:22
@gustawlippa gustawlippa force-pushed the fix-duplicated-mam-msg branch from b8d4427 to 4415dbe Compare December 6, 2024 15:22
Copy link
Collaborator

@NelsonVides NelsonVides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me 👌🏽

@mongoose-im
Copy link
Collaborator

mongoose-im commented Dec 6, 2024

elasticsearch_and_cassandra_27 / elasticsearch_and_cassandra_mnesia / 4415dbe
Reports root/ big
OK: 473 / Failed: 0 / User-skipped: 49 / Auto-skipped: 0


small_tests_27 / small_tests / 4415dbe
Reports root / small


small_tests_27_arm64 / small_tests / 4415dbe
Reports root / small


small_tests_26 / small_tests / 4415dbe
Reports root / small


ldap_mnesia_26 / ldap_mnesia / 4415dbe
Reports root/ big
OK: 2354 / Failed: 0 / User-skipped: 912 / Auto-skipped: 0


ldap_mnesia_27 / ldap_mnesia / 4415dbe
Reports root/ big
OK: 2354 / Failed: 0 / User-skipped: 912 / Auto-skipped: 0


dynamic_domains_mysql_redis_27 / mysql_redis / 4415dbe
Reports root/ big
OK: 4726 / Failed: 0 / User-skipped: 154 / Auto-skipped: 0


internal_mnesia_27 / internal_mnesia / 4415dbe
Reports root/ big
OK: 2496 / Failed: 0 / User-skipped: 770 / Auto-skipped: 0


pgsql_cets_27 / pgsql_cets / 4415dbe
Reports root/ big
OK: 4851 / Failed: 0 / User-skipped: 188 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / 4415dbe
Reports root/ big
OK: 4761 / Failed: 0 / User-skipped: 119 / Auto-skipped: 0


dynamic_domains_mssql_mnesia_27 / odbc_mssql_mnesia / 4415dbe
Reports root/ big
OK: 4756 / Failed: 0 / User-skipped: 124 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_27 / pgsql_mnesia / 4415dbe
Reports root/ big
OK: 4761 / Failed: 0 / User-skipped: 119 / Auto-skipped: 0


pgsql_mnesia_26 / pgsql_mnesia / 4415dbe
Reports root/ big
OK: 5152 / Failed: 0 / User-skipped: 128 / Auto-skipped: 0


cockroachdb_cets_27 / cockroachdb_cets / 4415dbe
Reports root/ big
OK: 4855 / Failed: 2 / User-skipped: 188 / Auto-skipped: 0

graphql_session_SUITE:admin_session:admin_session_cli:admin_list_users_with_status
{error,
  {{assertEqual,
     [{module,graphql_session_SUITE},
      {line,591},
      {expression,"lists : sort ( ActualJIDs )"},
      {expected,
        [<<"alice_admin_list_users_with_status_1224@localhost.bis/res1">>,
         <<"alice_admin_list_users_with_status_1224@localhost/res1">>]},
      {value,[]}]},
   [{graphql_session_SUITE,check_users,2,
      [{file,
         "/home/circleci/project/big_tests/tests/graphql_session_SUITE.erl"},
       {line,591}]},
    {graphql_session_SUITE,admin_list_users_with_status_story,3,
      [{file,
         "/home/circleci/project/big_tests/tests/graphql_session_SUITE.erl"},
       {line,416}]},
    {escalus_story,story,4,
      [{file,
         "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
       {line,72}]},
    {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1794}]},
    {test_server,run_test_case_eval1,6,
      [{file,"test_server.erl"},{line,1303}]},
    {test_server,run_test_case_eval,9,
      [{file,"test_server.erl"},{line,1235}]}]}}

Report log

pubsub_SUITE:dag+manage_subscriptions:modify_node_subscriptions_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_modify_node_subscriptions_test_2974@localhost/res1">>,
          escalus_tcp,<0.77764.0>,
          [{event_manager,<0.77740.0>},
           {server,<<"localhost">>},
           {username,<<"alicE_modify_node_subscriptions_test_2974">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.77740.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_modify_node_subscriptions_test_2974">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,<<"alice_modify_node_subscriptions_test_2974">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,{escalus_auth,auth_plain}},
           {wspath,undefined},
           {username,<<"alicE_modify_node_subscriptions_test_2974">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"0e0061e568b5f10d">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,433}]},
     {pubsub_tools,receive_and_check_response,4,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"}...

Report log


mysql_redis_27 / mysql_redis / 4415dbe
Reports root/ big
OK: 5131 / Failed: 0 / User-skipped: 149 / Auto-skipped: 0


pgsql_mnesia_27 / pgsql_mnesia / 4415dbe
Reports root/ big
OK: 5152 / Failed: 0 / User-skipped: 128 / Auto-skipped: 0


mssql_mnesia_27 / odbc_mssql_mnesia / 4415dbe
Reports root/ big
OK: 5147 / Failed: 0 / User-skipped: 133 / Auto-skipped: 0


cockroachdb_cets_27 / cockroachdb_cets / 4415dbe
Reports root/ big
OK: 161 / Failed: 0 / User-skipped: 2 / Auto-skipped: 0

@NelsonVides NelsonVides merged commit cf2de15 into master Dec 6, 2024
4 checks passed
@NelsonVides NelsonVides deleted the fix-duplicated-mam-msg branch December 6, 2024 16:01
@jacekwegr jacekwegr added this to the 6.3.1 milestone Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants