Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument router and remove global spirals (S2S/component) #4319

Merged
merged 14 commits into from
Jul 17, 2024

Conversation

gustawlippa
Copy link
Contributor

@gustawlippa gustawlippa commented Jul 8, 2024

This PR removes global spirals and changes routingErrors metric into router_no_route_found instrumentation event.

The router_no_route_found event is a bit useless, as it will happen if none of the routing modules have routed a stanza (returned {done, Acc}), but S2S, as the last router, always tries to route a message and returns done. If it can't find the recipient, it sends the error stanzas on its own. So for this event to occur, S2S has to be disabled - and this is how I tested it.

I wasn't sure how to handle the metrics to be similar to C2S ones. There are two issues that we can still decide to change in this PR:

  • naming: I went with the _in/_out convention from Instrument/hook handlers: c2s #4310, but transport metrics for C2S,(which are more similar to the ones from this PR) use _received/_sent convention from Add C2S TCP/TLS listener instrumentation metrics #4304. I think it's best to either change them to use _in/_out as well, or I can change the names to _received/_sent in this PR.
  • number of metrics: I decided for this PR, to change the metrics one for one, but there could be more metrics, to mirror the ones for C2S. I created a PR with a proposed implementation of this: WIP metrics proposal #4322. However, I'm not certain if we need so many additional metrics. Additional idea is that all of them could be combined, and just have labels c2s/s2s/component.

@gustawlippa gustawlippa changed the base branch from master to feature/instrument July 8, 2024 11:49
@mongoose-im

This comment was marked as outdated.

Copy link

codecov bot commented Jul 8, 2024

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Project coverage is 84.41%. Comparing base (4ef5573) to head (578bd1b).

Files Patch % Lines
src/mongoose_transport.erl 89.47% 2 Missing ⚠️
Additional details and impacted files
@@                  Coverage Diff                   @@
##           feature/instrument    #4319      +/-   ##
======================================================
- Coverage               84.41%   84.41%   -0.01%     
======================================================
  Files                     551      551              
  Lines                   33840    33848       +8     
======================================================
+ Hits                    28567    28572       +5     
- Misses                   5273     5276       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

Because S2S always returns {done, Acc}, it has to be disabled for this test to
work. I think this is a bit forced, but at least the event is properly tested.
@mongoose-im

This comment was marked as outdated.

@gustawlippa gustawlippa changed the title WIP Instrument/router Instrument router and global spirals (S2S/component) Jul 10, 2024
@gustawlippa gustawlippa changed the title Instrument router and global spirals (S2S/component) Instrument router and remove global spirals (S2S/component) Jul 11, 2024
@mongoose-im

This comment was marked as outdated.

@gustawlippa gustawlippa marked this pull request as ready for review July 11, 2024 10:18
Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. I added a few minor questions and notes.

doc/operation-and-maintenance/MongooseIM-metrics.md Outdated Show resolved Hide resolved
doc/operation-and-maintenance/MongooseIM-metrics.md Outdated Show resolved Hide resolved
src/mongoose_transport.erl Outdated Show resolved Hide resolved
src/mongoose_transport.erl Show resolved Hide resolved
@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

simple_message was removed from all_tests because it was already contained in
the connections_info testcase. Because of that, the S2S connection was already
up, and the expected number of events was different for both of those cases.
@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

For some reason, case with cachain sends more messages
@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

After removing the last global metric, we need to switch this suite to
use instrumentation, because otherwise ther would be nothing to report
to the Graphite mock.
@chrzaszcz chrzaszcz force-pushed the instrument/router branch from f193cb0 to 30e684c Compare July 16, 2024 11:56
@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the instrument/router branch from 6d28edf to 578bd1b Compare July 16, 2024 15:22
@mongoose-im
Copy link
Collaborator

mongoose-im commented Jul 16, 2024

elasticsearch_and_cassandra_26 / elasticsearch_and_cassandra_mnesia / 578bd1b
Reports root/ big
OK: 457 / Failed: 0 / User-skipped: 45 / Auto-skipped: 0


small_tests_25 / small_tests / 578bd1b
Reports root / small


small_tests_26 / small_tests / 578bd1b
Reports root / small


small_tests_26_arm64 / small_tests / 578bd1b
Reports root / small


ldap_mnesia_25 / ldap_mnesia / 578bd1b
Reports root/ big
OK: 2293 / Failed: 0 / User-skipped: 910 / Auto-skipped: 0


ldap_mnesia_26 / ldap_mnesia / 578bd1b
Reports root/ big
OK: 2292 / Failed: 1 / User-skipped: 910 / Auto-skipped: 0

domain_isolation_SUITE:two_domains:routing_one2one_message_to_another_domain_gets_dropped
{error,{test_case_failed,"Incorrect number of instrumentation events - matched: 0, expected: 1"}}

Report log

domain_isolation_SUITE:end_per_suite
{error,{test_case_failed,"Instrumentation events that were logged, but not tested:\n[{router_stanza_dropped,#{host_type => <<\"localhost\">>}}]\nYou need to test them with instrument_helper:assert/3"}}

Report log


dynamic_domains_mysql_redis_26 / mysql_redis / 578bd1b
Reports root/ big
OK: 4608 / Failed: 0 / User-skipped: 139 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / 578bd1b
Reports root/ big
OK: 4641 / Failed: 0 / User-skipped: 106 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_25 / pgsql_mnesia / 578bd1b
Reports root/ big
OK: 4641 / Failed: 0 / User-skipped: 106 / Auto-skipped: 0


internal_mnesia_26 / internal_mnesia / 578bd1b
Reports root/ big
OK: 2434 / Failed: 1 / User-skipped: 768 / Auto-skipped: 0

pubsub_SUITE:tree+basic:subscribe_options_separate_request_test
{error,{{badmatch,false},
    [{pubsub_tools,check_response,2,
             [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
            {line,444}]},
     {pubsub_tools,receive_response,3,
             [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
            {line,434}]},
     {pubsub_tools,receive_and_check_response,4,
             [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
            {line,424}]},
     {pubsub_SUITE,'-subscribe_options_separate_request_test/1-fun-2-',2,
             [{file,"/home/circleci/project/big_tests/tests/pubsub_SUITE.erl"},
            {line,436}]},
     {escalus_story,story,4,
            [{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
             {line,72}]},
     {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1793}]},
     {test_server,run_test_case_eval1,6,
            [{file,"test_server.erl"},{line,1302}]},
     {test_server,run_test_case_eval,9,
            [{file,"test_server.erl"},{line,1234}]}]}}

Report log


pgsql_mnesia_25 / pgsql_mnesia / 578bd1b
Reports root/ big
OK: 5032 / Failed: 0 / User-skipped: 115 / Auto-skipped: 0


dynamic_domains_mssql_mnesia_26 / odbc_mssql_mnesia / 578bd1b
Reports root/ big
OK: 4638 / Failed: 0 / User-skipped: 109 / Auto-skipped: 0


mysql_redis_26 / mysql_redis / 578bd1b
Reports root/ big
OK: 5013 / Failed: 0 / User-skipped: 134 / Auto-skipped: 0


pgsql_cets_26 / pgsql_cets / 578bd1b
Reports root/ big
OK: 4534 / Failed: 0 / User-skipped: 177 / Auto-skipped: 0


pgsql_mnesia_26 / pgsql_mnesia / 578bd1b
Reports root/ big
OK: 5032 / Failed: 0 / User-skipped: 115 / Auto-skipped: 0


mssql_mnesia_26 / odbc_mssql_mnesia / 578bd1b
Reports root/ big
OK: 5029 / Failed: 0 / User-skipped: 118 / Auto-skipped: 0


ldap_mnesia_26 / ldap_mnesia / 578bd1b
Reports root/ big
OK: 2293 / Failed: 0 / User-skipped: 910 / Auto-skipped: 0


internal_mnesia_26 / internal_mnesia / 578bd1b
Reports root/ big
OK: 2435 / Failed: 0 / User-skipped: 768 / Auto-skipped: 0

@chrzaszcz chrzaszcz marked this pull request as draft July 17, 2024 06:42
@chrzaszcz chrzaszcz marked this pull request as ready for review July 17, 2024 06:42
Copy link
Contributor

@JanuszJakubiec JanuszJakubiec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@JanuszJakubiec JanuszJakubiec merged commit 3d2012d into feature/instrument Jul 17, 2024
4 checks passed
@JanuszJakubiec JanuszJakubiec deleted the instrument/router branch July 17, 2024 08:27
@jacekwegr jacekwegr added this to the 6.3.0 milestone Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants