Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messages still showing out of order #426

Closed
hloeung opened this issue Apr 9, 2021 · 38 comments · Fixed by #482
Closed

Messages still showing out of order #426

hloeung opened this issue Apr 9, 2021 · 38 comments · Fixed by #482

Comments

@hloeung
Copy link
Collaborator

hloeung commented Apr 9, 2021

Hi,

Thought it was fixed with #419 but it looks like it's still happening (as @axinojolais mentioned a few days ago):

Test one
```
Test two
Test three
```
Test four
Test five

With that, you can see that messages are received out of order below:

MAV2eqW7tg

Even without code blocks, for example:

Test one
Test two
Test three
Test four
Test five

Produces this:

U4OghjXOga

@42wim
Copy link
Owner

42wim commented Apr 14, 2021

Possible to run with debug and paste the output?

@hloeung
Copy link
Collaborator Author

hloeung commented Apr 22, 2021

Sure, planning to try reproduce it when I have time and see if it's easily fixed. It may even be to do with the introduction of an additional IRC client/proxy (for me, it's irssi client -> irssi proxy -> matterircd. For @axinojolais, it's irssi client -> bip (?!?) -> matterircd).

FWIW, I've been running matterircd reverting commit 2c3883a and so far haven't noticed out-of-order messages. Anyways, will keep digging.

@guilhermepiccoli
Copy link
Contributor

I am also facing this issue. My setup is hexchat <-> ZNC <-> matterircd.
I do have debug enabled...will paste a "redacted" example below of the issue

@guilhermepiccoli
Copy link
Contributor

ORIGINAL / Web-client

ONE
TWO
THREE
FOUR
FIVE
SIX

MATTERIRCD

[1d4] ONE
[1d5] TWO
[1d6] THREE
[1d7] FIVE
[1d8] FOUR
[1d9] SIX

Debug logs:

The issues seems to be that messages FOUR and FIVE got "glued" together, it shows as "FOUR\nFIVE" in the debug logs.
Then some parsing split them, but throw them in reverse order.

I hope this is useful for the issue debugging!
Cheers

redacted_dbg_logs.txt

@hloeung
Copy link
Collaborator Author

hloeung commented Apr 23, 2021

So I guess this is just fallout from switching to using buffered channels?

With the fix switching back to unbuffered channels, u.eventChan = make(chan *bridge.Event) or u.eventChan = make(chan *bridge.Event, 0)?

hloeung added a commit to hloeung/matterircd that referenced this issue Apr 23, 2021
@hloeung
Copy link
Collaborator Author

hloeung commented Apr 23, 2021

IIUC, in handleWsActionPost() (https://github.com/42wim/matterircd/blob/master/bridge/mattermost/mattermost.go#L813), a single message with multiple lines is split into multiple messages and put into the buffered channel (#383). Unfortunately, per https://stackoverflow.com/a/25795236, with buffered channels, we can't guarantee order of messages (the delivery?).

Correct me if I'm wrong.

@42wim
Copy link
Owner

42wim commented Apr 23, 2021

Well, we have 1 goroutine handling them and the get put in in order, so that's why I don't understand it or I must be missing something.

@guilhermepiccoli logs above show the issue, a LIFO problem :-)
I'm wondering why the FIVE and FOUR got glued together though.

@hloeung
Copy link
Collaborator Author

hloeung commented Apr 23, 2021

Well, we have 1 goroutine handling them and the get put in in order, so that's why I don't understand it or I must be missing something

Oh, just edited my previous comment as you were replying :)

I checked and we do indeed have only the 1 goroutine.

@guilhermepiccoli
Copy link
Contributor

Thanks @42wim and @hloeung ! Is there anything else I can provide in order to get us to a resolution?
Unfortunately my time is really full the next 2 weeks, or else I'd try to fix myself, hopefully not too complex

@42wim
Copy link
Owner

42wim commented Apr 28, 2021

@hloeung are you sure you were testing with the version that has #419 in it ? I talked to @guilhermepiccoli and he was using an older version that didn't had the fix.

I also can not reproduce it with current master

@hloeung
Copy link
Collaborator Author

hloeung commented Apr 30, 2021

Yeah, definitely using a version with #419 applied.

I wasn't able to easily reproduce it locally either. I'll have another go at reproducing it with debug/trace enabled this weekend.

@hloeung
Copy link
Collaborator Author

hloeung commented May 5, 2021

An update here, still haven't come across this yet and wasn't able to reproduce it myself.

I did stumble on #420 with a PR to fix that though so all is not in vein.

@42wim
Copy link
Owner

42wim commented May 5, 2021

Ok, i'm going to close this, if someone can reproduce it, feel free to reopen.

@42wim 42wim closed this as completed May 5, 2021
@hloeung
Copy link
Collaborator Author

hloeung commented Jun 2, 2021

Seen it today, with the latest matterircd. Sadly, while running without debug/trace logging.

@guilhermepiccoli did you end up sanitising the logs you had?

@guilhermepiccoli
Copy link
Contributor

Hey @42wim and @hloeung !
Finally I was able to sanitize the logs I have, which shows the issue with 0.24.1, trace enabled - hope this is helpful.

ooo_msg_0.24.1-sanitized.txt

@guilhermepiccoli
Copy link
Contributor

Also, can we reopen the issue @42wim ? Saw this issue a lot past few days, but only have the above log collection with trace enabled. Have a bunch with just the debug logs though...

Cheers!

@42wim 42wim reopened this Jun 9, 2021
42wim added a commit that referenced this issue Jun 9, 2021
This should fix any unordered messages, but no idea why they are
unordered in the first place...
@42wim 42wim closed this as completed in e097242 Jun 9, 2021
@42wim
Copy link
Owner

42wim commented Jun 9, 2021

@guilhermepiccoli and closed again.. no idea why it happens, I've now removed buffered channels for mattermost/slack.
master should fix it

@guilhermepiccoli
Copy link
Contributor

Thanks a bunch @42wim , gonna try now (will run the artifact generated by the test build of this commit).

@axinojolais
Copy link

I'm running with the patch above, and so far, no out-of-order messages.

Thanks !

@furai
Copy link

furai commented Jun 11, 2021

@guilhermepiccoli and closed again.. no idea why it happens, I've now removed buffered channels for mattermost/slack.
master should fix it

I'll give it a try as well. I've been experiencing a lot of out of order messages lately.

As to why it got closed (if you were actually asking about it, if not, I'm sorry for providing explanation anyway) - it's because of the commit message. If you put in it keywords like Fixes #21312312 then when it's gets included in the main branch - it will automatically close the mentioned issue. Source.

@guilhermepiccoli
Copy link
Contributor

Unfortunately, I'm still seeing the issues even with @42wim 's patch =/

@furai
Copy link

furai commented Jun 21, 2021

Yeah, I'm also experiencing out of order messages.
It's like if they get grouped by mattermost app into one then they appear in wrong order in matterircd. Probably enough to reproduce is to type 2 messages in quick succession on mattermost web or maybe inside the mattermost app. If they get grouped then they appear in IRC in reversed order.

@furai
Copy link

furai commented Jun 22, 2021

In web UI:
image

On IRC:
image

I'm using latest release build

@furai
Copy link

furai commented Jun 24, 2021

Adding to the above - the messages sent by this user were most likely sent using desktop mattermost app.

@42wim 42wim reopened this Jun 24, 2021
@42wim
Copy link
Owner

42wim commented Jul 4, 2021

Can the people (@hloeung, @guilhermepiccoli, @furai ) that still have the issue download this special build: https://github.com/42wim/matterircd/pull/445/checks?check_run_id=2985395608 (and select your correct artifact)

https://github.com/42wim/matterircd/suites/3155567672/artifacts/72567237 should be a direct link to the 64-bit linux one.

Now add ootrace=true (same way as trace=true) in your matterircd.toml (so not under the mattermost header)
You don't need to run with full --debug, the ootrace option will show info messages with timestamps of every (multiline) message (and content) going to the eventchan and received from the eventchan on the irc side.

Hopefully this way we can see if there's a channel out of order issue or something else, if you get hit by the issue, paste the last OOTRACE lines here.

@hloeung
Copy link
Collaborator Author

hloeung commented Jul 4, 2021

Per IRC, I haven't come across or noticed it since the latest patch. I've pulled in the latest with OOTRACE logging anyways. Will update if I notice.

Thanks!

@hloeung
Copy link
Collaborator Author

hloeung commented Jul 7, 2021

Okay, here it is:

INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ${MSG1}  to eventChan on 1625646007267419568: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ``` to eventChan on 1625646007267530504: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ${MSG2} to eventChan on 1625646007267579861: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ``` [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007268263014  module=matterircd
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ${MSG1} [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007268319657  module=matterircd
INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ``` to eventChan on 1625646007272937234: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ${MSG2} [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007273574391  module=matterircd
INFO[2021-07-07T08:20:07Z] OOTRACE: ${USER} sent ${MSG1}
```
${MSG2}
```
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ``` [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007275923369  module=matterircd

@hloeung
Copy link
Collaborator Author

hloeung commented Jul 7, 2021

Another:

INFO[2021-07-07T05:31:43Z] OOTRACE: sending msg ${MSG1} to eventChan on 1625635903100315921: 0
INFO[2021-07-07T05:31:43Z] OOTRACE: sending msg ${MSG2} to eventChan on 1625635903100436639: 0
INFO[2021-07-07T05:31:43Z] OOTRACE: sending msg ${MSG3} to eventChan on 1625635903100489927: 0
INFO[2021-07-07T05:31:43Z] OOTRACE: got msg ${MSG2} [↪@@57z8xex4m78ijfuqke6yk6ha4e] on 1625635903101329460  module=matterircd
INFO[2021-07-07T05:31:43Z] OOTRACE: got msg ${MSG1} [↪@@57z8xex4m78ijfuqke6yk6ha4e] on 1625635903101347667  module=matterircd
INFO[2021-07-07T05:31:43Z] OOTRACE: ${USER} sent ${MSG1}

${MSG2}

${MSG3}

INFO[2021-07-07T05:31:43Z] OOTRACE: got msg ${MSG3} [↪@@57z8xex4m78ijfuqke6yk6ha4e] on 1625635903107285778  module=matterircd

@42wim
Copy link
Owner

42wim commented Jul 11, 2021

@hloeung does MSG1 MSG2 or MSG3 contain ``` ?
or a reaction?

I have no idea btw how this is possible, msg's get set on the channel in order but get read out of order.

@42wim
Copy link
Owner

42wim commented Jul 11, 2021

@hloeung and other peeps, can you try https://github.com/42wim/matterircd/pull/445/checks?check_run_id=3041968404 ?
https://github.com/42wim/matterircd/suites/3211116698/artifacts/74279412 direct linux 64-bit build.

The change here is to remove the buffered channel completely although this shouldn't make any difference because it's getting set to 0 when initializing mattermost/slack but maybe I'm missing something :)

If you're seeing OOTRACE: creating non-buffered eventchan on start-up when connected to mattermost you're running the correct version.

@hloeung
Copy link
Collaborator Author

hloeung commented Jul 12, 2021

Yep, upgraded yesterday and am running with the latest patch since then.

[ubuntu@irc-proxy mattermost]$ ./matterircd
INFO[2021-07-12T10:16:20Z] Running version 0.24.3-dev                    module=matterircd                                                                                                                                                              
INFO[2021-07-12T10:16:20Z] WARNING: THIS IS A DEVELOPMENT VERSION. Things may break.  module=matterircd
INFO[2021-07-12T10:16:20Z] Listening on 127.0.0.1:6697                   module=matterircd                                                                                                                                                              
INFO[2021-07-12T10:17:11Z] New connection: 127.0.0.1:44932               module=matterircd
INFO[2021-07-12T10:17:11Z] OOTRACE: creating non-buffered eventchan on start-up  module=matterircd                                                                                                                                                      
slack-go/slack2021/07/12 10:17:11 websocket_managed_conn.go:181: Connecting to RTM
...

I can see a difference in the logs already. Seems to be got, sending, got, sending, got, sending vs. previously it's bunch of gots followed by bunch of sendings.

Will keep watching.

@42wim
Copy link
Owner

42wim commented Jul 20, 2021

so fixed now? I'll do a release tomorrow then.

@hloeung
Copy link
Collaborator Author

hloeung commented Jul 21, 2021

Yeah, I think so. Not noticed any more OOO messages.

@hloeung
Copy link
Collaborator Author

hloeung commented Jul 21, 2021

Spoke too soon.

On IRC:

|20:30 <user1> ```bash [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> mount error 110 = Connection timed out [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> msg1 [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> ``` [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> $ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest [ui1pfumdkf8nmkwd8zwc46zqch]

Matterircd OOTRACE logs:

INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg msg1 to eventChan on 1626813013329295391: 0
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg ```bash to eventChan on 1626813013329423442: 0                                                                                                                                                          
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg $ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest to eventChan on 1626813013329487053: 0                                                                                                 
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg mount error 110 = Connection timed out to eventChan on 1626813013329544542: 0                                                                                                                           
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg ```bash [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013331397999 module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg msg1 [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013331004892 module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg $ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013331507745  module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg mount error 110 = Connection timed out [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013341611377  module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg ``` to eventChan on 1626813013345213396: 0
INFO[2021-07-20T20:30:13Z] OOTRACE: user1 sent msg1
```bash
$ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest                                                                                                                                                                                        
mount error 110 = Connection timed out
```
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg ``` [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013348128027  module=matterircd

So wtf?!?!?

@hloeung
Copy link
Collaborator Author

hloeung commented Aug 13, 2021

Still happening, a lot less frequent, but still seeing it. Most recent being:

INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 39       2 ftpuser to eventChan on 1628810281942359650: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sent     msg 39       2 ftpuser to eventChan on 1628810281948257769: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 40       2 deploy to eventChan on 1628810281948595596: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: got      msg 40       2 deploy [@@3sbkx9gjypgg5d1f7frf79xfxc] on 1628810281953273716  module=matterircd
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: got      msg 39       2 ftpuser [@@3sbkx9gjypgg5d1f7frf79xfxc] on 1628810281954677351  module=matterircd
WARN[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: Message out of order, got msg 39       2 ftpuser [@@3sbkx9gjypgg5d1f7frf79xfxc] but seen 40  module=matterircd
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sent     msg 40       2 deploy to eventChan on 1628810281955665837: 0                                                                                                     INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 41       2 chia to eventChan on 1628810281955869657: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sent     msg 41       2 chia to eventChan on 1628810281955993984: 0                                                                                                       INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 42       2 centos to eventChan on 1628810281956149911: 0

Weird, I'm at a loss at where this would be.

@furai
Copy link

furai commented Oct 14, 2021

Still no fix for this?

@42wim
Copy link
Owner

42wim commented Nov 7, 2021

@furai well, if we don't know the root cause we can't fix it :(

42wim added a commit that referenced this issue Oct 14, 2022
This fixes #480 and as a bonus
also fixes #426 as this is now
just 1 big message for irc
@42wim
Copy link
Owner

42wim commented Oct 15, 2022

#482 should fix this as we're not splitting the lines from mattermost anymore before sending it to irc
So anyone still interested, please try this PR :)

hloeung pushed a commit to hloeung/matterircd that referenced this issue Oct 16, 2022
This fixes 42wim#480 and as a bonus
also fixes 42wim#426 as this is now
just 1 big message for irc
@42wim 42wim closed this as completed in #482 Nov 1, 2022
@42wim 42wim closed this as completed in e47926b Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants