Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Since December, there has been a freeze issue occurring on jbnc #70

Open
Madriix opened this issue Apr 9, 2024 · 20 comments
Open

Since December, there has been a freeze issue occurring on jbnc #70

Madriix opened this issue Apr 9, 2024 · 20 comments

Comments

@Madriix
Copy link
Collaborator

Madriix commented Apr 9, 2024

Since December 2023, I've been experiencing a random issue with JBNC, here's the explanation:

I have an IRC web client with approximately 550 concurrent users during peak hours (around 9 p.m.) and about 200 in the afternoon during school hours, so on JBNC, there are around 350 users during peak hours. Here's how to reproduce this random bug; it doesn't happen every time but at least once every 24 or 48 hours. Here's how to do it:

  • For example, it's 11 p.m. You start JBNC from an IRC web client, and this web client must have 2 channels on autojoin. Preferably, start it with an Android mobile phone.
  • You disconnect after a minute, for example, you turn off your phone without disconnecting JBNC from the IRC server. For this, I've set the bouncerTimeout to 1800 seconds, which is 30 minutes; it will disconnect after 30 minutes.
  • So from 11 p.m. until 8 a.m., I sleep, and then I turn on my computer, let's say 5 minutes later, from Firefox, I go to the IRC web client (jbnc), and I connect. Occasionally, I find myself stuck in the status tab of my web client.

I see myself connected from an mIRC (no-jbnc) on the channels where I have autojoin with JBNC, I am present on them, but with JBNC, I am stuck on status, I am on no channel. I then typed the PASS command to connect to JBNC from an mIRC client, and indeed I see that I am not on any channel, although I am present in the channels, JBNC refuses to display the channel on mIRC and on the applet. The only way for it to work is either to restart JBNC by typing "/jbnc quit" or I can also type "/part #channels" and then join the 2 channels, and then it works fine again. This is a significant UX issue because I have noticed that this problem not only happens to me but also to many others, so it's losing customers. I also encountered this bug for the first time in December 2023 during a change of dedicated server, but the problem is that neither the IRC web client nor JBNC is installed on the new dedicated server. However, to try to fix this issue, in January, I migrated the IRC web client and JBNC to the new dedicated server, but this did not fix the problem. I feel like the problem comes either from:

  • the web IRC client (for example, it joins the channels too quickly, I've tried both setting autojoin at raw 005 and also at the end of the MOTD)
  • or JBNC (I've tried to fix things related to "gone=settimeout()" and all that is automatic quit)
  • or from UnrealIRCd 6.1.1.1 (but I haven't updated it for a year and it was working fine before December); however, I activated the RPC stuff, but no idea if it could be because of that. Maybe a SASL problem too, because the irc and jbnc webclient uses it for 100% of users who have an account

Please note also that the problem does not exclusively come from leaving a space without doing anything between 11 p.m. and 8 a.m. because one day I was coding and trying to reconnect to the web IRC client (with JBNC), and I had to type "/jbnc quit" each time, and in an hour, this freeze happened quite often, to quickly fix it, I reconnected by typing /jbnc quit. There might be a conflict, for example, if one joins a channel and someone else joins it at the same time, causing the freeze? I'm not sure; I can't locate the bug.
I also noticed that this bug could also occur by stopping JBNC with "pm2 stop bouncer" and then starting it again with "start", with at least 300 people reconnecting within 10 minutes; here the bug also occurs randomly; it may not happen to me, but it probably does to other users.
This bug could be related to a connection issue, or quitting, or joining a channel, it's one of the three. I think, in my opinion, it doesn't come from quitting.

Do you have any idea where this problem might be coming from?

Have you ever experienced this bug?

Best regards.

@realrasengan
Copy link
Collaborator

If I understand correctly, are you saying that it isn't tracking the channels a user should be in correctly?

If a user turns their phone off, it should disconnect due to the "SHACK" which is the PING/PONG trades between jbnc and the user. I'm going to read through everything from connection to channels as well as the reconnect and see if I can figure this out!

@realrasengan
Copy link
Collaborator

realrasengan commented Apr 9, 2024

              if(!this.irc.password) {
                this.write(":*jbnc NOTICE * :*** This is a JBNC Server.  You must set a password.\n");
                this.badauth=true;
                this.end();
              }
              else if(commands[1]) {
                this.irc.nick=commands[1].trim();
                if(false&&this.irc.user) {
                  this.hash=hash(this.irc.nick+this.irc.user+this.irc.password+this.irc.server+this.irc.port.toString());
                  if(connections[socket.hash]) {
                    clientReconnect(this);
                  }
                  else {
                    clientConnect(this);
                  }
                }
              }
              break;
            case 'USER':
              if(!this.irc.password) {
                this.write(":*jbnc NOTICE * :*** This is a JBNC Server.  You must set a password.\n");
                this.badauth=true;
                this.end();
              }
              else if(commands.length >= 5) {
                this.irc.user = commands[1].trim();
                this.irc.realname = commands.slice(4).join(" ");
                if(this.irc.realname.substr(0,1)==':')
                  this.irc.realname=this.irc.realname.substr(1);
                if(BOUNCER_USER.length>0 && this.irc.user!=BOUNCER_USER) {
                  this.write(":*jbnc NOTICE * :*** Incorrect Username ***\n");
                  this.end();
                }
                else {
                  if(this.irc.nick) {
                    this.hash=hash(this.irc.nick+this.irc.password+this.irc.server+this.irc.port.toString());
                    if(connections[socket.hash]) {
                      clientReconnect(this);
                    }
                    else {
                      clientConnect(this);
                      if(DEBUG)
                        console.log("Connecting to "+this.irc.server+":"+this.irc.port);
                    }
                  }
                }
              }
              else {
                this.write(":*jbnc NOTICE * :*** Your IRC client is faulty. ***\n");
                this.badauth=true;
                this.end();
              }
              break;```

I think we should rewrite this. if(false&&this.irc.user) means that reconnect or connect will never be called if NICK is sent and USER is already set. Whereas, it will only be called if USER is set and NICK is already set. I wonder if this is the underlying issue?

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 9, 2024

Oh, I remember I had put 'false' here because I saw that it was already set to 'USER'. Actually, I put 'false' because on my IRC web client, every time people reconnected, sometimes they would write twice. The client socket was open two or three times, and people were writing in duplicates, flooding. So, I put 'false' here. Eventually, some time later, I noticed that the random flood bug continued, but after a few more months, it resolved itself. The issue was with the IRC web client, not with JBNC. It hasn't been doing it for a long time (or rarely), and I haven't removed this 'false' because since it was working well, I thought I might as well leave it as 'false'. I will test removing the 'false' and see how it goes for 2-3 days to observe the difference

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 10, 2024

Removing the false didn't change anything. Currently, I'm facing this bug:

001

002

If I refresh the page, it still doesn't work. Even when I connect to JBNC with mIRC using the /PASS command, I'm still not connected to the channel, and the /debug command doesn't show any trace of the channels where I'm normally present.

It's strange because this issue happens every morning lately, perhaps under the condition that the last connection was yesterday evening around 11 p.m. using a mobile phone, leaving the bouncer on but turning off the mobile phone (power off). Then, after 30 minutes, the bouncer automatically disconnects from the IRC with the "gone: settimeout()" message, and there's a long gap of several hours while I sleep. When I connect in the morning, this bug occurs.

Maybe during this time, there are many connections to JBNC from other users? Perhaps the bug occurs at 7:30 a.m., and I should wait until 8 a.m. or 4 p.m. before reconnecting to see if it still occurs. However, to be sure, I need to connect to JBNC with my mobile and wait until around 8 or 9 hours., then connect with my PC at 4 p.m. That's the only way I can see to reproduce the bug, but it's worth noting that it doesn't happen every time. Alternatively, it could be that my session doesn't close properly with the 'gone:settimeout()' command, or the web client IRC is joining channels too quickly upon connection.

It could also be a bug related to the joins channels.

@realrasengan
Copy link
Collaborator

Looking at the screenshot it's unclear, are you receiving this on reconnect

socket.write(":*jbnc PRIVMSG "+connection.nick+" :Attaching you to the network\n");

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 10, 2024

@realrasengan The screenshot is of the connection; otherwise, yes, upon reconnecting, I do receive this. I didn't capture the screenshot of the reconnection

@realrasengan
Copy link
Collaborator

That's really peculiar.

I'm very confused. I wonder if there is a way to log the data the irc client is receiving and sending to understand if something is not firing/is missing, or if its something else.

That would be the easiest way to track this down.

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 10, 2024

Yes, it is true that it would be good if my client could record all the logs only for my pseudo because otherwise it will not be easy to find the right lines. However, such a logging system does not exist, it will be long to set up, and I will have to think about how I could do it.

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 11, 2024

@realrasengan Unfortunately, the bug didn't occur this morning, even though I activated debugging by recording everything from the 'on('raw')' of IRC Framework connecting to JBNC. So, it records all the lines in an array called 'let DEBUG = []', and this DEBUG is only activated for my nickname 'Carly'. I even created a small function that allows downloading the result of the 'DEBUG' array into a file named 'result.txt'. So, I'll see next time, maybe tomorrow.

I also noticed a bug this morning on one of my bots created with IRC Framework. It's a bot named 'Tapavu' which does the same thing as '!Seen Pseudo' except it's '!tapavu pseudo'. I noticed there was a bug in the regex because when typing '!tapavu pseudo|away' with the '|', it only retrieves 'pseudo' and not '|away'. So, I fixed that with ".replace(/[-/\^$*+?.()|[]{}]/g, '\$&')" but then I had the idea to search all 'regex, test(' and 'match(' and 'replace(/' on JBNC to see if there could also be bugs with this '|' character as a value, and imagine here what it could do?"

                  for(x=0;x<this.channels[key].names.length;x++) {
                    if(this.channels[key].names[x].replace(/(&|~|@|%|\+)/,"")==_sender) {
                      _statut = ( /(&|~|@|%|\+)/.test(this.channels[key].names[x].substr(0,1)) ? this.channels[key].names[x].substr(0,1) : "" );

Let's imagine that this.channels[key].names[x] returns "@Pseudo", so here it's okay. But if it returns "@Pseudo|away" with the "|", could there be a bug? Maybe it doesn't cause an issue because there's never a "@" after the "|", but the search behavior could definitely glitch at some point over an entire month, right?

There's also this :

if(curchan.names[c].replace(/(&|~|@|%|\+)/,"")==_mode_target[_mode_count]) {

curchan.names[c] = Pseudo|away

if(this.channels[_channel].names[x].replace(/(&|~|@|%|\+)/,"")==_target)

this.channels[_channel].names[x] = If it doesn't find the "@" in the first "Pseudo," it searches for the "@" in "away." That's a bit conflicting. _target = Pseudo|away

Normally there shouldn't be a problem; it probably returns the right results as it's positioned now

Could there be problems for those who put Pseudo[away] or Pseudo{away} or Pseudo||[away] ?

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 11, 2024

I restarted the dedicated server 30 minutes ago after 60 days without a reboot. So, I started the ircd and jbnc, anope, the whole package, and 3 minutes later, I connected to jbnc and the bug occurred. Here are the two logs attached:

https://labs.discussionner.com/logs/First_connection_immediately_afte_ircd_started.txt : This is the first connection to the ircd with jbnc immediately after the dedicated server restarted. Here, the bug on the #ados channel is frozen; it doesn't want to display as a tab on either the web client or mIRC.

https://labs.discussionner.com/logs/A_new_connection_which_does_not_bug.txt : Here, I disconnected from the irc with /jbnc quit and then reconnected. This is the log of the second connection just after /jbnc quit. Here, the #ados channel is not frozen

If needed, I also have the log while reconnecting to jbnc without having done /jbnc quit, but there is nothing interesting in it

I don't see anything special in the two logs. Perhaps it could be a bug with either UnrealIRCd (with the who nick %tcuhsnfdaorl), or with jbnc, or maybe there's an issue with the webirc client, but I'm not sure. Alternatively, it could be a problem with the dedicated server. Would sending these two logs to UnrealIRCd be useful to explain the issue?

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 12, 2024

On the first connection that's bug, there's this:
@realname=15/F/01/Dnc;account=Carly;msgid=mkV4uDCVQsAqmTzM9vehQT-nas/I7IKzhq/FwrNxuWpCA;time=2024-04-12T11:19:37.889Z :Carly!o-dcs-fr816e6d8-1@dis-81D05E96.ip2ca47e81.cloud-4a840e.irccity.com JOIN #ados Carly :15/F/01/Dnc

But not on the new connection that isn't glitching. Perhaps there's an issue around this line:
https://github.com/freenode/jbnc/blob/master/bouncer.js#L545

I also tried with a second channel, which is #otaku. Similarly, it doesn't appear in the glitching connection but does appear in the non-glitching one:
@realname=15/F/01/Dnc;account=Carly;msgid=jQJ10Nv2BPdR5wISubzdB7-4y9ndpWR2Btml7YUUaNPdQ;time=2024-04-12T11:19:37.890Z :Carly!o-dcs-fr816e6d8-1@dis-81D05E96.ip2ca47e81.cloud-4a840e.irccity.com JOIN #otaku Carly :15/F/01/Dnc

Maybe it's because of the "extended-join"?

@realrasengan
Copy link
Collaborator

It could be. I haven't actually looked that deeply into all of extended-join code -- I think that was entirely contributed by you; so you might have the best luck, but please let me know and I'll try to help out as well if I can make some time!

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 12, 2024

I compared the code archive from July 2020 to the current one (April 12, 2024), and I didn't find anything suspicious.

Here are all the new habits I adopted between October and December 2023:

  • On my web IRC client, I removed individual commands like /join #channel1, /join #channel2, /join #channel3. Instead, I replaced them with /join #channel1,#channel2,#channel3.
  • I switched to a different dedicated server for the web IRC client. Previously, I was using CentOS 7.9 with cPanel, and now I'm on Debian 12 with Virtualmin.
  • I updated Node.js in December; I remember it was version 18 (probably).

I'll think about this new thing in one line: /join #channel1,#channel2,#channels3, maybe that's what's causing the problem

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 12, 2024

I manage to reproduce the bug exactly by modifying the IRC framework:
https://github.com/kiwiirc/irc-framework/blob/master/src/commands/handlers/channel.js#L209

Simply by removing this line handler.emit('join', data);, the channels on jbnc freeze identically.

There seems to be something peculiar between jbnc and the IRC framework. I have the impression that the "JOIN" section of the IRC framework is always duplicated.

At the end of the code, I added this:

if (command.nick == "Liska" && handler.network.cap.isEnabled('extended-join')) {
    console.log("Test", handler.network.cap.isEnabled('extended-join'));
    process.exit(1);
}

Every time I change handler.network.cap.isEnabled('extended-join') to !handler.network.cap.isEnabled('extended-join') or replace it with gecos_idx==1 or gecos_idx==2, it still detects both lines, as if jbnc is receiving messages in duplicate. I'm connected to an IRC web client whose server is the IRC framework, and I'm using jbnc between the client and the IRC framework. I'll need to try to understand why messages are arriving in duplicate from the IRC framework, perhaps because I'm connected to both the IRC web client (IRC framework) and jbnc is also connected to the IRC framework and processing its own messages, there seems to be some kind of conflict.

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 13, 2024

I located the bug, it doesn't come from unrealircd or jbnc. It originates either from the web client IRC, or from the IRC framework. In the event listener for "join" on the IRC framework or the web client, simply add if (data.channel=="#channel") return false;, and the channel freezes. It's related to that.

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 14, 2024

Finally, I don't know, I manage to reproduce the bug by doing this:

So by doing these three, I manage to reproduce the bug, but the bug does not occur identically because afterwards, I can connect to BNC with mIRC (without disconnecting from the IRC server first, leaving it online in the BNC), so here the blocked channels join correctly whereas with the original bug everything freezes, unable to display the channels.

I have the impression that the bug does not come from IRC Framework.
It may come from the BNC server (but it has been tested on two different dedicated servers, and very few lines have been modified since January 2023). The BNC server, even on its old dedicated server, the bug occurred.
Either it comes from the web IRC client, but I have no clue.
Or the bug comes from the new versions of NodeJS from November or December 2023.
Or the bug comes from UnrealIRCd (for example, in December, I may have activated things like exceptions on accounts, IP, and also tinkered with everything related to RPC).
Or the bug comes from the new dedicated server (Debian 12), because the first time I located the freezing bug on the channels was the first day UnrealIRCd was installed on the Debian, I turn everything on properly, the BNC server is still on its original server and there, the first startup of the UnrealIRCd server on this dedicated server, I connect and I was directly blocked, it had never done that to me before. I also found something strange on Debian 12 with UnrealIRCd, is that to do ./Config and make it's really fast on Debian 12 (maybe 2 minutes) and on the old CentOS 7.9 server, it took 5 minutes, and yet this server was more powerful than the new dedicated server (and also more expensive, €90 more per month for old server).
Either the bug is related to the new dedicated server where I use Apache2 and it may be misconfigured but I don't have the impression.
Or my dedicated server has a problem with "blocking" things, maybe an IPv4 or IPv6 conflict, I'm not sure.

This bug is very random, I manage to reproduce it only at the following times:

  • Either by shutting everything down (unrealircd/service/apache) and restarting the dedicated server. The freezing bug on channels will always occur after all restarts.
  • Or accidentally, at random times, but it's extremely rare and I have no idea if it's the same identical bug because I never checked if it was possible to see if I was in the channels by reconnecting to BNC with mIRC.
  • The bug is always related to the BNC server and a SASL account, it never occurs with anything else.

Maybe it's the Firefox browser, I'll test tomorrow with Chrome (I've always tested with Firefox), because a second person I was able to see that she had the identical bug, and I just noticed that she also had Firefox

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 15, 2024

Today, I restarted the dedicated server, and the bug did not occur. It's the first time (no idea if it's just a coincidence). Just before restarting the dedicated server, I modified JBNC to remove a "return;" in those places, around number 1433: https://github.com/freenode/jbnc/blob/master/bouncer.js#L1433 Do you know if it's risky to place a return; at that location?
Of course it wasn't a pure return, there were also ifs to manage privmsg and ctcp rejection

@realrasengan
Copy link
Collaborator

What was the logic around it?

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 16, 2024

@realrasengan
In the CTCP if statement, there was a "return;" which I replaced with "continue;" instead. I'm not sure if the "return;" could have caused any issues.

The join freeze bug hasn't recurred. We should restart the dedicated server and check if it happens again.

This code is placed on line 1433 :

          // Check the last 400 messages that will be sent on the irc web client to sort the new messages
          if (MSG_REDISTRIBUTION && this.messagetags && lines[n].startsWith("@")) {
                // Ignore CTCP request/responses
                if (
                    (lines[n].split(" ")[2] === 'PRIVMSG' || lines[n].split(" ")[2] === 'NOTICE') &&
                    lines[n].split(" ").splice(4).join(' ').substr(1) && lines[n].split(" ").splice(4).join(' ').substr(1)[0] === '\x01'
                ) {
                    // We do want to log ACTIONs though
                    if (!lines[n].split(" ").splice(4).join(' ').substr(1).startsWith('\x01ACTION ')) {
                      if(DEBUG)
                        console.log("Ignoring CTCP");
                      //return; // Old
                      continue; // new test
                    }
                }

                if (lines[n].split(" ")[2] === "PRIVMSG" || lines[n].split(" ")[2] === "NOTICE") {
                  for(key in this.buffers) {
                    if(this.buffers.hasOwnProperty(key)) {
                      _n = lines[n].split(" ")[3]; //<target> in PRIVMSG <hereTarget> :... - Nom du salon ou nom du pv

                      let message = lines[n];
                      let atIndex1 = message.indexOf(" :");
                      let atIndex2 = message.indexOf("!");
                      let source = message.substring(atIndex1 + 2, atIndex2);


                      //console.log("TEST : ",lines[n]);
                      let count = 0;

                      for (let x = 0; x < this.buffers[key].privmsgnotice.length; x++) {
                        if (this.buffers[key].privmsgnotice[x].target === _n) {
                          count++;
                      
                          if (count >= 300) {
                            this.buffers[key].privmsgnotice.splice(x, 1);
                            break;
                          }
                        }
                      }

                      // Adding all messages
                      this.buffers[key].privmsgnotice.push({
                        source: source,
                        target: _n, 
                        line: lines[n]+"\n"
                      });
                    }
                  }
                }
                else if (lines[n].split(" ")[2] && validActionsRedist.has(lines[n].split(" ")[2].trimEnd())) {
                  for(key in this.buffers) {
                    if(this.buffers.hasOwnProperty(key)) {
                      _n = "server.irc";

                      //console.log("TEST : ",lines[n]);
                      let count = 0;

                      for (let x = 0; x < this.buffers[key].privmsgnotice.length; x++) {
                        if (this.buffers[key].privmsgnotice[x].target === _n) {
                          count++;
                      
                          if (count >= 300) {
                            this.buffers[key].privmsgnotice.splice(x, 1);
                            break;
                          }
                        }
                      }
                      
                      // Adding all messages
                      this.buffers[key].privmsgnotice.push({
                        target: _n, 
                        line: lines[n]+"\n"
                      });
                    }
                  }
                }
          
          }

@Madriix
Copy link
Collaborator Author

Madriix commented Apr 21, 2024

@realrasengan For a while, I've noticed an issue where sometimes the IRC client wouldn't connect to the server, although it happened rarely. To fix the problem, I had to restart the web client. I might have found a clue about what was causing it:

https://github.com/freenode/jbnc/blob/master/lib/ClientConnect.js#L147

Above line 147, I added a console.log(lines); and observed that sometimes it was failing on the very first line, randomly. It wasn't getting placed into an object/array; it was being written directly without an array and with a comma at the very beginning of the line.

On this line: https://github.com/freenode/jbnc/blob/master/lib/ClientConnect.js#L139, I replaced .split('\n'); with .split('\r\n'); It seems to be working better now; all lines are properly within an array, including the first one. Perhaps this change should make it more efficient now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants