-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Site sluggishness #1548
Comments
* Link in the FAQ for this Post OpenUserJS#944 OpenUserJS#970 OpenUserJS#389 ... missed somewhere around OpenUserJS#976 to OpenUserJS#1208 *(vaguely recall this was on the script homepage originally and moved to source code page)*. Needed for OpenUserJS#1548 to calm network traffic issues which appear to be global with Level3. Over 17,000 sites are down according to pingdom.
* Link in the FAQ for this Post #944 #970 #389 ... missed somewhere around #976 to #1208 *(vaguely recall this was on the script homepage originally and moved to source code page)*. Needed for #1548 to calm network traffic issues which appear to be global with Level3. Over 17,000 sites are down according to pingdom. Auto-merge
Had a friend in Wisconson (say cheese ;) try it and it's not loading for him either. Here's my traceroute: $ time traceroute openuserjs.org
traceroute to openuserjs.org (104.236.255.50), 30 hops max, 60 byte packets
1 *(Intranet)* 0.614 ms 0.646 ms 0.672 ms
2 *(ISP)* 10.865 ms 10.872 ms 16.676 ms
3 *(ISP hop)* 20.595 ms 20.588 ms 20.592 ms
4 *(ISP hop)* 17.291 ms 17.301 ms 17.815 ms
5 *(ISP hop) 16.543 ms 17.109 ms 17.116 ms
6 *(ISP City hop)*.Level3.net 17.123 ms 16.745 ms 16.750 ms
7 ae-2-3602.ear4.Newark1.Level3.net (4.69.211.181) 64.112 ms 59.858 ms 59.861 ms
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
real 0m25.048s *(Time to site way to long for these requests)*
user 0m0.003s
sys 0m0.004s Of the sites monitored by https://livemap.pingdom.com/ it's up to ~18,000 now but the Level3 map url is more specific to opening this issue. |
I don't believe this is MongoLabs or AWS connection issue as dev is working at normal speed. Thinking the internet backbone is damaged via Level3 (owned by CenturyLink now... probably answers the why it is broken perhaps). Anyhow... no new status updates. Spent a wonderful time with my ISP confirming they deny anything is problematic with their service (and cross referenced with a different provider out of State). Temporarily downgraded node... not the issue and tested roll back to about a85b989 ... no change. |
Some summary info our VPS provider had me run with some tests:
I've triple checked the firewall. Functional and okay. Still awaiting any further suggestions including some preferred results... until then it's the waiting game. Ughh. Still sitting around 1% to 9% with process usage (few spikes when downloading certain items but nominal is occurring). Removed letsencrypt package and it seemed to improve that from 3% (or just more ppl trying). Google Public DNS still hammering UDP (plus I'm trying it myself instead of my ISPs... no perceptual difference than native). I'd block it but then anyone using it wouldn't be able to reach us. Heh. |
When it rains it pours... yet another issue bleh. (cascading is my guess) Should rule out node by using nvm atm. e.g. Passed all tests back then and even now... however since this issue appeared out of "thin air" I'm lead to believe we need a VPS migration to a newer distro. It's near the holidays and I'm super busy but I'll try to squeeze something in after @sizzlemctwizzle responds. Will try project in a VM first though as I've had that prepped since March. Until then this is BLOCKING which means sizzle has to unblock it when migration occurs. Found that async dep is lagging on |
Facing 2-3 minutes of load on every page on a VPN secured connection through Iceland from Pakistan. Am on a 20Mbps U/D fiber connection from a Tier 2 ISP. Tracert:
|
@brazenvoid Thanks for the continued patience. This is going to be interesting getting done in between the holiday stuff I have planned because I'll be AFK for quite a bit of it... so please continue patience. This might take a couple of weeks since I don't have full access. Sorry... but I'm trying. :) Until then just try to use the site as is... the best recommendation at the moment. |
Hmmm new distro via MongoDB only has MongoDB 4... that can present a problem/delay with express-brute-mongo. |
OOOH.. not good... local pro delay with next version of distro in VM. Guess I'll need to add that async test back in a local branch. It is about peak internet time so that could be a little bit of it too. Will do some more thorough testing.
|
Okay... pro was dragging AWS and MongoLabs down in local pro. Killed pro and local pro is at top speed (my usual perusing the site) atm. Still have the lurking Level3 issue in my outbound network for a possible additional reason. Restored pro to "online-expected". |
I'm still recommending the VPS upgrade to a new one esp. to (hopefully) resolve this issue. Adding the extra security will take a bit more time on the VPS but once it's all in place I can't move the DNS. That's something you'll need to do. Please stick with IPv4 for now. Some of our deps don't currently do well at IPv6. Plus this will cost some extra during the setup to migration from them. I'm building a list of what needs to be done. We really need the next level up to with more vCores which is more mulah per month. I'm about beat from lack of sleep so I'll await your response(s) please. |
Hmmm looks like I have DNS access change... we'll know for sure in ~24 to 48 hours. |
So before the DNS propagated to me it was fine using the direct IP. Granted no one else had that IP. There's very few things left that I can think of. The server was recreated to the same data center but better cpu/mem/ssd stats. I can try to migrate it to another data center but have to do a backup first (and another IP change and some other details too)
I'm also needing a break after 12 hours of this. P.S. It's still lagging but lagging quicker heh. Note pingdom url is up to > 21000 atm Misc notes:
|
Misc test noteLocal pro test (this means I'm not using our VPS providers hosting for those new here but actual pro(duction) is running atm) with this diff and async dep at this code block point which has already contacted MongoLabs and succeeded in the callback but somehow messes up async now. (This code block point hasn't changed in quite some time): diff --git a/controllers/user.js b/controllers/user.js
index 37ec725..3fa8f27 100644
--- a/controllers/user.js#b914392
+++ b/controllers/user.js
@@ -390,6 +390,9 @@ exports.userListPage = function (aReq, aRes, aNext) {
async.parallel([
function (aCallback) {
+
+ console.timeEnd('userListPage()');
+
if (!!!options.isFlagged || !options.isAdmin) { // NOTE: Watchpoint
aCallback();
return;
@@ -440,6 +443,9 @@ exports.userListPage = function (aReq, aRes, aNext) {
tasks.push(execQueryTask(userListQuery, options, 'userList'));
//---
+
+ console.time('userListPage()');
+
async.parallel(tasks, asyncComplete);
};
@@ -462,6 +468,9 @@ exports.view = function (aReq, aRes, aNext) {
async.parallel([
function (aCallback) {
+
+ console.timeEnd('view() ' + username);
+
if (!options.isAdmin) { // NOTE: Watchpoint
aCallback();
return;
@@ -525,6 +534,9 @@ exports.view = function (aReq, aRes, aNext) {
tasks = tasks.concat(stats.getSummaryTasks(options));
//---
+
+ console.time('view() ' + username);
+
async.parallel(tasks, asyncComplete);
});
};
... produces this output on some random clicks of users and user list: view() -JesperJod: 332.085ms
view() -hoverboard: 69.237ms
view() -_ArmandLevas: 63.867ms
view() -lavienrose: 64.703ms
view() 00000H: 67.742ms
view() 0097gvk: 71.954ms
view() 04MR17: 75.263ms
view() 01018575475: 69.633ms
view() 007: 70.182ms
view() -mg-: 67.090ms
userListPage(): 405.968ms
view() 1544cman2000gmail.com: 70.499ms
view() 1solutions: 32389.779ms
view() 160004000: 25381.381ms
view() 1solutions: 64.616ms ... Some are quick... some are realllllllllllllly slow. Same test with some more: userListPage(): 134.631ms
userListPage(): 139.506ms
userListPage(): 109.424ms
userListPage(): 148.130ms
userListPage(): 147.365ms
userListPage(): 132.894ms
userListPage(): 136.901ms
userListPage(): 125.249ms
userListPage(): 129.335ms
userListPage(): 127.833ms
userListPage(): 119.766ms
userListPage(): 135.394ms
userListPage(): 132.066ms
userListPage(): 234.387ms
userListPage(): 769.262ms
userListPage(): 121.794ms
userListPage(): 111.775ms
userListPage(): 122.818ms
userListPage(): 108.172ms
userListPage(): 388.980ms
userListPage(): 352.293ms
userListPage(): 149.513ms
userListPage(): 172.532ms
userListPage(): 150.295ms
userListPage(): 594.304ms
userListPage(): 221.400ms
userListPage(): 195.818ms
userListPage(): 174.826ms
userListPage(): 198.745ms
userListPage(): 204.019ms
userListPage(): 135.008ms
userListPage(): 105.093ms
userListPage(): 112.189ms
userListPage(): 93.537ms
userListPage(): 111.668ms
userListPage(): 235.212ms
userListPage(): 259.961ms
userListPage(): 304.103ms
userListPage(): 318.409ms
userListPage(): 489.174ms
userListPage(): 582.791ms
userListPage(): 139.654ms
userListPage(): 122.359ms
userListPage(): 771.824ms
userListPage(): 601.540ms
userListPage(): 199.029ms
view() 93Akkord: 19528.984ms
view() 99aintenough: 9438.322ms
view() 9tfall: 111.546ms
view() 9kopb: 69.801ms
view() AAK: 66.806ms
view() ADRENALINE1234: 71.058ms
view() AJMansfield: 24289.024ms ... last 10 or so |
Finally!!! Found evidence and confirmation of network issue on production (from VPS to MongoLabs): 2018-12-18 13:30:17.100 +00:00: Group rating NOT updated aErr := MongoNetworkError: connection 10 to *clipped*.mongolab.com:*portclipped* timed out aGroup := undefined |
* This didn't seem to help in a direct test on production but doing since it's the logical thing to do with our current process manager. NOTE(S): * We don't currently have clustering management in the project itself but may at some point in the future so this could eventually use some improvement. Trying to keep this simple at start for everyone. Applies to OpenUserJS#1548
* This value is per thread otherwise it would be huge in each thread. Durr. Applies to OpenUserJS#1548
* Since commit notes said it didn't seem to help let's try tripling the multiplier Applies to OpenUserJS#1548
* This didn't seem to help in a direct test on production but doing since it's the logical thing to do with our current process manager. NOTE(S): * We don't currently have clustering management in the project itself but may at some point in the future so this could eventually use some improvement. Trying to keep this simple at start for everyone. * Go back to the original multiplier * This value is per thread otherwise it would be huge in each thread. Durr. * Try triple multiplier * Since commit notes said it didn't seem to help let's try tripling the multiplier Applies to #1548
Misc test noteI've temporarily audited url usage (on pro) this morning for about 4 minutes on a single thread and most of the requests are to .meta.js and .user.js which is AWS and not MongoLabs. |
* Delete op retested * Please read their CHANGELOG... however it's a bug fix that we're looking for and there is one. May apply to OpenUserJS#1548
* Delete op retested * Please read their CHANGELOG... however it's a bug fix that we're looking for and there is one. May apply to #1548 Auto-merge
The main site is still sluggish for me (several minutes to load)... but the requests for .user.js (AWS not Mongo?) is completely unresponsive. Nothing but HTTP code 429 (too many requests) 95% of the time, and the other 5% return HTTP code 444 (unknown). |
* Change a few classes around for UI coloring and display * Shows exclamation on script homepage and script lists that there **may** be user initiated, system initiated, etc. notices on the Install button or Raw Source button e.g. the Source Code tab contents. * Few mustache identifier name changes for symmetry Applies to #1548 #432 Auto-merge
This is still true for lockdown but there is some advanced warning now other than the about page. The blue install button becomes orange'ish if there are any "Source Code notices" including script source lockdown. All may still be found in the drop down. |
* Little less prominent for warnings vs. possible critical issue * More in line with the docs that mostly say "blue install button" * Add `updateURL` check for all modes and display if present * Reorder the UI notices a bit. * Some line length conformance Post OpenUserJS#1632 and applies to OpenUserJS#1548 OpenUserJS#432
* Little less prominent for warnings vs. possible critical issue * More in line with the docs that mostly say "blue install button" * Add `updateURL` check for all modes and display if present * Reorder the UI notices a bit. * Some line length conformance Post #1632 and applies to #1548 #432 Auto-merge
* Chromium 75.0.3770.90 started spewing this out and it's not in *mime-db* dep (yet?)... Relates to `/install/<username>/<scriptname>.meta.js`. Don't think it has an extension spec based off skimming doc * Relaxing is temporary atm in lieu of more aggressive re Post OpenUserJS#1633 OpenUserJS#1632 OpenUserJS#944 and applies to OpenUserJS#1548 OpenUserJS#432
* Chromium 75.0.3770.90 started spewing this out and it's not in *mime-db* dep (yet?)... Relates to `/install/<username>/<scriptname>.meta.js`. Don't think it has an extension spec based off skimming doc * Relaxing is temporary atm in lieu of more aggressive re Post #1633 #1632 #944 and applies to #1548 #432 Auto-merge
* Please read their CHANGELOGs * Delete op retested * Post OpenUserJS#1628 partial rollback now that v12.x is LTS and no longer the issue in OpenUserJS#1548
Created a new label for this since I put it on... still means we should work on debugging but AM can remove. |
* Add another default. This is a breaking change for third-party instances so version bump Closes OpenUserJS#1745 and applies to OpenUserJS#1548 NOTE: * Issue OpenUserJS#1745 has been hung for over a year. If needed may be reopened by appropriate personnel.
* Libraries will be affected in lockdown atm... first usage. * Not sure visible (graceful) messages are needed but we'll give it a whirl atm. Post OpenUserJS#944 OpenUserJS#1548
* Second usage * Autoban comes much sooner for bad actors Post OpenUserJS#944 OpenUserJS#1548
* More can be done if needed Post OpenUserJS#944 OpenUserJS#1548
So I've been contacting all kinds of people in the last 24 hours with no clear resolution on why sometimes the site is super fast vs super slow.
When I know anything more I'll let everyone know however there isn't much to be done as everything has been triple checked on our end (hence the dep updates a couple of times in the last few days [that's not typical for my updating], server restarts, and we did an unannounced backup since it's about that time.).
Anyhow... just letting everyone know we're on top of what we can do. Apologies for what we can't do.
P.S. When it's sitting idle in the browser (spinner spinning) it's doing nothing in the network management tools e.g. your request isn't reaching us every time atm from the test points we tried and our VPS (real person) confirmed occasionally it's taking an excessive period from their testing.
Refs:
The text was updated successfully, but these errors were encountered: