QueuedTracking setup on multiple servers (new guide) #134

mattab · 2020-05-15T05:02:29Z

Here are some notes I wrote earlier and thought it would be useful to put in the FAQ maybe?

How do I setup QueuedTracking on multiple tracking servers?

Say you have

4 servers,
and setup 8 workers in QueuedTracking,
and disabled the option in the settings to process the queue during the tracking request,

then on each of your 4 frontend servers, you need to:

setup 2 crontab to run each minute and which execute the command:
./console queuedtracking:process --queue-id=X

Where X is the queue ID. Each server handles 2 queues. So the 4 servers handle the 8 queues.

Queue ID starts at 0.

Notes:

execute the command ./console queuedtracking:monitor to track the state of the queue
When using multiple workers it might be worth to lower the number of "Number of requests to process" to eg 15 in "Plugin Settings". By default 25 requests are inserted in one step by using transactions. This means different workers might have to wait for each other. By lowering that number each worker will block the DB for less time.
you can optionally receive an email when the number of requests queued in a single queue reaches a configured threshold. You can configure this in your config/config.ini.php config file using the following configuration:

[QueuedTracking]
notify_queue_threshold_emails[] = example@example.org
notify_queue_threshold_single_queue = 250000

The text was updated successfully, but these errors were encountered:

okossuth · 2021-02-03T12:08:16Z

Hello
Can multiple workers process the same queueid ? We have a situation were we had originally 4 workers processing 4 queues, but due to slowness in our setup those workers were not fast enough to process the queues and now we have a ton of requests pending to be processed. Can we use multiple workers to process these specific 4 queues somehow? Thanks

danielsss · 2021-02-03T13:13:31Z

Hello
Can multiple workers process the same queueid ? We have a situation were we had originally 4 workers processing 4 queues, but due to slowness in our setup those workers were not fast enough to process the queues and now we have a ton of requests pending to be processed. Can we use multiple workers to process these specific 4 queues somehow? Thanks

+1

okossuth · 2021-02-03T13:29:48Z

Forgot to mention we are using matomo 3.14.1

tsteur · 2021-02-04T00:58:10Z

Hi @okossuth @danielsss
Multiple workers work on the same queue automatically if you don't set the queue-id option. However, they don't work on it at the very same time in parallel. They work on it one after another. It's not possible that multiple workers work on the very same queue in parallel (only one after another) as otherwise the tracked data could end up wrong and random visits with eg 0 actions could be created etc.

uglyrobot · 2021-02-04T14:03:04Z

Just a suggestion, if we used lpop or better blpop then that would eliminate potential race conditions, allow use of only one shared queue, and unlimited workers processing the same queue with no need for complicated locking. It would also scale to any level. We had our workers stop for a day now we have a 60GB queue size that we are trying to catch up with, but it's taking forever as only one worker can process each queue.

The main downside being that if the processing of the popped data fails then there are no retries. However I don't think that's a big deal, and even if it is can work around that by adding the data back to the beginning of the list, or into a failed queue.

tsteur · 2021-02-04T17:58:23Z

Thanks @uglyrobot the problem is less around redis but more about Matomo and how it tracks data etc. There's a related issue in core eg matomo-org/matomo#6415 basically if two workers were to work on the same a queue and one worker was processing the second tracking request of a visit slightly faster than another worker does the first tracking request Matomo could store wrong data in its database and sometimes even additionally create multiple visits.

StevieKay90 · 2021-03-09T23:57:04Z

I seem to be having issue with the following command which is stopping me from executing this correctly;

./console queuedtracking:process --queue-id=X

When activating ./console queuedtracking:process --queue-id=0 specifically for queue-id=0, it doesn’t work, i get this error

ERROR [2020-07-06 09:10:58] 4700 Uncaught exception: C:\inetpub\wwwroot\vendor\symfony\console\Symfony\Component\Console\Input\ArgvInput.php(242): The “–queue-id” option requires a value.

It works for fine for “./console queuedtracking:process --queue-id=1”

is this a known issue or am i doing something incorrectly?

tsteur · 2021-03-10T02:10:52Z

@StevieKay90 could you send us the output of your system check see https://matomo.org/faq/troubleshooting/how-do-i-find-and-copy-the-system-check-in-matomo-on-premise/ ? The output should be anonymised automatically.

StevieKay90 · 2021-03-10T06:31:31Z

Thanks for quick response!

its here;

matomo_system_check.txt

StevieKay90 · 2021-03-10T16:05:53Z

Hi Thomas, i've just found out that if you set ./console queuedtracking:process --queue-id=00 it works, good help from the community!

One thing which is vexing me though is why queue=0 seems to the most full, its not evenly distributing the load. The other queues are just a handful of requests in but queue 0 has over 200
Is there a way to stop this?

tsteur · 2021-03-10T20:10:23Z

Thanks for this. I still can't reproduce it just yet. @sgiehl any chance you have a windows running with Matomo and can try to reproduce this? I'm wondering if it's maybe windows related.

sgiehl · 2021-03-10T20:38:28Z

@tsteur don't have a matomo running directly on windows. But I could check if my Windows VM where I had set this up once is still running. But I guess it's already outdated and I would need to set it up again. Let me know if it's important enough to spend time on it.

StevieKay90 · 2021-03-10T20:48:45Z

Hi all @tsteur @sgiehl thanks for taking a look into this

as you can see its quickly is becoming a big problem here for me, i'm going to have to stop queued tracking

This has happened since the upgrade, previously i haven't run into this issue. Any interim advice would be great

tsteur · 2021-03-10T21:41:07Z

@StevieKay90 could you remove the queue-id parameter? Then the requests in the first queue should get processed

StevieKay90 · 2021-03-10T21:44:15Z

@tsteur I have done, i'm not using command line at all now i'm using the "Process during tracking request option"
It just seems to heave the vast majority of requests into one queue and as its one worker at a time, it ant handle all the requests in the queue

tsteur · 2021-03-10T21:48:22Z

@StevieKay90 it will likely catch up and process these requests. If otherwise overall it always pushes more requests into the first queue that might be if a lot of the requests are coming from the same IP address or a lot of them use the same visitorId or userId (if userId feature is used). It's possible that simply the visits in the queue 0 weren't processed in the past because of the error you were getting

tsteur · 2021-03-10T21:48:50Z

btw you could maybe also try --queue-id="0" not sure if that makes a difference in Windows

StevieKay90 · 2021-03-10T22:00:10Z

@tsteur the command --queue-id=00 seems to work on windows to process queue 0. However this problem i'm now suffering from is way deeper (i thought this was the issue like you but now i don't think it is). Previously, not stating an ID did actually process queue 0- its just that
a) queue 0 seemed to be much bigger and also B), write speed gets really slow as the redis db grew, takingsomething like 500 records in 2 minutes, i've got pretty high spec servers so that was surpising. it could never clear it all, and it reached massive levels until redis choked.
so now i'm think was it an error in the upgrade or a software config thing

StevieKay90 · 2021-03-11T08:58:52Z

@tsteur Ok, i've done some research and have some very interesting findings!

Forcing queue ID: 0 : This worker finished queue processing with 3.2req/s (150 requests in 46.91 seconds)
Forcing queue ID: 1 : This worker finished queue processing with 39.01req/s (125 requests in 3.20 seconds)
Forcing queue ID: 2 : This worker finished queue processing with 42.12req/s (150 requests in 3.56 seconds)
Forcing queue ID: 3 : This worker finished queue processing with 38.92req/s (125 requests in 3.21 seconds)
Forcing queue ID: 4 : This worker finished queue processing with 44.05req/s (100 requests in 2.27 seconds)
Forcing queue ID: 5 : This worker finished queue processing with 39.85req/s (125 requests in 3.14 seconds)

So its not that the there is more requests being routed to Queue ID 0 - its just the computing time of this specific queue is incredibly slow in comparison to the others!

UPDATE

I now opted for 16 workers as i figured that the relative speed of the other 15 would counter balance that of the slow moving queue 0.

However - Now queue 0 is performing a lot better (figuratively speaking at about 12-20req/s) but queue number 6 is now the naughty boy! There was nothing especially wrong in the verbose process output when i processed this queue manually, only the fact that it was slow and i could read most of the lines as they went by when normally its just a black and white fuzzy blur.

tsteur · 2021-03-11T20:07:53Z

@StevieKay90 any chance our using our log analytics for example to track / import data? This would explain that more requests go into the first queue and that it's slower since every request might consist of multiple tracking requests. Or in case you do custom tracking with bulk tracking requests that would explain it too.

That another queue might have now more entries be likely expected if you're not using the regular JS tracker. Be great to know how you track the data @StevieKay90

StevieKay90 · 2021-03-12T04:40:26Z

Thanks for the response thomas. All data is from the regular JS tracker. It looks like I’m going to have to return to matomo 3 to check if it was the upgrade which changed the queued tracker process. Currently with QT s weir Ged on I eventually get a pool of data in a queue which can’t be cleared fast enough and without QT I get a lot of strain on the db server

…

On Thu, 11 Mar 2021 at 20:08, Thomas Steur ***@***.***> wrote: @StevieKay90 <https://github.com/StevieKay90> any chance our using our log analytics for example to track / import data? This would explain that more requests go into the first queue and that it's slower since every request might consist of multiple tracking requests. Or in case you do custom tracking with bulk tracking requests that would explain it too. That another queue might have now more entries be likely expected if you're not using the regular JS tracker. Be great to know how you track the data @StevieKay90 <https://github.com/StevieKay90> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#134 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ATFEZHWPNXTY2R7PGTA5HKDTDEPKRANCNFSM4NBIIKTQ> .

tsteur · 2021-03-14T22:07:05Z

Let us know how you go with the downgrade to Matomo 3. Generally, there wasn't really any change though in queued tracking so I don't think it would make a difference. Be interesting to see though.

StevieKay90 · 2021-03-17T18:30:52Z

@tsteur is queued tracking compatible with php 8? out of interest?

tsteur · 2021-03-17T20:28:20Z

AFAIK it should be @StevieKay90

bitactive · 2023-11-15T16:29:37Z

Hi,

we are using QueuedTracking on 3 frontend servers, each with 24core and backend DB+redis with 128core+1TB RAM.
We are tracking single website with billion of monthly pageviews.
DB has little workload, redis has ~24% CPU core.

Having 16 queues, 10 requests per batch, processing 6 queues on 1st frontent and 5 queues on second and third frontend, each queue processor is hitting ~80% cpu, but frontend servers still have spare CPU power. Is it possible to increase number of queues beyond 16 to get even more performance?
We have already written start scripts for queue processors so they immediately restart after reaching NumberOfMaxBatchesToProcess and do not wait for cron to restart for remaining seconds until full minute.

Do you have any other advices to increase QueuedTracking capacity here?

snake14 · 2023-11-15T20:00:36Z

Hi @bitactive. I'm sorry you're experiencing issues. Sadly, 16 is currently the maximum number of queues supported. You could try adjusting the number of requests processed in each batch. I believe that the default is 25. Any other recommendations @AltamashShaikh ?

AltamashShaikh · 2023-11-16T03:20:29Z

@bitactive We would recommend to increase the no of requests here

bitactive · 2024-01-31T17:24:31Z

@snake14 @AltamashShaikh Increased no of requests from 10 per batch to 25 per batch. Now each of 16 workers have ~80% CPU and increased total throughput (processed requests per second) by ~15%. Still not able to process queue in realtime during high hours with 16 workers, each at 80% CPU on 3.8GHz cores.

What are further possible steps to increase efficiency, e.g. by an additional 100%? We do track one big website and have nearly unlimited resources for this (machines / CPU cores / memory).

AltamashShaikh · 2024-02-01T03:44:35Z

@bitactive What if you change the no pf requests to 50 ?

bitactive · 2024-02-23T17:53:39Z

@snake14 @AltamashShaikh Changing requests per batch to 50 gives another 10-15% throughput increase. Will try 100 soon as traffic increase.

Meantime i have another question for this configuration.

If i would like to add second big project to this Matomo instance, is it possible to configure it so for example matomo project #1 will use redis queue 0 and matomo project #2 will use redis queue 1 and then run 16 workers for queue 0 and 16 workers for queue 1?

As far as i know different matomo projects can be processed independently so it should be possible to direct requests from one project to one redis queue and from second project to another redis queue and then process them independently by another 16 workers?

snake14 · 2024-02-25T20:11:57Z

Hi @bitactive . I'm glad that helped. As far as I can tell, each Matomo instance would need a separate Redis database. Can you confirm @AltamashShaikh ?

AltamashShaikh · 2024-02-26T01:24:00Z

@bitactive You can specify the database if you want to use the same Redis for 2 instances

AltamashShaikh mentioned this issue Aug 6, 2024

High Traffic Handling Issues with Queue Processing #252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QueuedTracking setup on multiple servers (new guide) #134

QueuedTracking setup on multiple servers (new guide) #134

mattab commented May 15, 2020

okossuth commented Feb 3, 2021

danielsss commented Feb 3, 2021

okossuth commented Feb 3, 2021

tsteur commented Feb 4, 2021

uglyrobot commented Feb 4, 2021

tsteur commented Feb 4, 2021

StevieKay90 commented Mar 9, 2021

tsteur commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021 •

edited

Loading

tsteur commented Mar 10, 2021

sgiehl commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021 •

edited

Loading

tsteur commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021

tsteur commented Mar 10, 2021

tsteur commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021

StevieKay90 commented Mar 11, 2021 •

edited

Loading

tsteur commented Mar 11, 2021

StevieKay90 commented Mar 12, 2021 via email

tsteur commented Mar 14, 2021

StevieKay90 commented Mar 17, 2021

tsteur commented Mar 17, 2021

bitactive commented Nov 15, 2023

snake14 commented Nov 15, 2023

AltamashShaikh commented Nov 16, 2023

bitactive commented Jan 31, 2024

AltamashShaikh commented Feb 1, 2024

bitactive commented Feb 23, 2024 •

edited

Loading

snake14 commented Feb 25, 2024

AltamashShaikh commented Feb 26, 2024

QueuedTracking setup on multiple servers (new guide) #134

QueuedTracking setup on multiple servers (new guide) #134

Comments

mattab commented May 15, 2020

How do I setup QueuedTracking on multiple tracking servers?

okossuth commented Feb 3, 2021

danielsss commented Feb 3, 2021

okossuth commented Feb 3, 2021

tsteur commented Feb 4, 2021

uglyrobot commented Feb 4, 2021

tsteur commented Feb 4, 2021

StevieKay90 commented Mar 9, 2021

tsteur commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021 • edited Loading

tsteur commented Mar 10, 2021

sgiehl commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021 • edited Loading

tsteur commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021

tsteur commented Mar 10, 2021

tsteur commented Mar 10, 2021

StevieKay90 commented Mar 10, 2021

StevieKay90 commented Mar 11, 2021 • edited Loading

tsteur commented Mar 11, 2021

StevieKay90 commented Mar 12, 2021 via email

tsteur commented Mar 14, 2021

StevieKay90 commented Mar 17, 2021

tsteur commented Mar 17, 2021

bitactive commented Nov 15, 2023

snake14 commented Nov 15, 2023

AltamashShaikh commented Nov 16, 2023

bitactive commented Jan 31, 2024

AltamashShaikh commented Feb 1, 2024

bitactive commented Feb 23, 2024 • edited Loading

snake14 commented Feb 25, 2024

AltamashShaikh commented Feb 26, 2024

StevieKay90 commented Mar 10, 2021 •

edited

Loading

StevieKay90 commented Mar 10, 2021 •

edited

Loading

StevieKay90 commented Mar 11, 2021 •

edited

Loading

bitactive commented Feb 23, 2024 •

edited

Loading