add basic metrics of multilevel task queue #27

sticnarf · 2020-01-06T04:02:29Z

It's quite useful to get how long tasks use and the chance of being scheduled of each level to investigate issues about the effect of multilevel feedback scheduling.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

BusyJay · 2020-01-06T05:38:28Z

Why not use rust-prometheus instead? In that case, users of the library does even have to make their own wrappers.

sticnarf · 2020-01-06T05:42:39Z

@BusyJay We can't assume the user uses prometheus.

BusyJay · 2020-01-06T05:45:37Z

They can still access the data via prometheus's APIs.

/cc @breeswish, is it easy for other metrics implement to collaborate with rust-prometheus?

sticnarf · 2020-01-06T06:15:59Z

I think it mightn't be a good idea to use prometheus-rust in a library. Because the registry needs to interact with the metrics. If the version of the registry differs from that of the metrics, I believe it's likely to be true, then using prometheus in the library cannot bring any benefit.

BusyJay · 2020-01-06T06:32:16Z

Why need to use registry? What I mean is use prometheus' counters and histograms. It should behave like rocksdb's metrics: https://github.com/facebook/rocksdb/blob/master/include/rocksdb/statistics.h.

sticnarf · 2020-01-06T06:45:45Z

Then what's the benefit of using prometheus's counters, just an atomic wrapper?
If the versions don't match, the counters and histograms cannot be used with the prometheus registry that the user uses and it doesn't bring any more convenience. Why bother introduce a library only for its counter type?

sticnarf · 2020-01-06T06:53:21Z

It is easy for us to unify the version of rust-prometheus used by tikv and this library. But it doesn't change that it's generally not a good idea to use rust-prometheus in a library.

BusyJay · 2020-01-06T07:12:57Z

Then what's the benefit of using prometheus's counters, just an atomic wrapper?

The benefit is that:

we don't need to define our own types, such as counter, histogram or registry. For example, to check the performance of a benchmark, we can dump the metrics easily, instead of building a struct and customize its output from scratch;
very easy to integrate with other system, all need to be done are just registry.register.

If the versions don't match, the counters and histograms cannot be used with the prometheus registry

I think it's very similar with log crate. log crate also has similar layout but it doesn't prevent it from being widely used by both libraries and binaries. I don't think versions are the defeater of using another library.

sticnarf · 2020-01-06T07:21:27Z

Ok. I can change to use prometheus if you have strong opinion.

PS: The log crate uses a trick to deal with compatibility. log 0.3 depends on log 0.4 so the types can be used across versions. We can apply this trick on rust-prometheus too, I think. /cc @breeswish

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf · 2020-01-07T03:16:17Z

@BusyJay PTAL again.
I choose to use global static MetricsVec so if the user uses several thread pools at the same time, it is convenient to register the vec only once and get information of all thread pools.

BusyJay · 2020-01-18T03:32:49Z

src/queue/multilevel.rs

-        let total = self.total_elapsed.0.load(SeqCst);
-        if Duration::from_micros(total) < ADJUST_CHANCE_INTERVAL {
-            // Another thread just adjusted the chances.
+        let total = self.total_elapsed_us.get();


I think the order will be Relaxed for metrics, is it OK?

I think it's OK as soon as the time of each handling is not very long so that the inaccuracy doesn't have much effect.

I'm worry that total_diff can be zero as reorder is allowed now.

If total_diff is zero, the function returns in L297. The calculation is different here and there is no risk of div by 0 like before. Anything else worrying?

BusyJay · 2020-01-19T05:56:58Z

src/queue/multilevel.rs

-            // Another thread just adjusted the chances.
+        let total = self.total_elapsed_us.get();
+        let total_diff = total - self.last_total_elapsed_us.get();
+        if total_diff < ADJUST_CHANCE_INTERVAL_US {


So the metrics report interval should be less than ADJUST_CHANCE_INTERVAL_US, otherwise it can be always true.

Sorry I don't get what you mean. Reporting the metrics shoudn't reset its value and last_total_elapsed_us is only set at L304.

Oh, turns out I misunderstood the processing of metrics.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

BusyJay · 2020-02-06T04:51:15Z

@AndreMouche @breeswish PTAL

AndreMouche

LGTM

AndreMouche · 2020-02-07T02:03:55Z

src/queue/multilevel.rs

-const MIN_LEVEL0_CHANCE: u32 = 1 << 31; // 0.5
-const MAX_LEVEL0_CHANCE: u32 = 4_209_067_949; // 0.98
-const ADJUST_AMOUNT: u32 = (MAX_LEVEL0_CHANCE - MIN_LEVEL0_CHANCE) / 8; // 0.06
+const INIT_LEVEL0_CHANCE: f64 = 0.8;


Could we add some description about this value?

I add some comments to it. PTAL

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

AndreMouche

LGTM

sticnarf added 3 commits January 6, 2020 11:52

add metrics for multilevel queue

881127f

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Merge branch 'master' into add-metrics

7bb2f8f

add queue_statistics to ThreadPool

50f4037

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf requested review from AndreMouche and BusyJay January 6, 2020 04:02

add static metrics for multilevel queue

f50e6f5

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf force-pushed the add-metrics branch from 074f36e to f50e6f5 Compare January 6, 2020 12:49

BusyJay reviewed Jan 18, 2020

View reviewed changes

BusyJay reviewed Jan 19, 2020

View reviewed changes

BusyJay previously approved these changes Jan 19, 2020

View reviewed changes

use latest prometheus and support setting namespace

1f11ed5

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf dismissed BusyJay’s stale review via 1f11ed5 February 6, 2020 04:28

sticnarf mentioned this pull request Feb 6, 2020

*: add metrics for unified read pool tikv/tikv#6534

Merged

AndreMouche previously approved these changes Feb 7, 2020

View reviewed changes

add some more comments to some constants

f01b6b7

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf dismissed AndreMouche’s stale review via f01b6b7 February 7, 2020 04:02

BusyJay approved these changes Feb 7, 2020

View reviewed changes

AndreMouche approved these changes Feb 7, 2020

View reviewed changes

sticnarf merged commit fe59cbc into tikv:master Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add basic metrics of multilevel task queue #27

add basic metrics of multilevel task queue #27

sticnarf commented Jan 6, 2020 •

edited

Loading

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

sticnarf commented Jan 6, 2020

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

sticnarf commented Jan 7, 2020

BusyJay Jan 18, 2020

sticnarf Jan 18, 2020

BusyJay Jan 18, 2020

sticnarf Jan 18, 2020 •

edited

Loading

BusyJay Jan 19, 2020 •

edited

Loading

sticnarf Jan 19, 2020

BusyJay Jan 19, 2020

BusyJay commented Feb 6, 2020

AndreMouche left a comment

AndreMouche Feb 7, 2020

sticnarf Feb 7, 2020

AndreMouche left a comment

add basic metrics of multilevel task queue #27

add basic metrics of multilevel task queue #27

Conversation

sticnarf commented Jan 6, 2020 • edited Loading

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

sticnarf commented Jan 6, 2020

BusyJay commented Jan 6, 2020

sticnarf commented Jan 6, 2020

sticnarf commented Jan 7, 2020

BusyJay Jan 18, 2020

Choose a reason for hiding this comment

sticnarf Jan 18, 2020

Choose a reason for hiding this comment

BusyJay Jan 18, 2020

Choose a reason for hiding this comment

sticnarf Jan 18, 2020 • edited Loading

Choose a reason for hiding this comment

BusyJay Jan 19, 2020 • edited Loading

Choose a reason for hiding this comment

sticnarf Jan 19, 2020

Choose a reason for hiding this comment

BusyJay Jan 19, 2020

Choose a reason for hiding this comment

BusyJay commented Feb 6, 2020

AndreMouche left a comment

Choose a reason for hiding this comment

AndreMouche Feb 7, 2020

Choose a reason for hiding this comment

sticnarf Feb 7, 2020

Choose a reason for hiding this comment

AndreMouche left a comment

Choose a reason for hiding this comment

sticnarf commented Jan 6, 2020 •

edited

Loading

sticnarf Jan 18, 2020 •

edited

Loading

BusyJay Jan 19, 2020 •

edited

Loading