Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify and make counter atomic #567

Closed
wants to merge 1 commit into from

Conversation

kevgs
Copy link
Contributor

@kevgs kevgs commented Jan 21, 2018

additionally fix data race which looks like this:

WARNING: ThreadSanitizer: data race (pid=30515)
  Write of size 8 at 0x0000039ee908 by thread T21:
    #0 ib_counter_t<long, 64, counter_indexer_t>::add(unsigned long, long) storage/innobase/include/ut0counter.h:132:16 (mysqld+0x21cd166)
    #1 ib_counter_t<long, 64, counter_indexer_t>::add(long) storage/innobase/include/ut0counter.h:122:34 (mysqld+0x21cc102)
    #2 rw_lock_x_lock_wait_func(rw_lock_t*, unsigned long, long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:489:38 (mysqld+0x21cb91f)
    #3 rw_lock_x_lock_low(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:538:3 (mysqld+0x21c9339)
    #4 rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:698:6 (mysqld+0x21c8c4d)
    #5 pfs_rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/include/sync0rw.ic:568:3 (mysqld+0x1afbdb4)
    #6 buf_page_get_gen(page_id_t const&, page_size_t const&, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*) storage/innobase/buf/buf0buf.cc:4782:3 (mysqld+0x1b04e28)
    #7 btr_cur_search_to_nth_level_func(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, rw_lock_t*, char const*, unsigned int, mtr_t*, unsigned long) storage/innobase/btr/btr0cur.cc:1312:10 (mysqld+0x1bf362c)
    #8 btr_pcur_open_low(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_pcur_t*, char const*, unsigned int, unsigned long, mtr_t*) storage/innobase/include/btr0pcur.ic:457:8 (mysqld+0x20eecd0)
    #9 row_search_on_row_ref(btr_pcur_t*, unsigned long, dict_table_t const*, dtuple_t const*, mtr_t*) storage/innobase/row/row0row.cc:1030:3 (mysqld+0x20ee1b4)
    #10 row_purge_reposition_pcur(unsigned long, purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:103:23 (mysqld+0x20d7f6f)
    #11 row_purge_reset_trx_id(purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:678:6 (mysqld+0x20dcbcd)
    #12 row_purge_record_func(purge_node_t*, unsigned char*, que_thr_t const*, bool) storage/innobase/row/row0purge.cc:1062:4 (mysqld+0x20db46f)
    #13 row_purge(purge_node_t*, unsigned char*, que_thr_t*) storage/innobase/row/row0purge.cc:1111:18 (mysqld+0x20d8aa4)
    #14 row_purge_step(que_thr_t*) storage/innobase/row/row0purge.cc:1190:3 (mysqld+0x20d872c)
    #15 que_thr_step(que_thr_t*) storage/innobase/que/que0que.cc:1055:9 (mysqld+0x1fcfa6f)
    #16 que_run_threads_low(que_thr_t*) storage/innobase/que/que0que.cc:1117:14 (mysqld+0x1fcdd6e)
    #17 que_run_threads(que_thr_t*) storage/innobase/que/que0que.cc:1157:2 (mysqld+0x1fcd908)
    #18 srv_task_execute() storage/innobase/srv/srv0srv.cc:2520:3 (mysqld+0x21a6684)
    #19 srv_worker_thread storage/innobase/srv/srv0srv.cc:2567:7 (mysqld+0x21a6247)

  Previous write of size 8 at 0x0000039ee908 by thread T20:
    #0 ib_counter_t<long, 64, counter_indexer_t>::add(unsigned long, long) storage/innobase/include/ut0counter.h:132:16 (mysqld+0x21cd166)
    #1 ib_counter_t<long, 64, counter_indexer_t>::add(long) storage/innobase/include/ut0counter.h:122:34 (mysqld+0x21cc102)
    #2 rw_lock_x_lock_wait_func(rw_lock_t*, unsigned long, long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:489:38 (mysqld+0x21cb91f)
    #3 rw_lock_x_lock_low(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:538:3 (mysqld+0x21c9339)
    #4 rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:698:6 (mysqld+0x21c8c4d)
    #5 pfs_rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/include/sync0rw.ic:568:3 (mysqld+0x1afbdb4)
    #6 buf_page_get_gen(page_id_t const&, page_size_t const&, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*) storage/innobase/buf/buf0buf.cc:4782:3 (mysqld+0x1b04e28)
    #7 btr_cur_search_to_nth_level_func(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, rw_lock_t*, char const*, unsigned int, mtr_t*, unsigned long) storage/innobase/btr/btr0cur.cc:1312:10 (mysqld+0x1bf362c)
    #8 btr_pcur_open_low(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_pcur_t*, char const*, unsigned int, unsigned long, mtr_t*) storage/innobase/include/btr0pcur.ic:457:8 (mysqld+0x20eecd0)
    #9 row_search_on_row_ref(btr_pcur_t*, unsigned long, dict_table_t const*, dtuple_t const*, mtr_t*) storage/innobase/row/row0row.cc:1030:3 (mysqld+0x20ee1b4)
    #10 row_purge_reposition_pcur(unsigned long, purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:103:23 (mysqld+0x20d7f6f)
    #11 row_purge_reset_trx_id(purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:678:6 (mysqld+0x20dcbcd)
    #12 row_purge_record_func(purge_node_t*, unsigned char*, que_thr_t const*, bool) storage/innobase/row/row0purge.cc:1062:4 (mysqld+0x20db46f)
    #13 row_purge(purge_node_t*, unsigned char*, que_thr_t*) storage/innobase/row/row0purge.cc:1111:18 (mysqld+0x20d8aa4)
    #14 row_purge_step(que_thr_t*) storage/innobase/row/row0purge.cc:1190:3 (mysqld+0x20d872c)
    #15 que_thr_step(que_thr_t*) storage/innobase/que/que0que.cc:1055:9 (mysqld+0x1fcfa6f)
    #16 que_run_threads_low(que_thr_t*) storage/innobase/que/que0que.cc:1117:14 (mysqld+0x1fcdd6e)
    #17 que_run_threads(que_thr_t*) storage/innobase/que/que0que.cc:1157:2 (mysqld+0x1fcd908)
    #18 srv_task_execute() storage/innobase/srv/srv0srv.cc:2520:3 (mysqld+0x21a6684)
    #19 srv_worker_thread storage/innobase/srv/srv0srv.cc:2567:7 (mysqld+0x21a6247)

I'm contributing this new code of the whole pull request, including one or several files that are either new files or modified ones, under the BSD-new license.

additionally fix data race which looks like this:

WARNING: ThreadSanitizer: data race (pid=30515)
  Write of size 8 at 0x0000039ee908 by thread T21:
    #0 ib_counter_t<long, 64, counter_indexer_t>::add(unsigned long, long) storage/innobase/include/ut0counter.h:132:16 (mysqld+0x21cd166)
    #1 ib_counter_t<long, 64, counter_indexer_t>::add(long) storage/innobase/include/ut0counter.h:122:34 (mysqld+0x21cc102)
    #2 rw_lock_x_lock_wait_func(rw_lock_t*, unsigned long, long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:489:38 (mysqld+0x21cb91f)
    #3 rw_lock_x_lock_low(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:538:3 (mysqld+0x21c9339)
    #4 rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:698:6 (mysqld+0x21c8c4d)
    #5 pfs_rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/include/sync0rw.ic:568:3 (mysqld+0x1afbdb4)
    #6 buf_page_get_gen(page_id_t const&, page_size_t const&, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*) storage/innobase/buf/buf0buf.cc:4782:3 (mysqld+0x1b04e28)
    #7 btr_cur_search_to_nth_level_func(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, rw_lock_t*, char const*, unsigned int, mtr_t*, unsigned long) storage/innobase/btr/btr0cur.cc:1312:10 (mysqld+0x1bf362c)
    #8 btr_pcur_open_low(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_pcur_t*, char const*, unsigned int, unsigned long, mtr_t*) storage/innobase/include/btr0pcur.ic:457:8 (mysqld+0x20eecd0)
    #9 row_search_on_row_ref(btr_pcur_t*, unsigned long, dict_table_t const*, dtuple_t const*, mtr_t*) storage/innobase/row/row0row.cc:1030:3 (mysqld+0x20ee1b4)
    #10 row_purge_reposition_pcur(unsigned long, purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:103:23 (mysqld+0x20d7f6f)
    #11 row_purge_reset_trx_id(purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:678:6 (mysqld+0x20dcbcd)
    #12 row_purge_record_func(purge_node_t*, unsigned char*, que_thr_t const*, bool) storage/innobase/row/row0purge.cc:1062:4 (mysqld+0x20db46f)
    #13 row_purge(purge_node_t*, unsigned char*, que_thr_t*) storage/innobase/row/row0purge.cc:1111:18 (mysqld+0x20d8aa4)
    #14 row_purge_step(que_thr_t*) storage/innobase/row/row0purge.cc:1190:3 (mysqld+0x20d872c)
    #15 que_thr_step(que_thr_t*) storage/innobase/que/que0que.cc:1055:9 (mysqld+0x1fcfa6f)
    #16 que_run_threads_low(que_thr_t*) storage/innobase/que/que0que.cc:1117:14 (mysqld+0x1fcdd6e)
    #17 que_run_threads(que_thr_t*) storage/innobase/que/que0que.cc:1157:2 (mysqld+0x1fcd908)
    #18 srv_task_execute() storage/innobase/srv/srv0srv.cc:2520:3 (mysqld+0x21a6684)
    #19 srv_worker_thread storage/innobase/srv/srv0srv.cc:2567:7 (mysqld+0x21a6247)

  Previous write of size 8 at 0x0000039ee908 by thread T20:
    #0 ib_counter_t<long, 64, counter_indexer_t>::add(unsigned long, long) storage/innobase/include/ut0counter.h:132:16 (mysqld+0x21cd166)
    #1 ib_counter_t<long, 64, counter_indexer_t>::add(long) storage/innobase/include/ut0counter.h:122:34 (mysqld+0x21cc102)
    #2 rw_lock_x_lock_wait_func(rw_lock_t*, unsigned long, long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:489:38 (mysqld+0x21cb91f)
    #3 rw_lock_x_lock_low(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:538:3 (mysqld+0x21c9339)
    #4 rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/sync/sync0rw.cc:698:6 (mysqld+0x21c8c4d)
    #5 pfs_rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned int) storage/innobase/include/sync0rw.ic:568:3 (mysqld+0x1afbdb4)
    #6 buf_page_get_gen(page_id_t const&, page_size_t const&, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*) storage/innobase/buf/buf0buf.cc:4782:3 (mysqld+0x1b04e28)
    #7 btr_cur_search_to_nth_level_func(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, rw_lock_t*, char const*, unsigned int, mtr_t*, unsigned long) storage/innobase/btr/btr0cur.cc:1312:10 (mysqld+0x1bf362c)
    #8 btr_pcur_open_low(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_pcur_t*, char const*, unsigned int, unsigned long, mtr_t*) storage/innobase/include/btr0pcur.ic:457:8 (mysqld+0x20eecd0)
    #9 row_search_on_row_ref(btr_pcur_t*, unsigned long, dict_table_t const*, dtuple_t const*, mtr_t*) storage/innobase/row/row0row.cc:1030:3 (mysqld+0x20ee1b4)
    #10 row_purge_reposition_pcur(unsigned long, purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:103:23 (mysqld+0x20d7f6f)
    #11 row_purge_reset_trx_id(purge_node_t*, mtr_t*) storage/innobase/row/row0purge.cc:678:6 (mysqld+0x20dcbcd)
    #12 row_purge_record_func(purge_node_t*, unsigned char*, que_thr_t const*, bool) storage/innobase/row/row0purge.cc:1062:4 (mysqld+0x20db46f)
    #13 row_purge(purge_node_t*, unsigned char*, que_thr_t*) storage/innobase/row/row0purge.cc:1111:18 (mysqld+0x20d8aa4)
    #14 row_purge_step(que_thr_t*) storage/innobase/row/row0purge.cc:1190:3 (mysqld+0x20d872c)
    #15 que_thr_step(que_thr_t*) storage/innobase/que/que0que.cc:1055:9 (mysqld+0x1fcfa6f)
    #16 que_run_threads_low(que_thr_t*) storage/innobase/que/que0que.cc:1117:14 (mysqld+0x1fcdd6e)
    #17 que_run_threads(que_thr_t*) storage/innobase/que/que0que.cc:1157:2 (mysqld+0x1fcd908)
    #18 srv_task_execute() storage/innobase/srv/srv0srv.cc:2520:3 (mysqld+0x21a6684)
    #19 srv_worker_thread storage/innobase/srv/srv0srv.cc:2567:7 (mysqld+0x21a6247)
@svoj
Copy link
Contributor

svoj commented Jan 23, 2018

Two reasons why suggested change is not great:

  1. scalability: multi-instance counter was there exactly for this reason. Even with multi-instance counter I see srv_stats high in profiler when running concurrent benchmarks, it is going to become a disaster if we have just single instance.
  2. performance: even single thread performance may suffer since we touch these counters too often, though we can probably live with that.

I like atomic operations, but in this case we need to put additional effort rethinking srv_stats.

@svoj svoj self-assigned this Jan 23, 2018
@svoj svoj added this to the 10.3 milestone Jan 23, 2018
@svoj
Copy link
Contributor

svoj commented Jan 24, 2018

Closing as there're no objections so far. Feel free to reopen is you believe I'm wrong.

@svoj svoj closed this Jan 24, 2018
@kevgs
Copy link
Contributor Author

kevgs commented Jan 24, 2018

Hi. Micro benchmark shows that atomic counter is faster: https://github.com/kevgs/counter_benchmark/blob/master/README.md

@svoj
Copy link
Contributor

svoj commented Jan 24, 2018

I don't think benchmark is completely fair. E.g. ha_innobase::general_fetch() doesn't calculate random number, instead it uses thd_get_thread_id(trx->mysql_thd).

Second thing is: 4 cores is nothing these days. You can't get reasonable contention on such CPU.

@laurynas-biveinis
Copy link
Contributor

@kevgs
Copy link
Contributor Author

kevgs commented Jan 25, 2018

Without my_timer_cycles write performance is much better indeed:

BM_SimpleWrite64/threads:64                 1 ns          6 ns  120101504
BM_ComplexWrite64/threads:64                0 ns          0 ns 64000000000
BM_SimpleRead64/threads:64                  0 ns          0 ns 2158920192
BM_ComplexRead64/threads:64                 4 ns         24 ns   28624960
BM_SimpleMostlyWrite64/threads:64           9 ns         57 ns   12339072
BM_ComplexMostlyWrite64/threads:64          6 ns         40 ns   17363584
BM_SimpleMostlyRead64/threads:64            2 ns         12 ns   57982464
BM_ComplexMostlyRead64/threads:64          35 ns        230 ns    3027264

That's on the same 4 cores. I don't have a CPU with much cores and thus have no ability to test performance for a relevant case.

@svoj, thanks for your time.
@laurynas-biveinis, thanks for links. It was interesting to read.

@kevgs kevgs deleted the atomic_counter branch January 27, 2018 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants